Search Results

Search found 1769 results on 71 pages for 'architechture concepts'.

Page 20/71 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

Understanding Node.js and concept of non-blocking I/O

- by Saif Bechan

Recently I became interested in using Node.js to tackle some of the parts of my web-application. I love the part that its full JavaScript and its very light weight so no use anymore to call an JavaScript-PHP call but a lighter JavaScript-JavaScript call. I however do not understand all the concepts explained. Basic concepts Now in the presentation for Node.js Ryan Dahl talks about non-blocking IO and why this is the way we need to create our programs. I can understand the theoretical concept. You just don't wait for a response, you go ahead and do other things. You make a callback for the response, and when the response arrives millions of clock-cycles later, you can fire that. If you have not already I recommend to watch this presentation. It is very easy to follow and pretty detailed. There are some nice concepts explained on how to write your code in a good manner. There are also some examples given and I am going to work with the basic example given. Examples The way we do thing now: puts("Enter your name: "); var name = gets(); puts("Name: " + name); Now the problem with this is that the code is halted at line 1. It blocks your code. The way we need to do things according to node puts("Enter your name: "); gets(function (name) { puts("Name: " + name); }); Now with this your program does not halt, because the input is a function within the output. So the programs continues to work without halting. Questions Now the basic question I have is how does this work in real-life situations. I am talking here for the use in web-applications. The application I am writing does I/O, bit is still does it in am blocking matter. I think that most of the time, if not all, you need to block, because you have to wait on what the response is you have to work with. When you need to get some information from the database, most of the time this data needs to be verified before you can further with the code. Example 1 If you take a login for example. You have to wait for the database to response to return, because you can not do anything else. I can't see a way around this without blocking. Example 2 Going back to the basic example. The use just request something from a database which does not need any verification. You still have to block because you don't have anything to do more. I can not come up with a single example where you want to do other things while you wait for the response to return. Possible answers I have read that this frees up recourses. When you program like this it takes less CPU or memory usage. So this non-blocking IO is ONLY meant to free up recourses and does not have any other practical use. Not that this is not a huge plus, freeing up recourses is always good. Yet I fail to see this as a good solution. because in both of the above examples, the program has to wait for the response of the user. Whether this is inside a function, or just inline, in my opinion there is a program that wait for input. Resources I looked at I have looked at some recourses before I posted this question. They talk a lot about the theoretical concept, which is quite clear. Yet i fail to see some real-life examples where this is makes a huge difference. Stackoverflow: What is in simple words blocking IO and non-blocking IO? Blocking IO vs non-blocking IO; looking for good articles tidy code for asynchronous IO Other recources: Wikipedia: Asynchronous I/O Introduction to non-blocking I/O The C10K problem

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
Lies, damned lies, and statistics Part 2

- by Maria Colgan

There was huge interest in our OOW session last year on Managing Optimizer Statistics. It seems statistics and the maintenance of them continues to baffle people. In order to help dispel the mysteries surround statistics management we have created a two part white paper series on Optimizer statistics. Part one of this series was released in November last years and describes in detail, with worked examples, the different concepts of Optimizer statistics. Today we have published part two of the series, which focuses on the best practices for gathering statistics, and examines specific use cases including, the fears that surround histograms and statistics management of volatile tables like Global Temporary Tables. Here is a quick look at the Introduction and the start of the paper. You can find the full paper here. Happy Reading! Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman","serif";} Introduction The Oracle Optimizer examines all of the possible plans for a SQL statement and picks the one with the lowest cost, where cost represents the estimated resource usage for a given plan. In order for the Optimizer to accurately determine the cost for an execution plan it must have information about all of the objects (table and indexes) accessed in the SQL statement as well as information about the system on which the SQL statement will be run. This necessary information is commonly referred to as Optimizer statistics. Understanding and managing Optimizer statistics is key to optimal SQL execution. Knowing when and how to gather statistics in a timely manner is critical to maintaining acceptable performance. This whitepaper is the second of a two part series on Optimizer statistics. The first part of this series, Understanding Optimizer Statistics, focuses on the concepts of statistics and will be referenced several times in this paper as a source of additional information. This paper will discuss in detail, when and how to gather statistics for the most common scenarios seen in an Oracle Database. The topics are · How to gather statistics · When to gather statistics · Improving the efficiency of gathering statistics · When not to gather statistics · Gathering other types of statistics How to gather statistics The preferred method for gathering statistics in Oracle is to use the supplied automatic statistics-gathering job. Automatic statistics gathering job The job collects statistics for all database objects, which are missing statistics or have stale statistics by running an Oracle AutoTask task during a predefined maintenance window. Oracle internally prioritizes the database objects that require statistics, so that those objects, which most need updated statistics, are processed first. The automatic statistics-gathering job uses the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure, which uses the same default parameter values as the other DBMS_STATS.GATHER_*_STATS procedures. The defaults are sufficient in most cases. However, it is occasionally necessary to change the default value of one of the statistics gathering parameters, which can be accomplished by using the DBMS_STATS.SET_*_PREF procedures. Parameter values should be changed at the smallest scope possible, ideally on a per-object bases. You can find the full paper here. Happy Reading! +Maria Colgan

Read the article
Windows 8 Camp–Ways to Prepare

- by Lori Lalonde

When Windows 8 was announced at the BUILD conference back in September, it created quite a buzz among the developer community. By the spring of 2012, Windows 8 Developer Camps started popping up everywhere imaginable. I received a lot of questions from CTTDNUG members about whether or not we would be hosting one locally. If you recall my post about the Windows Phone/Azure Developer Workshop that CTTDNUG hosted back in March, you’ll remember that the biggest hurdle to overcome when planning this type of event was finding the right venue. It took some time, but I finally found a venue that was available and provided the prerequisites needed to ensure this camp is a success. I am very excited that CTTDNUG will be hosting a Windows 8 Camp this summer in the Kitchener/Waterloo area. In fact, it’s coming up in less than 2 weeks. Clearly other developers are excited as well, because our registration numbers show that the event is already 70% full! On top of that, I was fortunate enough to also book two well-known evangelists to present and teach at this full day developer camp: Andrei Marukovich and Atley Hunter. This was the icing on the cake. With the content provided by Microsoft, and two local experts that live and breathe Windows 8 development, I know that I, along with other developers that attend this event, will have the opportunity to maximize our learning potential and hit the ground running. If you plan on attending a Windows 8 Developer Camp soon, and want to ensure you get the most “bang for your buck” (figuratively speaking, since these camps are free), there are some things you can do to prepare before the big day: 1) Install the prerequisites on your own device before the big day I can’t stress this enough. Otherwise, you will be spending valuable time during the hands-on period downloading and installing what is needed, rather than digging into the development and using that time to ask the experts on-hand about programming challenges, issues, questions you may have with respect to your development. Prerequisites: Windows 8 Release Preview Visual Studio 2012 RC Download the Windows 8 SDK Samples 2) Purchase, download, and read Charles Petzold’s newest book: Programming Windows 6th Edition This is a great introduction to the type of content you will be learning about during the camp. Doing some light reading beforehand might raise some questions about the concepts discussed in the book, which will give you the opportunity to write them down and bring them with you to the camp. The experts on hand will be able to answer them for you. 3) Make use of the freebies that are available Telerik has recently released a preview of their RadControls for Metro. You can sign up to receive a license code to give you access to install the preview for free and start playing around with it. Syncfusion also offers a free download of their Metro Studio package, which is a collection of metro style icons that you can customize and use in your own applications. Last but not least, once you’ve installed the Windows 8 Release Preview on your own device, go to the Windows 8 Store and download a handful of the free apps that are available. Testing out other Metro apps may give you ideas of what you can do in your own apps and analyze what features you like: application flow, type of animations used, concepts that were leveraged, how live tiles were used, etc. I hope you found these tips to be useful as you embark on a new development journey! Although this post focused on how to prepare for a Windows 8 camp, the same ideas are there whichever developer camp/workshop/event you attend. Learning does not begin and end on the day of the event. Attending a developer camp is just one step of many to master whatever technology you are interested in. It is a continuous process, which is fully maximized when you do your homework beforehand, actively participate during, and follow up by putting what you learned to practice afterwards. Happy coding!

Read the article
Developer Training – 6 Online Courses to Learn SQL Server, MySQL and Technology

- by Pinal Dave

Video courses are the next big thing and I am so happy that I have so far authored 6 different video courses with Pluralsight. Here is the list of the courses. I have listed all of my video courses over here. Note: If you click on the courses and it does not open, you need to login to Pluralsight with a valid username and password or sign up for a FREE trial. Please leave a comment with your favorite course in the comment section. Random 10 winners will get surprise gift via email. Bonus: If you list your favorite module from the course site. SQL Server Performance: Introduction to Query Tuning SQL Server performance tuning is an in-depth topic, and an art to master. A key component of overall application performance tuning is query tuning. Writing queries in an efficient manner, and making sure they execute in the most optimal way possible, is always a challenge. The basics revolve around the details of how SQL Server carries out query execution, so the optimizations explored in this course follow along the same lines. Click to View Course SQL Server Performance: Indexing Basics Indexes are the most crucial objects of the database. They are the first stop for any DBA and Developer when it is about performance tuning. There is a good side as well evil side of the indexes. To master the art of performance tuning one has to understand the fundamentals of the indexes and the best practices associated with the same. This course is for every DBA and Developer who deals with performance tuning and wants to use indexes to improve the performance of the server. Click to View Course SQL Server Questions and Answers This course is designed to help you better understand how to use SQL Server effectively. The course presents many of the common misconceptions about SQL Server, and then carefully debunks those misconceptions with clear explanations and short but compelling demos, showing you how SQL Server really works. This course is for anyone working with SQL Server databases who wants to improve her knowledge and understanding of this complex platform. Click to View Course MySQL Fundamentals MySQL is a popular choice of database for use in web applications, and is a central component of the widely used LAMP open source web application software stack. This course covers the fundamentals of MySQL, including how to install MySQL as well as written basic data retrieval and data modification queries. Click to View Course Building a Successful Blog Expressing yourself is the most common behavior of humans. Blogging has made easy to express yourself. Just like a letter or book has a structure and formula, blogging also has structure and formula. In this introductory course on blogging we will go over a few of the basics of blogging and show the way to get started with blogging immediately. If you already have a blog, this course will be even more relevant as this will discuss many of the common questions and issue you face in your blogging routine. Click to View Course Introduction to ColdFusion ColdFusion is rapid web application development platform. In this course you will learn the basics of how to use ColdFusion platform and rapidly develop web sites. The course begins with learning basics of ColdFusion Markup Language and moves to common development language practices. From there we move to frequent database operations and advanced concepts of Forms, Sessions and Cookies. The last module sums up all the concepts covered in the course with sample application. Click to View Course Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, T SQL, Technology

Read the article
SQL SERVER – #TechEdIn – Presenting Tomorrow on Speed Up! – Parallel Processes and Unparalleled Performance at TechEd India 2012

- by pinaldave

Performance tuning is always a very hot topic when it is about SQL Server. SQL Server Performance Tuning is a very challenging subject that requires expertise in Database Administration and Database Development. I always have enjoyed talking about SQL Server Performance tuning subject. However, in India, it’s actually the very first time someone is presenting on this interesting subject, so this time I had the biggest challenge to present this session. Frequently enough, we get these two kind of questions: How to turn off parallelism as it is reducing performance? How to turn on parallelism as I want more performance? The reality is that not everyone knows what exactly is needed by their system. In this session, I have attempted to answer this very question. I’ve decided to provide a balanced view but stay away from theory, which leads us to say “It depends”. The session will have a clear message about this towards its end. Deck Details Slides: 45+ Demos: 7+ Bonus Quiz: 5 Images: 10+ Session delivery time: 52 Mins + 8 Mins of Q & A I have presented this session a couple of times to my friends and so far have received good feedback. Oftentimes, when people hear that I am going to present 45 slides, they all say it is too much to cover. However, when I am done with the session the usual reaction is that I truly gave justice to those slides. Action Item Here are a few of the action items for all of those who are going to attend this session: If you want to attend the session, just come early. There’s a good chance that you may not get a seat because right before me, there is a session from SQL Guru Vinod Kumar. He performs a powerful delivery of million concepts in just a little time. Quiz. I will be asking few questions during the session as well as before the session starts. If you get the correct answer, I will give unique learning material for you. You may not want to miss this learning opportunity at any cosst. Session Details Title: Speed Up! – Parallel Processes and Unparalleled Performance (Add to Calendar) Abstract: “More CPU, More Performance” – A very common understanding is that usage of multiple CPUs can improve the performance of the query. To get a maximum performance out of any query, one has to master various aspects of the parallel processes. In this deep-dive session, we will explore this complex subject with a very simple interactive demo. Attendees will walk away with proper understanding of CX_PACKET wait types, MAXDOP, parallelism threshold and various other concepts. Date and Time: March 23, 2012, 12:15 to 13:15 Location: Hotel Lalit Ashok - Kumara Krupa High Grounds, Bengaluru – 560001, Karnataka, India. Add to Calendar Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article
Come see us at JavaU at JavaOne!

- by tmcginn

In just a little under a month, JavaOne will be in full swing (no pun intended) and thousands of Java developers will gather to hear the latest Java news, immerse themselves in Java technology and learn some new things. This year, I am fortunate enough to be able to attend, along with my Java curriculum development colleagues Matt Heimer and Mike Williams. We start our week at JavaOne teaching a one-day session at JavaU on Sunday morning. If you have never attended a training session through JavaU, you should check it out. There are some terrific sessions this year, and it might help to justify your trip to JavaOne if you can say it was for training! This year I am teaching a one day session on Java SE 7 New Features - a great session for anyone interested in the specific details of what is new in Java SE 7. Matt is teaching a one-day session on Developing Portable Java EE applications with the Enterprise JavaBeans 3.1 API and Java Persistence 2.0 API EJB, and Mike is doing a one-day session on developing Rich Client applications with Java SE 7 using Java FX 2. I asked Matt and Mike to tell me what developers can expect from their sessions. Matt: "My session will get you up to speed on everything you need to know to create portable Java EE 6 applications using EJB 3.1 and JPA 2. I am going to cover why everyone can benefit from using EJBs (and why developers should relearn them if they haven't looked at them for years). Students who attend my session will see JPA examples showcasing how to use relational databases in an enterprise applications without programming to JDBC and without writing SQL statements. EJB and JPA benefit from being paired together, so I will also show how transaction management is easier in a container. I encourage students to bring a laptop and code as they learn!" Mike: "My session covers how to develop a rich client application using Java FX 2. Starting with the basic concepts of JavaFX, students will see how a JavaFX application is built from its layout, to its controls, to its data structures. In addition, more advanced controls like charts, smart tables, and transitions will be added to the application. Finally, a quick review of JavaFX concurrency and data binding is included. Blended with the core concepts the session will include some of the latest JavaFX technology. This includes using Scene Builder to create a JavaFX UI and connecting your XML UI definition to Java code. In addition, packaging of the JavaFX application will be covered with some examples of the new native packaging features." As I mentioned, my session covers the changes in the Java for SE 7, including the language changes that were voted into Java SE 7 from Project Coin. I will also look at how you can take advantage if the the new I/O library (NIO.2) for writing applications that work with files, directories and file systems. We will also look at the changes in Asynchronous I/O that are a part of the changes in NIO/2. We will spend some time looking at the changes to the Java Virtual Machine as well, including support for dynamically typed languages (JSR-292). We will spend some time looking at the Java Concurrency enhancements (JSR-166), including the new Fork/Join framework. And we'll round out the day with a look at changes in Swing, XML and a number of smaller changes in the API's. And, if these topics aren't grabbing your interest, take a look at the other 10 sessions that range from topics on architecture to how to pass the Oracle Certified Programmer I and II exams. See you soon!

Read the article
October in Review

- by Richard Bingham

With OpenWorld over October was time to get back to serious work for everyone, including the Fusion Applications Developer Relations team. Don't forget the OpenWorld content is still available, including presentation downloads, for a limited period of time so be sure to grab anything you found useful or take another scan for anything you might have missed. Of all the announcements, the continued evolution of the Oracle Cloud services for extending and integrating with Fusion Applications is increasing in popularity, and certainly the Cloud Marketplace is something we're becoming involved in. More details to follow. Fusion Concepts Last week Vik from our team started the new "Fusion Concepts" series of articles, providing those new to Fusion Applications an explanation of the architectural basics, with the aim to reduce the learning curve and lay the platform for more efficient and effective development. The series begun with an insightful first post on the different schemas that exist in the Fusion Applications database. Look out for upcoming posts on multi-lingual entities, profile options, look-ups and more. New Learning Resources Our YouTube channel continued to expand with more 'how to' videos on using page composer, extending the Simplified UI (aka FUSE), and integrating BI reports and analytics. Also the Oracle Learning Library is now well established as a central resource for knowledge, now with thousands of tutorials, videos, and documents. Of particular note are the great new extensibility-related videos added by the CRM Product Management team, including more on the ever-expanding capabilities of Application Composer. To see some examples of these search using keyword 'customization' or the product 'Sales Cloud'. Finally on learning resources, as Oliver mentioned the Oracle Press book on Fusion Application Customization and Extensibility is now available for pre-order on Amazon (due out 1st Jan). Out And About October also saw us attend the annual Apps Conference held by the UK Oracle User Group in London. Interestingly there was an Applications Transformation stream of sessions and content that included Fusion Applications with all the latest in the Oracle Applications evolution, as always focused around the three tenets of social, mobile, and cloud. Read more in Richard's post-event write up. Other teams around Oracle have also been busy. Angelo from the Platform Technical Services group has done quite a bit of work using web services with Fusion SaaS and has published many interesting findings on his blog. It's definitely recommended reading if you are working on any related integration projects. The middleware-for-applications group has built a new tool called "AppAdvantage" offering an online assessment of your use of Fusion Middleware technologies with Oracle Applications. As the popularity of integrating cloud applications with on-premises systems continued to grow, leveraging existing middleware technologies (and licenses) to support the integration solution is likely to be of paramount importance. Similarly the "Build Enterprise Application Extensions with Ease" section of the related webpage has AppsUX director Killan Evers speaking about customization using the composer tools. Both are useful resources for those just getting started with a move to Fusion Applications. The Oracle A-Team, specialists in middleware technical architecture, always publish superb content via their 'chronicles' site, now with a substantial amount specifically related to Fusion Applications. Click on the Fusion Applications menu on the top right of their homepage to see more. Last month of particular note was an article on customizing the timeout pop-up message that shows to inactive users, providing design-time insight and easy-to-follow steps. Finally if you're looking at using Oracle Middleware and Cloud to tailor and extend your applications then you may also be interested in this new blog post on the roadmap for Oracle SOA and the latest on-demand Cloud Development webcast.

Read the article
Understanding the Customer Form in Release 12 from an AR Perspective!!

- by user793553

Confused by the Customer Form in Release 12?? Read on, to get some insight into the evolution of this screen, and how it links in with Trading Community Architecture. Historically, the customer data model was owned by Oracle Receivables (AR). However, as the data model changed and more complex relationships and attributes had to be tracked and monitored, the Trading Community Architecture (TCA) product was created. All applications within the E-Business suite that require interaction with a customer integrate with TCA. Customer information is no longer stored in the individual applications but rather in a central repository/registry maintained within TCA. It is important to understand the following entities/concepts stored in TCA: Party: A party is an entity with whom you can have a potential business relationship. A party can be either a Person or an Organization. The Party entity is completely independent of any business relationship; this means that a Party can exist even if you have no transactions with it. The Party is the "umbrella" entity under which you capture all other attributes listed below. Customer: A customer is a party with whom you have an existing business relationship. From an AR perspective, you can simplify the concepts by thinking of a Customer as a Party. This definition however does not apply to all other applications. In the Oracle Receivables Customer form, the information displayed at the Customer level is from TCA's Party information record. Customer Account (also called Account): An account contains information about how you transact business with a particular customer. You can create multiple accounts for a customer. When you create invoices and receipts you associate it to a particular Account of a Customer. Location: A Location is an address. It is a point in space, typically identified by a street number, a street name, a city, a state or province, a country. A location is independent of what it is used for - you do not associate a purpose to a location. Party Site: A Party Site is associated to a Party. It is the location where a party is physically located. When defining sites for a Party, only one can be an identifying address. However, you can define other party sites associated to a party. You can define purposes/usage for Party Sites. Account Site: An Account Site is associated to a Customer Account. It is the location associated to the account you are transacting business with. You can define business purposes (also called site uses) for an Account site. Read more about the Customer Workbench in these notes: Doc ID 1436547.1 Oracle Receivables: Understanding the Customer Form in Release 12 Doc ID 1437866.1 Customer Form - Address: Troubleshooting, Known Issues and Patches Doc ID 1448442.1 Oracle Receivables (AR): Customer Workbench Information Center Do you find this type of blog entry useful? Please add comments to let us know how we can help you more effectively. Thank you!

Read the article
The long road to bug-free software

- by Tony Davis

The past decade has seen a burgeoning interest in functional programming languages such as Haskell or, in the Microsoft world, F#. Though still on the periphery of mainstream programming, functional programming concepts are gradually seeping into the imperative C# language (for example, Lambda expressions have their root in functional programming). One of the more interesting concepts from functional programming languages is the use of formal methods, the lofty ideal behind which is bug-free software. The idea is that we write a specification that describes exactly how our function (say) should behave. We then prove that our function conforms to it, and in doing so have proved beyond any doubt that it is free from bugs. All programmers already use one form of specification, specifically their programming language's type system. If a value has a specific type then, in a type-safe language, the compiler guarantees that value cannot be an instance of a different type. Many extensions to existing type systems, such as generics in Java and .NET, extend the range of programs that can be type-checked. Unfortunately, type systems can only prevent some bugs. To take a classic problem of retrieving an index value from an array, since the type system doesn't specify the length of the array, the compiler has no way of knowing that a request for the "value of index 4" from an array of only two elements is "unsafe". We restore safety via exception handling, but the ideal type system will prevent us from doing anything that is unsafe in the first place and this is where we start to borrow ideas from a language such as Haskell, with its concept of "dependent types". If the type of an array includes its length, we can ensure that any index accesses into the array are valid. The problem is that we now need to carry around the length of arrays and the values of indices throughout our code so that it can be type-checked. In general, writing the specification to prove a positive property, even for a problem very amenable to specification, such as a simple sorting algorithm, turns out to be very hard and the specification will be different for every program. Extend this to writing a specification for, say, Microsoft Word and we can see that the specification would end up being no simpler, and therefore no less buggy, than the implementation. Fortunately, it is easier to write a specification that proves that a program doesn't have certain, specific and undesirable properties, such as infinite loops or accesses to the wrong bit of memory. If we can write the specifications to prove that a program is immune to such problems, we could reuse them in many places. The problem is the lack of specification "provers" that can do this without a lot of manual intervention (i.e. hints from the programmer). All this might feel a very long way off, but computing power and our understanding of the theory of "provers" advances quickly, and Microsoft is doing some of it already. Via their Terminator research project they have started to prove that their device drivers will always terminate, and in so doing have suddenly eliminated a vast range of possible bugs. This is a huge step forward from saying, "we've tested it lots and it seems fine". What do you think? What might be good targets for specification and verification? SQL could be one: the cost of a bug in SQL Server is quite high given how many important systems rely on it, so there's a good incentive to eliminate bugs, even at high initial cost. [Many thanks to Mike Williamson for guidance and useful conversations during the writing of this piece] Cheers, Tony.

Read the article
What do you need to know to be a world-class master software developer? [closed]

- by glitch

I wanted to bring up this question to you folks and see what you think, hopefully advise me on the matter: let's say you had 30 years of learning and practicing software development in front of you, how would you dedicate your time so that you'd get the biggest bang for your buck. What would you both learn and work on to be a world-class software developer that would make a large impact on the industry and leave behind a legacy? I think that most great developers end up being both broad generalists and specialists in one-two areas of interest. I'm thinking Bill Joy, John Carmack, Linus Torvalds, K&R and so on. I'm thinking that perhaps one approach would be to break things down by categories and establish a base minimum of "software development" greatness. I'm thinking: Operating Systems: completely internalize the core concepts of OS, perhaps gain a lot of familiarity with an OSS one such as Linux. Anything from memory management to device drivers has to be complete second nature. Programming Languages: this is one of those topics that imho has to be fully grokked even if it might take many years. I don't think there's quite anything like going through the process of developing your own compiler, understanding language design trade-offs and so on. Programming Language Pragmatics is one of my favorite books actually, I think you want to have that internalized back to back, and that's just the start. You could go significantly deeper, but I think it's time well spent, because it's such a crucial building block. As a subset of that, you want to really understand the different programming paradigms out there. Imperative, declarative, logic, functional and so on. Anything from assembly to LISP should be at the very least comfortable to write in. Contexts: I believe one should have experience working in different contexts to truly be able to appreciate the trade-offs that are being made every day. Embedded, web development, mobile development, UX development, distributed, cloud computing and so on. Hardware: I'm somewhat conflicted about this one. I think you want some understanding of computer architecture at a low level, but I feel like the concepts that will truly matter will be slightly higher level, such as CPU caching / memory hierarchy, ILP, and so on. Networking: we live in a completely network-dependent era. Having a good understanding of the OSI model, knowing how the Web works, how HTTP works and so on is pretty much a pre-requisite these days. Distributed systems: once again, everything's distributed these days, it's getting progressively harder to ignore this reality. Slightly related, perhaps add solid understanding of how browsers work to that, since the world seems to be moving so much to interfacing with everything through a browser. Tools: Have a really broad toolset that you're familiar with, one that continuously expands throughout the years. Communication: I think being a great writer, effective communicator and a phenomenal team player is pretty much a prerequisite for a lot of a software developer's greatness. It can't be overstated. Software engineering: understanding the process of building software, team dynamics, the requirements of the business-side, all the pitfalls. You want to deeply understand where what you're writing fits from the market perspective. The better you understand all of this, the more of your work will actually see the daylight. This is really just a starting list, I'm confident that there's a ton of other material that you need to master. As I mentioned, you most likely end up specializing in a bunch of these areas as you go along, but I was trying to come up with a baseline. Any thoughts, suggestions and words of wisdom from the grizzled veterans out there who would like to share their thoughts and experiences with this? I'd really love to know what you think!

Read the article
The long road to bug-free software

- by Tony Davis

The past decade has seen a burgeoning interest in functional programming languages such as Haskell or, in the Microsoft world, F#. Though still on the periphery of mainstream programming, functional programming concepts are gradually seeping into the imperative C# language (for example, Lambda expressions have their root in functional programming). One of the more interesting concepts from functional programming languages is the use of formal methods, the lofty ideal behind which is bug-free software. The idea is that we write a specification that describes exactly how our function (say) should behave. We then prove that our function conforms to it, and in doing so have proved beyond any doubt that it is free from bugs. All programmers already use one form of specification, specifically their programming language's type system. If a value has a specific type then, in a type-safe language, the compiler guarantees that value cannot be an instance of a different type. Many extensions to existing type systems, such as generics in Java and .NET, extend the range of programs that can be type-checked. Unfortunately, type systems can only prevent some bugs. To take a classic problem of retrieving an index value from an array, since the type system doesn't specify the length of the array, the compiler has no way of knowing that a request for the "value of index 4" from an array of only two elements is "unsafe". We restore safety via exception handling, but the ideal type system will prevent us from doing anything that is unsafe in the first place and this is where we start to borrow ideas from a language such as Haskell, with its concept of "dependent types". If the type of an array includes its length, we can ensure that any index accesses into the array are valid. The problem is that we now need to carry around the length of arrays and the values of indices throughout our code so that it can be type-checked. In general, writing the specification to prove a positive property, even for a problem very amenable to specification, such as a simple sorting algorithm, turns out to be very hard and the specification will be different for every program. Extend this to writing a specification for, say, Microsoft Word and we can see that the specification would end up being no simpler, and therefore no less buggy, than the implementation. Fortunately, it is easier to write a specification that proves that a program doesn't have certain, specific and undesirable properties, such as infinite loops or accesses to the wrong bit of memory. If we can write the specifications to prove that a program is immune to such problems, we could reuse them in many places. The problem is the lack of specification "provers" that can do this without a lot of manual intervention (i.e. hints from the programmer). All this might feel a very long way off, but computing power and our understanding of the theory of "provers" advances quickly, and Microsoft is doing some of it already. Via their Terminator research project they have started to prove that their device drivers will always terminate, and in so doing have suddenly eliminated a vast range of possible bugs. This is a huge step forward from saying, "we've tested it lots and it seems fine". What do you think? What might be good targets for specification and verification? SQL could be one: the cost of a bug in SQL Server is quite high given how many important systems rely on it, so there's a good incentive to eliminate bugs, even at high initial cost. [Many thanks to Mike Williamson for guidance and useful conversations during the writing of this piece] Cheers, Tony.

Read the article
??????????? - Java SE Embedded 8

- by kshimizu-Oracle

Java?OS??????1?????????????????????????????????3?????????????? HEAP: Java????????????????????????????????? NON-HEAP: NON-HEAP????JVM???????????????????Code Cache?Metaspace???2????????????? Code Cache: ????JIT??????????????????????????? Metaspace: HEAP?????????????????????????? JavaVM??????????: VM?????????????????? ??????????????? ????????????????????????????????????????????????????????????????????????? HEAP?Java Mission Control???????????????????? (????)? ????Java SE?????????????API????????????????????????????????????? Mission Control?????API?????????????????????????????????API??????????????? HEAP???????????? VM????????"-Xmx"???????????????? java.lang.Runtime.maxMemory(); ?????HEAP????????? ?????VM????????"-Xms"? ????????????? "-Xms"???????"-Xmx"?????????? java.lang.Runtime.totalMemory(); ???????????HEAP????????????? java.lang.Runtime.freeMemory(); ??NON-HEAP???????????? API??????????? Java Mission Control?????????? ????????????Java Mission Control??????????????????????? ????"NON_HEAP"?????????NON-HEAP?????? ???? HEAP????NON-HEAP?????????????? Java VM???????????????????????????????????????? ?????????????????????????????????? ????HEAP/NON-HEAP?????????????????????????? OS?????????????? Linux???????procfs?Java??????????????????? (VmHWM or VmRSS) ????? ????HEAP/NON-HEAP??????????????????????????? ?????????????????? ??????JVM?????????????????? ?????????????????JVM???????????????????? ???JVM?????? ????????????? Embedded??JVM?????????? ??Embedded???Oracle JVM??????CPU????????????????????????????????????????? ??????CPU??????????????????????????????????????? Minimal/Client/Server??JVM???????????????? ????JVM??????????????????? ??????Compact????????????????? ? 2 - 3?????? Concept Guide (http://docs.oracle.com/javase/8/embedded/embedded-concepts/basic-concepts.htm) ???????? ??JVM??????????? ????????????????????? -Xms: ??????????? ?????????? ?????????????????????????????????????????????????? -Xmx: ??????????? -XX:ReservedCodeCacheSize: Code Cache??????? ?) JIT??????????????Code Cache????????????0???????? -Xint: JIT??????????? ????????????? JIT?????????????????????? ????????????????? -Xss: ???????????????????? ????????????????????????? ????????????????????????????? -XX:CompileThreshold: JIT?????????????????????????????????? ?????????????????????? ????????? ?????????????????? Code Cache?????????? ?????????? ????????????????????? ????????????????????????? ??????????????????????? ?????????????????????

Read the article
What should every developer know about databases?

- by Aaronaught

Whether we like it or not, many if not most of us developers either regularly work with databases or may have to work with one someday. And considering the amount of misuse and abuse in the wild, and the volume of database-related questions that come up every day, it's fair to say that there are certain concepts that developers should know - even if they don't design or work with databases today. So: What are the important concepts that developers and other software professionals ought to know about databases? Guidelines for Responses: Keep your list short. One concept per answer is best. Be specific. "Data modelling" may be an important skill, but what does that mean precisely? Explain your rationale. Why is your concept important? Don't just say "use indexes." Don't fall into "best practices." Convince your audience to go learn more. Upvote answers you agree with. Read other people's answers first. One high-ranked answer is a more effective statement than two low-ranked ones. If you have more to add, either add a comment or reference the original. Don't downvote something just because it doesn't apply to you personally. We all work in different domains. The objective here is to provide direction for database novices to gain a well-founded, well-rounded understanding of database design and database-driven development, not to compete for the title of most-important.

Read the article
Which is better? OpenCyc or ConceptNet?

- by Daniel Loureiro

Hi, I'm doing a NLP project where I need to recognise concepts in sentences to find other similar concepts. I do this to infer word valences from a list I already have. I started using WordNet, but it gave many contradictory results. By contradictory results I mean word expansions that had contradictory valences. So now I'm looking into ConceptNet and OpenCyc. I've already implemented ConceptNet and it was all very easy and I love it. Problem is that OpenCyc appears to have a much larger and more logically rigid database, which is important when I found so many "contradictions" on WordNet... But I wouldn't know because I haven't tried it. Could someone tell me if it's worth going through the (considerable, for me) effort to implement OpenCyc, or is ConceptNet good enough to infer word valences? Are they that different? I'll be happy to explain myself further, if needed. Trying to keep it short for now! Thanks!

Read the article
Architecture for new ASP.NET web application

- by Anders Abel

I'm maintaining an application which currently is just a web service (built with WCF) and a database backend. The web service is built in layers with a linq-to-sql data access part with core functionality in an own assembly and on top of that the web service assembly which contains the WCF code. The core assembly also handles all business logic rules (very few actually). The customer now wants a Web interface for the application instead of just accessing it through other applications which are consuming the web service. I'm quite lost on modern web application design, so I would like some advice on what architecture and frameworks to use for the web application. The web application will be using the same core assembly with business rules and the linq-to-sql data access layer as the web service. Some concepts I've thought about are: ASP.NET MVC Webforms AJAX controls - possibly leting the AJAX controls access the existing web service through JSON. Are there any more concepts I should look into? Which one is the best for a fresh project? The development tools are Visual Studio 2008 Team Edition for Developers targeting .NET 3.5. An upgrade to Visual Studio 2010 Premium (or maybe even Ultimate) is possible if it gives any benefits.

Read the article
why do we need advanced knowledge of mathematics & physics for programming?

- by Sumeet

Guys, I have been very good in mathematics and physics in my schools and colleges. Right now I am a programmer. Even in the colleges I have to engrossed my self into computers and programming things all the time. As I used to like it very much. But I have always felt the lack of advanced mathematics and physics in all the work I have done (Programs). Programming never asked me any advanced mathematics and physics knowledge in what I was very good. It always ask u some optimized loops, and different programming technologies which has never been covered in advanced mathematics and physics. Even at the time of selection in big College , such a kind of advanced knowledge is required. Time by time I got out of touch of all that facts and concepts (advanced mathematics and physics). And now after, 5 years in job I found it hard to resolve Differentiations and integrations from Trigonometry. Which sometimes make me feel like I have wasted time in those concepts because they are never used. (At that time I knew that I am going to be a programmer) If one need to be a programmer why do all this advanced knowledge is required. One can go with elementry knowledge a bit more. You never got to think like scientists and R&D person in your Schols and colleges for being a programmer? Just think and let me know your thoughts. I must be wrong somewhere in what I think , but not able to figure that out..? Regards Sumeet

Read the article
node.js beginner tutorials?

- by TreyK

Hey all, I'm working on creating my first real node.js http server, and I'm sort of drowning in it. As a good teacher of mine always said, "I'll just shove you in the water for now, and then I'll show you how to swim." Fortunately, she wasn't a swimming instructor, but it's a good analogy nonetheless. I feel like I've jumped into node.js and I've only found a ping pong ball to help, that is to say, most of the tutorials I've read stop shortly after the "Hello World" example and I've mostly been trying to make sense of copied and pasted code (or they assume I have knowledge of lower level HTTP and webserver concepts that have been done for me as an Apache/PHP developer). I have experience in both client-side Javascript and PHP, but node seems to be a beast all of its own. I don't quite have the low-level knowledge that seems necessary for creating a node server, and connect, which seems to be a nice module for simplifying things, seems quite sparsely explained, even in the docs on its Git. Where could I find some tutorials to help me in this situation? TL;DR - Are there any tutorials for node.js that go beyond "Hello World" but don't require much low-level knowledge? Or any tutorials that explain lower-level HTTP and webserver concepts that I would need to effectively create a node HTTP server? Thanks for any help. -Trey

Read the article
Should a programmer have mastery over C++

- by Yogendra

I was wondering if it is necessary for programmers to have expertise on at least 1 programming language? Programming languages like C#, java, VB.Net etc change every year or two. Should a programmer have mastery over C++, which is a stable language and rarely undergoes changes? I am a C# developer and using it for about 7 years now, I still don't have mastery on it. EDIT I think my question is being misunderstood. I am not against changes or evolution. I love the new features and abstraction provided by languages such as C#, VB, Java. And I keep waiting for new features if it makes a programmers life easy. But this fact also make this languages very difficult to master. They are continuously evolving. Languages like C++ have slow evolution cycle. So given this scenario, Is it helpful to be master of C++? This is what my original question meant. Note:- Based on the answers by friends below, I have understood that languages and framework are tools for expressing the concepts. Also it might be a good idea to express the concepts in different programming languages.

Read the article
Need help/guidance about creating a desktop application with gui

- by Somebody still uses you MS-DOS

I'm planning to do an Desktop application using Python, to learn some Desktop concepts. I'm going to use GTK or Qt, I still haven't decided which one. Fact is: I would like to create an application with the possibility to be called from command line, AND using a GUI. So it would be useful for cmd fans, and GUI users as well. It would be interesting to create a web interface too in the future, so it could be run in a server somewhere using an html interface created with a template language. I'm thinking about two approaches: - Creating a "model" with a simple interface which is called from a desktop/web implementation; - Creating a "model" with an html interface, and embeb a browser component so I could reuse all the code in both desktop/web scenarios. My question is: which exactly concepts are involved in this project? What advantages/disadvantages each approach has? Are they possible? By naming "interface", I'm planning to just do some interfaces.py files with def calls. Is this a bad approach? I would like to know some book recommendations, or resources to both options - or source code from projects which share the same GUI/cmd/web goals I'm after. Thanks in advance!

Read the article
Should I learn two (or more) programming languages in parallel?

- by c_maker

I found entries on this site about learning a new programming language, however, I have not come across anything that talks about the advantages and disadvantages of learning two languages at the same time. Let's say my goal is to learn two new languages in a year. I understand that the definition of learning a new language is different for everyone and you can probably never know everything about a language. I believe in most cases the following things are enough to include the language in your resume and say that you are proficient in it (list is not in any particular order): Know its syntax so you can write a simple program in it Compare its underlying concepts with concepts of other languages Know best practices Know what libraries are available Know in what situations to use it Understand the flow of a more complex program At least know most of what you do not know I would probably look for a good book and pick an open source project for both of these languages to start with. My questions: Is it best to spend 5 months learning language#1 then 5 months learning language#2, or should you mix the two. Mixing them I mean you work on them in parallel. Should you pick two languages that are similar or different? Are there any advantages/disadvantages of let's say learning Lisp in tandem with Ruby? Is it a good idea to pick two languages with similar syntax or would it be too confusing? Please tell me what your experiences are regarding this. Does it make a difference if you are a beginner or a senior programmer?

Read the article
Financial Market Developer dilemma...

- by Sahat

...In the future I am planning to work in the financial sector as a programmer. I have a couple of options right now (1 or 2): Learn and master .NET since presumably that's widely used in that industry OR Learn the programming concepts, learn algorithms, learn a little bit of c,c++,c#,java,objective-c,sql,oracle,cobol - in other words learn the fundamental principles that tie all programming languages together without going too deep in any particular language. Someone has told me that most of the time as a programmer you won't be writing any code, but instead maintaing and existing code that people before you have built. Does that mean I don't really need to master any specific language and as long as I have general concepts it'll be good enough? If you or if you know someone who has worked in the financial industry as a software developer could you please share the experience and what is the daily routine consists of? Also what should I be learning right now while I am still young and in college? Do I have to thoroughly understand the market and the current economy? What about Oracle or SQL Databases - do I need to know them inside out as a programmer? Thanks if you have anything else to add that I have not mentioned then please do so! Thanks in advance!

Read the article
Objective-measures of the expressiveness of programming languages [closed]

- by Casebash

I am very interested in the expressiveness of different languages. Everyone who has programmed in multiple languages knows that sometimes a language allows you to express concepts which you can't express in other languages. You can have all kinds of subjective discussion about this, but naturally it would be better to have an objective measure. There do actually exist objective measures. One is Turing-Completeness, which means that a language is capable of generating any output that could be generated by following a sequential set of steps. There are also other lesser levels of expressiveness such as Finite State Automata. Now, except for domain specific languages, pretty much all modern languages are Turing complete. It is therefore natural to ask the following question: Can we can define any other formal measures of expressiveness which are greater than Turing completeness? Now of course we can't define this by considering the output that a program can generate, as Turing machines can already produce the same output that any other program can. But there are definitely different levels in what concepts can be expressed - surely no-one would argue that assembly language is as powerful as a modern object oriented language like Python. You could use your assembly to write a Python interpreter, so clearly any accurate objective measure would have to exclude this possibility. This also causes a problem with trying to define the expressiveness using the minimum number of symbols. How exactly to do so is not clear and indeed appears extremely difficult, but we can't assume that just because we don't know how to solve a problem, that nobody know how to. It is also doesn't really make sense to demand a definition of expressiveness before answering the question - after all the whole point of this question is to obtain such a definition. I think that my explanation will be clear enough for anyone with a strong theoretical background in computer science to understand what I am looking for. If you do have such a background and you disagree, please comment why, but if you don't thats probably why you don't understand the question.

Read the article
Where does ASP.NET Web API Fit?

- by Rick Strahl

With the pending release of ASP.NET MVC 4 and the new ASP.NET Web API, there has been a lot of discussion of where the new Web API technology fits in the ASP.NET Web stack. There are a lot of choices to build HTTP based applications available now on the stack - we've come a long way from when WebForms and Http Handlers/Modules where the only real options. Today we have WebForms, MVC, ASP.NET Web Pages, ASP.NET AJAX, WCF REST and now Web API as well as the core ASP.NET runtime to choose to build HTTP content with. Web API definitely squarely addresses the 'API' aspect - building consumable services - rather than HTML content, but even to that end there are a lot of choices you have today. So where does Web API fit, and when doesn't it? But before we get into that discussion, let's talk about what a Web API is and why we should care. What's a Web API? HTTP 'APIs' (Microsoft's new terminology for a service I guess) are becoming increasingly more important with the rise of the many devices in use today. Most mobile devices like phones and tablets run Apps that are using data retrieved from the Web over HTTP. Desktop applications are also moving in this direction with more and more online content and synching moving into even traditional desktop applications. The pending Windows 8 release promises an app like platform for both the desktop and other devices, that also emphasizes consuming data from the Cloud. Likewise many Web browser hosted applications these days are relying on rich client functionality to create and manipulate the browser user interface, using AJAX rather than server generated HTML data to load up the user interface with data. These mobile or rich Web applications use their HTTP connection to return data rather than HTML markup in the form of JSON or XML typically. But an API can also serve other kinds of data, like images or other binary files, or even text data and HTML (although that's less common). A Web API is what feeds rich applications with data. ASP.NET Web API aims to service this particular segment of Web development by providing easy semantics to route and handle incoming requests and an easy to use platform to serve HTTP data in just about any content format you choose to create and serve from the server. But .NET already has various HTTP Platforms The .NET stack already includes a number of technologies that provide the ability to create HTTP service back ends, and it has done so since the very beginnings of the .NET platform. From raw HTTP Handlers and Modules in the core ASP.NET runtime, to high level platforms like ASP.NET MVC, Web Forms, ASP.NET AJAX and the WCF REST engine (which technically is not ASP.NET, but can integrate with it), you've always been able to handle just about any kind of HTTP request and response with ASP.NET. The beauty of the raw ASP.NET platform is that it provides you everything you need to build just about any type of HTTP application you can dream up from low level APIs/custom engines to high level HTML generation engine. ASP.NET as a core platform clearly has stood the test of time 10+ years later and all other frameworks like Web API are built on top of this ASP.NET core. However, although it's possible to create Web APIs / Services using any of the existing out of box .NET technologies, none of them have been a really nice fit for building arbitrary HTTP based APIs. Sure, you can use an HttpHandler to create just about anything, but you have to build a lot of plumbing to build something more complex like a comprehensive API that serves a variety of requests, handles multiple output formats and can easily pass data up to the server in a variety of ways. Likewise you can use ASP.NET MVC to handle routing and creating content in various formats fairly easily, but it doesn't provide a great way to automatically negotiate content types and serve various content formats directly (it's possible to do with some plumbing code of your own but not built in). Prior to Web API, Microsoft's main push for HTTP services has been WCF REST, which was always an awkward technology that had a severe personality conflict, not being clear on whether it wanted to be part of WCF or purely a separate technology. In the end it didn't do either WCF compatibility or WCF agnostic pure HTTP operation very well, which made for a very developer-unfriendly environment. Personally I didn't like any of the implementations at the time, so much so that I ended up building my own HTTP service engine (as part of the West Wind Web Toolkit), as have a few other third party tools that provided much better integration and ease of use. With the release of Web API for the first time I feel that I can finally use the tools in the box and not have to worry about creating and maintaining my own toolkit as Web API addresses just about all the features I implemented on my own and much more. ASP.NET Web API provides a better HTTP Experience ASP.NET Web API differentiates itself from the previous Microsoft in-box HTTP service solutions in that it was built from the ground up around the HTTP protocol and its messaging semantics. Unlike WCF REST or ASP.NET AJAX with ASMX, it’s a brand new platform rather than bolted on technology that is supposed to work in the context of an existing framework. The strength of the new ASP.NET Web API is that it combines the best features of the platforms that came before it, to provide a comprehensive and very usable HTTP platform. Because it's based on ASP.NET and borrows a lot of concepts from ASP.NET MVC, Web API should be immediately familiar and comfortable to most ASP.NET developers. Here are some of the features that Web API provides that I like: Strong Support for URL Routing to produce clean URLs using familiar MVC style routing semantics Content Negotiation based on Accept headers for request and response serialization Support for a host of supported output formats including JSON, XML, ATOM Strong default support for REST semantics but they are optional Easily extensible Formatter support to add new input/output types Deep support for more advanced HTTP features via HttpResponseMessage and HttpRequestMessage classes and strongly typed Enums to describe many HTTP operations Convention based design that drives you into doing the right thing for HTTP Services Very extensible, based on MVC like extensibility model of Formatters and Filters Self-hostable in non-Web applications Testable using testing concepts similar to MVC Web API is meant to handle any kind of HTTP input and produce output and status codes using the full spectrum of HTTP functionality available in a straight forward and flexible manner. Looking at the list above you can see that a lot of functionality is very similar to ASP.NET MVC, so many ASP.NET developers should feel quite comfortable with the concepts of Web API. The Routing and core infrastructure of Web API are very similar to how MVC works providing many of the benefits of MVC, but with focus on HTTP access and manipulation in Controller methods rather than HTML generation in MVC. There’s much improved support for content negotiation based on HTTP Accept headers with the framework capable of detecting automatically what content the client is sending and requesting and serving the appropriate data format in return. This seems like such a little and obvious thing, but it's really important. Today's service backends often are used by multiple clients/applications and being able to choose the right data format for what fits best for the client is very important. While previous solutions were able to accomplish this using a variety of mixed features of WCF and ASP.NET, Web API combines all this functionality into a single robust server side HTTP framework that intrinsically understands the HTTP semantics and subtly drives you in the right direction for most operations. And when you need to customize or do something that is not built in, there are lots of hooks and overrides for most behaviors, and even many low level hook points that allow you to plug in custom functionality with relatively little effort. No Brainers for Web API There are a few scenarios that are a slam dunk for Web API. If your primary focus of an application or even a part of an application is some sort of API then Web API makes great sense. HTTP ServicesIf you're building a comprehensive HTTP API that is to be consumed over the Web, Web API is a perfect fit. You can isolate the logic in Web API and build your application as a service breaking out the logic into controllers as needed. Because the primary interface is the service there's no confusion of what should go where (MVC or API). Perfect fit. Primary AJAX BackendsIf you're building rich client Web applications that are relying heavily on AJAX callbacks to serve its data, Web API is also a slam dunk. Again because much if not most of the business logic will probably end up in your Web API service logic, there's no confusion over where logic should go and there's no duplication. In Single Page Applications (SPA), typically there's very little HTML based logic served other than bringing up a shell UI and then filling the data from the server with AJAX which means the business logic required for data retrieval and data acceptance and validation too lives in the Web API. Perfect fit. Generic HTTP EndpointsAnother good fit are generic HTTP endpoints that to serve data or handle 'utility' type functionality in typical Web applications. If you need to implement an image server, or an upload handler in the past I'd implement that as an HTTP handler. With Web API you now have a well defined place where you can implement these types of generic 'services' in a location that can easily add endpoints (via Controller methods) or separated out as more full featured APIs. Granted this could be done with MVC as well, but Web API seems a clearer and more well defined place to store generic application services. This is one thing I used to do a lot of in my own libraries and Web API addresses this nicely. Great fit. Mixed HTML and AJAX Applications: Not a clear Choice For all the commonality that Web API and MVC share they are fundamentally different platforms that are independent of each other. A lot of people have asked when does it make sense to use MVC vs. Web API when you're dealing with typical Web application that creates HTML and also uses AJAX functionality for rich functionality. While it's easy to say that all 'service'/AJAX logic should go into a Web API and all HTML related generation into MVC, that can often result in a lot of code duplication. Also MVC supports JSON and XML result data fairly easily as well so there's some confusion where that 'trigger point' is of when you should switch to Web API vs. just implementing functionality as part of MVC controllers. Ultimately there's a tradeoff between isolation of functionality and duplication. A good rule of thumb I think works is that if a large chunk of the application's functionality serves data Web API is a good choice, but if you have a couple of small AJAX requests to serve data to a grid or autocomplete box it'd be overkill to separate out that logic into a separate Web API controller. Web API does add overhead to your application (it's yet another framework that sits on top of core ASP.NET) so it should be worth it .Keep in mind that MVC can generate HTML and JSON/XML and just about any other content easily and that functionality is not going away, so just because you Web API is there it doesn't mean you have to use it. Web API is not a full replacement for MVC obviously either since there's not the same level of support to feed HTML from Web API controllers (although you can host a RazorEngine easily enough if you really want to go that route) so if you're HTML is part of your API or application in general MVC is still a better choice either alone or in combination with Web API. I suspect (and hope) that in the future Web API's functionality will merge even closer with MVC so that you might even be able to mix functionality of both into single Controllers so that you don't have to make any trade offs, but at the moment that's not the case. Some Issues To think about Web API is similar to MVC but not the Same Although Web API looks a lot like MVC it's not the same and some common functionality of MVC behaves differently in Web API. For example, the way single POST variables are handled is different than MVC and doesn't lend itself particularly well to some AJAX scenarios with POST data. Code Duplication I already touched on this in the Mixed HTML and Web API section, but if you build an MVC application that also exposes a Web API it's quite likely that you end up duplicating a bunch of code and - potentially - infrastructure. You may have to create authentication logic both for an HTML application and for the Web API which might need something different altogether. More often than not though the same logic is used, and there's no easy way to share. If you implement an MVC ActionFilter and you want that same functionality in your Web API you'll end up creating the filter twice. AJAX Data or AJAX HTML On a recent post's comments, David made some really good points regarding the commonality of MVC and Web API's and its place. One comment that caught my eye was a little more generic, regarding data services vs. HTML services. David says: I see a lot of merit in the combination of Knockout.js, client side templates and view models, calling Web API for a responsive UI, but sometimes late at night that still leaves me wondering why I would no longer be using some of the nice tooling and features that have evolved in MVC ;-) You know what - I can totally relate to that. On the last Web based mobile app I worked on, we decided to serve HTML partials to the client via AJAX for many (but not all!) things, rather than sending down raw data to inject into the DOM on the client via templating or direct manipulation. While there are definitely more bytes on the wire, with this, the overhead ended up being actually fairly small if you keep the 'data' requests small and atomic. Performance was often made up by the lack of client side rendering of HTML. Server rendered HTML for AJAX templating gives so much better infrastructure support without having to screw around with 20 mismatched client libraries. Especially with MVC and partials it's pretty easy to break out your HTML logic into very small, atomic chunks, so it's actually easy to create small rendering islands that can be used via composition on the server, or via AJAX calls to small, tight partials that return HTML to the client. Although this is often frowned upon as to 'heavy', it worked really well in terms of developer effort as well as providing surprisingly good performance on devices. There's still plenty of jQuery and AJAX logic happening on the client but it's more manageable in small doses rather than trying to do the entire UI composition with JavaScript and/or 'not-quite-there-yet' template engines that are very difficult to debug. This is not an issue directly related to Web API of course, but something to think about especially for AJAX or SPA style applications. Summary Web API is a great new addition to the ASP.NET platform and it addresses a serious need for consolidation of a lot of half-baked HTTP service API technologies that came before it. Web API feels 'right', and hits the right combination of usability and flexibility at least for me and it's a good fit for true API scenarios. However, just because a new platform is available it doesn't meant that other tools or tech that came before it should be discarded or even upgraded to the new platform. There's nothing wrong with continuing to use MVC controller methods to handle API tasks if that's what your app is running now - there's very little to be gained by upgrading to Web API just because. But going forward Web API clearly is the way to go, when building HTTP data interfaces and it's good to see that Microsoft got this one right - it was sorely needed! Resources ASP.NET Web API AspConf Ask the Experts Session (first 5 minutes) © Rick Strahl, West Wind Technologies, 2005-2012Posted in Web Api Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
Plan Caching and Query Memory Part II (Hash Match) – When not to use stored procedure - Most common performance mistake SQL Server developers make.

- by sqlworkshops

SQL Server estimates Memory requirement at compile time, when stored procedure or other plan caching mechanisms like sp_executesql or prepared statement are used, the memory requirement is estimated based on first set of execution parameters. This is a common reason for spill over tempdb and hence poor performance. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union. This article covers Hash Match operations with examples. It is recommended to read Plan Caching and Query Memory Part I before this article which covers an introduction and Query memory for Sort. In most cases it is cheaper to pay for the compilation cost of dynamic queries than huge cost for spill over tempdb, unless memory requirement for a query does not change significantly based on predicates. This article covers underestimation / overestimation of memory for Hash Match operation. Plan Caching and Query Memory Part I covers underestimation / overestimation for Sort. It is important to note that underestimation of memory for Sort and Hash Match operations lead to spill over tempdb and hence negatively impact performance. Overestimation of memory affects the memory needs of other concurrently executing queries. In addition, it is important to note, with Hash Match operations, overestimation of memory can actually lead to poor performance. To read additional articles I wrote click here. The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script. Most of these concepts are also covered in our webcasts: www.sqlworkshops.com/webcasts Let’s create a Customer’s State table that has 99% of customers in NY and the rest 1% in WA.Customers table used in Part I of this article is also used here.To observe Hash Warning, enable 'Hash Warning' in SQL Profiler under Events 'Errors and Warnings'. --Example provided by www.sqlworkshops.com drop table CustomersState go create table CustomersState (CustomerID int primary key, Address char(200), State char(2)) go insert into CustomersState (CustomerID, Address) select CustomerID, 'Address' from Customers update CustomersState set State = 'NY' where CustomerID % 100 != 1 update CustomersState set State = 'WA' where CustomerID % 100 = 1 go update statistics CustomersState with fullscan go Let’s create a stored procedure that joins customers with CustomersState table with a predicate on State. --Example provided by www.sqlworkshops.com create proc CustomersByState @State char(2) as begin declare @CustomerID int select @CustomerID = e.CustomerID from Customers e inner join CustomersState es on (e.CustomerID = es.CustomerID) where es.State = @State option (maxdop 1) end go Let’s execute the stored procedure first with parameter value ‘WA’ – which will select 1% of data. set statistics time on go --Example provided by www.sqlworkshops.com exec CustomersByState 'WA' goThe stored procedure took 294 ms to complete. The stored procedure was granted 6704 KB based on 8000 rows being estimated. The estimated number of rows, 8000 is similar to actual number of rows 8000 and hence the memory estimation should be ok. There was no Hash Warning in SQL Profiler. To observe Hash Warning, enable 'Hash Warning' in SQL Profiler under Events 'Errors and Warnings'. Now let’s execute the stored procedure with parameter value ‘NY’ – which will select 99% of data. -Example provided by www.sqlworkshops.com exec CustomersByState 'NY' go The stored procedure took 2922 ms to complete. The stored procedure was granted 6704 KB based on 8000 rows being estimated. The estimated number of rows, 8000 is way different from the actual number of rows 792000 because the estimation is based on the first set of parameter value supplied to the stored procedure which is ‘WA’ in our case. This underestimation will lead to spill over tempdb, resulting in poor performance. There was Hash Warning (Recursion) in SQL Profiler. To observe Hash Warning, enable 'Hash Warning' in SQL Profiler under Events 'Errors and Warnings'. Let’s recompile the stored procedure and then let’s first execute the stored procedure with parameter value ‘NY’. In a production instance it is not advisable to use sp_recompile instead one should use DBCC FREEPROCCACHE (plan_handle). This is due to locking issues involved with sp_recompile, refer to our webcasts, www.sqlworkshops.com/webcasts for further details. exec sp_recompile CustomersByState go --Example provided by www.sqlworkshops.com exec CustomersByState 'NY' go Now the stored procedure took only 1046 ms instead of 2922 ms. The stored procedure was granted 146752 KB of memory. The estimated number of rows, 792000 is similar to actual number of rows of 792000. Better performance of this stored procedure execution is due to better estimation of memory and avoiding spill over tempdb. There was no Hash Warning in SQL Profiler. Now let’s execute the stored procedure with parameter value ‘WA’. --Example provided by www.sqlworkshops.com exec CustomersByState 'WA' go The stored procedure took 351 ms to complete, higher than the previous execution time of 294 ms. This stored procedure was granted more memory (146752 KB) than necessary (6704 KB) based on parameter value ‘NY’ for estimation (792000 rows) instead of parameter value ‘WA’ for estimation (8000 rows). This is because the estimation is based on the first set of parameter value supplied to the stored procedure which is ‘NY’ in this case. This overestimation leads to poor performance of this Hash Match operation, it might also affect the performance of other concurrently executing queries requiring memory and hence overestimation is not recommended. The estimated number of rows, 792000 is much more than the actual number of rows of 8000. Intermediate Summary: This issue can be avoided by not caching the plan for memory allocating queries. Other possibility is to use recompile hint or optimize for hint to allocate memory for predefined data range.Let’s recreate the stored procedure with recompile hint. --Example provided by www.sqlworkshops.com drop proc CustomersByState go create proc CustomersByState @State char(2) as begin declare @CustomerID int select @CustomerID = e.CustomerID from Customers e inner join CustomersState es on (e.CustomerID = es.CustomerID) where es.State = @State option (maxdop 1, recompile) end go Let’s execute the stored procedure initially with parameter value ‘WA’ and then with parameter value ‘NY’. --Example provided by www.sqlworkshops.com exec CustomersByState 'WA' go exec CustomersByState 'NY' go The stored procedure took 297 ms and 1102 ms in line with previous optimal execution times. The stored procedure with parameter value ‘WA’ has good estimation like before. Estimated number of rows of 8000 is similar to actual number of rows of 8000. The stored procedure with parameter value ‘NY’ also has good estimation and memory grant like before because the stored procedure was recompiled with current set of parameter values. Estimated number of rows of 792000 is similar to actual number of rows of 792000. The compilation time and compilation CPU of 1 ms is not expensive in this case compared to the performance benefit. There was no Hash Warning in SQL Profiler. Let’s recreate the stored procedure with optimize for hint of ‘NY’. --Example provided by www.sqlworkshops.com drop proc CustomersByState go create proc CustomersByState @State char(2) as begin declare @CustomerID int select @CustomerID = e.CustomerID from Customers e inner join CustomersState es on (e.CustomerID = es.CustomerID) where es.State = @State option (maxdop 1, optimize for (@State = 'NY')) end go Let’s execute the stored procedure initially with parameter value ‘WA’ and then with parameter value ‘NY’. --Example provided by www.sqlworkshops.com exec CustomersByState 'WA' go exec CustomersByState 'NY' go The stored procedure took 353 ms with parameter value ‘WA’, this is much slower than the optimal execution time of 294 ms we observed previously. This is because of overestimation of memory. The stored procedure with parameter value ‘NY’ has optimal execution time like before. The stored procedure with parameter value ‘WA’ has overestimation of rows because of optimize for hint value of ‘NY’. Unlike before, more memory was estimated to this stored procedure based on optimize for hint value ‘NY’. The stored procedure with parameter value ‘NY’ has good estimation because of optimize for hint value of ‘NY’. Estimated number of rows of 792000 is similar to actual number of rows of 792000. Optimal amount memory was estimated to this stored procedure based on optimize for hint value ‘NY’. There was no Hash Warning in SQL Profiler. This article covers underestimation / overestimation of memory for Hash Match operation. Plan Caching and Query Memory Part I covers underestimation / overestimation for Sort. It is important to note that underestimation of memory for Sort and Hash Match operations lead to spill over tempdb and hence negatively impact performance. Overestimation of memory affects the memory needs of other concurrently executing queries. In addition, it is important to note, with Hash Match operations, overestimation of memory can actually lead to poor performance. Summary: Cached plan might lead to underestimation or overestimation of memory because the memory is estimated based on first set of execution parameters. It is recommended not to cache the plan if the amount of memory required to execute the stored procedure has a wide range of possibilities. One can mitigate this by using recompile hint, but that will lead to compilation overhead. However, in most cases it might be ok to pay for compilation rather than spilling sort over tempdb which could be very expensive compared to compilation cost. The other possibility is to use optimize for hint, but in case one sorts more data than hinted by optimize for hint, this will still lead to spill. On the other side there is also the possibility of overestimation leading to unnecessary memory issues for other concurrently executing queries. In case of Hash Match operations, this overestimation of memory might lead to poor performance. When the values used in optimize for hint are archived from the database, the estimation will be wrong leading to worst performance, so one has to exercise caution before using optimize for hint, recompile hint is better in this case. I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts. Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here. Disclaimer and copyright information:This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners. Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work. This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk. R Meyyappan [email protected] LinkedIn: http://at.linkedin.com/in/rmeyyappan

Read the article

Search Results

Search found 1769 results on 71 pages for 'architechture concepts'.

Page 20/71 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

- by Saif Bechan

- by rchrd

- by Maria Colgan

- by Lori Lalonde

- by Pinal Dave

- by pinaldave

- by tmcginn

- by Richard Bingham

- by user793553

- by Tony Davis

- by glitch

- by Tony Davis

- by kshimizu-Oracle

- by Aaronaught

- by Daniel Loureiro

- by Anders Abel

- by Sumeet

- by TreyK

- by Yogendra

- by Somebody still uses you MS-DOS

- by c_maker

- by Sahat

- by Casebash

- by Rick Strahl

- by sqlworkshops

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >