Search Results

Search found 454 results on 19 pages for 'optimizations'.

Page 2/19 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • SQLAuthority News – Whitepaper Download – Using Star Join and Few-Outer-Row Optimizations to Improve Data Warehousing Queries

    - by pinaldave
    Size of the database is growing every day. Many organizations now a days have more than TB of the Data in their system. Performance is always part of the issue. Microsoft is really paying attention to the same and also focusing on improving performance for Data Warehousing. Microsoft has recently released whitepaper on the performance tuning subject of Data Warehousing. Here is the abstract about the whitepaper from official site: In this white paper we discuss two of the new features introduced in SQL Server 2008, Star Join and Few-Outer-Row optimizations. These two features are in SQL Server 2008 R2 as well.  We test the performance of SQL Server 2008 on a set of complex data warehouse queries designed to highlight the effect of these two features and observed a significant performance gain over SQL Server 2005 (without these two features). The results observed also apply to SQL Server 2008 R2.  On average, about 75 percent of the query execution time has been reduced, compared to SQL Server 2005. We also include data that shows a reduction in the number of rows processed and improved balance in parallel queries, both of which highlight the important role the Star Join and Few Outer-Row features played. I encouraged all of those interested in Data Warehouse to read it and see if they can learn the tricks. Using Star Join and Few-Outer-Row Optimizations to Improve Data Warehousing Queries Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL, Technology

    Read the article

  • Will these optimizations to my Ruby implementation of diff improve performance in a Rails app?

    - by grg-n-sox
    <tl;dr> In source version control diff patch generation, would it be worth it to use the optimizations listed at the very bottom of this writing (see <optimizations>) in my Ruby implementation of diff for making diff patches? </tl;dr> <introduction> I am programming something I have never done before and there might already be tools out there to do the exact thing I am programming but at this point I am having too much fun to care so I am still going to do it from scratch, even if there is a tool for this. So anyways, I am working on a Ruby on Rails app and need a certain feature. Basically I want each entry in a table of mine, let's say for example a table of video games, to have a stored chunk of text that represents a review or something of the sort for that table entry. However, I want this text to be both editable by any registered user and also keep track of different submissions in a version control system. The simplest solution I could think of is just implement a solution that keeps track of the text body and the diff patch history of different versions of the text body as objects in Ruby and then serialize it, preferably in human readable form (so I'll most likely use YAML for this) for editing if needed due to corruption by a software bug or a mistake is made by an admin doing some version editing. So at first I just tried to dive in head first into this feature to find that the problem of generating a diff patch is more difficult that I thought to do efficiently. So I did some research and came across some ideas. Some I have implemented already and some I have not. However, it all pretty much revolves around the longest common subsequence problem, as you would already know if you have already done anything with diff or diff-like features, and optimization the function that solves it. Currently I have it so it truncates the compared versions of the text body from the beginning and end until non-matching lines are found. Then it solves the problem using a comparison matrix, but instead of incrementing the value stored in a cell when it finds a matching line like in most longest common subsequence algorithms I have seen examples of, I increment when I have a non-matching line so as to calculate edit distance instead of longest common subsequence. Although as far as I can tell between the two approaches, they are essentially two sides of the same coin so either could be used to derive an answer. It then back-traces through the comparison matrix and notes when there was an incrementation and in which adjacent cell (West, Northwest, or North) to determine that line's diff entry and assumes all other lines to be unchanged. Normally I would leave it at that, but since this is going into a Rails environment and not just some stand-alone Ruby script, I started getting worried about needing to optimize at least enough so if a spammer that somehow knew how I implemented the version control system and knew my worst case scenario entry still wouldn't be able to hit the server that bad. After some searching and reading of research papers and articles through the internet, I've come across several that seem decent but all seem to have pros and cons and I am having a hard time deciding how well in this situation that the pros and cons balance out. So are the ones listed here worth it? I have listed them with known pros and cons. </introduction> <optimizations> Chop the compared sequences into multiple chucks of subsequences by splitting where lines are unchanged, and then truncating each section of unchanged lines at the beginning and end of each section. Then solve the edit distance of each subsequence. Pro: Changes the time increase as the changed area gets bigger from a quadratic increase to something more similar to a linear increase. Con: Figuring out where to split already seems like you have to solve edit distance except now you don't care how it is changed. Would be fine if this was solvable by a process closer to solving hamming distance but a single insertion would throw this off. Use a cryptographic hash function to both convert all sequence elements into integers and ensure uniqueness. Then solve the edit distance comparing the hash integers instead of the sequence elements themselves. Pro: The operation of comparing two integers is faster than the operation of comparing two strings, so a slight performance gain is received after every comparison, which can be a lot overall. Con: Using a cryptographic hash function takes time to convert all the sequence elements and may end up costing more time to do the conversion that you gain back from the integer comparisons. You could use the built in hash function for a string but that will not guarantee uniqueness. Use lazy evaluation to only calculate the three center-most diagonals of the comparison matrix and then only calculate additional diagonals as needed. And then also use this approach to possibly remove the need on some comparisons to compare all three adjacent cells as desribed here. Pro: Can turn an algorithm that always takes O(n * m) time and make it so only worst case scenario is that time, best case becomes practically linear, and average case is somewhere between the two. Con: It is an algorithm I've only seen implemented in functional programming languages and I am having a difficult time comprehending how to convert this into Ruby based on how it is described at the site linked to above. Make a C module and do the hard work at the native level in C and just make a Ruby wrapper for it so Ruby can make all the calls to it that it needs. Pro: I have to imagine that evaluating something like this in could be a LOT faster. Con: I have no idea how Rails handles apps with ruby code that has C extensions and it hurts the portability of the app. This is an optimization for after the solving of edit distance, but idea is to store additional combined diffs with the ones produced by each version to make a delta-tree data structure with the most recently made diff as the root node of the tree so getting to any version takes worst case time of O(log n) instead of O(n). Pro: Would make going back to an old version a lot faster. Con: It would mean every new commit, the delta-tree would get a new root node that will cost time to reorganize the delta-tree for an operation that will be carried out a lot more often than going back a version, not to mention the unlikelihood it will be an old version. </optimizations> So are these things worth the effort?

    Read the article

  • Is there a detailed description of optimizations in the Android build process?

    - by Daniel Lew
    I've been curious as to all the optimizations that go into the building of an .apk. I'm curious because of two things I've tried in the past to bring down the size of my .apk: I have had a few large json assets in projects before, as well as a static sqlite database. I tried bringing down the size of the apk by gzipping them before the build process, but the resulting size is exactly the same. I just today tried pngcrush on my /drawable/ folders. The resulting build was exactly the same size as before. I would think that perhaps #1 could be explained by the zip process, but simply zipping the /drawable/ folders in #2 result in different-sized files. Perhaps the build process runs something akin to pngcrush? Regardless, I was wondering if anyone knew where to find a detailed description of all the optimizations in the Android build process. I don't want to waste my time trying to optimize what is already automated, and also I think it'd help my understanding of the resulting apk. Does anyone know if this is documented anywhere?

    Read the article

  • Use Cherokee Instead of nginx in Front of Varnish to Get HTTP 1.1 Optimizations?

    - by espeed
    We have been running nginx - uWSGI, and now we are evaluating putting Varnish as a caching layer between nginx and uWSGI (similar to http://www.heroku.com/how/architecture). But, nginx only supports HTTP 1.0 on the back so it will have to create new connections with Varnish for each request. Many recommend running nginx in front of Varnish, but wouldn't it make much more sense to use something like Cherokee so that you eliminate the HTTP connection overhead since it supports HTTP 1.1 on the back?

    Read the article

  • Why does VS2005 skip execution of lines when debugging managed C++ without optimizations?

    - by Sakin
    I ran into a rather odd behavior that I don't even know how to start describing. I wrote a piece of managed C++ code that makes calls to native methods. A (very) simplified version of the code would look like this (I know it looks like a full native function, just assume there is managed stuff being done all over the place): int somefunction(ptrHolder x) { // the accessptr method returns a native pointer if (x.accessptr() != nullptr) // I tried this with nullptr, NULL, 0) { try { x->doSomeNativeVeryImportantStuff(); // or whatever, doesn't matter } catch (SomeCustomExceptionClass &) { return 0; } } SomeOtherNativeClass::doStaticMagic(); return 1; } I compiled this code without optimizations using the /clr flag (VS.NET 2005, SP2) and when running it in the debugger I get to the if statement, since the pointer is actually null, I don't enter the if, but surprisingly, the cursor jumps directly to the return 1 statement, ignoring the doStaticMagic() method completely!!! When looking at the assembly code, I see that it really jumps directly to that line. If I force the debugger to enter the if block, I also jump to the return 1 statement after I press F10. Any ideas why this is happening? Thanks, Ariel

    Read the article

  • Does the .NET CLR Really Optimize for the Current Processor

    - by dewald
    When I read about the performance of JITted languages like C# or Java, authors usually say that they should/could theoretically outperform many native-compiled applications. The theory being that native applications are usually just compiled for a processor family (like x86), so the compiler cannot make certain optimizations as they may not truly be optimizations on all processors. On the other hand, the CLR can make processor-specific optimizations during the JIT process. Does anyone know if Microsoft's (or Mono's) CLR actually performs processor-specific optimizations during the JIT process? If so, what kind of optimizations?

    Read the article

  • MySQL Connect 9 Days Away – Optimizer Sessions

    - by Bertrand Matthelié
    72 1024x768 Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Following my previous blog post focusing on InnoDB talks at MySQL Connect, let us review today the sessions focusing on the MySQL Optimizer: Saturday, 11.30 am, Room Golden Gate 6: MySQL Optimizer Overview—Olav Sanstå, Oracle The goal of MySQL optimizer is to take a SQL query as input and produce an optimal execution plan for the query. This session presents an overview of the main phases of the MySQL optimizer and the primary optimizations done to the query. These optimizations are based on a combination of logical transformations and cost-based decisions. Examples of optimization strategies the presentation covers are the main query transformations, the join optimizer, the data access selection strategies, and the range optimizer. For the cost-based optimizations, an overview of the cost model and the data used for doing the cost estimations is included. Saturday, 1.00 pm, Room Golden Gate 6: Overview of New Optimizer Features in MySQL 5.6—Manyi Lu, Oracle Many optimizer features have been added into MySQL 5.6. This session provides an introduction to these great features. Multirange read, index condition pushdown, and batched key access will yield huge performance improvements on large data volumes. Structured explain, explain for update/delete/insert, and optimizer tracing will help users analyze and speed up queries. And last but not least, the session covers subquery optimizations in Release 5.6. Saturday, 7.00 pm, Room Golden Gate 4: BoF: Query Optimizations: What Is New and What Is Coming? This BoF presents common techniques for query optimization, covers what is new in MySQL 5.6, and provides a discussion forum in which attendees can tell the MySQL optimizer team which optimizations they would like to see in the future. Sunday, 1.15 pm, Room Golden Gate 8: Query Performance Comparison of MySQL 5.5 and MySQL 5.6—Øystein Grøvlen, Oracle MySQL Release 5.6 contains several improvements in the query optimizer that create improved performance for complex queries. This presentation looks at how MySQL 5.6 improves the performance of many of the queries in the DBT-3 benchmark. Based on the observed improvements, the presentation discusses what makes the specific queries perform better in Release 5.6. It describes the relevant new optimization techniques and gives examples of the types of queries that will benefit from these techniques. Sunday, 4.15 pm, Room Golden Gate 4: Powerful EXPLAIN in MySQL 5.6—Evgeny Potemkin, Oracle The EXPLAIN command of MySQL has long been a very useful tool for understanding how MySQL will execute a query. Release 5.6 of the MySQL database offers several new additions that give more-detailed information about the query plan and make it easier to understand at the same time. This presentation gives an overview of new EXPLAIN features: structured EXPLAIN in JSON format, EXPLAIN for INSERT/UPDATE/DELETE, and optimizer tracing. Examples in the session give insights into how you can take advantage of the new features. They show how these features supplement and relate to each other and to classical EXPLAIN and how and why the MySQL server chooses a particular query plan. You can check out the full program here as well as in the September edition of the MySQL newsletter. Not registered yet? You can still save US$ 300 over the on-site fee – Register Now!

    Read the article

  • SQL SERVER – DMV – sys.dm_exec_query_optimizer_info – Statistics of Optimizer

    - by pinaldave
    Incredibly, SQL Server has so much information to share with us. Every single day, I am amazed with this SQL Server technology. Sometimes I find several interesting information by just querying few of the DMV. And when I present this info in front of my client during performance tuning consultancy, they are surprised with my findings. Today, I am going to share one of the hidden gems of DMV with you, the one which I frequently use to understand what’s going on under the hood of SQL Server. SQL Server keeps the record of most of the operations of the Query Optimizer. We can learn many interesting details about the optimizer which can be utilized to improve the performance of server. SELECT * FROM sys.dm_exec_query_optimizer_info WHERE counter IN ('optimizations', 'elapsed time','final cost', 'insert stmt','delete stmt','update stmt', 'merge stmt','contains subquery','tables', 'hints','order hint','join hint', 'view reference','remote query','maximum DOP', 'maximum recursion level','indexed views loaded', 'indexed views matched','indexed views used', 'indexed views updated','dynamic cursor request', 'fast forward cursor request') All occurrence values are cumulative and are set to 0 at system restart. All values for value fields are set to NULL at system restart. I have removed a few of the internal counters from the script above, and kept only documented details. Let us check the result of the above query. As you can see, there is so much vital information that is revealed in above query. I can easily say so many things about how many times Optimizer was triggered and what the average time taken by it to optimize my queries was. Additionally, I can also determine how many times update, insert or delete statements were optimized. I was able to quickly figure out that my client was overusing the Query Hints using this dynamic management view. If you have been reading my blog, I am sure you are aware of my series related to SQL Server Views SQL SERVER – The Limitations of the Views – Eleven and more…. With this, I can take a quick look and figure out how many times Views were used in various solutions within the query. Moreover, you can easily know what fraction of the optimizations has been involved in tuning server. For example, the following query would tell me, in total optimizations, what the fraction of time View was “reference“. As this View also includes system Views and DMVs, the number is a bit higher on my machine. SELECT (SELECT CAST (occurrence AS FLOAT) FROM sys.dm_exec_query_optimizer_info WHERE counter = 'view reference') / (SELECT CAST (occurrence AS FLOAT) FROM sys.dm_exec_query_optimizer_info WHERE counter = 'optimizations') AS ViewReferencedFraction Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL DMV, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • The Significance of SEO

    SEO is not rocket science. It means some simple approaches to mark relevant keywords (on-page optimizations) and link building (off-page optimizations). In many cases SEO means only few simple modifications of the page to emphasize few keywords.

    Read the article

  • Optimizing Solaris 11 SHA-1 on Intel Processors

    - by danx
    SHA-1 is a "hash" or "digest" operation that produces a 160 bit (20 byte) checksum value on arbitrary data, such as a file. It is intended to uniquely identify text and to verify it hasn't been modified. Max Locktyukhin and others at Intel have improved the performance of the SHA-1 digest algorithm using multiple techniques. This code has been incorporated into Solaris 11 and is available in the Solaris Crypto Framework via the libmd(3LIB), the industry-standard libpkcs11(3LIB) library, and Solaris kernel module sha1. The optimized code is used automatically on systems with a x86 CPU supporting SSSE3 (Intel Supplemental SSSE3). Intel microprocessor architectures that support SSSE3 include Nehalem, Westmere, Sandy Bridge microprocessor families. Further optimizations are available for microprocessors that support AVX (such as Sandy Bridge). Although SHA-1 is considered obsolete because of weaknesses found in the SHA-1 algorithm—NIST recommends using at least SHA-256, SHA-1 is still widely used and will be with us for awhile more. Collisions (the same SHA-1 result for two different inputs) can be found with moderate effort. SHA-1 is used heavily though in SSL/TLS, for example. And SHA-1 is stronger than the older MD5 digest algorithm, another digest option defined in SSL/TLS. Optimizations Review SHA-1 operates by reading an arbitrary amount of data. The data is read in 512 bit (64 byte) blocks (the last block is padded in a specific way to ensure it's a full 64 bytes). Each 64 byte block has 80 "rounds" of calculations (consisting of a mixture of "ROTATE-LEFT", "AND", and "XOR") applied to the block. Each round produces a 32-bit intermediate result, called W[i]. Here's what each round operates: The first 16 rounds, rounds 0 to 15, read the 512 bit block 32 bits at-a-time. These 32 bits is used as input to the round. The remaining rounds, rounds 16 to 79, use the results from the previous rounds as input. Specifically for round i it XORs the results of rounds i-3, i-8, i-14, and i-16 and rotates the result left 1 bit. The remaining calculations for the round is a series of AND, XOR, and ROTATE-LEFT operators on the 32-bit input and some constants. The 32-bit result is saved as W[i] for round i. The 32-bit result of the final round, W[79], is the SHA-1 checksum. Optimization: Vectorization The first 16 rounds can be vectorized (computed in parallel) because they don't depend on the output of a previous round. As for the remaining rounds, because of step 2 above, computing round i depends on the results of round i-3, W[i-3], one can vectorize 3 rounds at-a-time. Max Locktyukhin found through simple factoring, explained in detail in his article referenced below, that the dependencies of round i on the results of rounds i-3, i-8, i-14, and i-16 can be replaced instead with dependencies on the results of rounds i-6, i-16, i-28, and i-32. That is, instead of initializing intermediate result W[i] with: W[i] = (W[i-3] XOR W[i-8] XOR W[i-14] XOR W[i-16]) ROTATE-LEFT 1 Initialize W[i] as follows: W[i] = (W[i-6] XOR W[i-16] XOR W[i-28] XOR W[i-32]) ROTATE-LEFT 2 That means that 6 rounds could be vectorized at once, with no additional calculations, instead of just 3! This optimization is independent of Intel or any other microprocessor architecture, although the microprocessor has to support vectorization to use it, and exploits one of the weaknesses of SHA-1. Optimization: SSSE3 Intel SSSE3 makes use of 16 %xmm registers, each 128 bits wide. The 4 32-bit inputs to a round, W[i-6], W[i-16], W[i-28], W[i-32], all fit in one %xmm register. The following code snippet, from Max Locktyukhin's article, converted to ATT assembly syntax, computes 4 rounds in parallel with just a dozen or so SSSE3 instructions: movdqa W_minus_04, W_TMP pxor W_minus_28, W // W equals W[i-32:i-29] before XOR // W = W[i-32:i-29] ^ W[i-28:i-25] palignr $8, W_minus_08, W_TMP // W_TMP = W[i-6:i-3], combined from // W[i-4:i-1] and W[i-8:i-5] vectors pxor W_minus_16, W // W = (W[i-32:i-29] ^ W[i-28:i-25]) ^ W[i-16:i-13] pxor W_TMP, W // W = (W[i-32:i-29] ^ W[i-28:i-25] ^ W[i-16:i-13]) ^ W[i-6:i-3]) movdqa W, W_TMP // 4 dwords in W are rotated left by 2 psrld $30, W // rotate left by 2 W = (W >> 30) | (W << 2) pslld $2, W_TMP por W, W_TMP movdqa W_TMP, W // four new W values W[i:i+3] are now calculated paddd (K_XMM), W_TMP // adding 4 current round's values of K movdqa W_TMP, (WK(i)) // storing for downstream GPR instructions to read A window of the 32 previous results, W[i-1] to W[i-32] is saved in memory on the stack. This is best illustrated with a chart. Without vectorization, computing the rounds is like this (each "R" represents 1 round of SHA-1 computation): RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR With vectorization, 4 rounds can be computed in parallel: RRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRR Optimization: AVX The new "Sandy Bridge" microprocessor architecture, which supports AVX, allows another interesting optimization. SSSE3 instructions have two operands, a input and an output. AVX allows three operands, two inputs and an output. In many cases two SSSE3 instructions can be combined into one AVX instruction. The difference is best illustrated with an example. Consider these two instructions from the snippet above: pxor W_minus_16, W // W = (W[i-32:i-29] ^ W[i-28:i-25]) ^ W[i-16:i-13] pxor W_TMP, W // W = (W[i-32:i-29] ^ W[i-28:i-25] ^ W[i-16:i-13]) ^ W[i-6:i-3]) With AVX they can be combined in one instruction: vpxor W_minus_16, W, W_TMP // W = (W[i-32:i-29] ^ W[i-28:i-25] ^ W[i-16:i-13]) ^ W[i-6:i-3]) This optimization is also in Solaris, although Sandy Bridge-based systems aren't widely available yet. As an exercise for the reader, AVX also has 256-bit media registers, %ymm0 - %ymm15 (a superset of 128-bit %xmm0 - %xmm15). Can %ymm registers be used to parallelize the code even more? Optimization: Solaris-specific In addition to using the Intel code described above, I performed other minor optimizations to the Solaris SHA-1 code: Increased the digest(1) and mac(1) command's buffer size from 4K to 64K, as previously done for decrypt(1) and encrypt(1). This size is well suited for ZFS file systems, but helps for other file systems as well. Optimized encode functions, which byte swap the input and output data, to copy/byte-swap 4 or 8 bytes at-a-time instead of 1 byte-at-a-time. Enhanced the Solaris mdb(1) and kmdb(1) debuggers to display all 16 %xmm and %ymm registers (mdb "$x" command). Previously they only displayed the first 8 that are available in 32-bit mode. Can't optimize if you can't debug :-). Changed the SHA-1 code to allow processing in "chunks" greater than 2 Gigabytes (64-bits) Performance I measured performance on a Sun Ultra 27 (which has a Nehalem-class Xeon 5500 Intel W3570 microprocessor @3.2GHz). Turbo mode is disabled for consistent performance measurement. Graphs are better than words and numbers, so here they are: The first graph shows the Solaris digest(1) command before and after the optimizations discussed here, contained in libmd(3LIB). I ran the digest command on a half GByte file in swapfs (/tmp) and execution time decreased from 1.35 seconds to 0.98 seconds. The second graph shows the the results of an internal microbenchmark that uses the Solaris libpkcs11(3LIB) library. The operations are on a 128 byte buffer with 10,000 iterations. The results show operations increased from 320,000 to 416,000 operations per second. Finally the third graph shows the results of an internal kernel microbenchmark that uses the Solaris /kernel/crypto/amd64/sha1 module. The operations are on a 64Kbyte buffer with 100 iterations. third graph shows the results of an internal kernel microbenchmark that uses the Solaris /kernel/crypto/amd64/sha1 module. The operations are on a 64Kbyte buffer with 100 iterations. The results show for 1 kernel thread, operations increased from 410 to 600 MBytes/second. For 8 kernel threads, operations increase from 1540 to 1940 MBytes/second. Availability This code is in Solaris 11 FCS. It is available in the 64-bit libmd(3LIB) library for 64-bit programs and is in the Solaris kernel. You must be running hardware that supports Intel's SSSE3 instructions (for example, Intel Nehalem, Westmere, or Sandy Bridge microprocessor architectures). The easiest way to determine if SSSE3 is available is with the isainfo(1) command. For example, nehalem $ isainfo -v $ isainfo -v 64-bit amd64 applications sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu 32-bit i386 applications sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov sep cx8 tsc fpu If the output also shows "avx", the Solaris executes the even-more optimized 3-operand AVX instructions for SHA-1 mentioned above: sandybridge $ isainfo -v 64-bit amd64 applications avx xsave pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu 32-bit i386 applications avx xsave pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov sep cx8 tsc fpu No special configuration or setup is needed to take advantage of this code. Solaris libraries and kernel automatically determine if it's running on SSSE3 or AVX-capable machines and execute the correctly-tuned code for that microprocessor. Summary The Solaris 11 Crypto Framework, via the sha1 kernel module and libmd(3LIB) and libpkcs11(3LIB) libraries, incorporated a useful SHA-1 optimization from Intel for SSSE3-capable microprocessors. As with other Solaris optimizations, they come automatically "under the hood" with the current Solaris release. References "Improving the Performance of the Secure Hash Algorithm (SHA-1)" by Max Locktyukhin (Intel, March 2010). The source for these SHA-1 optimizations used in Solaris "SHA-1", Wikipedia Good overview of SHA-1 FIPS 180-1 SHA-1 standard (FIPS, 1995) NIST Comments on Cryptanalytic Attacks on SHA-1 (2005, revised 2006)

    Read the article

  • Advantages of compilers for functional languages over compilers for imperative languages

    - by Onorio Catenacci
    As a follow up to this question What are the advantages of built-in immutability of F# over C#?--am I correct in assuming that the F# compiler can make certain optimizations knowing that it's dealing with largely immutable code? I mean even if a developer writes "Functional C#" the compiler wouldn't know all of the immutability that the developer had tried to code in so that it couldn't make the same optimizations, right? In general would the compiler of a functional language be able to make optimizations that would not be possible with an imperative language--even one written with as much immutability as possible?

    Read the article

  • Oracle Solaris Studio Express 6/10 and its Customer Feedback Program are now available

    - by pieter.humphrey
    Oracle Solaris Studio Express 6/10 and the Customer Feedback Program for it are now available. Oracle Solaris Studio Express 6/10 is available on Solaris 10 (SPARC, x86), OEL 5 (x86), RHEL 5 (x86), SuSE 11 (x86) today and will be available for OpenSolaris in the near future. New feature highlights since the last release include: C/C++/Fortran compiler optimizations for the latest UltraSPARC and SPARC64-based architectures such as UltraSPARC T2 and SPARC64 VII C/C++/Fortran compiler optimizations for the latest x86 architectures including the Intel Xeon 7500 processor series (Nehalem-EX) and the Intel Xeon 5600 processor series (Westmere-EP) Enhanced debugging and code coverage tooling Improved application profiling with the Performance Analyzer Updated IDE based on NetBeans 6.8 To find more information and download go to http://developers.sun.com/sunstudio/downloads/express/ To participate in the customer feedback program for Oracle Solaris Studio Express 6/10 go to http://developers.sun.com/sunstudio/customerfeedback/index.jsp Please get the word out, try out this new release and send us your feedback! Technorati Tags: developer,development,solaris,sparc,Oracle Solaris Studio,Solaris Studio,Sun Studio,oracle,otn del.icio.us Tags: developer,development,solaris,sparc,Oracle Solaris Studio,Solaris Studio,Sun Studio,oracle,otn

    Read the article

  • Google I/O 2010 - Optimizing apps with the GWT Compiler

    Google I/O 2010 - Optimizing apps with the GWT Compiler Google I/O 2010 - Faster apps faster - Optimizing apps with the GWT Compiler GWT 201 Ray Cromwell The GWT compiler isn't just a Java to JavaScript transliterator. It performs many optimizations along the way. In this session, we'll show you not only the optimizations performed, but how you can get more out of the compiler itself. Learn how to speed up compiles, use -draftCompile, compile for only one locale/browser permutation, and more. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 7 0 ratings Time: 56:17 More in Science & Technology

    Read the article

  • Google I/O 2010 - Appstats - instrumentation for App Engine

    Google I/O 2010 - Appstats - instrumentation for App Engine Google I/O 2010 - Appstats - RPC instrumentation and optimizations for App Engine App Engine 201 Guido van Rossum Appstats is a pure userland library (for Python and Java) that inserts instrumentation hooks into the App Engine runtime at the interface between the runtime and services like the datastore. The collected statistics can be browsed in a rich UI which allows drilling down to various levels of detail. The talk will also discuss common optimizations to address typical findings. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 19 0 ratings Time: 59:31 More in Science & Technology

    Read the article

  • Improving the performance of JDeveloper11g (part 2) and JVMs in general

    - by asantaga
    Just received an email from one of our JVM developers who read my blog entry on Performance tuning JDeveloper11g and he's confirmed that all of the above parameters are totally supported :-) He's also provided a description of the parameters so we can learn what magic is actually being applied. - -XX:+AggressiveOpts -- this enables the latest and greatest JVM optimizations. It will likely help most Java applications. It's fully supported. The downside of it is that because it has the latest and greatest optimizations, there is some small probability that it may not offer as good of an experience. As those features enabled with this command line option have "matured", they are made the default in a future JDK release. So, you can think of this command line option as the place where the newest optimizations get introduced. Some time later they are moved out from under AggressiveOpts to become default behavior. -XX:+OptimizeStringConcat -- only works with the -server JVM. It may be enabled by the default in a future JDK 7 update release. This option delays the construction of a StringBuilder/StringBuffer and attempts to avoid re-sizing the underlying char[] by attempting to detect the size of the char[] to allocate based on what's being appended to the StringBuilder/StringBuffer. -XX:+UseStringCache -- I would not suggest using this unless you knew that JDeveloper allocated the same string over and over again. And, the string that's allocated over and over again is one of the first 100,000 allocated strings. In short, I'd recommend against using it. And, in fact, in Java 7 (currently) does not include this feature. -XX:+UseCompressedOops -- applicable to 64-bit JVMs. And, if you're using a 64-bit JVM, I'd suggest you use it. It's auto enabled in JDK 7 64-bit JVMs and later JDK 6 64-bit JVMs enable it by default too. -XX:+UseGCOverheadLimit -- by default this option is already enabled. One other command line option to consider is -XX:+TieredCompilation for a JDK 6 Update 25 or later, or JDK 7. This gives you the startup of a -client JVM and the peak performance of a -server JVM. Awesome-ness!  Finally, Charlies also pointed out to me a "new" book he's just published where he goes into the details of JVM tuning, a must for all Fusion Middleware tuning exercises..  (click the book)  Thanks Charlie!

    Read the article

  • Switching from Debug into Release Mode with VS2010 as IDE and Intel C++ Compiler 13

    - by Drazick
    I have a code of a Plug In from an SDK. The code is in Debug Mode. I use Intel Compiler which only applies optimizations in Release Mode. Under configuration manager of the project only "Debug" mode is defined. How could I switch to "Release" mode and enable all Intel Compiler's optimizations? If I enable them on debug mode nothing is applied (Empty Report). I couldn't find the trick to do so. Thank You.

    Read the article

  • Why is shrink_to_fit non-binding?

    - by Roger Pate
    The C++0x FCD states in 23.3.6.2 vector capacity: void shrink_to_fit(); Remarks: shrink_to_fit is a non-binding request to reduce capacity() to size(). [Note: The request is non-binding to allow latitude for implementation-specific optimizations. —end note] Why is it non-binding, and what optimizations are intended to be allowed?

    Read the article

  • SQLAuthority News – A Successful Performance Tuning Seminar at Pune – Dec 4-5, 2010

    - by pinaldave
    This is report to my third of very successful seminar event on SQL Server Performance Tuning. SQL Server Performance Tuning Seminar in Colombo was oversubscribed with total of 35 attendees. You can read the details over here SQLAuthority News – SQL Server Performance Optimizations Seminar – Grand Success – Colombo, Sri Lanka – Oct 4 – 5, 2010. SQL Server Performance Tuning Seminar in Hyderabad was oversubscribed with total of 25 attendees. You can read the details over here SQL SERVER – A Successful Performance Tuning Seminar – Hyderabad – Nov 27-28, 2010. The same Seminar was offered in Pune on December 4,-5, 2010. We had another successful seminar with lots of performance talk. This seminar was attended by 30 attendees. The best part of the seminar was that along with the our agenda, we have talked about following very interesting concepts. Deadlocks Detection and Removal Dynamic SQL and Inline Code SQL Optimizations Multiple OR conditions and performance tuning Dynamic Search Condition Building and Improvement Memory Cache and Improvement Bottleneck Detections – Memory, CPU and IO Beginning Performance Tuning on Production Parametrization Improving already Super Fast Queries Convenience vs. Performance Proper way to create Indexes Hints and Disadvantages I had great time doing the seminar and sharing my performance tricks with all. The highlight of this seminar was I have explained the attendees, how I begin doing performance tuning when I go for Performance Tuning Consultations.   Pinal Dave at SQL Performance Tuning Seminar SQL Server Performance Tuning Seminar Pinal Dave at SQL Performance Tuning Seminar Pinal Dave at SQL Performance Tuning Seminar SQL Server Performance Tuning Seminar SQL Server Performance Tuning Seminar This seminar series are 100% demo oriented and no usual PowerPoint talk. They are created from my experiences of various organizations for performance tuning. I am not planning any more seminar this year as it was great but I am booked currently for next 60 days at various performance tuning engagements. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, SQLAuthority News, T SQL, Technology

    Read the article

  • in HFT trading should I upgrade from Windows Server 2008 R2 to Windows Server 2012?

    - by javapowered
    I'm using HP DL360p Gen8 + Windows Server 2008 R2 for HFT trading. That means that every 10 microseconds is important for me. I do understand that if I need everything to be so fast I probably should consider using Linux. But in this post I want to compare only Windows Server 2008 R2 and Windows Server 2012. I've found in internet couple articles that suggest how to tune Windows Server 2012 for low latency http://msdn.microsoft.com/library/windows/hardware/jj248719 http://technet.microsoft.com/en-us/library/hh831415.aspx Most part of optimizations from these articles apply only to Windows Server 2012 and can not be used on Windows Server 2008 R2. So now I think that as I can optimize Windows Server 2012 for low latency, probaly I should upgrade? After optimizations how much faster windows server 2012 would be (ideally in microseconds :)?

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >