Search Results

Search found 7065 results on 283 pages for 'cpu sockets'.

Page 152/283 | < Previous Page | 148 149 150 151 152 153 154 155 156 157 158 159  | Next Page >

  • Understanding and Controlling Parallel Query Processing in SQL Server

    Data warehousing and general reporting applications tend to be CPU intensive because they need to read and process a large number of rows. To facilitate quick data processing for queries that touch a large amount of data, Microsoft SQL Server exploits the power of multiple logical processors to provide parallel query processing operations such as parallel scans. Through extensive testing, we have learned that, for most large queries that are executed in a parallel fashion, SQL Server can deliver linear or nearly linear response time speedup as the number of logical processors increases. However, some queries in high parallelism scenarios perform suboptimally. There are also some parallelism issues that can occur in a multi-user parallel query workload. This white paper describes parallel performance problems you might encounter when you run such queries and workloads, and it explains why these issues occur. In addition, it presents how data warehouse developers can detect these issues, and how they can work around them or mitigate them.

    Read the article

  • SQL SERVER – Database Dynamic Caching by Automatic SQL Server Performance Acceleration

    - by pinaldave
    My second look at SafePeak’s new version (2.1) revealed to me few additional interesting features. For those of you who hadn’t read my previous reviews SafePeak and not familiar with it, here is a quick brief: SafePeak is in business of accelerating performance of SQL Server applications, as well as their scalability, without making code changes to the applications or to the databases. SafePeak performs database dynamic caching, by caching in memory result sets of queries and stored procedures while keeping all those cache correct and up to date. Cached queries are retrieved from the SafePeak RAM in microsecond speed and not send to the SQL Server. The application gets much faster results (100-500 micro seconds), the load on the SQL Server is reduced (less CPU and IO) and the application or the infrastructure gets better scalability. SafePeak solution is hosted either within your cloud servers, hosted servers or your enterprise servers, as part of the application architecture. Connection of the application is done via change of connection strings or adding reroute line in the c:\windows\system32\drivers\etc\hosts file on all application servers. For those who would like to learn more on SafePeak architecture and how it works, I suggest to read this vendor’s webpage: SafePeak Architecture. More interesting new features in SafePeak 2.1 In my previous review of SafePeak new I covered the first 4 things I noticed in the new SafePeak (check out my article “SQLAuthority News – SafePeak Releases a Major Update: SafePeak version 2.1 for SQL Server Performance Acceleration”): Cache setup and fine-tuning – a critical part for getting good caching results Database templates Choosing which database to cache Monitoring and analysis options by SafePeak Since then I had a chance to play with SafePeak some more and here is what I found. 5. Analysis of SQL Performance (present and history): In SafePeak v.2.1 the tools for understanding of performance became more comprehensive. Every 15 minutes SafePeak creates and updates various performance statistics. Each query (or a procedure execute) that arrives to SafePeak gets a SQL pattern, and after it is used again there are statistics for such pattern. An important part of this product is that it understands the dependencies of every pattern (list of tables, views, user defined functions and procs). From this understanding SafePeak creates important analysis information on performance of every object: response time from the database, response time from SafePeak cache, average response time, percent of traffic and break down of behavior. One of the interesting things this behavior column shows is how often the object is actually pdated. The break down analysis allows knowing the above information for: queries and procedures, tables, views, databases and even instances level. The data is show now on all arriving queries, both read queries (that can be cached), but also any types of updates like DMLs, DDLs, DCLs, and even session settings queries. The stats are being updated every 15 minutes and SafePeak dashboard allows going back in time and investigating what happened within any time frame. 6. Logon trigger, for making sure nothing corrupts SafePeak cache data If you have an application with many parts, many servers many possible locations that can actually update the database, or the SQL Server is accessible to many DBAs or software engineers, each can access some database directly and do some changes without going thru SafePeak – this can create a potential corruption of the data stored in SafePeak cache. To make sure SafePeak cache is correct it needs to get all updates to arrive to SafePeak, and if a DBA will access the database directly and do some changes, for example, then SafePeak will simply not know about it and will not clean SafePeak cache. In the new version, SafePeak brought a new feature called “Logon Trigger” to solve the above challenge. By special click of a button SafePeak can deploy a special server logon trigger (with a CLR object) on your SQL Server that actually monitors all connections and informs SafePeak on any connection that is coming not from SafePeak. In SafePeak dashboard there is an interface that allows to control which logins can be ignored based on login names and IPs, while the rest will invoke cache cleanup of SafePeak and actually locks SafePeak cache until this connection will not be closed. Important to note, that this does not interrupt any logins, only informs SafePeak on such connection. On the Dashboard screen in SafePeak you will be able to see those connections and then decide what to do with them. Configuration of this feature in SafePeak dashboard can be done here: Settings -> SQL instances management -> click on instance -> Logon Trigger tab. Other features: 7. User management ability to grant permissions to someone without changing its configuration and only use SafePeak as performance analysis tool. 8. Better reports for analysis of performance using 15 minute resolution charts. 9. Caching of client cursors 10. Support for IPv6 Summary SafePeak is a great SQL Server performance acceleration solution for users who want immediate results for sites with performance, scalability and peak spikes challenges. Especially if your apps are packaged or 3rd party, since no code changes are done. SafePeak can significantly increase response times, by reducing network roundtrip to the database, decreasing CPU resource usage, eliminating I/O and storage access. SafePeak team provides a free fully functional trial www.safepeak.com/download and actually provides a one-on-one assistance during such trial. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Pinal Dave, PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology

    Read the article

  • What a Performance! MySQL 5.5 and InnoDB 1.1 running on Oracle Linux

    - by zeynep.koch(at)oracle.com
    The MySQL performance team in Oracle has recently completed a series of benchmarks comparing Read / Write and Read-Only performance of MySQL 5.5 with the InnoDB and MyISAM storage engines. Compared to MyISAM, InnoDB delivered 35x higher throughput on the Read / Write test and 5x higher throughput on the Read-Only test, with 90% scalability across 36 CPU cores. A full analysis of results and MySQL configuration parameters are documented in a new whitepaperIn addition to the benchmark, the new whitepaper, also includes:- A discussion of the use-cases for each storage engine- Best practices for users considering the migration of existing applications from MyISAM to InnoDB- A summary of the performance and scalability enhancements introduced with MySQL 5.5 and InnoDB 1.1.The benchmark itself was based on Sysbench, running on AMD Opteron "Magny-Cours" processors, and Oracle Linux with the Unbreakable Enterprise Kernel You can learn more about MySQL 5.5 and InnoDB 1.1 from here and download it from here to test whether you witness performance gains in your real-world applications.  By Mat Keep

    Read the article

  • Improving Partitioned Table Join Performance

    - by Paul White
    The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) );   CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY);   WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50);   -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE);   -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory:   Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   CREATE TABLE #P ( partition_number integer PRIMARY KEY);   INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1;   SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals;   DROP TABLE #P;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated  – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • Ubuntu 10.10 Mouse and Keyboard Freeze

    - by Kev
    I installed Ubuntu 10.10 today and have had mouse problem since. Symptoms: At some arbitrary point in time (frequency: 2-3 times per hour), the mouse and keyboard stops working for ever(may be). I start System monitor, I found out network was shutdown just before mouse freeze. Some time my keyboard keep typing one key. For example:77777777777777777777777777777777777777777777777777777.....(it keep typing for 20 sec) I found out a script just solve the freeze problem:(I hit Powerbutton) -----------------/etc/acpi/powerbtn.sh------------------------ event=button[ /]power action=/usr/sbin/fix_mouse.sh -----------------/usr/sbin/fix_mouse.sh------------------------ rmmod psmouse modprobe psmouse Yesterday I install Ubuntu 10.04 FAILED also have mouse problem. When I switch back to Windows XP. The network card is down. It kept connecting and disconnecting 1 time per sec. CPU: i5 Motherboard: ASUS P7P55D OS: Windows XP + Ubuntu 10.10 Video Card: ATI 5770 Mouse,Keyboard: PS/2

    Read the article

  • Dell Latitude E6520 overheating

    - by Wu Yi Han
    I'm a newcomer to Ubuntu 11.10. My laptop is a Dell Latitude E6520, Sandy Bridge platform. The system cooling fan is crazy all the time. I don't do any intensive tasks. I really hope my laptop doesn't become a mushroom cloud. I suppose there's no perfect way to solve this... Can I lower the CPU frequency? Jupiter 0.0.51 was installed (power save mode). Cooling worked in my Windows 7 system until I deleted it. (I won't go back to Windows 7.)

    Read the article

  • Ubuntu 12.04 installation aborts without giving any errors on Sony Vaio

    - by Guilherme Simoni
    I'm not able to install the release ubuntu-12.04-desktop-i386 on the laptop below: Sony Vaio VGN-FE21H CPU: Intel Core Duo T2300 1.66GHz Memory: 2GB DDR2 533MHz HDD: 100GB Graphics: NVIDIA GeForce 7400 256MB I'm using the ISO "ubuntu-12.04-desktop-i386.iso" burned into a DVD. I know the ISO is OK because I used it to successfully install on Virtualbox. Live DVD boots and runs OK, but I cannot install from it or directly from the boot menu. The installation goes through all the steps until the final part where is asked the Name, Name of PC and password. The problem is in the next step where it should start copying files and present some screens and features of Ubuntu. In this part the installation just close without any error message. If I am running the installation inside the live DVD it closes and returns to the home screen of the Live. If I am running straight from the boot it closes the graphic interface and restarts the PC. Does anybody know or faced the same problem?

    Read the article

  • Looking for VPS Hosting for a LAMP Web Application

    - by Ali
    Hi guys, Trying to find a Managed VPS Hosting Solution for a LAMP Web Application. The more CPU, RAM, and Disk space the better Don't need a huge amount of bandwidth for now Would like to be able to easily grow into a stronger server Have really responsive, dedicated, smart support staff -- our current hosting is just terrible My main problem is that I can't even find a non-biased website out there that does a proper comparison of VPS Hosting providers. Can anybody either suggest a reviews/ranking site or a hosting with proven record? How would you go about finding the best hosting service? Thanks a lot! Ali

    Read the article

  • ASP.NET Connections Fall 2011 Slides and Code

    - by Stephen Walther
    Thanks everyone who came to my talks at ASP.NET Connections in Las Vegas!  There was a definite theme to my talks this year…taking advantage of JavaScript to build a rich presentation layer. I gave the following three talks: JsRender Templates – Originally, I was scheduled to give a talk on jQuery Templates.  However, jQuery Templates has been deprecated and JsRender is the new technology which replaces jQuery Templates. In the talk, I give plenty of code samples of using JsRender.  You can download the slides and code samples RIGHT HERE   HTML5 – In this talk, I focused on the features of HTML5 which are the most interesting to developers building database-driven Web applications. In particular, I discussed Web Sockets,  Web workers, Web Storage, Indexed DB, and the Offline Application Cache. All of these features are supported by Safari, Chrome, and Firefox today and they will be supported by Internet Explorer 10. You can download the slides and code samples RIGHT HERE   Ajax Control Toolkit – My company, Superexpert, is responsible for developing and maintaining the Ajax Control Toolkit. In this talk, I discuss all of the bug fixes and new features which the developers on the Superexpert team have added to the Ajax Control Toolkit over the previous six months. We also had a good discussion of the features which people want in future releases of the Ajax Control Toolkit. The slides and code samples for this talk can be downloaded RIGHT HERE   I had a great time in Las Vegas!  Good questions, friendly audience, and lots of opportunities for me to learn new things!      -- Stephen

    Read the article

  • Ubuntu 12.04 install DVD-RW recorder PATA as SCSI

    - by Alexandre Gatelli
    Ubuntu 12.04 64-bit installed DVD-RW recorder PATA as SCSI. My DVD-RW recorder is in /dev/sr0 as SCSI. I opened Disk Management and my IDE PATA drive is installed as SCSI. I can't use this drive because it hangs computer (I need press reset button on CPU to back to Ubuntu). What do I do for the drive back to function with the correct drive (IDE mode)? With the first kernel version of the Ubuntu 12.04 64-bit was all functioning. Help me please.

    Read the article

  • Tolkien’s Rivendell Rendered in LEGO

    - by Jason Fitzpatrick
    If you’re a fan of all things geeky rendered in LEGO–and we certainly are–you’ll want to take a moment to appreciate this incredible model of the mythical Rivendell from the Lord of the Rings universe. Courtesy of builders Blake Baer and Jake Bittner, the behemoth model measures nearly 4×3 ft. in size, weighs 120 pounds, and required over 50,000 LEGO bricks. Hit up the link below to check out the full set of photos. Rivendell in LEGO [via Geeks Are Sexy] How To Switch Webmail Providers Without Losing All Your Email How To Force Windows Applications to Use a Specific CPU HTG Explains: Is UPnP a Security Risk?

    Read the article

  • Online scoreboard in Python?

    - by CorundumGames
    So my friend and I are working on an arcade-style game in Python and Pygame. We're beginning to look at the feasibility of an online leaderboard, given our current programming backgrounds. Such a leaderboard would have the following requirements/features; The ability to search through demographics like region, country, platform, game mode, recentness ("best scores this month") and difficulty. (e.g. to make it possible for someone to say "I'm the best player in Italy!" or "I'm the best Linux player in South America!") Our game will not have online multiplayer, so no need to worry about that. We don't expect the game to be a million-dollar hit. We want the scores to be accessible both from in-game and the website. We would like some semblance of security to make sure no one plugs fake scores into the system. This is our present situation; Neither I nor my friend have any network programming background. All I really know is that sockets are low-level, HTTP is high-level. I happen to know that the Google App Engine might be useful for something like this, and I'm really thinking about going with that. We're not sure how we would store all the high score data. Our game will be free and open source (though we might keep the components that submit the high scores closed-source). Aside from all of this, we don't really have any idea where to begin. Any thoughts?

    Read the article

  • Mouse and Keyboard Freeze

    - by Kev
    I installed Ubuntu 10.10 today and have had mouse problem since. Symptoms: At some arbitrary point in time (frequency: 2-3 times per hour), the mouse and keyboard stops working for ever(may be). I start System monitor, I found out network was shutdown just before mouse freeze. Some time my keyboard keep typing one key. For example:77777777777777777777777777777777777777777777777777777.....(it keep typing for 20 sec) I found out a script just solve the freeze problem:(I hit Powerbutton) -----------------/etc/acpi/powerbtn.sh------------------------ event=button[ /]power action=/usr/sbin/fix_mouse.sh -----------------/usr/sbin/fix_mouse.sh------------------------ rmmod psmouse modprobe psmouse Yesterday I install Ubuntu 10.04 FAILED also have mouse problem. When I switch back to Windows XP. The network card is down. It kept connecting and disconnecting 1 time per sec. CPU: i5 Motherboard: ASUS P7P55D OS: Windows XP + Ubuntu 10.10 Video Card: ATI 5770 Mouse,Keyboard: PS/2

    Read the article

  • ERRNO 5 Input/Output Error

    - by CCarey
    Going up for my first ubuntu installation and encountered a critical error. Mind that I am installing on my Macbook Pro, and have already removed all other partitions. (I'm installing with a CD) My Ubuntu version is: "ubuntu-12.10-desktop-i386" So once the installation gets to something around "Finishing copying files", a great big "[ERRNO5] Input/Output Error" pops up on the screen. Obviously this halts and crashes the whole installation. Now, I've already run a disk check, memtest, and cpu load test, and all came up green. I have also redownloaded ubuntu twice, md5 match both times, and burnt four disks. None got past this error. If anyone could help me out, that would be greatly appreciated! Cheers!

    Read the article

  • World Record Batch Rate on Oracle JD Edwards Consolidated Workload with SPARC T4-2

    - by Brian
    Oracle produced a World Record batch throughput for single system results on Oracle's JD Edwards EnterpriseOne Day-in-the-Life benchmark using Oracle's SPARC T4-2 server running Oracle Solaris Containers and consolidating JD Edwards EnterpriseOne, Oracle WebLogic servers and the Oracle Database 11g Release 2. The workload includes both online and batch workload. The SPARC T4-2 server delivered a result of 8,000 online users while concurrently executing a mix of JD Edwards EnterpriseOne Long and Short batch processes at 95.5 UBEs/min (Universal Batch Engines per minute). In order to obtain this record benchmark result, the JD Edwards EnterpriseOne, Oracle WebLogic and Oracle Database 11g Release 2 servers were executed each in separate Oracle Solaris Containers which enabled optimal system resources distribution and performance together with scalable and manageable virtualization. One SPARC T4-2 server running Oracle Solaris Containers and consolidating JD Edwards EnterpriseOne, Oracle WebLogic servers and the Oracle Database 11g Release 2 utilized only 55% of the available CPU power. The Oracle DB server in a Shared Server configuration allows for optimized CPU resource utilization and significant memory savings on the SPARC T4-2 server without sacrificing performance. This configuration with SPARC T4-2 server has achieved 33% more Users/core, 47% more UBEs/min and 78% more Users/rack unit than the IBM Power 770 server. The SPARC T4-2 server with 2 processors ran the JD Edwards "Day-in-the-Life" benchmark and supported 8,000 concurrent online users while concurrently executing mixed batch workloads at 95.5 UBEs per minute. The IBM Power 770 server with twice as many processors supported only 12,000 concurrent online users while concurrently executing mixed batch workloads at only 65 UBEs per minute. This benchmark demonstrates more than 2x cost savings by consolidating the complete solution in a single SPARC T4-2 server compared to earlier published results of 10,000 users and 67 UBEs per minute on two SPARC T4-2 and SPARC T4-1. The Oracle DB server used mirrored (RAID 1) volumes for the database providing high availability for the data without impacting performance. Performance Landscape JD Edwards EnterpriseOne Day in the Life (DIL) Benchmark Consolidated Online with Batch Workload System Rack Units BatchRate(UBEs/m) Online Users Users /Units Users /Core Version SPARC T4-2 (2 x SPARC T4, 2.85 GHz) 3 95.5 8,000 2,667 500 9.0.2 IBM Power 770 (4 x POWER7, 3.3 GHz, 32 cores) 8 65 12,000 1,500 375 9.0.2 Batch Rate (UBEs/m) — Batch transaction rate in UBEs per minute Configuration Summary Hardware Configuration: 1 x SPARC T4-2 server with 2 x SPARC T4 processors, 2.85 GHz 256 GB memory 4 x 300 GB 10K RPM SAS internal disk 2 x 300 GB internal SSD 2 x Sun Storage F5100 Flash Arrays Software Configuration: Oracle Solaris 10 Oracle Solaris Containers JD Edwards EnterpriseOne 9.0.2 JD Edwards EnterpriseOne Tools (8.98.4.2) Oracle WebLogic Server 11g (10.3.4) Oracle HTTP Server 11g Oracle Database 11g Release 2 (11.2.0.1) Benchmark Description JD Edwards EnterpriseOne is an integrated applications suite of Enterprise Resource Planning (ERP) software. Oracle offers 70 JD Edwards EnterpriseOne application modules to support a diverse set of business operations. Oracle's Day in the Life (DIL) kit is a suite of scripts that exercises most common transactions of JD Edwards EnterpriseOne applications, including business processes such as payroll, sales order, purchase order, work order, and manufacturing processes, such as ship confirmation. These are labeled by industry acronyms such as SCM, CRM, HCM, SRM and FMS. The kit's scripts execute transactions typical of a mid-sized manufacturing company. The workload consists of online transactions and the UBE – Universal Business Engine workload of 61 short and 4 long UBEs. LoadRunner runs the DIL workload, collects the user’s transactions response times and reports the key metric of Combined Weighted Average Transaction Response time. The UBE processes workload runs from the JD Enterprise Application server. Oracle's UBE processes come as three flavors: Short UBEs < 1 minute engage in Business Report and Summary Analysis, Mid UBEs > 1 minute create a large report of Account, Balance, and Full Address, Long UBEs > 2 minutes simulate Payroll, Sales Order, night only jobs. The UBE workload generates large numbers of PDF files reports and log files. The UBE Queues are categorized as the QBATCHD, a single threaded queue for large and medium UBEs, and the QPROCESS queue for short UBEs run concurrently. Oracle's UBE process performance metric is Number of Maximum Concurrent UBE processes at transaction rate, UBEs/minute. Key Points and Best Practices Two JD Edwards EnterpriseOne Application Servers, two Oracle WebLogic Servers 11g Release 1 coupled with two Oracle Web Tier HTTP server instances and one Oracle Database 11g Release 2 database on a single SPARC T4-2 server were hosted in separate Oracle Solaris Containers bound to four processor sets to demonstrate consolidation of multiple applications, web servers and the database with best resource utilizations. Interrupt fencing was configured on all Oracle Solaris Containers to channel the interrupts to processors other than the processor sets used for the JD Edwards Application server, Oracle WebLogic servers and the database server. A Oracle WebLogic vertical cluster was configured on each WebServer Container with twelve managed instances each to load balance users' requests and to provide the infrastructure that enables scaling to high number of users with ease of deployment and high availability. The database log writer was run in the real time RT class and bound to a processor set. The database redo logs were configured on the raw disk partitions. The Oracle Solaris Container running the Enterprise Application server completed 61 Short UBEs, 4 Long UBEs concurrently as the mixed size batch workload. The mixed size UBEs ran concurrently from the Enterprise Application server with the 8,000 online users driven by the LoadRunner. See Also SPARC T4-2 Server oracle.com OTN JD Edwards EnterpriseOne oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Oracle Fusion Middleware oracle.com OTN Disclosure Statement Copyright 2012, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 09/30/2012.

    Read the article

  • How to Avoid Your Next 12-Month Science Project

    - by constant
    While most customers immediately understand how the magic of Oracle's Hybrid Columnar Compression, intelligent storage servers and flash memory make Exadata uniquely powerful against home-grown database systems, some people think that Exalogic is nothing more than a bunch of x86 servers, a storage appliance and an InfiniBand (IB) network, built into a single rack. After all, isn't this exactly what the High Performance Computing (HPC) world has been doing for decades? On the surface, this may be true. And some people tried exactly that: They tried to put together their own version of Exalogic, but then they discover there's a lot more to building a system than buying hardware and assembling it together. IT is not Ikea. Why is that so? Could it be there's more going on behind the scenes than merely putting together a bunch of servers, a storage array and an InfiniBand network into a rack? Let's explore some of the special sauce that makes Exalogic unique and un-copyable, so you can save yourself from your next 6- to 12-month science project that distracts you from doing real work that adds value to your company. Engineering Systems is Hard Work! The backbone of Exalogic is its InfiniBand network: 4 times better bandwidth than even 10 Gigabit Ethernet, and only about a tenth of its latency. What a potential for increased scalability and throughput across the middleware and database layers! But InfiniBand is a beast that needs to be tamed: It is true that Exalogic uses a standard, open-source Open Fabrics Enterprise Distribution (OFED) InfiniBand driver stack. Unfortunately, this software has been developed by the HPC community with fastest speed in mind (which is good) but, despite the name, not many other enterprise-class requirements are included (which is less good). Here are some of the improvements that Oracle's InfiniBand development team had to add to the OFED stack to make it enterprise-ready, simply because typical HPC users didn't have the need to implement them: More than 100 bug fixes in the pieces that were not related to the Message Passing Interface Protocol (MPI), which is the protocol that HPC users use most of the time, but which is less useful in the enterprise. Performance optimizations and tuning across the whole IB stack: From Switches, Host Channel Adapters (HCAs) and drivers to low-level protocols, middleware and applications. Yes, even the standard HPC IB stack could be improved in terms of performance. Ethernet over IB (EoIB): Exalogic uses InfiniBand internally to reach high performance, but it needs to play nicely with datacenters around it. That's why Oracle added Ethernet over InfiniBand technology to it that allows for creating many virtual 10GBE adapters inside Exalogic's nodes that are aggregated and connected to Exalogic's IB gateway switches. While this is an open standard, it's up to the vendor to implement it. In this case, Oracle integrated the EoIB stack with Oracle's own IB to 10GBE gateway switches, and made it fully virtualized from the beginning. This means that Exalogic customers can completely rewire their server infrastructure inside the rack without having to physically pull or plug a single cable - a must-have for every cloud deployment. Anybody who wants to match this level of integration would need to add an InfiniBand switch development team to their project. Or just buy Oracle's gateway switches, which are conveniently shipped with a whole server infrastructure attached! IPv6 support for InfiniBand's Sockets Direct Protocol (SDP), Reliable Datagram Sockets (RDS), TCP/IP over IB (IPoIB) and EoIB protocols. Because no IPv6 = not very enterprise-class. HA capability for SDP. High Availability is not a big requirement for HPC, but for enterprise-class application servers it is. Every node in Exalogic's InfiniBand network is connected twice for redundancy. If any cable or port or HCA fails, there's always a replacement link ready to take over. This requires extra magic at the protocol level to work. So in addition to Weblogic's failover capabilities, Oracle implemented IB automatic path migration at the SDP level to avoid unnecessary failover operations at the middleware level. Security, for example spoof-protection. Another feature that is less important for traditional users of InfiniBand, but very important for enterprise customers. InfiniBand Partitioning and Quality-of-Service (QoS): One of the first questions we get from customers about Exalogic is: “How can we implement multi-tenancy?” The answer is to partition your IB network, which effectively creates many networks that work independently and that are protected at the lowest networking layer possible. In addition to that, QoS allows administrators to prioritize traffic flow in multi-tenancy environments so they can keep their service levels where it matters most. Resilient IB Fabric Management: InfiniBand is a self-managing network, so a lot of the magic lies in coming up with the right topology and in teaching the subnet manager how to properly discover and manage the network. Oracle's Infiniband switches come with pre-integrated, highly available fabric management with seamless integration into Oracle Enterprise Manager Ops Center. In short: Oracle elevated the OFED InfiniBand stack into an enterprise-class networking infrastructure. Many years and multiple teams of manpower went into the above improvements - this is something you can only get from Oracle, because no other InfiniBand vendor can give you these features across the whole stack! Exabus: Because it's not About the Size of Your Network, it's How You Use it! So let's assume that you somehow were able to get your hands on an enterprise-class IB driver stack. Or maybe you don't care and are just happy with the standard OFED one? Anyway, the next step is to actually leverage that InfiniBand performance. Here are the choices: Use traditional TCP/IP on top of the InfiniBand stack, Develop your own integration between your middleware and the lower-level (but faster) InfiniBand protocols. While more bandwidth is always a good thing, it's actually the low latency that enables superior performance for your applications when running on any networking infrastructure: The lower the latency, the faster the response travels through the network and the more transactions you can close per second. The reason why InfiniBand is such a low latency technology is that it gets rid of most if not all of your traditional networking protocol stack: Data is literally beamed from one region of RAM in one server into another region of RAM in another server with no kernel/drivers/UDP/TCP or other networking stack overhead involved! Which makes option 1 a no-go: Adding TCP/IP on top of InfiniBand is like adding training wheels to your racing bike. It may be ok in the beginning and for development, but it's not quite the performance IB was meant to deliver. Which only leaves option 2: Integrating your middleware with fast, low-level InfiniBand protocols. And this is what Exalogic's "Exabus" technology is all about. Here are a few Exabus features that help applications leverage the performance of InfiniBand in Exalogic: RDMA and SDP integration at the JDBC driver level (SDP), for Oracle Weblogic (SDP), Oracle Coherence (RDMA), Oracle Tuxedo (RDMA) and the new Oracle Traffic Director (RDMA) on Exalogic. Using these protocols, middleware can communicate a lot faster with each other and the Oracle database than by using standard networking protocols, Seamless Integration of Ethernet over InfiniBand from Exalogic's Gateway switches into the OS, Oracle Weblogic optimizations for handling massive amounts of parallel transactions. Because if you have an 8-lane Autobahn, you also need to improve your ramps so you can feed it with many cars in parallel. Integration of Weblogic with Oracle Exadata for faster performance, optimized session management and failover. As you see, “Exabus” is Oracle's word for describing all the InfiniBand enhancements Oracle put into Exalogic: OFED stack enhancements, protocols for faster IB access, and InfiniBand support and optimizations at the virtualization and middleware level. All working together to deliver the full potential of InfiniBand performance. Who else has 100% control over their middleware so they can develop their own low-level protocol integration with InfiniBand? Even if you take an open source approach, you're looking at years of development work to create, test and support a whole new networking technology in your middleware! The Extras: Less Hassle, More Productivity, Faster Time to Market And then there are the other advantages of Engineered Systems that are true for Exalogic the same as they are for every other Engineered System: One simple purchasing process: No headaches due to endless RFPs and no “Will X work with Y?” uncertainties. Everything has been engineered together: All kinds of bugs and problems have been already fixed at the design level that would have only manifested themselves after you have built the system from scratch. Everything is built, tested and integrated at the factory level . Less integration pain for you, faster time to market. Every Exalogic machine world-wide is identical to Oracle's own machines in the lab: Instant replication of any problems you may encounter, faster time to resolution. Simplified patching, management and operations. One throat to choke: Imagine finger-pointing hell for systems that have been put together using several different vendors. Oracle's Engineered Systems have a single phone number that customers can call to get their problems solved. For more business-centric values, read The Business Value of Engineered Systems. Conclusion: Buy Exalogic, or get ready for a 6-12 Month Science Project And here's the reason why it's not easy to "build your own Exalogic": There's a lot of work required to make such a system fly. In fact, anybody who is starting to "just put together a bunch of servers and an InfiniBand network" is really looking at a 6-12 month science project. And the outcome is likely to not be very enterprise-class. And it won't have Exalogic's performance either. Because building an Engineered System is literally rocket science: It takes a lot of time, effort, resources and many iterations of design/test/analyze/fix to build such a system. That's why InfiniBand has been reserved for HPC scientists for such a long time. And only Oracle can bring the power of InfiniBand in an enterprise-class, ready-to use, pre-integrated version to customers, without the develop/integrate/support pain. For more details, check the new Exalogic overview white paper which was updated only recently. P.S.: Thanks to my colleagues Ola, Paul, Don and Andy for helping me put together this article! var flattr_uid = '26528'; var flattr_tle = 'How to Avoid Your Next 12-Month Science Project'; var flattr_dsc = 'While most customers immediately understand how the magic of Oracle's Hybrid Columnar Compression, intelligent storage servers and flash memory make Exadata uniquely powerful against home-grown database systems, some people think that Exalogic is nothing more than a bunch of x86 servers, a storage appliance and an InfiniBand (IB) network, built into a single rack.After all, isn't this exactly what the High Performance Computing (HPC) world has been doing for decades?On the surface, this may be true. And some people tried exactly that: They tried to put together their own version of Exalogic, but then they discover there's a lot more to building a system than buying hardware and assembling it together. IT is not Ikea.Why is that so? Could it be there's more going on behind the scenes than merely putting together a bunch of servers, a storage array and an InfiniBand network into a rack? Let's explore some of the special sauce that makes Exalogic unique and un-copyable, so you can save yourself from your next 6- to 12-month science project that distracts you from doing real work that adds value to your company.'; var flattr_tag = 'Engineered Systems,Engineered Systems,Infiniband,Integration,latency,Oracle,performance'; var flattr_cat = 'text'; var flattr_url = 'http://constantin.glez.de/blog/2012/04/how-avoid-your-next-12-month-science-project'; var flattr_lng = 'en_GB'

    Read the article

  • Oracle Database 12c Spatial: Vector Performance Acceleration

    - by Okcan Yasin Saygili-Oracle
    Most business information has a location component, such as customer addresses, sales territories and physical assets. Businesses can take advantage of their geographic information by incorporating location analysis and intelligence into their information systems. This allows organizations to make better decisions, respond to customers more effectively, and reduce operational costs – increasing ROI and creating competitive advantage. Oracle Database, the industry’s most advanced database,  includes native location capabilities, fully integrated in the kernel, for fast, scalable, reliable and secure spatial and massive graph applications. It is a foundation for deploying enterprise-wide spatial information systems and locationenabled business applications. Developers can extend existing Oracle-based tools and applications, since they can easily incorporate location information directly in their applications, workflows, and services. Spatial Features The geospatial data features of Oracle Spatial and Graph option support complex geographic information systems (GIS) applications, enterprise applications and location services applications. Oracle Spatial and Graph option extends the spatial query and analysis features included in every edition of Oracle Database with the Oracle Locator feature, and provides a robust foundation for applications that require advanced spatial analysis and processing in the Oracle Database. It supports all major spatial data types and models, addressing challenging business-critical requirements from various industries, including transportation, utilities, energy, public sector, defense and commercial location intelligence. Network Data Model Graph Features The Network Data Model graph explicitly stores and maintains a persistent data model withnetwork connectivity and provides network analysis capability such as shortest path, nearest neighbors, within cost and reachability. It loads partitioned networks into memory on demand, overcomingthe limitations of in-memory analysis. Partitioning massive networks into manageable sub-networkssimplifies the network analysis. RDF Semantic Graph Features RDF Semantic Graph has native support for World Wide Web Consortium standards. It has open, scalable, and secure features for storing RDF/OWL ontologies anddata; native inference with OWL 2, SKOS and user-defined rules; and querying RDF/OWL data withSPARQL 1.1, Java APIs, and SPARQLgraph patterns in SQL. Video: Oracle Spatial and Graph Overview Oracle spatial is embeded on oracle database product. So ,we can use oracle installer (OUI).The Oracle Universal Installer (OUI) is used to install Oracle Database software. OUI is a graphical user interface utility that enables you to view the Oracle software that is installed on your machine, install new Oracle Database software, and delete Oracle software that you no longer need to use. Online Help is available to guide you through the installation process. One of the installation options is to create a database. If you select database creation, OUI automatically starts Oracle Database Configuration Assistant (DBCA) to guide you through the process of creating and configuring a database. If you do not create a database during installation, you must invoke DBCA after you have installed the software to create a database. You can also use DBCA to create additional databases. For installing Oracle Database 12c you may check the Installing Oracle Database Software and Creating a Database tutorial under the Oracle Database 12c 2-Day DBA Series.You can always check if spatial is available in your database using  "select comp_id, version, status, comp_name from dba_registry where comp_id='SDO';"   One of the most notable improvements with Oracle Spatial and Graph 12c can be seen in performance increases in vector data operations. Enabling the Spatial Vector Acceleration feature (available with the Spatial option) dramatically improves the performance of commonly used vector data operations, such as sdo_distance, sdo_aggr_union, and sdo_inside. With 12c, these operations also run more efficiently in parallel than in prior versions through the use of metadata caching. For organizations that have been facing processing limitations, these enhancements enable developers to make a small set of configuration changes and quickly realize significant performance improvements. Results include improved index performance, enhanced geometry engine performance, optimized secondary filter optimizations for Spatial operators, and improved CPU and memory utilization for many advanced vector functions. Vector performance acceleration is especially beneficial when using Oracle Exadata Database Machine and other large-scale systems. Oracle Spatial and Graph vector performance acceleration builds on general improvements available to all SDO_GEOMETRY operations in these areas: Caching of index metadata, Concurrent update mechanisms, and Optimized spatial predicate selectivity and cost functions. These optimizations enable more efficient use of: CPU, Memory, and Partitioning Resulting in substantial query performance improvements.UsageTo accelerate the performance of spatial operators, it is recommended that you set the SPATIAL_VECTOR_ACCELERATION database system parameter to the value TRUE. (This parameter is authorized for use only by licensed Oracle Spatial users, and its default value is FALSE.) You can set this parameter for the whole system or for a single session. To set the value for the whole system, do either of the following:Enter the following statement from a suitably privileged account:   ALTER SYSTEM SET SPATIAL_VECTOR_ACCELERATION = TRUE;Add the following to the database initialization file (xxxinit.ora):   SPATIAL_VECTOR_ACCELERATION = TRUE;To set the value for the current session, enter the following statement from a suitably privileged account:   ALTER SESSION SET SPATIAL_VECTOR_ACCELERATION = TRUE; Checkout the complete list of new features on Oracle.com @ http://www.oracle.com/technetwork/database/options/spatialandgraph/overview/index.html Spatial and Graph Data Sheet (PDF) Spatial and Graph White Paper (PDF)

    Read the article

  • Invalid Opcode 0000

    - by Mr47
    At random times (usually when watching a movie in XBMC), the computer locks up. I can still sometimes SSH in and get the 'dmesg' output before that locks up too. A hard reboot is usually required to get things going again. I have cut out the date/time/server columns for easier reading, please do ask if these seem relevant omissions... System: Ubuntu 11.04 (2.6.38-8-server) x64 X11 installed with IceWm (and XBMC) Core 2 Duo E8400 @ 3.00GHz 8 GB RAM Asus P5Q premium motherboard Primary harddrive: OCZ Vertex 2 60 GB (SSD) Other harddrives: various 750GB, 1TB, 1.5TB & 2TB (WD & samsung) Any important information I am not supplying is purely a sign of my incompetence in these matters, so please do ask and excuse me for my inabilities... invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map CPU 0 Modules linked in: parport_pc ppdev vesafb snd_hda_codec_analog tuner_simple tuner_types wm8775 tda9887 tda8290 tea5767 tuner cx25840 ir_lirc_codec lirc_dev snd_hda_intel snd_hda_codec snd_hwdep snd_pcm ir_sony_decoder snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq rc_rc6_mce ivtv ir_jvc_decoder cx2341x i2c_algo_bit v4l2_common mceusb videodev ir_rc6_decoder ir_rc5_decoder snd_timer ir_nec_decoder nvidia(P) btusb bluetooth rc_core v4l2_compat_ioctl32 tveeprom snd_seq_device pata_marvell psmouse shpchp serio_raw snd asus_atk0110 soundcore snd_page_alloc lp parport firewire_ohci firewire_core crc_itu_t r8169 sky2 ahci libahci Pid: 4597, comm: xbmc.bin Tainted: P 2.6.38-8-server #42-Ubuntu System manufacturer P5Q Premium/P5Q Premium RIP: 0010:[<ffffffff8119bc4a>] [<ffffffff8119bc4a>] do_mpage_readpage+0x9a/0x510 RSP: 0018:ffff88021f5a59d8 EFLAGS: 00210246 RAX: 0000000000000020 RBX: ffff88021f5a5ac8 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000015e36fe0 RDI: 0000000000000000 RBP: ffff88021f5a5a98 R08: ffff88021f5a5ac8 R09: ffff88021f5a5b38 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 000000000000b148 R14: 0000000000000001 R15: ffff8802067034b8 FS: 00007f3f34eb1700(0000) GS:ffff8800cfc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007feb8d515000 CR3: 000000021d744000 CR4: 00000000000406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process xbmc.bin (pid: 4597, threadinfo ffff88021f5a4000, task ffff88021f63db80) Stack: ffff88021f5a5a28 ffffffff8116019d ffff88021f5a5b40 0000002000000003 ffff88021f5a5b38 ffff880206703370 ffffea0006d800f8 0000000000000000 ffffea0006d800f8 0000000c811270b5 ffff88021f5a5a68 ffffffff8110bdba Call Trace: [<ffffffff8116019d>] ? mem_cgroup_cache_charge+0xed/0x130 [<ffffffff8110bdba>] ? add_to_page_cache_locked+0xea/0x160 [<ffffffff8119c232>] mpage_readpages+0x102/0x150 [<ffffffff812063e0>] ? ext4_get_block+0x0/0x20 [<ffffffff812063e0>] ? ext4_get_block+0x0/0x20 [<ffffffff81149475>] ? alloc_pages_current+0xa5/0x110 [<ffffffff8120157d>] ext4_readpages+0x1d/0x20 [<ffffffff81116a9b>] __do_page_cache_readahead+0x14b/0x220 [<ffffffff81116ed1>] ra_submit+0x21/0x30 [<ffffffff81116ff5>] ondemand_readahead+0x115/0x230 [<ffffffff811171a0>] page_cache_async_readahead+0x90/0xc0 [<ffffffff8110b184>] ? file_read_actor+0xd4/0x170 [<ffffffff812de72e>] ? radix_tree_lookup_slot+0xe/0x10 [<ffffffff8110c521>] do_generic_file_read.clone.23+0x271/0x450 [<ffffffff8110d1ba>] generic_file_aio_read+0x1ca/0x240 [<ffffffff8100a82e>] ? __switch_to+0x20e/0x2f0 [<ffffffff81164c82>] do_sync_read+0xd2/0x110 [<ffffffff8108b61c>] ? hrtimer_try_to_cancel+0x4c/0xe0 [<ffffffff81279083>] ? security_file_permission+0x93/0xb0 [<ffffffff81164fa1>] ? rw_verify_area+0x61/0xf0 [<ffffffff81165463>] vfs_read+0xc3/0x180 [<ffffffff81165571>] sys_read+0x51/0x90 [<ffffffff8100bfc2>] system_call_fastpath+0x16/0x1b Code: ff ff 48 c7 85 78 ff ff ff 00 00 00 00 49 d3 ee b9 0c 00 00 00 2b 4d 8c 48 8b b2 c8 00 00 00 ba 01 00 00 00 41 0f af c6 49 d3 e5 <0f> 36 4d 8c 4c 01 e8 d3 e2 4c 8d 44 16 ff 48 8b 53 20 49 d3 f8 RIP [<ffffffff8119bc4a>] do_mpage_readpage+0x9a/0x510 RSP <ffff88021f5a59d8> ---[ end trace ac6cd2f4692205a3 ]--- Please note that the error is ALWAYS occuring at do_mpage_readpage+0x9a/0x510 with the same numbers after it. I've tried to come up with the possible meaning of these, but couldn't get any further. I've also noticed that the top block from the call trace is always the following with the exact same numbers: [<ffffffff8116019d>] ? mem_cgroup_cache_charge+0xed/0x130 [<ffffffff8110bdba>] ? add_to_page_cache_locked+0xea/0x160 [<ffffffff8119c232>] mpage_readpages+0x102/0x150 [<ffffffff812063e0>] ? ext4_get_block+0x0/0x20 [<ffffffff812063e0>] ? ext4_get_block+0x0/0x20 Could this indicate a hard drive issue, a RAM issue or something else entirely?

    Read the article

  • overheating when using flash

    - by Ben
    I am having overheating issues when watching flash videos. Here is all the relevant information that I could think of. Please help. --Often when playing flash videos it operates at 95 degrees C. --Only gets that hot when on full screen (but still in the 80s otherwise) --Sometimes reaches 100 degrees and shuts down. --fan runs at around 4500 RPM when hot (not to mention I can here it speed up) --Happens under Ubuntu 12.04 and 12.10 (did not use Ubuntu before then) --Overheating happens on more than one site (for example youtube and Comedy Central) --My computer does not overheat otherwise. --Usually runs around 50-60 degrees C. --Maybe if compiling or doing updates can it get to 70 or 75 degree C. --Have a dual boot, and under Windows I do not have this issue. --I use firefox, but have the same issue if using Chrome --I have a Lenovo T410, currently running Ubuntu 12.10 --Intel® Core™ i5 CPU M 540 @ 2.53GHz × 4 --4 GB of memory, 64 bit OS --Graphics: NVS 3100M/PCIe/SSE2 Thanks and let me know if you need more info, Ben

    Read the article

  • Will Ubuntu break my RAID 0 array?

    - by Chad
    I am upgrading an older machine today with new Motherboard, RAM, and CPU. Then I am going to do a fresh install of Ubuntu 64bit. Currently the old machine has an 80gb system drive, and a 4TB RAID 0 array. The old Motherboard has no SATA ports, so I used a SATA card. Ubuntu set up the old RAID array, will it still recognize the array on a newer machine? Are there any steps I should take to ensure the array isn't damaged? It's non-crucial data, but I would rather not start over if it can be avoided. Thanks.

    Read the article

  • What do you consider to be a high-level language and for what reason?

    - by Anto
    Traditionally, C was called a high-level language, but these days it is often referred to as a low-level language (it is high-level compared to Assembly, but it is very low-level compared to, for instance, Python, these days). Generally, everyone calls languages of the sort of Python and Java high-level languages nowadays. How would you judge whether a programming language really is a high-level language (or a low-level one)? Must you give direct instructions to the CPU while programming in a language to call it low-level (e.g. Assembly), or is C, which provides some abstraction from the hardware, a low-level language too? Has the meaning of "high-level" and "low-level" changed over the years?

    Read the article

  • Critical Patch Update for April 2010 Now Available

    - by Steven Chan
    The Critical Patch Update (CPU) for April 2010 was released on April 13, 2010. Oracle strongly recommends applying the patches as soon as possible.The Critical Patch Update Advisory is the starting point for relevant information. It includes a list of products affected, pointers to obtain the patches, a summary of the security vulnerabilities, and links to other important documents.Supported Products that are not listed in the "Supported Products and Components Affected" Section of the advisory do not require new patches to be applied.Also, it is essential to review the Critical Patch Update supporting documentation referenced in the Advisory before applying patches, as this is where you can find important pertinent information.The Critical Patch Update Advisory is available at the following location:Oracle Technology NetworkThe next four Critical Patch Update release dates are:July 13, 2010October 12, 2010January 18, 2011April 19, 2011

    Read the article

  • Please Help with ATI Radeon 4250 and Xinerama

    - by Luis Enrique
    I am using ubuntu 11.10 fresh install and I am having problems making the second screen to work, I remember before I was able to do it by extending the desktop to a larger number like 3840 X 1080 but since I am new I completely forgot how to do it, Now I have a philips 230 E monitor full 1920 X 1080 and a Toshiba tv HDMI and I want to activate the second monitor to be able to use Xinerama but I don't know how to go about this, I want to keep the Phillips as a primary monitor since it's the smaller one and use the TV from time to time to play videos movies and presentations. I installed all the additional drivers and don't know what else to do. I would prefer if I can use a list of exact commands to copy and paste onto the terminal since that is all I know how to use LOL . Thank you so much for your help. If you need more details of my mother board or video card or cpu just ask thank you. Luis

    Read the article

  • DIY Coffee Table Arcade Hides Retro Gaming Inside

    - by Jason Fitzpatrick
    Last week we showed you a nifty man-cave arcade-in-coffee-table build that was a bit, shall we say, exposed. If you’re looking for a sleek build that conceals its arcade-heart until it’s game time, this clean and concealed build is for you. Courtesy of IKEAHacker reader Sam Wang, the beauty of this build is that other than the rectangle of black glass in the center of the table–which could just as well be a design accent–there is no indication that the coffee table is a gaming machine when not in use. Slide out the drawers and boot it up, however, and you’re in business–full MAME arcade emulation at your finger tips. Hit up the link below to check out his full photo build guide. My DIY Arcade Machine Coffee Table [via IKEAHacker] How To Switch Webmail Providers Without Losing All Your Email How To Force Windows Applications to Use a Specific CPU HTG Explains: Is UPnP a Security Risk?

    Read the article

  • concurrency::accelerator

    - by Daniel Moth
    Overview An accelerator represents a "target" on which C++ AMP code can execute and where data can reside. Typically (but not necessarily) an accelerator is a GPU device. Accelerators are represented in C++ AMP as objects of the accelerator class. For many scenarios, you do not need to obtain an accelerator object, since the runtime has a notion of a default accelerator, which is what it thinks is the best one in the system. Examples where you need to deal with accelerator objects are if you need to pick your own accelerator (based on your specific criteria), or if you need to use more than one accelerators from your app. Construction and operator usage You can query and obtain a std::vector of all the accelerators on your system, which the runtime discovers on startup. Beyond enumerating accelerators, you can also create one directly by passing to the constructor a system-wide unique path to a device if you know it (i.e. the “Device Instance Path” property for the device in Device Manager), e.g. accelerator acc(L"PCI\\VEN_1002&DEV_6898&SUBSYS_0B001002etc"); There are some predefined strings (for predefined accelerators) that you can pass to the accelerator constructor (and there are corresponding constants for those on the accelerator class itself, so you don’t have to hardcode them every time). Examples are the following: accelerator::default_accelerator represents the default accelerator that the C++ AMP runtime picks for you if you don’t pick one (the heuristics of how it picks one will be covered in a future post). Example: accelerator acc; accelerator::direct3d_ref represents the reference rasterizer emulator that simulates a direct3d device on the CPU (in a very slow manner). This emulator is available on systems with Visual Studio installed and is useful for debugging. More on debugging in general in future posts. Example: accelerator acc(accelerator::direct3d_ref); accelerator::direct3d_warp represents a target that I will cover in future blog posts. Example: accelerator acc(accelerator::direct3d_warp); accelerator::cpu_accelerator represents the CPU. In this first release the only use of this accelerator is for using the staging arrays technique that I'll cover separately. Example: accelerator acc(accelerator::cpu_accelerator); You can also create an accelerator by shallow copying another accelerator instance (via the corresponding constructor) or simply assigning it to another accelerator instance (via the operator overloading of =). Speaking of operator overloading, you can also compare (for equality and inequality) two accelerator objects between them to determine if they refer to the same underlying device. Querying accelerator characteristics Given an accelerator object, you can access its description, version, device path, size of dedicated memory in KB, whether it is some kind of emulator, whether it has a display attached, whether it supports double precision, and whether it was created with the debugging layer enabled for extensive error reporting. Below is example code that accesses some of the properties; in your real code you'd probably be checking one or more of them in order to pick an accelerator (or check that the default one is good enough for your specific workload): void inspect_accelerator(concurrency::accelerator acc) { std::wcout << "New accelerator: " << acc.description << std::endl; std::wcout << "is_debug = " << acc.is_debug << std::endl; std::wcout << "is_emulated = " << acc.is_emulated << std::endl; std::wcout << "dedicated_memory = " << acc.dedicated_memory << std::endl; std::wcout << "device_path = " << acc.device_path << std::endl; std::wcout << "has_display = " << acc.has_display << std::endl; std::wcout << "version = " << (acc.version >> 16) << '.' << (acc.version & 0xFFFF) << std::endl; } accelerator_view In my next blog post I'll cover a related class: accelerator_view. Suffice to say here that each accelerator may have from 1..n related accelerator_view objects. You can get the accelerator_view from an accelerator via the default_view property, or create new ones by invoking the create_view method that creates an accelerator_view object for you (by also accepting a queuing_mode enum value of deferred or immediate that we'll also explore in the next blog post). Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

< Previous Page | 148 149 150 151 152 153 154 155 156 157 158 159  | Next Page >