Search Results

Search found 97855 results on 3915 pages for 'code performance'.

Page 50/3915 | < Previous Page | 46 47 48 49 50 51 52 53 54 55 56 57  | Next Page >

  • Does chunk size affect the read performance of a Linux md software RAID1 array?

    - by OldWolf
    This came up in relation to this question on determining chunk size of an existing RAID array. The general consensus seems to be that chunk size does not apply to RAID1 as it is not striped. On the other hand, the Linux RAID Wiki claims that it will have an affect on read performance. However, I cannot find any benchmarks testing/proving that. Can anyone point to conclusive documentation that it either does or does not affect read performance?

    Read the article

  • Performance boost for MacBook: Hybrid hard drive or 4GB RAM?

    - by user13572
    I have an aluminium 13" MacBook with 2GB or RAM and 5400RPM 500GB hard drive. The main tasks I perform are developing iPhone and Mac apps in Xcode and websites in Coda. I want to improve the performance so I am considering buying 4GB of RAM or a 500GB Seagate solid-state hybrid drive. What is likely to provide the biggest performance boost?

    Read the article

  • Performance boast for MacBook: Hybrid hard drive or 4GB RAM?

    - by user13572
    I have an aluminium 13" MacBook with 2GB or RAM and 5400RPM 500GB hard drive. The main tasks I perform are developing iPhone and Mac apps in Xcode and websites in Coda. I want to improve the performance so I am considering buying 4GB of RAM or a 500GB Seagate solid-state hybrid drive. What is likely to provide the biggest performance boast?

    Read the article

  • How to review code that you do not understand?

    - by John Isaacks
    I have been given the role to improve development in our company. The first thing I wanted to start was code reviews since that has never been done here before. There are 3 programmers in our company. I am a web programmer, my known languages are mainly PHP, ActionScript and JavaScript. The other 2 developers write internal applications in VB.net We have been doing code reviews for a couple weeks now. I find it hard to understand VB code. So when they say what its doing, for the most part I just have to take their word for it. If I do see something that looks wrong, I explain my opinion and I explain how I would address it in one of the languages I know. Sometimes my suggestions are welcomed but many times I am told things like "this is the best way of doing it in this language" or "that doesn't apply to this language" or similar things of that nature. This may be true, but without knowing the language I am not sure how to confirm or refute these claims. I know one possible solution would be to learn vb so I can do better code reviews. I really have no interest in learning vb (especially since I have a list of other technologies I am trying to learn for my own projects) and would like to keep this as a last resort but it is an option. Another idea that came to me is, they both have interest in C# and so do I. Its relative to them because its .net and relative to me because its more similar to the languages I know. Yet it is new to all of us. I thought about the benefits of us all collaborating on a pet C#.net project and reviewing each others code from that. I guess theres also the possibility hiring a consultant to come in and give us some code reviews. What would you recommend I do in this situation.

    Read the article

  • How do you use blank lines in your code ?

    - by Matthieu M.
    There has been a few remarks about white space already in discussion about curly braces placements. I myself tend to sprinkle my code with blank lines in an attempt to segregate things that go together in "logical" groups and hopefully make it easier for the next person to come by to read the code I just produced. In fact, I would say I structure my code like I write: I make paragraphs, no longer than a few lines (definitely shorter than 10), and try to make each paragraph self-contained. For example: in a class, I will group methods that go together, while separating them by a blank line from the next group. if I need to write a comment I'll usually put a blank line before the comment in a method, I make one paragraph per step of the process All in all, I rarely have more than 4/5 lines clustered together, meaning a very sparse code. I don't consider all this white space a waste because I actually use it to structure the code (as I use the indentation in fact), and therefore I feel it worth the screen estate it takes. For example: for (int i = 0; i < 10; ++i) { if (i % 3 == 0) continue; array[i] += 2; } I consider than the two statements have clear distinct purposes and thus deserve to be separated to make it obvious. So, how do you actually use (or not) blank lines in code ?

    Read the article

  • How do you structure your shared code so that it is "re-findable" for new developers?

    - by awmckinley
    I started working at my current job about 8 months ago, and its been one of the best experiences I've had as a young programmer. It's a small company, and both my co-developers are brilliant guys. One of the practices that they both have been encouraging is lots of code-reuse. Our code base is mainly C#, and we're using a centralized revision control system. The way the repository is currently structured, there is a single folder in which all shared class libraries are placed (along with unit tests for each library), and our revision control system allows for sharing or linking those libraries out to other projects. What I'm trying to understand at this point is how the current structure of the folder can be made more conducive for finding those libraries again. I've talked to the other developers about this, and they agree that it's gotten a little messy. I find that I am sometimes "reinventing the wheel" because I didn't realize that there was an existing piece of code that solved a particular problem. The issue is complicated further by the fact that we're sharing some code between ASP.NET MVC2, WinForms, and Windows CE projects, and sharing code between applications built against multiple versions of .NET. How do other people approach this? Is the answer in naming the libraries in a certain way or is it preferable to invest in some code-search software? Is the answer in doc comments? Should we be sharing libraries at all or should we simply branch the class libraries for re-use? Thanks for any and all help!

    Read the article

  • How to get feedback from the community on large chunks of code?

    - by MainMa
    Code Review.SE is great when you need feedback on a precise, short piece of code. But where to get similar feedback about the code itself when: you have thousands of LOC, don't have colleagues in your workplace ready or willing to review the code¹, don't have thousands of dollars to spend for a professional review by a third party developer?² Places like CodePlex are a good idea to get your project known³, but from what I've seen, the feedback you get on known projects are consumer feedback, i.e. concerns the bugs and feature requests, not the quality of the source code itself. What are the social way to get the community involved in the code review of the codebase of a certain size for an open source project which doesn't have the scale of Firefox or similar products? ¹ Which is the case for most personal and open source projects, or projects done in companies where the practice of regular and complete code review is nonexistent. ² Which is, again, the case for most personal and open source projects. ³ Even if too many projects published on CodePlex never get known, either because nobody cares or because they are presented not very well.

    Read the article

  • Does it make sense to write tests for legacy code when there is no time for a complete refactoring?

    - by is4
    I usually try to follow the advice of the book Working Effectively with Legacy Code. I break dependencies, move parts of the code to @VisibleForTesting public static methods and to new classes to make the code (or at least some part of it) testable. And I write tests to make sure that I don't break anything when I'm modifying or adding new functions. A colleague says that I shouldn't do this. His reasoning: The original code might not work properly in the first place. And writing tests for it makes future fixes and modifications harder since devs have to understand and modify the tests too. If it's GUI code with some logic (~12 lines, 2-3 if/else block, for example), a test isn't worth the trouble since the code is too trivial to begin with. Similar bad patterns could exist in other parts of the codebase, too (which I haven't seen yet, I'm rather new); it will be easier to clean them all up in one big refactoring. Extracting out logic could undermine this future possibility. Should I avoid extracting out testable parts and writing tests if we don't have time for complete refactoring? Is there any disadvantage to this that I should consider?

    Read the article

  • How to scale MySQL with multiple machines?

    - by erotsppa
    I have a web app running LAMP. We recently have an increase in load and is now looking at solutions to scale. Scaling apache is pretty easy we are just going to have multiple multiple machines hosting it and round robin the incoming traffic. However, each instance of apache will talk with MySQL and eventually MySQL will be overloaded. How to scale MySQL across multiple machines in this setup? I have already looked at this but specifically we need the updates from the DB available immediately so I don't think replication is a good strategy here? Also hopefully this can be done with minimal code change. PS. We have around a 1:1 read-write ratio.

    Read the article

  • Improving Partitioned Table Join Performance

    - by Paul White
    The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) );   CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY);   WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50);   -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE);   -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory:   Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   CREATE TABLE #P ( partition_number integer PRIMARY KEY);   INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1;   SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals;   DROP TABLE #P;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated  – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • 'echo' or drop out of 'programming' write HTML then start PHP code again

    - by thecoshman
    For the most part, when I want to display some HTML code to be actually rendered I would use a 'close PHP' tag, write the HTML, then open the PHP again. eg <?php // some php code ?> <p>HTML that I want displayed</p> <?php // more php code ?> But I have seen lots of people who would just use echo instead, so they would have done the above something like <?php // some php code echo("<p>HTML that I want displayed</p>"); // more php code ?> Is their any performance hit for dropping out and back in like that? I would assume not as the PHP engine would have to process the entire file either way. What about when you use the echo function in the way that dose not look like a function, eg echo "<p>HTML that I want displayed</p>" I would hope that this is purely a matter of taste, but I would like to know if I was missing out on something. I personally find the first way preferable (dropping out of PHP then back in) as it helps draw a clear distinction between PHP and HTML and also lets you make use of code highlighting and hinting for your HTML, which is always handy.

    Read the article

  • PHP: Profiling code and strict environment ~ Improving my coding

    - by DavidYell
    I would like to update my local working environment to be stricter in an effort to improve my code. I know that my code is okay, but as with most things there is always room for improvement. I use XAMPP on my local machine, for simplicities sake Apache Friends XAMPP (Basic Package) version 1.7.2 So I've updated my php.ini : error_reporting to be E_ALL | E_STRICT to help with the code standard. I've also enabled the XDebug extension zend_extension = "C:\xampp\php\ext\php_xdebug.dll" which seems to be working, having tested some broken code and got the nice standard orange error notice. However, having read this question, http://stackoverflow.com/questions/133686/what-is-the-best-way-to-profile-php-code and enabled the profiler, I cannot seem to create a cachegrind file. Many of the guides that I've looked at seem to think you need to install XDebug in XAMPP which leads me to think they are out of date, as XDebug is bundled with XAMPP these days. So I would appreciate it if anyone can help point me in the right direction with both configuring XDebug to output grind files, and or just a great set of default settings for the XDebug config in XAMPP. Seems there is very little documentation to go on. If people have tips on integrating these tools with Netbeans, that would be awesomesauce. I'm happy to get suggestions on other things that I can do to help tighten up my php code, both syntactically and performance wise Thanks, and apologies for the rambling question(s)! Ninja edit I should menion that I'm using named vhosts as my Apache configuration, which I think is why running XDebug on port 9000 isn't working for me. I guess I'd need to edit my vhost to include port 9000

    Read the article

  • Why is code quality not popular?

    - by Peter Kofler
    I like my code being in order, i.e. properly formatted, readable, designed, tested, checked for bugs, etc. In fact I am fanatic about it. (Maybe even more than fanatic...) But in my experience actions helping code quality are hardly implemented. (By code quality I mean the quality of the code you produce day to day. The whole topic of software quality with development processes and such is much broader and not the scope of this question.) Code quality does not seem popular. Some examples from my experience include Probably every Java developer knows JUnit, almost all languages implement xUnit frameworks, but in all companies I know, only very few proper unit tests existed (if at all). I know that it's not always possible to write unit tests due to technical limitations or pressing deadlines, but in the cases I saw, unit testing would have been an option. If a developer wanted to write some tests for his/her new code, he/she could do so. My conclusion is that developers do not want to write tests. Static code analysis is often played around in small projects, but not really used to enforce coding conventions or find possible errors in enterprise projects. Usually even compiler warnings like potential null pointer access are ignored. Conference speakers and magazines would talk a lot about EJB3.1, OSGI, Cloud and other new technologies, but hardly about new testing technologies or tools, new static code analysis approaches (e.g. SAT solving), development processes helping to maintain higher quality, how some nasty beast of legacy code was brought under test, ... (I did not attend many conferences and it propably looks different for conferences on agile topics, as unit testing and CI and such has a higer value there.) So why is code quality so unpopular/considered boring? EDIT: Thank your for your answers. Most of them concern unit testing (and has been discussed in a related question). But there are lots of other things that can be used to keep code quality high (see related question). Even if you are not able to use unit tests, you could use a daily build, add some static code analysis to your IDE or development process, try pair programming or enforce reviews of critical code.

    Read the article

  • Inheritance Mapping Strategies with Entity Framework Code First CTP5: Part 3 – Table per Concrete Type (TPC) and Choosing Strategy Guidelines

    - by mortezam
    This is the third (and last) post in a series that explains different approaches to map an inheritance hierarchy with EF Code First. I've described these strategies in previous posts: Part 1 – Table per Hierarchy (TPH) Part 2 – Table per Type (TPT)In today’s blog post I am going to discuss Table per Concrete Type (TPC) which completes the inheritance mapping strategies supported by EF Code First. At the end of this post I will provide some guidelines to choose an inheritance strategy mainly based on what we've learned in this series. TPC and Entity Framework in the Past Table per Concrete type is somehow the simplest approach suggested, yet using TPC with EF is one of those concepts that has not been covered very well so far and I've seen in some resources that it was even discouraged. The reason for that is just because Entity Data Model Designer in VS2010 doesn't support TPC (even though the EF runtime does). That basically means if you are following EF's Database-First or Model-First approaches then configuring TPC requires manually writing XML in the EDMX file which is not considered to be a fun practice. Well, no more. You'll see that with Code First, creating TPC is perfectly possible with fluent API just like other strategies and you don't need to avoid TPC due to the lack of designer support as you would probably do in other EF approaches. Table per Concrete Type (TPC)In Table per Concrete type (aka Table per Concrete class) we use exactly one table for each (nonabstract) class. All properties of a class, including inherited properties, can be mapped to columns of this table, as shown in the following figure: As you can see, the SQL schema is not aware of the inheritance; effectively, we’ve mapped two unrelated tables to a more expressive class structure. If the base class was concrete, then an additional table would be needed to hold instances of that class. I have to emphasize that there is no relationship between the database tables, except for the fact that they share some similar columns. TPC Implementation in Code First Just like the TPT implementation, we need to specify a separate table for each of the subclasses. We also need to tell Code First that we want all of the inherited properties to be mapped as part of this table. In CTP5, there is a new helper method on EntityMappingConfiguration class called MapInheritedProperties that exactly does this for us. Here is the complete object model as well as the fluent API to create a TPC mapping: public abstract class BillingDetail {     public int BillingDetailId { get; set; }     public string Owner { get; set; }     public string Number { get; set; } }          public class BankAccount : BillingDetail {     public string BankName { get; set; }     public string Swift { get; set; } }          public class CreditCard : BillingDetail {     public int CardType { get; set; }     public string ExpiryMonth { get; set; }     public string ExpiryYear { get; set; } }      public class InheritanceMappingContext : DbContext {     public DbSet<BillingDetail> BillingDetails { get; set; }              protected override void OnModelCreating(ModelBuilder modelBuilder)     {         modelBuilder.Entity<BankAccount>().Map(m =>         {             m.MapInheritedProperties();             m.ToTable("BankAccounts");         });         modelBuilder.Entity<CreditCard>().Map(m =>         {             m.MapInheritedProperties();             m.ToTable("CreditCards");         });                 } } The Importance of EntityMappingConfiguration ClassAs a side note, it worth mentioning that EntityMappingConfiguration class turns out to be a key type for inheritance mapping in Code First. Here is an snapshot of this class: namespace System.Data.Entity.ModelConfiguration.Configuration.Mapping {     public class EntityMappingConfiguration<TEntityType> where TEntityType : class     {         public ValueConditionConfiguration Requires(string discriminator);         public void ToTable(string tableName);         public void MapInheritedProperties();     } } As you have seen so far, we used its Requires method to customize TPH. We also used its ToTable method to create a TPT and now we are using its MapInheritedProperties along with ToTable method to create our TPC mapping. TPC Configuration is Not Done Yet!We are not quite done with our TPC configuration and there is more into this story even though the fluent API we saw perfectly created a TPC mapping for us in the database. To see why, let's start working with our object model. For example, the following code creates two new objects of BankAccount and CreditCard types and tries to add them to the database: using (var context = new InheritanceMappingContext()) {     BankAccount bankAccount = new BankAccount();     CreditCard creditCard = new CreditCard() { CardType = 1 };                      context.BillingDetails.Add(bankAccount);     context.BillingDetails.Add(creditCard);     context.SaveChanges(); } Running this code throws an InvalidOperationException with this message: The changes to the database were committed successfully, but an error occurred while updating the object context. The ObjectContext might be in an inconsistent state. Inner exception message: AcceptChanges cannot continue because the object's key values conflict with another object in the ObjectStateManager. Make sure that the key values are unique before calling AcceptChanges. The reason we got this exception is because DbContext.SaveChanges() internally invokes SaveChanges method of its internal ObjectContext. ObjectContext's SaveChanges method on its turn by default calls AcceptAllChanges after it has performed the database modifications. AcceptAllChanges method merely iterates over all entries in ObjectStateManager and invokes AcceptChanges on each of them. Since the entities are in Added state, AcceptChanges method replaces their temporary EntityKey with a regular EntityKey based on the primary key values (i.e. BillingDetailId) that come back from the database and that's where the problem occurs since both the entities have been assigned the same value for their primary key by the database (i.e. on both BillingDetailId = 1) and the problem is that ObjectStateManager cannot track objects of the same type (i.e. BillingDetail) with the same EntityKey value hence it throws. If you take a closer look at the TPC's SQL schema above, you'll see why the database generated the same values for the primary keys: the BillingDetailId column in both BankAccounts and CreditCards table has been marked as identity. How to Solve The Identity Problem in TPC As you saw, using SQL Server’s int identity columns doesn't work very well together with TPC since there will be duplicate entity keys when inserting in subclasses tables with all having the same identity seed. Therefore, to solve this, either a spread seed (where each table has its own initial seed value) will be needed, or a mechanism other than SQL Server’s int identity should be used. Some other RDBMSes have other mechanisms allowing a sequence (identity) to be shared by multiple tables, and something similar can be achieved with GUID keys in SQL Server. While using GUID keys, or int identity keys with different starting seeds will solve the problem but yet another solution would be to completely switch off identity on the primary key property. As a result, we need to take the responsibility of providing unique keys when inserting records to the database. We will go with this solution since it works regardless of which database engine is used. Switching Off Identity in Code First We can switch off identity simply by placing DatabaseGenerated attribute on the primary key property and pass DatabaseGenerationOption.None to its constructor. DatabaseGenerated attribute is a new data annotation which has been added to System.ComponentModel.DataAnnotations namespace in CTP5: public abstract class BillingDetail {     [DatabaseGenerated(DatabaseGenerationOption.None)]     public int BillingDetailId { get; set; }     public string Owner { get; set; }     public string Number { get; set; } } As always, we can achieve the same result by using fluent API, if you prefer that: modelBuilder.Entity<BillingDetail>()             .Property(p => p.BillingDetailId)             .HasDatabaseGenerationOption(DatabaseGenerationOption.None); Working With The Object Model Our TPC mapping is ready and we can try adding new records to the database. But, like I said, now we need to take care of providing unique keys when creating new objects: using (var context = new InheritanceMappingContext()) {     BankAccount bankAccount = new BankAccount()      {          BillingDetailId = 1                          };     CreditCard creditCard = new CreditCard()      {          BillingDetailId = 2,         CardType = 1     };                      context.BillingDetails.Add(bankAccount);     context.BillingDetails.Add(creditCard);     context.SaveChanges(); } Polymorphic Associations with TPC is Problematic The main problem with this approach is that it doesn’t support Polymorphic Associations very well. After all, in the database, associations are represented as foreign key relationships and in TPC, the subclasses are all mapped to different tables so a polymorphic association to their base class (abstract BillingDetail in our example) cannot be represented as a simple foreign key relationship. For example, consider the the domain model we introduced here where User has a polymorphic association with BillingDetail. This would be problematic in our TPC Schema, because if User has a many-to-one relationship with BillingDetail, the Users table would need a single foreign key column, which would have to refer both concrete subclass tables. This isn’t possible with regular foreign key constraints. Schema Evolution with TPC is Complex A further conceptual problem with this mapping strategy is that several different columns, of different tables, share exactly the same semantics. This makes schema evolution more complex. For example, a change to a base class property results in changes to multiple columns. It also makes it much more difficult to implement database integrity constraints that apply to all subclasses. Generated SQLLet's examine SQL output for polymorphic queries in TPC mapping. For example, consider this polymorphic query for all BillingDetails and the resulting SQL statements that being executed in the database: var query = from b in context.BillingDetails select b; Just like the SQL query generated by TPT mapping, the CASE statements that you see in the beginning of the query is merely to ensure columns that are irrelevant for a particular row have NULL values in the returning flattened table. (e.g. BankName for a row that represents a CreditCard type). TPC's SQL Queries are Union Based As you can see in the above screenshot, the first SELECT uses a FROM-clause subquery (which is selected with a red rectangle) to retrieve all instances of BillingDetails from all concrete class tables. The tables are combined with a UNION operator, and a literal (in this case, 0 and 1) is inserted into the intermediate result; (look at the lines highlighted in yellow.) EF reads this to instantiate the correct class given the data from a particular row. A union requires that the queries that are combined, project over the same columns; hence, EF has to pad and fill up nonexistent columns with NULL. This query will really perform well since here we can let the database optimizer find the best execution plan to combine rows from several tables. There is also no Joins involved so it has a better performance than the SQL queries generated by TPT where a Join is required between the base and subclasses tables. Choosing Strategy GuidelinesBefore we get into this discussion, I want to emphasize that there is no one single "best strategy fits all scenarios" exists. As you saw, each of the approaches have their own advantages and drawbacks. Here are some rules of thumb to identify the best strategy in a particular scenario: If you don’t require polymorphic associations or queries, lean toward TPC—in other words, if you never or rarely query for BillingDetails and you have no class that has an association to BillingDetail base class. I recommend TPC (only) for the top level of your class hierarchy, where polymorphism isn’t usually required, and when modification of the base class in the future is unlikely. If you do require polymorphic associations or queries, and subclasses declare relatively few properties (particularly if the main difference between subclasses is in their behavior), lean toward TPH. Your goal is to minimize the number of nullable columns and to convince yourself (and your DBA) that a denormalized schema won’t create problems in the long run. If you do require polymorphic associations or queries, and subclasses declare many properties (subclasses differ mainly by the data they hold), lean toward TPT. Or, depending on the width and depth of your inheritance hierarchy and the possible cost of joins versus unions, use TPC. By default, choose TPH only for simple problems. For more complex cases (or when you’re overruled by a data modeler insisting on the importance of nullability constraints and normalization), you should consider the TPT strategy. But at that point, ask yourself whether it may not be better to remodel inheritance as delegation in the object model (delegation is a way of making composition as powerful for reuse as inheritance). Complex inheritance is often best avoided for all sorts of reasons unrelated to persistence or ORM. EF acts as a buffer between the domain and relational models, but that doesn’t mean you can ignore persistence concerns when designing your classes. SummaryIn this series, we focused on one of the main structural aspect of the object/relational paradigm mismatch which is inheritance and discussed how EF solve this problem as an ORM solution. We learned about the three well-known inheritance mapping strategies and their implementations in EF Code First. Hopefully it gives you a better insight about the mapping of inheritance hierarchies as well as choosing the best strategy for your particular scenario. Happy New Year and Happy Code-Firsting! References ADO.NET team blog Java Persistence with Hibernate book a { color: #5A99FF; } a:visited { color: #5A99FF; } .title { padding-bottom: 5px; font-family: Segoe UI; font-size: 11pt; font-weight: bold; padding-top: 15px; } .code, .typeName { font-family: consolas; } .typeName { color: #2b91af; } .padTop5 { padding-top: 5px; } .padTop10 { padding-top: 10px; } .exception { background-color: #f0f0f0; font-style: italic; padding-bottom: 5px; padding-left: 5px; padding-top: 5px; padding-right: 5px; }

    Read the article

  • Oracle Data Integration 12c: Simplified, Future-Ready, High-Performance Solutions

    - by Thanos Terentes Printzios
    In today’s data-driven business environment, organizations need to cost-effectively manage the ever-growing streams of information originating both inside and outside the firewall and address emerging deployment styles like cloud, big data analytics, and real-time replication. Oracle Data Integration delivers pervasive and continuous access to timely and trusted data across heterogeneous systems. Oracle is enhancing its data integration offering announcing the general availability of 12c release for the key data integration products: Oracle Data Integrator 12c and Oracle GoldenGate 12c, delivering Simplified and High-Performance Solutions for Cloud, Big Data Analytics, and Real-Time Replication. The new release delivers extreme performance, increase IT productivity, and simplify deployment, while helping IT organizations to keep pace with new data-oriented technology trends including cloud computing, big data analytics, real-time business intelligence. With the 12c release Oracle becomes the new leader in the data integration and replication technologies as no other vendor offers such a complete set of data integration capabilities for pervasive, continuous access to trusted data across Oracle platforms as well as third-party systems and applications. Oracle Data Integration 12c release addresses data-driven organizations’ critical and evolving data integration requirements under 3 key themes: Future-Ready Solutions : Supporting Current and Emerging Initiatives Extreme Performance : Even higher performance than ever before Fast Time-to-Value : Higher IT Productivity and Simplified Solutions  With the new capabilities in Oracle Data Integrator 12c, customers can benefit from: Superior developer productivity, ease of use, and rapid time-to-market with the new flow-based mapping model, reusable mappings, and step-by-step debugger. Increased performance when executing data integration processes due to improved parallelism. Improved productivity and monitoring via tighter integration with Oracle GoldenGate 12c and Oracle Enterprise Manager 12c. Improved interoperability with Oracle Warehouse Builder which enables faster and easier migration to Oracle Data Integrator’s strategic data integration offering. Faster implementation of business analytics through Oracle Data Integrator pre-integrated with Oracle BI Applications’ latest release. Oracle Data Integrator also integrates simply and easily with Oracle Business Analytics tools, including OBI-EE and Oracle Hyperion. Support for loading and transforming big and fast data, enabled by integration with big data technologies: Hadoop, Hive, HDFS, and Oracle Big Data Appliance. Only Oracle GoldenGate provides the best-of-breed real-time replication of data in heterogeneous data environments. With the new capabilities in Oracle GoldenGate 12c, customers can benefit from: Simplified setup and management of Oracle GoldenGate 12c when using multiple database delivery processes via a new Coordinated Delivery feature for non-Oracle databases. Expanded heterogeneity through added support for the latest versions of major databases such as Sybase ASE v 15.7, MySQL NDB Clusters 7.2, and MySQL 5.6., as well as integration with Oracle Coherence. Enhanced high availability and data protection via integration with Oracle Data Guard and Fast-Start Failover integration. Enhanced security for credentials and encryption keys using Oracle Wallet. Real-time replication for databases hosted on public cloud environments supported by third-party clouds. Tight integration between Oracle Data Integrator 12c and Oracle GoldenGate 12c and other Oracle technologies, such as Oracle Database 12c and Oracle Applications, provides a number of benefits for organizations: Tight integration between Oracle Data Integrator 12c and Oracle GoldenGate 12c enables developers to leverage Oracle GoldenGate’s low overhead, real-time change data capture completely within the Oracle Data Integrator Studio without additional training. Integration with Oracle Database 12c provides a strong foundation for seamless private cloud deployments. Delivers real-time data for reporting, zero downtime migration, and improved performance and availability for Oracle Applications, such as Oracle E-Business Suite and ATG Web Commerce . Oracle’s data integration offering is optimized for Oracle Engineered Systems and is an integral part of Oracle’s fast data, real-time analytics strategy on Oracle Exadata Database Machine and Oracle Exalytics In-Memory Machine. Oracle Data Integrator 12c and Oracle GoldenGate 12c differentiate the new offering on data integration with these many new features. This is just a quick glimpse into Oracle Data Integrator 12c and Oracle GoldenGate 12c. Find out much more about the new release in the video webcast "Introducing 12c for Oracle Data Integration", where customer and partner speakers, including SolarWorld, BT, Rittman Mead will join us in launching the new release. Resource Kits Meet Oracle Data Integration 12c  Discover what's new with Oracle Goldengate 12c  Oracle EMEA DIS (Data Integration Solutions) Partner Community is available for all your questions, while additional partner focused webcasts will be made available through our blog here, so stay connected. For any questions please contact us at partner.imc-AT-beehiveonline.oracle-DOT-com Stay Connected Oracle Newsletters

    Read the article

  • SQL SERVER – What is Page Life Expectancy (PLE) Counter

    - by pinaldave
    During performance tuning consultation there are plenty of counters and values, I often come across. Today we will quickly talk about Page Life Expectancy counter, which is commonly known as PLE as well. You can find the value of the PLE by running following query. SELECT [object_name], [counter_name], [cntr_value] FROM sys.dm_os_performance_counters WHERE [object_name] LIKE '%Manager%' AND [counter_name] = 'Page life expectancy' The recommended value of the PLE counter is 300 seconds. I have seen on busy system this value to be as low as even 45 seconds and on unused system as high as 1250 seconds. Page Life Expectancy is number of seconds a page will stay in the buffer pool without references. In simple words, if your page stays longer in the buffer pool (area of the memory cache) your PLE is higher, leading to higher performance as every time request comes there are chances it may find its data in the cache itself instead of going to hard drive to read the data. Now check your system and post back what is this counter value for you during various time of the day. Is this counter any way relates to performance issues for your system? Note: There are various other counters which are important to discuss during the performance tuning and this counter is not everything. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • EPM 11.1.1 - EPM Infrastructure Tuning Guide v11.1.1.3

    - by Ahmed Awan
    This edition applies to EPM 9.3.1, 11.1.1.1, 11.1.1.2 & 11.1.1.3 only. INTRODUCTION:One of the most challenging aspects of performance tuning is knowing where to begin. To maximize Oracle EPM System performance, all components need to be monitored, analyzed, and tuned. This guide describe the techniques used to monitor performance and the techniques for optimizing the performance of EPM components. Click to Download the EPM 11.1.1.3 Infrastructure Tuning Whitepaper (Right click or option-click the link and choose "Save As..." to download this file)

    Read the article

  • SQL Profiler: Read/Write units

    - by Ian Boyd
    i've picked a query out of SQL Server Profiler that says it took 1,497 reads: EventClass: SQL:BatchCompleted TextData: SELECT Transactions.... CPU: 406 Reads: 1497 Writes: 0 Duration: 406 So i've taken this query into Query Analyzer, so i may try to reduce the number of reads. But when i turn on SET STATISTICS IO ON to see the IO activity for the query, i get nowhere close to one thousand reads: Table Scan Count Logical Reads =================== ========== ============= FintracTransactions 4 20 LCDs 2 4 LCTs 2 4 FintracTransacti... 0 0 Users 1 2 MALs 0 0 Patrons 0 0 Shifts 1 2 Cages 1 1 Windows 1 3 Logins 1 3 Sessions 1 6 Transactions 1 7 Which if i do my math right, there is a total of 51 reads; not 1,497. So i assume Reads in SQL Profiler is an arbitrary metric. Does anyone know the conversion of SQL Server Profiler Reads to IO Reads? See also SQL Profiler CPU / duration unit Query Analyzer VS. Query Profiler Reads, Writes, and Duration Discrepencies

    Read the article

  • Using the Static Code Analysis feature of Visual Studio (Premium/Ultimate) to find memory leakage problems

    - by terje
    Memory for managed code is handled by the garbage collector, but if you use any kind of unmanaged code, like native resources of any kind, open files, streams and window handles, your application may leak memory if these are not properly handled.  To handle such resources the classes that own these in your application should implement the IDisposable interface, and preferably implement it according to the pattern described for that interface. When you suspect a memory leak, the immediate impulse would be to start up a memory profiler and start digging into that.   However, before you follow that impulse, do a Static Code Analysis run with a ruleset tuned to finding possible memory leaks in your code.  If you get any warnings from this, fix them before you go on with the profiling. How to use a ruleset In Visual Studio 2010 (Premium and Ultimate editions) you can define your own rulesets containing a list of Static Code Analysis checks.   I have defined the memory checks as shown in the lists below as ruleset files, which can be downloaded – see bottom of this post.  When you get them, you can easily attach them to every project in your solution using the Solution Properties dialog. Right click the solution, and choose Properties at the bottom, or use the Analyze menu and choose “Configure Code Analysis for Solution”: In this dialog you can now choose the Memorycheck ruleset for every project you want to investigate.  Pressing Apply or Ok opens every project file and changes the projects code analysis ruleset to the one we have specified here. How to define your own ruleset  (skip this if you just download my predefined rulesets) If you want to define the ruleset yourself, open the properties on any project, choose Code Analysis tab near the bottom, choose any ruleset in the drop box and press Open Clear out all the rules by selecting “Source Rule Sets” in the Group By box, and unselect the box Change the Group By box to ID, and select the checks you want to include from the lists below. Note that you can change the action for each check to either warning, error or none, none being the same as unchecking the check.   Now go to the properties window and set a new name and description for your ruleset. Then save (File/Save as) the ruleset using the new name as its name, and use it for your projects as detailed above. It can also be wise to add the ruleset to your solution as a solution item. That way it’s there if you want to enable Code Analysis in some of your TFS builds.   Running the code analysis In Visual Studio 2010 you can either do your code analysis project by project using the context menu in the solution explorer and choose “Run Code Analysis”, you can define a new solution configuration, call it for example Debug (Code Analysis), in for each project here enable the Enable Code Analysis on Build   In Visual Studio Dev-11 it is all much simpler, just go to the Solution root in the Solution explorer, right click and choose “Run code analysis on solution”.     The ruleset checks The following list is the essential and critical memory checks.  CheckID Message Can be ignored ? Link to description with fix suggestions CA1001 Types that own disposable fields should be disposable No  http://msdn.microsoft.com/en-us/library/ms182172.aspx CA1049 Types that own native resources should be disposable Only if the pointers assumed to point to unmanaged resources point to something else  http://msdn.microsoft.com/en-us/library/ms182173.aspx CA1063 Implement IDisposable correctly No  http://msdn.microsoft.com/en-us/library/ms244737.aspx CA2000 Dispose objects before losing scope No  http://msdn.microsoft.com/en-us/library/ms182289.aspx CA2115 1 Call GC.KeepAlive when using native resources See description  http://msdn.microsoft.com/en-us/library/ms182300.aspx CA2213 Disposable fields should be disposed If you are not responsible for release, of if Dispose occurs at deeper level  http://msdn.microsoft.com/en-us/library/ms182328.aspx CA2215 Dispose methods should call base class dispose Only if call to base happens at deeper calling level  http://msdn.microsoft.com/en-us/library/ms182330.aspx CA2216 Disposable types should declare a finalizer Only if type does not implement IDisposable for the purpose of releasing unmanaged resources  http://msdn.microsoft.com/en-us/library/ms182329.aspx CA2220 Finalizers should call base class finalizers No  http://msdn.microsoft.com/en-us/library/ms182341.aspx Notes: 1) Does not result in memory leak, but may cause the application to crash   The list below is a set of optional checks that may be enabled for your ruleset, because the issues these points too often happen as a result of attempting to fix up the warnings from the first set.   ID Message Type of fault Can be ignored ? Link to description with fix suggestions CA1060 Move P/invokes to NativeMethods class Security No http://msdn.microsoft.com/en-us/library/ms182161.aspx CA1816 Call GC.SuppressFinalize correctly Performance Sometimes, see description http://msdn.microsoft.com/en-us/library/ms182269.aspx CA1821 Remove empty finalizers Performance No http://msdn.microsoft.com/en-us/library/bb264476.aspx CA2004 Remove calls to GC.KeepAlive Performance and maintainability Only if not technically correct to convert to SafeHandle http://msdn.microsoft.com/en-us/library/ms182293.aspx CA2006 Use SafeHandle to encapsulate native resources Security No http://msdn.microsoft.com/en-us/library/ms182294.aspx CA2202 Do not dispose of objects multiple times Exception (System.ObjectDisposedException) No http://msdn.microsoft.com/en-us/library/ms182334.aspx CA2205 Use managed equivalents of Win32 API Maintainability and complexity Only if the replace doesn’t provide needed functionality http://msdn.microsoft.com/en-us/library/ms182365.aspx CA2221 Finalizers should be protected Incorrect implementation, only possible in MSIL coding No http://msdn.microsoft.com/en-us/library/ms182340.aspx   Downloadable ruleset definitions I have defined three rulesets, one called Inmeta.Memorycheck with the rules in the first list above, and Inmeta.Memorycheck.Optionals containing the rules in the second list, and the last one called Inmeta.Memorycheck.All containing the sum of the two first ones.  All three rulesets can be found in the  zip archive  “Inmeta.Memorycheck” downloadable from here.   Links to some other resources relevant to Static Code Analysis MSDN Magazine Article by Mickey Gousset on Static Code Analysis in VS2010 MSDN :  Analyzing Managed Code Quality by Using Code Analysis, root of the documentation for this Preventing generated code from being analyzed using attributes Online training course on Using Code Analysis with VS2010 Blogpost by Tatham Oddie on custom code analysis rules How to write custom rules, from Microsoft Code Analysis Team Blog Microsoft Code Analysis Team Blog

    Read the article

  • LINQ To objects: Quicker ideas?

    - by SDReyes
    Do you see a better approach to obtain and concatenate item.Number in a single string? Current: var numbers = new StringBuilder( ); // group is the result of a previous group by var basenumbers = group.Select( item => item.Number ); basenumbers.Aggregate ( numbers, ( res, element ) => res.AppendFormat( "{0:00}", element ) );

    Read the article

  • Logging library for (c++) games

    - by Klaim
    I know a lot of logging libraries but didn't test a lot of them. (GoogleLog, Pantheios, the coming boost::log library...) In games, especially in remote multiplayer and multithreaded games, logging is vital to debugging, even if you remove all logs in the end. Let's say I'm making a PC game (not console) that needs logs (multiplayer and multithreaded and/or multiprocess) and I have good reasons for looking for a library for logging (like, I don't have time or I'm not confident in my ability to write one correctly for my case). Assuming that I need : performance ease of use (allow streaming or formating or something like that) reliable (don't leak or crash!) cross-platform (at least Windows, MacOSX, Linux/Ubuntu) Wich logging library would you recommand? Currently, I think that boost::log is the most flexible one (you can even log to remotely!), but have not good performance update: is for high performance, but isn't released yet. Pantheios is often cited but I don't have comparison points on performance and usage. I've used my own lib for a long time but I know it don't manage multithreading so it's a big problem, even if it's fast enough. Google Log seems interesting, I just need to test it but if you already have compared those libs and more, your advice might be of good use. Games are often performance demanding while complex to debug so it would be good to know logging libraries that, in our specific case, have clear advantages.

    Read the article

  • Data in two databases, eager spool resulting in query

    - by Valkyrie
    I have two databases in SQL2k5: one that holds a large amount of static data (SQL Database 1) (never updated but frequently inserted into) and one that holds relational data (SQL Database 2) related to the static data. They're separated mainly because of corporate guidelines and business requirements: assume for the following problem that combining them is not practical. There are places in SQLDB2 that PKs in SQLDB1 are referenced; triggers control the referential integrity, since cross-database relationships are troublesome in SQL Server. BUT, because of the large amount of data in SQLDB1, I'm getting eager spools on queries that join from the Id in SQLDB2 that references the data in SQLDB1. (With me so far? Maybe an example will help:) SELECT t.Id, t.Name, t2.Company FROM SQLDB1.table t INNER JOIN SQLDB2.table t2 ON t.Id = t2.FKId This query results in a eager spool that's 84% of the load of the query; the table in SQLDB1 has 35M rows, so it's completely choking this query. I can't create a view on the table in SQLDB1 and use that as my FK/index; it doesn't want me to create a constraint based on a view. Anyone have any idea how I can fix this huge bottleneck? (Short of putting the static data in the first db: believe me, I've argued that one until I'm blue in the face to no avail.) Thanks! valkyrie Edit: also can't create an indexed view because you can't put schemabinding on a view that references a table outside the database where the view resides. Dang it.

    Read the article

< Previous Page | 46 47 48 49 50 51 52 53 54 55 56 57  | Next Page >