Search Results

Search found 9542 results on 382 pages for 'row'.

Page 205/382 | < Previous Page | 201 202 203 204 205 206 207 208 209 210 211 212  | Next Page >

  • Improving Partitioned Table Join Performance

    - by Paul White
    The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) );   CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY);   WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50);   -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE);   -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory:   Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   CREATE TABLE #P ( partition_number integer PRIMARY KEY);   INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1;   SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals;   DROP TABLE #P;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated  – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • setCurrentRowWithKey and setCurrentRowWithKeyValue

    - by raghu.yadav
    Good demo demo by shay by shay on how to use setCurrentRowWithKeyValue to synchronize the master and details of same table. Example : Employee master table list and EmployeeDetail form to edit records. However there are many ways to achieve the same usecase. You can use the clicktoEdit property on table row to edit records in place within in table and there is also a feature to show more for few columns in table tools. But objective here is to see how and where and all we can make use of setCurrentRowWithKey and setCurrentRowWithKeyValue. Here is the link about this explains how we can make use of this. link more to be added.

    Read the article

  • /sbin/getty process causing 100% CPU utilization

    - by scrrr
    I have an instance of Ubuntu 12.04 LTS (GNU/Linux 3.2.0-25-virtual i686) running as a KVM-VM on a host-machine that runs one more VM beside it. I deploy a Ruby on Rails application using the Capistrano deployment-gem. However, if I deploy twice in a row in a short time, the CPU usage jumps to 100% because of the /sbin/getty process. How can this be? I believe getty is a rather simple program that passes a login-name from a terminal to a login-process. Also: In my Capfile (Capistrano configuration file) I am running certain commands after the Rails application is deployed including a call to sudo /sbin/restart <APPNAME> which is an upstart task. Could this be related somehow? I can always kill the getty process and the problem is gone until the next deployment, but I would rather understand and fix the problem. Any help is appreciated. Attached is a screenshot of my problem.

    Read the article

  • New .NET Library for Accessing the Survey Monkey API

    - by Ben Emmett
    I’ve used Survey Monkey’s API for a while, and though it’s pretty powerful, there’s a lot of boilerplate each time it’s used in a new project, and the json it returns needs a bunch of processing to be able to use the raw information. So I’ve finally got around to releasing a .NET library you can use to consume the API more easily. The main advantages are: Only ever deal with strongly-typed .NET objects, making everything much more robust and a lot faster to get going Automatically handles things like rate-limiting and paging through results Uses combinations of endpoints to get all relevant data for you, and processes raw response data to map responses to questions To start, either install it using NuGet with PM> Install-Package SurveyMonkeyApi (easier option), or grab the source from https://github.com/bcemmett/SurveyMonkeyApi if you prefer to build it yourself. You’ll also need to have signed up for a developer account with Survey Monkey, and have both your API key and an OAuth token. A simple usage would be something like: string apiKey = "KEY"; string token = "TOKEN"; var sm = new SurveyMonkeyApi(apiKey, token); List<Survey> surveys = sm.GetSurveyList(); The surveys object is now a list of surveys with all the information available from the /surveys/get_survey_list API endpoint, including the title, id, date it was created and last modified, language, number of questions / responses, and relevant urls. If there are more than 1000 surveys in your account, the library pages through the results for you, making multiple requests to get a complete list of surveys. All the filtering available in the API can be controlled using .NET objects. For example you might only want surveys created in the last year and containing “pineapple” in the title: var settings = new GetSurveyListSettings { Title = "pineapple", StartDate = DateTime.Now.AddYears(-1) }; List<Survey> surveys = sm.GetSurveyList(settings); By default, whenever optional fields can be requested with a response, they will all be fetched for you. You can change this behaviour if for some reason you explicitly don’t want the information, using var settings = new GetSurveyListSettings { OptionalData = new GetSurveyListSettingsOptionalData { DateCreated = false, AnalysisUrl = false } }; Survey Monkey’s 7 read-only endpoints are supported, and the other 4 which make modifications to data might be supported in the future. The endpoints are: Endpoint Method Object returned /surveys/get_survey_list GetSurveyList() List<Survey> /surveys/get_survey_details GetSurveyDetails() Survey /surveys/get_collector_list GetCollectorList() List<Collector> /surveys/get_respondent_list GetRespondentList() List<Respondent> /surveys/get_responses GetResponses() List<Response> /surveys/get_response_counts GetResponseCounts() Collector /user/get_user_details GetUserDetails() UserDetails /batch/create_flow Not supported Not supported /batch/send_flow Not supported Not supported /templates/get_template_list Not supported Not supported /collectors/create_collector Not supported Not supported The hierarchy of objects the library can return is Survey List<Page> List<Question> QuestionType List<Answer> List<Item> List<Collector> List<Response> Respondent List<ResponseQuestion> List<ResponseAnswer> Each of these classes has properties which map directly to the names of properties returned by the API itself (though using PascalCasing which is more natural for .NET, rather than the snake_casing used by SurveyMonkey). For most users, Survey Monkey imposes a rate limit of 2 requests per second, so by default the library leaves at least 500ms between requests. You can request higher limits from them, so if you want to change the delay between requests just use a different constructor: var sm = new SurveyMonkeyApi(apiKey, token, 200); //200ms delay = 5 reqs per sec There’s a separate cap of 1000 requests per day for each API key, which the library doesn’t currently enforce, so if you think you’ll be in danger of exceeding that you’ll need to handle it yourself for now.  To help, you can see how many requests the current instance of the SurveyMonkeyApi object has made by reading its RequestsMade property. If the library encounters any errors, including communicating with the API, it will throw a SurveyMonkeyException, so be sure to handle that sensibly any time you use it to make calls. Finally, if you have a survey (or list of surveys) obtained using GetSurveyList(), the library can automatically fill in all available information using sm.FillMissingSurveyInformation(surveys); For each survey in the list, it uses the other endpoints to fill in the missing information about the survey’s question structure, respondents, and responses. This results in at least 5 API calls being made per survey, so be careful before passing it a large list. It also joins up the raw response information to the survey’s question structure, so that for each question in a respondent’s set of replies, you can access a ProcessedAnswer object. For example, a response to a dropdown question (from the /surveys/get_responses endpoint) might be represented in json as { "answers": [ { "row": "9384627365", } ], "question_id": "615487516" } Separately, the question’s structure (from the /surveys/get_survey_details endpoint) might have several possible answers, one of which might look like { "text": "Fourth item in dropdown list", "visible": true, "position": 4, "type": "row", "answer_id": "9384627365" } The library understands how this mapping works, and uses that to give you the following ProcessedAnswer object, which first describes the family and type of question, and secondly gives you the respondent’s answers as they relate to the question. Survey Monkey has many different question types, with 11 distinct data structures, each of which are supported by the library. If you have suggestions or spot any bugs, let me know in the comments, or even better submit a pull request .

    Read the article

  • SQL SERVER – Example of Performance Tuning for Advanced Users with DB Optimizer

    - by Pinal Dave
    Performance tuning is such a subject that everyone wants to master it. In beginning everybody is at a novice level and spend lots of time learning how to master the art of performance tuning. However, as we progress further the tuning of the system keeps on getting very difficult. I have understood in my early career there should be no need of ego in the technology field. There are always better solutions and better ideas out there and we should not resist them. Instead of resisting the change and new wave I personally adopt it. Here is a similar example, as I personally progress to the master level of performance tuning, I face that it is getting harder to come up with optimal solutions. In such scenarios I rely on various tools to teach me how I can do things better. Once I learn about tools, I am often able to come up with better solutions when I face the similar situation next time. A few days ago I had received a query where the user wanted to tune it further to get the maximum out of the performance. I have re-written the similar query with the help of AdventureWorks sample database. SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID; User had similar query to above query was used in very critical report and wanted to get best out of the query. When I looked at the query – here were my initial thoughts Use only column in the select statements as much as you want in the application Let us look at the query pattern and data workload and find out the optimal index for it Before I give further solutions I was told by the user that they need all the columns from all the tables and creating index was not allowed in their system. He can only re-write queries or use hints to further tune this query. Now I was in the constraint box – I believe * was not a great idea but if they wanted all the columns, I believe we can’t do much besides using *. Additionally, if I cannot create a further index, I must come up with some creative way to write this query. I personally do not like to use hints in my application but there are cases when hints work out magically and gives optimal solutions. Finally, I decided to use Embarcadero’s DB Optimizer. It is a fantastic tool and very helpful when it is about performance tuning. I have previously explained how it works over here. First open DBOptimizer and open Tuning Job from File >> New >> Tuning Job. Once you open DBOptimizer Tuning Job follow the various steps indicates in the following diagram. Essentially we will take our original script and will paste that into Step 1: New SQL Text and right after that we will enable Step 2 for Generating Various cases, Step 3 for Detailed Analysis and Step 4 for Executing each generated case. Finally we will click on Analysis in Step 5 which will generate the report detailed analysis in the result pan. The detailed pan looks like. It generates various cases of T-SQL based on the original query. It applies various hints and available hints to the query and generate various execution plans of the query and displays them in the resultant. You can clearly notice that original query had a cost of 0.0841 and logical reads about 607 pages. Whereas various options which are just following it has different execution cost as well logical read. There are few cases where we have higher logical read and there are few cases where as we have very low logical read. If we pay attention the very next row to original query have Merge_Join_Query in description and have lowest execution cost value of 0.044 and have lowest Logical Reads of 29. This row contains the query which is the most optimal re-write of the original query. Let us double click over it. Here is the query: SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID OPTION (MERGE JOIN) If you notice above query have additional hint of Merge Join. With the help of this Merge Join query hint this query is now performing much better than before. The entire process takes less than 60 seconds. Please note that it the join hint Merge Join was optimal for this query but it is not necessary that the same hint will be helpful in all the queries. Additionally, if the workload or data pattern changes the query hint of merge join may be no more optimal join. In that case, we will have to redo the entire exercise once again. This is the reason I do not like to use hints in my queries and I discourage all of my users to use the same. However, if you look at this example, this is a great case where hints are optimizing the performance of the query. It is humanly not possible to test out various query hints and index options with the query to figure out which is the most optimal solution. Sometimes, we need to depend on the efficiency tools like DB Optimizer to guide us the way and select the best option from the suggestion provided. Let me know what you think of this article as well your experience with DB Optimizer. Please leave a comment. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Joins, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Edit Media Center TV Recordings with Windows Live Movie Maker

    - by DigitalGeekery
    Have you ever wanted to take a TV program you’ve recorded in Media Center and remove the commercials or save clips of favorite scenes? Today we’ll take a look at editing WTV and DVR-MS files with Windows Live Movie Maker. Download and Install Windows Live Movie Maker. The download link can be found at the end of the article. WLMM is part of Windows Live Essentials, but you can choose to install only the applications you want. You’ll also want to be sure to uncheck any unwanted settings like settings Bing as default search provider or MSN as your browser home page.   Add your recorded TV file to WLMM by clicking the Add videos and photos button, or by dragging and dropping it onto the storyboard.   You’ll see your video displayed in the Preview window on the left and on the storyboard. Adjust the Zoom Time Scale slider at the lower right to change the level of detail displayed on the storyboard. You may want to start zoomed out and zoom in for more detailed edits.   Removing Commercials or Unwanted Sections Note: Changes and edits made in Windows Live Movie Maker do not change or effect the original video file. To accomplish this, we will makes cuts, or “splits,” and the beginning and end of the section we want to remove, and then we will delete that section from our project. Click and drag the slider bar along the the storyboard to scroll through the video. When you get to the end of a row in on the storyboard, drag the slider down to the beginning of the next row. We’ve found it easiest and most accurate to get close to the end of the commercial break and then use the Play button and the Previous Frame and Next Frame buttons underneath the Preview window to fine tune your cut point. When you find the right place to make your first cut, click the split button on the Edit tab on the ribbon. You will see your video “split” into two sections. Now, repeat the process of scrolling through the storyboard to find the end of the section you wish to cut. When you are at the proper point, click the Split button again.   Now we’ll delete that section by selecting it and pressing the Delete key, selecting remove on the Home tab, or by right clicking on the section and selecting Remove.   Trim Tool This tool allows you to select a portion of the video to keep while trimming away the rest.   Click and drag the sliders in the preview windows to select the area you want to keep. The area outside the sliders will be trimmed away. The area inside is the section that is kept in the movie. You can also adjust the Start and End points manually on the ribbon.   Delete any additional clips you don’t want in the final output. You can also accomplish this by using the Set start point and Set end point buttons. Clicking Set start point will eliminate everything before the start point. Set end point will eliminate everything after the end point. And you’re left with only the clip you want to keep.   Output your Video Select the icon at the top left, then select Save movie. All of these settings will output your movie as a WMV file, but file size and quality will vary by setting. The Burn to DVD option also outputs a WMV file, but then opens Windows DVD Maker and prompts you to create and burn a DVD.   Conclusion WLMM is one of the few applications that can edit WTV files, and it’s the only one we’re aware of that’s free. We should note only WTV and DVR-MS files will appear in the Recorded TV library in Media Center, so if you want to view your WMV output file in WMC you’ll need to add it to the Video or Movie library. Would you like to learn more about Windows Live Movie Maker? Check out are article on how to turn photos and home videos into movies with Windows Live Movie Maker. Need to add videos from a network location? WLMM doesn’t allow this by default, but you check out how to add network support to Windows Live Move Maker. Download Windows Live Similar Articles Productive Geek Tips Rotate a Video 90 degrees with VLC or Windows Live Movie MakerHow to Make/Edit a movie with Windows Movie Maker in Windows VistaFamily Fun: Share Photos with Photo Gallery and Windows Live SpacesAutomatically Mount and View ISO files in Windows 7 Media CenterAutomatically Start Windows 7 Media Center in Live TV Mode TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Xobni Plus for Outlook All My Movies 5.9 CloudBerry Online Backup 1.5 for Windows Home Server Snagit 10 Get a free copy of WinUtilities Pro 2010 World Cup Schedule Boot Snooze – Reboot and then Standby or Hibernate Customize Everything Related to Dates, Times, Currency and Measurement in Windows 7 Google Earth replacement Icon (Icons we like) Build Great Charts in Excel with Chart Advisor

    Read the article

  • Microsoft MVP for the Third Time

    - by shiju
    I just received an email from Microsoft which stating that I have been awarded as Microsoft MVP again for 2012!! Now I became a Microsoft MVP for the third time in a row. Here's the email below: Dear Shiju Varghese, Congratulations! We are pleased to present you with the 2012 Microsoft® MVP Award! This award is given to exceptional technical community leaders who actively share their high quality, real world expertise with others. We appreciate your outstanding contributions in ASP.NET/IIS technical communities during the past year. On this occasion, I would like to thank Microsoft, community leaders, fellow MVPs, my blog readers, my employer Marlabs and finally big thanks to my lovely wife Rosmi and to my daughter Irene Rose.

    Read the article

  • I&rsquo;m speaking at Software Architect 2010 in October

    - by Eric Nelson
    I’m very pleased to report I have managed to slip past the quality police and get to speak for the third year in a row at the excellent Software Architect conference in London. Which makes it the only “long running” conference that I have a 100% record on speaking at year on year which gives it an extra special significance. How much longer before I am found out :) This conference attracts some great speakers including the likes of Kevlin Henney, Neal Ford and Tim Ewald (oh – and me). If you are a software/solution architect then I would definitely recommend you check out whether the sessions this year are something that would help you grow and make great technology/architecture choices in your organisation. I am delivering a brand new session - which means I need to create it :-) 10 things every architect needs to know about Windows Azure In this session we will look at the 10 most architecturally significant features of the Windows Azure platform which directly impact how you architect solutions if you plan to deploy in the Cloud. Maybe see you there…

    Read the article

  • Simple Excel Export with EPPlus

    - by Jesse Taber
    Originally posted on: http://geekswithblogs.net/GruffCode/archive/2013/10/30/simple-excel-export-with-epplus.aspxAnyone I’ve ever met who works with an application that sits in front of a lot of data loves it when they can get that data exported to an Excel file for them to mess around with offline. As both developer and end user of a little website project that I’ve been working on, I found myself wanting to be able to get a bunch of the data that the application was collecting into an Excel file. The great thing about being both an end user and a developer on a project is that you can build the features that you really want! While putting this feature together I came across the fantastic EPPlus library. This library is certainly very well known and popular, but I was so impressed with it that I thought it was worth a quick blog post. This library is extremely powerful; it lets you create and manipulate Excel 2007/2010 spreadsheets in .NET code with a high degree of flexibility. My only gripe with the project is that they are not touting how insanely easy it is to build a basic Excel workbook from a simple data source. If I were running this project the approach I’m about to demonstrate in this post would be front and center on the landing page for the project because it shows how easy it really is to get started and serves as a good way to ease yourself in to some of the more advanced features. The website in question uses RavenDB, which means that we’re dealing with POCOs to model the data throughout all layers of the application. I love working like this so when it came time to figure out how to export some of this data to an Excel spreadsheet I wanted to find a way to take an IEnumerable<T> and just have it dumped to Excel with each item in the collection being modeled as a single row in the Excel worksheet. Consider the following class: public class Employee { public int Id { get; set; } public string Name { get; set; } public decimal HourlyRate { get; set; } public DateTime HireDate { get; set; } } Now let’s say we have a collection of these represented as an IEnumerable<Employee> and we want to be able to output it to an Excel file for offline querying/manipulation. As it turns out, this is dead simple to do with EPPlus. Have a look: public void ExportToExcel(IEnumerable<Employee> employees, FileInfo targetFile) { using (var excelFile = new ExcelPackage(targetFile)) { var worksheet = excelFile.Workbook.Worksheets.Add("Sheet1"); worksheet.Cells["A1"].LoadFromCollection(Collection: employees, PrintHeaders: true); excelFile.Save(); } } That’s it. Let’s break down what’s going on here: Create a ExcelPackage to model the workbook (Excel file). Note that the ‘targetFile’ value here is a FileInfo object representing the location on disk where I want the file to be saved. Create a worksheet within the workbook. Get a reference to the top-leftmost cell (addressed as A1) and invoke the ‘LoadFromCollection’ method, passing it our collection of Employee objects. Behind the scenes this is reflecting over the properties of the type provided and pulling out any public members to become columns in the resulting Excel output. The ‘PrintHeaders’ parameter tells EPPlus to grab the name of the property and put it in the first row. Save the Excel file All of the heavy lifting here is being done by the ‘LoadFromCollection’ method, and that’s a good thing. Now, this was really easy to do, but it has some limitations. Using this approach you get a very plain, un-styled Excel worksheet. The column widths are all set to the default. The number format for all cells is ‘General’ (which proves particularly interesting if you have a DateTime property in your data source). I’m a “no frills” guy, so I wasn’t bothered at all by trading off simplicity for style and formatting. That said, EPPlus has tons of samples that you can download that illustrate how to apply styles and formatting to cells and a ton of other advanced features that are way beyond the scope of this post.

    Read the article

  • Silverlight Grid Layout is pain

    - by brainbox
     I think one of the biggest mistake of Silverlight and WPF is its Grid layout.Imagine you have a data form with 2 columns and 5 rows. You need to place new row after the first one. As a result you need to rewrite Grid.Rows and Grid.Columns in all rows belows. But the worst thing of such approach is that it is static. So you need predefine all your rows and columns. As a result creating of simple dynamic datagrid or dataform become impossible... So the question if why best practices of HTML and Adobe Flex were dropped????If anybody have tried to port Flex Grid layout to silverlight please mail me or drop a comment.

    Read the article

  • SQL SERVER – Solution – Puzzle – SELECT * vs SELECT COUNT(*)

    - by pinaldave
    Earlier I have published Puzzle Why SELECT * throws an error but SELECT COUNT(*) does not. This question have received many interesting comments. Let us go over few of the answers, which are valid. Before I start the same, let me acknowledge Rob Farley who has not only answered correctly very first but also started interesting conversation in the same thread. The usual question will be what is the right answer. I would like to point to official Microsoft Connect Items which discusses the same. RGarvao https://connect.microsoft.com/SQLServer/feedback/details/671475/select-test-where-exists-select tiberiu utan http://connect.microsoft.com/SQLServer/feedback/details/338532/count-returns-a-value-1 Rob Farley count(*) is about counting rows, not a particular column. It doesn’t even look to see what columns are available, it’ll just count the rows, which in the case of a missing FROM clause, is 1. “select *” is designed to return columns, and therefore barfs if there are none available. Even more odd is this one: select ‘blah’ where exists (select *) You might be surprised at the results… Koushik The engine performs a “Constant scan” for Count(*) where as in the case of “SELECT *” the engine is trying to perform either Index/Cluster/Table scans. amikolaj When you query ‘select * from sometable’, SQL replaces * with the current schema of that table. With out a source for the schema, SQL throws an error. so when you query ‘select count(*)’, you are counting the one row. * is just a constant to SQL here. Check out the execution plan. Like the description states – ‘Scan an internal table of constants.’ You could do ‘select COUNT(‘my name is adam and this is my answer’)’ and get the same answer. Netra Acharya SELECT * Here, * represents all columns from a table. So it always looks for a table (As we know, there should be FROM clause before specifying table name). So, it throws an error whenever this condition is not satisfied. SELECT COUNT(*) Here, COUNT is a Function. So it is not mandetory to provide a table. Check it out this: DECLARE @cnt INT SET @cnt = COUNT(*) SELECT @cnt SET @cnt = COUNT(‘x’) SELECT @cnt Naveen Select 1 / Select ‘*’ will return 1/* as expected. Select Count(1)/Count(*) will return the count of result set of select statement. Count(1)/Count(*) will have one 1/* for each row in the result set of select statement. Select 1 or Select ‘*’ result set will contain only 1 result. so count is 1. Where as “Select *” is a sysntax which expects the table or equauivalent to table (table functions, etc..). It is like compilation error for that query. Ramesh Hi Friends, Count is an aggregate function and it expects the rows (list of records) for a specified single column or whole rows for *. So, when we use ‘select *’ it definitely give and error because ‘*’ is meant to have all the fields but there is not any table and without table it can only raise an error. So, in the case of ‘Select Count(*)’, there will be an error as a record in the count function so you will get the result as ’1'. Try using : Select COUNT(‘RAMESH’) and think there is an error ‘Must specify table to select from.’ in place of ‘RAMESH’ Pinal : If i am wrong then please clarify this. Sachin Nandanwar Any aggregate function expects a constant or a column name as an expression. DO NOT be confused with * in an aggregate function.The aggregate function does not treat it as a column name or a set of column names but a constant value, as * is a key word in SQL. You can replace any value instead of * for the COUNT function.Ex Select COUNT(5) will result as 1. The error resulting from select * is obvious it expects an object where it can extract the result set. I sincerely thank you all for wonderful conversation, I personally enjoyed it and I am sure all of you have the same feeling. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: CodeProject, Pinal Dave, PostADay, Readers Contribution, Readers Question, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • How big can my SharePoint 2010 installation be?

    - by Sahil Malik
    Ad:: SharePoint 2007 Training in .NET 3.5 technologies (more information). 3 years ago, I had published “How big can my SharePoint 2007 installation be?” Well, SharePoint 2010 has significant under the covers improvements. So, how big can your SharePoint 2010 installation be? There are three kinds of limits you should know about Hard limits that cannot be exceeded by design. Configurable that are, well configurable – but the default values are set for a pretty good reason, so if you need to tweak, plan and understand before you tweak. Soft limits, you can exceed them, but it is not recommended that you do. Before you read any of the limits, read these two important disclaimers - 1. The limit depends on what you’re doing. So, don’t take the below as gospel, the reality depends on your situation. 2. There are many additional considerations in planning your SharePoint solution scalability and performance, besides just the below. So with those in mind, here goes.   Hard Limits - Zones per web app 5 RBS NAS performance Time to first byte of any response from NAS must be less than 20 milliseconds List row size 8000 bytes driven by how SP stores list items internally Max file size 2GB (default is 50MB, configurable). RBS does not increase this limit. Search metadata properties 10,000 per item crawled (pretty damn high, you’ll never need to worry about it). Max # of concurrent in-memory enterprise content types 5000 per web server, per tenant Max # of external system connections 500 per web server PerformancePoint services using Excel services as a datasource No single query can fetch more than 1 million excel cells Office Web Apps Renders One doc per second, per CPU core, per Application server, limited to a maximum of 8 cores.   Configurable Limits - Row Size Limit 6, configurable via SPWebApplication.MaxListItemRowStorage property List view lookup 8 join operations per query Max number of list items that a single operation can process at one time in normal hours 5000 Configurable via SPWebApplication.MaxItemsPerThrottledOperation   Also you get a warning at 3000, which is configurable via SPWebApplication.MaxItemsPerThrottledOperationWarningLevel   In addition, throttle overrides can be requested, throttle overrides can be disabled, and time windows can be set when throttle is disabled. Max number of list items for administrators that a single operation can process at one time in normal hours 20000 Configurable via SPWebApplication.MaxItemsPerThrottledOperationOverride Enumerating subsites 2000 Word and Powerpoint co-authoring simultaneous editors 10 (Hard limit is 99). # of webparts on a page 25 Search Crawl DBs per search service app 10 Items per crawl db 25 million Search Keywords 200 per site collection. There is a max limit of 5000, which can then be modified by editing the web.config/client.config. Concurrent # of workflows on a content db 15. Workflows running in the timer service are not counted in this limit. Further workflows are queued. Can be configured via the Set-SPFarmConfig powershell commandlet. Number of events picked by the workflow timer job and delivered to workflows 100. You can increase this limit by running additional instances of the workflow timer service. Visio services file size 50MB Visio web drawing recalculation timeout 120 seconds Configurable via – Powershell commandlet Set-SPVisioPerformance Visio services minimum and maximum cache age for data connected diagrams 0 to 24 hours. Default is 60 minutes. Configurable via – Powershell commandlet Set-SPVisioPerformance   Soft Limits - Content Databases 300 per web app Application Pools 10 per web server Managed Paths 20 per web app Content Database Size 200GB per Content DB Size of 1 site collection 100GB # of sites in a site collection 250,000 Documents in a library 30 Million, with nesting. Depends heavily on type and usage and size of documents. Items 30 million. Depends heavily on usage of items. SPGroups one SPUser can be in 5000 Users in a site collection 2 million, depends on UI, nesting, containers and underlying user store AD Principals in a SPGroup 5000 SPGroups in a site collection 10000 Search Service Instances 20 Indexed Items in Search 100 million Crawl Log entries 100 million Search Alerts 1 million per search application Search Crawled Properties 1/2 million URL removals in search 100 removals per operation User Profiles 2 million per service application Social Tags 500 million per social database Comment on the article ....

    Read the article

  • Importing data from text file to specific columns using BULK INSERT

    - by Dinesh Asanka
    Bulk insert is much faster than using other techniques such as  SSIS. However, when you are using bulk insert you can’t insert to specific columns. If, for example, there are five columns in a table you should have five values for each record in the text file you are importing from. This is an issue when you are expecting default values to be inserted into tables. Let us say you have table as below: In this table, you are expecting ID, Status and CreatedDate to be updated automatically, so your text file may only have   FirstName  LastName  values as below: Dinesh,Asanka Saman,Liyanage Ruwan,Silva Susantha,Bathige Jude,Peires Sanjeewa,Jayawickrama If you use bulk insert to this table like follows, You will be returned an error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 1 (ID). To avoid this you will need to create a view with the columns you are expecting to fill and use bulk insert against it. If you check the table now, you will see table with values in the text file and the default values.

    Read the article

  • How-to dynamically filter model-driven LOV

    - by Frank Nimphius
    Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Often developers need to filter a LOV query with information obtained from an ADF Faces form or other where. The sample below shows how to define a launch popup listener configured on the launchPopupListener property of the af:inputListOfValues component to filter a list of values. <af:inputListOfValues id="departmentIdId"    value="#{bindings.DepartmentId.inputValue}"                                          model="#{bindings.DepartmentId.listOfValuesModel}"    launchPopupListener="#{PopupLauncher.onPopupLaunch}" … >         … </af:inputListOfValues> A list of values is queried using a search binding that gets created in the PageDef file of a view when a lis of value component gets added. The managed bean code below looks this search binding up to then add a view criteria that filters the query. Note: There is no public API yet available for the FacesCtrlLOVBinding class, which is why I use the internal package class it in the example. public void onPopupLaunch(LaunchPopupEvent launchPopupEvent) {   BindingContext bctx = BindingContext.getCurrent();   BindingContainer bindings = bctx.getCurrentBindingsEntry();   FacesCtrlLOVBinding lov =        (FacesCtrlLOVBinding)bindings.get("DepartmentId");   ViewCriteriaManager vcm =   lov.getListIterBinding().getViewObject().getViewCriteriaManager();             //make sure the view criteria is cleared   vcm.removeViewCriteria(vcm.DFLT_VIEW_CRITERIA_NAME);   //create a new view criteria   ViewCriteria vc =          new ViewCriteria(lov.getListIterBinding().getViewObject());   //use the default view criteria name   //"__DefaultViewCriteria__"   vc.setName(vcm.DFLT_VIEW_CRITERIA_NAME);   //create a view criteria row for all queryable attributes   ViewCriteriaRow vcr = new ViewCriteriaRow(vc);   //for this sample I set the query filter to DepartmentId 60.   //You may determine it at runtime by reading it from a managed bean   //or binding layer   vcr.setAttribute("DepartmentId", 60);   //also note that the view criteria row consists of all attributes   //that belong to the LOV list view object, which means that you can   //filter on multiple attributes   vc.addRow(vcr);             lov.getListIterBinding().getViewObject().applyViewCriteria(vc); }  Note: Instead of using the vcm.DFLT_VIEW_CRITERIA_NAME name you can also define a custom name for the view criteria.

    Read the article

  • Why does my domain not show up in Google anymore?

    - by Earlz
    So I have had a website since about 2006. It's http://earlz.biz.tm . Recently I've noticed that no results will show up for it in google. I do have a secondary domain(that I plan on getting rid of) pointing to it but I don't understand why google would suddenly not show my site. I believe it was showing up a few months ago and my website is hardly ever down, like one or two days I believe has been the most it's been down in a row in this time period. Is there something wrong with my DNS or other configuration that would make google not index me? For reference I've tried: earlz.biz.tm site:earlz.biz.tm and the heading from my site "Earlz.biz.tm -- The reasoning is bacon" A few show up with the therusticstone.com domain(the one I plan to point somewhere else) but none show up directly linking to earlz.biz.tm.

    Read the article

  • DBCC CHECKDB (BatmanDb, REPAIR_ALLOW_DATA_LOSS) &ndash; Are you Feeling Lucky?

    - by David Totzke
    I’m currently working for a client on a PowerBuilder to WPF migration.  It’s one of those “I could tell you, but I’d have to kill you” kind of clients and the quick-lime pits are currently occupied by the EMC tech…but I’ve said too much already. At approximately 3 or 4 pm that day users of the Batman[1] application here in Gotham[1] started to experience problems accessing the application.  Batman[2] is a document management system here that also integrates with the ERP system.  Very little goes on here that doesn’t involve Batman in some way.  The errors being received seemed to point to network issues (TCP protocol error, connection forcibly closed by the remote host etc…) but the real issue was much more insidious. Connecting to the database via SSMS and performing selects on certain tables underlying the application areas that were having problems started to reveal the issue.  You couldn’t do a SELECT * FROM MyTable without it bombing and giving the same error noted above.  A run of DBCC CHECKDB revealed 14 tables with corruption.  One of the tables with issues was the Document table.  Pretty central to a “document management” system.  Information was obtained from IT that a single drive in the SAN went bad in the night.  A new drive was in place and was working fine.  The partition that held the Batman database is configured for RAID Level 5 so a single drive failure shouldn’t have caused any trouble and yet, the database is corrupted.  They do hourly incremental backups here so the first thing done was to try a restore.  A restore of the most recent backup failed so they worked backwards until they hit a good point.  This successful restore was for a backup at 3AM – a full day behind.  This time also roughly corresponds with the time the SAN started to report the drive failure.  The plot thickens… I got my hands on the output from DBCC CHECKDB and noticed a pattern.  What’s sad is that nobody that should have noticed the pattern in the DBCC output did notice.  There was a rush to do things to try and recover the data before anybody really understood what was wrong with it in the first place.  Cooler heads must prevail in these circumstances and some investigation should be done and a plan of action laid out or you could end up making things worse[3].  DBCC CHECKDB also told us that: repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB Yikes.  That means that the database is so messed up that you’re definitely going to lose some stuff when you repair it to get it back to a consistent state.  All the more reason to do a little more investigation into the problem.  Rescuing this database is preferable to having to export all of the data possible from this database into a new one.  This is a fifteen year old application with about seven hundred tables.  There are TRIGGERS everywhere not to mention the referential integrity constraints to deal with.  Only fourteen of the tables have an issue.  We have a good backup that is missing the last 24 hours of business which means we could have a “do-over” of yesterday but that’s not a very palatable option either. All of the affected tables had TEXT columns and all of the errors were about LOB data types and orphaned off-row data which basically means TEXT, IMAGE or NTEXT columns.  If we did a SELECT on an affected table and excluded those columns, we got all of the rows.  We exported that data into a separate database.  Things are looking up.  Working on a copy of the production database we then ran DBCC CHECKDB with REPAIR_ALLOW_DATA_LOSS and that “fixed” everything up.   The allow data loss option will delete the bad rows.  This isn’t too horrible as we have all of those rows minus the text fields from out earlier export.  Now I could LEFT JOIN to the exported data to find the missing rows and INSERT them minus the TEXT column data. We had the restored data from the good 3AM backup that we could now JOIN to and, with fingers crossed, recover the missing TEXT column information.  We got lucky in that all of the affected rows were old and in the end we didn’t lose anything.  :O  All of the row counts along the way worked out and it looks like we dodged a major bullet here. We’ve heard back from EMC and it turns out the SAN firmware that they were running here is apparently buggy.  This thing is only a couple of months old.  Grrr…. They dispatched a technician that night to come and update it .  That explains why RAID didn’t save us. All-in-all this could have been a lot worse.  Given the root cause here, they basically won the lottery in not losing anything. Here are a few links to some helpful posts on the SQL Server Engine blog.  I love the title of the first one: Which part of 'REPAIR_ALLOW_DATA_LOSS' isn't clear? CHECKDB (Part 8): Can repair fix everything? (in fact, read the whole series) Ta da! Emergency mode repair (we didn’t have to resort to this one thank goodness)   Dave Just because I can…   [1] Names have been changed to protect the guilty. [2] I'm Batman. [3] And if I'm the coolest head in the room, you've got even bigger problems...

    Read the article

  • Toughest Developer Puzzle Ever

    - by Josh Holmes
    For the second year in a row, my friend and colleague Jeff Blankenburg has created what is quickly proving to live up to it’s namesake – the Toughest Developer Puzzle Ever. Some of the puzzles are technical, some are not but all require that you understand the web, development and technology to solve. Even if you don’t get in on the fantastic prizes that Jeff has lined up, there’s great bragging rights in being able to solve the Toughest Developer Puzzle Ever. This year, I was honored enough to get to create three of the puzzles myself – let me know what you think of them. I’m not going to tell you which ones I created now and definitely don’t ask me for hints – Jeff has threatened me if I give any of the puzzle away… ;) All I can say now is “Good luck!”

    Read the article

  • Talend Enterprise Data Integration overperforms on Oracle SPARC T4

    - by Amir Javanshir
    The SPARC T microprocessor, released in 2005 by Sun Microsystems, and now continued at Oracle, has a good track record in parallel execution and multi-threaded performance. However it was less suited for pure single-threaded workloads. The new SPARC T4 processor is now filling that gap by offering a 5x better single-thread performance over previous generations. Following our long-term relationship with Talend, a fast growing ISV positioned by Gartner in the “Visionaries” quadrant of the “Magic Quadrant for Data Integration Tools”, we decided to test some of their integration components with the T4 chip, more precisely on a T4-1 system, in order to verify first hand if this new processor stands up to its promises. Several tests were performed, mainly focused on: Single-thread performance of the new SPARC T4 processor compared to an older SPARC T2+ processor Overall throughput of the SPARC T4-1 server using multiple threads The tests consisted in reading large amounts of data --ten's of gigabytes--, processing and writing them back to a file or an Oracle 11gR2 database table. They are CPU, memory and IO bound tests. Given the main focus of this project --CPU performance--, bottlenecks were removed as much as possible on the memory and IO sub-systems. When possible, the data to process was put into the ZFS filesystem cache, for instance. Also, two external storage devices were directly attached to the servers under test, each one divided in two ZFS pools for read and write operations. Multi-thread: Testing throughput on the Oracle T4-1 The tests were performed with different number of simultaneous threads (1, 2, 4, 8, 12, 16, 32, 48 and 64) and using different storage devices: Flash, Fibre Channel storage, two stripped internal disks and one single internal disk. All storage devices used ZFS as filesystem and volume management. Each thread read a dedicated 1GB-large file containing 12.5M lines with the following structure: customerID;FirstName;LastName;StreetAddress;City;State;Zip;Cust_Status;Since_DT;Status_DT 1;Ronald;Reagan;South Highway;Santa Fe;Montana;98756;A;04-06-2006;09-08-2008 2;Theodore;Roosevelt;Timberlane Drive;Columbus;Louisiana;75677;A;10-05-2009;27-05-2008 3;Andrew;Madison;S Rustle St;Santa Fe;Arkansas;75677;A;29-04-2005;09-02-2008 4;Dwight;Adams;South Roosevelt Drive;Baton Rouge;Vermont;75677;A;15-02-2004;26-01-2007 […] The following graphs present the results of our tests: Unsurprisingly up to 16 threads, all files fit in the ZFS cache a.k.a L2ARC : once the cache is hot there is no performance difference depending on the underlying storage. From 16 threads upwards however, it is clear that IO becomes a bottleneck, having a good IO subsystem is thus key. Single-disk performance collapses whereas the Sun F5100 and ST6180 arrays allow the T4-1 to scale quite seamlessly. From 32 to 64 threads, the performance is almost constant with just a slow decline. For the database load tests, only the best IO configuration --using external storage devices-- were used, hosting the Oracle table spaces and redo log files. Using the Sun Storage F5100 array allows the T4-1 server to scale up to 48 parallel JVM processes before saturating the CPU. The final result is a staggering 646K lines per second insertion in an Oracle table using 48 parallel threads. Single-thread: Testing the single thread performance Seven different tests were performed on both servers. Given the fact that only one thread, thus one file was read, no IO bottleneck was involved, all data being served from the ZFS cache. Read File ? Filter ? Write File: Read file, filter data, write the filtered data in a new file. The filter is set on the “Status” column: only lines with status set to “A” are selected. This limits each output file to about 500 MB. Read File ? Load Database Table: Read file, insert into a single Oracle table. Average: Read file, compute the average of a numeric column, write the result in a new file. Division & Square Root: Read file, perform a division and square root on a numeric column, write the result data in a new file. Oracle DB Dump: Dump the content of an Oracle table (12.5M rows) into a CSV file. Transform: Read file, transform, write the result data in a new file. The transformations applied are: set the address column to upper case and add an extra column at the end, which is the concatenation of two columns. Sort: Read file, sort a numeric and alpha numeric column, write the result data in a new file. The following table and graph present the final results of the tests: Throughput unit is thousand lines per second processed (K lines/second). Improvement is the % of improvement between the T5140 and T4-1. Test T4-1 (Time s.) T5140 (Time s.) Improvement T4-1 (Throughput) T5140 (Throughput) Read/Filter/Write 125 806 645% 100 16 Read/Load Database 195 1111 570% 64 11 Average 96 557 580% 130 22 Division & Square Root 161 1054 655% 78 12 Oracle DB Dump 164 945 576% 76 13 Transform 159 1124 707% 79 11 Sort 251 1336 532% 50 9 The improvement of single-thread performance is quite dramatic: depending on the tests, the T4 is between 5.4 to 7 times faster than the T2+. It seems clear that the SPARC T4 processor has gone a long way filling the gap in single-thread performance, without sacrifying the multi-threaded capability as it still shows a very impressive scaling on heavy-duty multi-threaded jobs. Finally, as always at Oracle ISV Engineering, we are happy to help our ISV partners test their own applications on our platforms, so don't hesitate to contact us and let's see what the SPARC T4-based systems can do for your application! "As describe in this benchmark, Talend Enterprise Data Integration has overperformed on T4. I was generally happy to see that the T4 gave scaling opportunities for many scenarios like complex aggregations. Row by row insertion in Oracle DB is faster with more than 650,000 rows per seconds without using any bulk Oracle capabilities !" Cedric Carbone, Talend CTO.

    Read the article

  • Matrix Pattern Recognition Algorithm

    - by Andres
    I am designing a logic analyzer and I would like to implement some Matrix Algorithm. I have several channels each one represented by a row in the matrix and every element in the column would be the state, for example: Channel 1 1 0 0 1 0 1 1 0 1 Channel 2 1 1 0 1 1 0 0 1 1 Channel 3 0 1 0 1 1 0 1 0 0 Channel 4 0 0 1 0 0 1 0 0 1 I would like to detect a pattern inside my matrix for example, detect if exist and where the sub-matrix or pattern: 1 0 1 1 I think it can be accomplished testing element by element but I think there should be a better way of doing it. Is there any Java API or any way to do it ? If there is a API ARM optimized for NEON instructions would be great also but not mandatory. Thank you very much in advance.

    Read the article

  • Sesame Data Browser: filtering, sorting, selecting and linking

    I have deferred the post about how Sesame is built in favor of publishing a new update.This new release offers major features such as the ability to quickly filter and sort data, select columns, and create hyperlinks to OData. Filtering, sorting, selecting In order to filter data, you just have to use the filter row, which becomes available when you click on the funnel button: You can then type some text and select an operator: The data grid will be refreshed immediately after you apply a filter. It...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Speaking at the Great Indian Developer Summit this week

    This week I will be speaking at the Great Indian Developer Summit in Bangalore, India for the second year in a row. My sessions are: GIDS.Net (Tuesday) Data Warehousing Made Easy What's New in SQL Server 2008 R2 (PowerPivot is shown here as well!) Sharing Business Logic between Silverlight and .NET GIDS.Web (Wednesday) Building Line of Business Applications with Silverlight 4.0 (RIA Services) GIDS.Seminar (Friday) Agile Tools and Teams In addition to my talks, Telerik will also staff a booth and have demos of all of our products and some tee shirts to give away. I will also be at the booth all day to answer your questions on my talks. See you at GIDS! Technorati Tags: Telerik,Agile,Silverlight,PowerPivot Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Visual Studio Little Wonders: Box Selection

    - by James Michael Hare
    So this week I decided I’d do a Little Wonder of a different kind and focus on an underused IDE improvement: Visual Studio’s Box Selection capability. This is a handy feature that many people still don’t realize was made available in Visual Studio 2010 (and beyond).  True, there have been other editors in the past with this capability, but now that it’s fully part of Visual Studio we can enjoy it’s goodness from within our own IDE. So, for those of you who don’t know what box selection is and what it allows you to do, read on! Sometimes, we want to select beyond the horizontal… The problem with traditional text selection in many editors is that it is horizontally oriented.  Sure, you can select multiple rows, but if you do you will pull in the entire row (at least for the middle rows).  Under the old selection scheme, if you wanted to select a portion of text from each row (a “box” of text) you were out of luck.  Box selection rectifies this by allowing you to select a box of text that bounded by a selection rectangle that you can grow horizontally or vertically.  So let’s think a situation that could occur where this comes in handy. Let’s say, for instance, that we are defining an enum in our code that we want to be able to translate into some string values (possibly to be stored in a database, output to screen, etc.). Perhaps such an enum would look like this: 1: public enum OrderType 2: { 3: Buy, // buy shares of a commodity 4: Sell, // sell shares of a commodity 5: Exchange, // exchange one commodity for another 6: Cancel, // cancel an order for a commodity 7: } 8:  Now, let’s say we are in the process of creating a Dictionary<K,V> to translate our OrderType: 1: var translator = new Dictionary<OrderType, string> 2: { 3: // do I really want to retype all this??? 4: }; Yes the example above is contrived so that we will pull some garbage if we do a multi-line select. I could select the lines above using the traditional multi-line selection: And then paste them into the translator code, which would result in this: 1: var translator = new Dictionary<OrderType, string> 2: { 3: Buy, // buy shares of a commodity 4: Sell, // sell shares of a commodity 5: Exchange, // exchange one commodity for another 6: Cancel, // cancel an order for a commodity 7: }; But I have a lot of junk there, sure I can manually clear it out, or use some search and replace magic, but if this were hundreds of lines instead of just a few that would quickly become cumbersome. The Box Selection Now that we have the ability to create box selections, we can select the box of text to delete!  Most of us are familiar with the fact we can drag the mouse (or hold [Shift] and use the arrow keys) to create a selection that can span multiple rows: Box selection, however, actually allows us to select a box instead of the typical horizontal lines: Then we can press the [delete] key and the pesky comments are all gone! You can do this either by holding down [Alt] while you select with your mouse, or by holding down [Alt+Shift] and using the arrow keys on the keyboard to grow the box horizontally or vertically. So now we have: 1: var translator = new Dictionary<OrderType, string> 2: { 3: Buy, 4: Sell, 5: Exchange, 6: Cancel, 7: }; Which is closer, but we still need an opening curly, the string to translate to, and the closing curly and comma. Fortunately, again, this is easy with box selections due to the fact box selection can even work for a zero-width selection! That is, hold down [Alt] and either drag down with no width, or hold down [Alt+Shift] and arrow down and you will define a selection range with no width, essentially, a vertical line selection: Notice the faint selection line on the right? So why is this useful? Well, just like with any selected range, we can type and it will replace the selection. What does this mean for box selections? It means that we can insert the same text all the way down on each line! If we have the same selection above, and type a curly and a space, we’d get: Imagine doing this over hundreds of lines and think of what a time saver it could be! Now make a zero-width selection on the other side: And type a curly and a comma, and we’d get: So close! Now finally, imagine we’ve already defined these strings somewhere and want to paste them in: 1: const private string BuyText = "Buy Shares"; 2: const private string SellText = "Sell Shares"; 3: const private string ExchangeText = "Exchange"; 4: const private string CancelText = "Cancel"; We can, again, use our box selection to pull out the constant names: And clicking copy (or [CTRL+C]) and then selecting a range to paste into: And finally clicking paste (or [CTRL+V]) to get the final result: 1: var translator = new Dictionary<OrderType, string> 2: { 3: { Buy, BuyText }, 4: { Sell, SellText }, 5: { Exchange, ExchangeText }, 6: { Cancel, CancelText }, 7: };   Sure, this was a contrived example, but I’m sure you’ll agree that it adds myriad possibilities of new ways to copy and paste vertical selections, as well as inserting text across a vertical slice. Summary: While box selection has been around in other editors, we finally get to experience it in VS2010 and beyond. It is extremely handy for selecting columns of information for cutting, copying, and pasting. In addition, it allows you to create a zero-width vertical insertion point that can be used to enter the same text across multiple rows. Imagine the time you can save adding repetitive code across multiple lines!  Try it, the more you use it, the more you’ll love it! Technorati Tags: C#,CSharp,.NET,Visual Studio,Little Wonders,Box Selection

    Read the article

  • Getting started with Oracle Database In-Memory Part III - Querying The IM Column Store

    - by Maria Colgan
    In my previous blog posts, I described how to install, enable, and populate the In-Memory column store (IM column store). This weeks post focuses on how data is accessed within the IM column store. Let’s take a simple query “What is the most expensive air-mail order we have received to date?” SELECT Max(lo_ordtotalprice) most_expensive_order FROM lineorderWHERE  lo_shipmode = 5; The LINEORDER table has been populated into the IM column store and since we have no alternative access paths (indexes or views) the execution plan for this query is a full table scan of the LINEORDER table. You will notice that the execution plan has a new set of keywords “IN MEMORY" in the access method description in the Operation column. These keywords indicate that the LINEORDER table has been marked for INMEMORY and we may use the IM column store in this query. What do I mean by “may use”? There are a small number of cases were we won’t use the IM column store even though the object has been marked INMEMORY. This is similar to how the keyword STORAGE is used on Exadata environments. You can confirm that the IM column store was actually used by examining the session level statistics, but more on that later. For now let's focus on how the data is accessed in the IM column store and why it’s faster to access the data in the new column format, for analytical queries, rather than the buffer cache. There are four main reasons why accessing the data in the IM column store is more efficient. 1. Access only the column data needed The IM column store only has to scan two columns – lo_shipmode and lo_ordtotalprice – to execute this query while the traditional row store or buffer cache has to scan all of the columns in each row of the LINEORDER table until it reaches both the lo_shipmode and the lo_ordtotalprice column. 2. Scan and filter data in it's compressed format When data is populated into the IM column it is automatically compressed using a new set of compression algorithms that allow WHERE clause predicates to be applied against the compressed formats. This means the volume of data scanned in the IM column store for our query will be far less than the same query in the buffer cache where it will scan the data in its uncompressed form, which could be 20X larger. 3. Prune out any unnecessary data within each column The fastest read you can execute is the read you don’t do. In the IM column store a further reduction in the amount of data accessed is possible due to the In-Memory Storage Indexes(IM storage indexes) that are automatically created and maintained on each of the columns in the IM column store. IM storage indexes allow data pruning to occur based on the filter predicates supplied in a SQL statement. An IM storage index keeps track of minimum and maximum values for each column in each of the In-Memory Compression Unit (IMCU). In our query the WHERE clause predicate is on the lo_shipmode column. The IM storage index on the lo_shipdate column is examined to determine if our specified column value 5 exist in any IMCU by comparing the value 5 to the minimum and maximum values maintained in the Storage Index. If the value 5 is outside the minimum and maximum range for an IMCU, the scan of that IMCU is avoided. For the IMCUs where the value 5 does fall within the min, max range, an additional level of data pruning is possible via the metadata dictionary created when dictionary-based compression is used on IMCU. The dictionary contains a list of the unique column values within the IMCU. Since we have an equality predicate we can easily determine if 5 is one of the distinct column values or not. The combination of the IM storage index and dictionary based pruning, enables us to only scan the necessary IMCUs. 4. Use SIMD to apply filter predicates For the IMCU that need to be scanned Oracle takes advantage of SIMD vector processing (Single Instruction processing Multiple Data values). Instead of evaluating each entry in the column one at a time, SIMD vector processing allows a set of column values to be evaluated together in a single CPU instruction. The column format used in the IM column store has been specifically designed to maximize the number of column entries that can be loaded into the vector registers on the CPU and evaluated in a single CPU instruction. SIMD vector processing enables the Oracle Database In-Memory to scan billion of rows per second per core versus the millions of rows per second per core scan rate that can be achieved in the buffer cache. I mentioned earlier in this post that in order to confirm the IM column store was used; we need to examine the session level statistics. You can monitor the session level statistics by querying the performance views v$mystat and v$statname. All of the statistics related to the In-Memory Column Store begin with IM. You can see the full list of these statistics by typing: display_name format a30 SELECT display_name FROM v$statname WHERE  display_name LIKE 'IM%'; If we check the session statistics after we execute our query the results would be as follow; SELECT Max(lo_ordtotalprice) most_expensive_order FROM lineorderWHERE lo_shipmode = 5; SELECT display_name FROM v$statname WHERE  display_name IN ('IM scan CUs columns accessed',                        'IM scan segments minmax eligible',                        'IM scan CUs pruned'); As you can see, only 2 IMCUs were accessed during the scan as the majority of the IMCUs (44) in the LINEORDER table were pruned out thanks to the storage index on the lo_shipmode column. In next weeks post I will describe how you can control which queries use the IM column store and which don't. +Maria Colgan

    Read the article

  • Free Virtual Developer Day - Oracle Fusion Development

    - by Grant Ronald
    You know, there is no reason for you or your developers not to be top notch Fusion developers.  This is my third blog in a row telling you about some sort of free training.  In this case its a whole on line conference! In this on line conference you can learn about the various components that make up the Oracle Fusion Middleware development platform including Oracle ADF, Oracle WebCenter, Business Intelligence, BPM and more!  The online conference will include seminars, hands-on lab and live chats with our technical staff including me!!  And the best bit, it doesn't cost you a single penny/cent.  Its free and available right on your desktop. You have to register to attend so click here to confirm your place.

    Read the article

  • DNS question and Google PageRank from domains

    - by Beck
    I'm not so good at dns at all :) Just basics. A while ago i have noticed, that my blog have different Page Ranks, PR 3 for domain www.example.com and PR 1 for domain example.com. In dns records i have this setup: A - IP - www.example.com A - same IP - example.com Should i replace this record "A - same IP - example.com" with row with CNAME instead of A? Like that: CNAME - same IP - example.com - alias www.example.com Will this combine Page Rank value of both domains? Or i can just create 302 redirection inside .htaccess file, verify example.com(without www) inside Google Webmaster Tools as my domain and inside www.example.com options set as main domain? Thanks ;)

    Read the article

< Previous Page | 201 202 203 204 205 206 207 208 209 210 211 212  | Next Page >