Search Results

Search found 13955 results on 559 pages for 'date comparison'.

Page 514/559 | < Previous Page | 510 511 512 513 514 515 516 517 518 519 520 521 | Next Page >

SQL SERVER – Database Dynamic Caching by Automatic SQL Server Performance Acceleration

- by pinaldave

My second look at SafePeak’s new version (2.1) revealed to me few additional interesting features. For those of you who hadn’t read my previous reviews SafePeak and not familiar with it, here is a quick brief: SafePeak is in business of accelerating performance of SQL Server applications, as well as their scalability, without making code changes to the applications or to the databases. SafePeak performs database dynamic caching, by caching in memory result sets of queries and stored procedures while keeping all those cache correct and up to date. Cached queries are retrieved from the SafePeak RAM in microsecond speed and not send to the SQL Server. The application gets much faster results (100-500 micro seconds), the load on the SQL Server is reduced (less CPU and IO) and the application or the infrastructure gets better scalability. SafePeak solution is hosted either within your cloud servers, hosted servers or your enterprise servers, as part of the application architecture. Connection of the application is done via change of connection strings or adding reroute line in the c:\windows\system32\drivers\etc\hosts file on all application servers. For those who would like to learn more on SafePeak architecture and how it works, I suggest to read this vendor’s webpage: SafePeak Architecture. More interesting new features in SafePeak 2.1 In my previous review of SafePeak new I covered the first 4 things I noticed in the new SafePeak (check out my article “SQLAuthority News – SafePeak Releases a Major Update: SafePeak version 2.1 for SQL Server Performance Acceleration”): Cache setup and fine-tuning – a critical part for getting good caching results Database templates Choosing which database to cache Monitoring and analysis options by SafePeak Since then I had a chance to play with SafePeak some more and here is what I found. 5. Analysis of SQL Performance (present and history): In SafePeak v.2.1 the tools for understanding of performance became more comprehensive. Every 15 minutes SafePeak creates and updates various performance statistics. Each query (or a procedure execute) that arrives to SafePeak gets a SQL pattern, and after it is used again there are statistics for such pattern. An important part of this product is that it understands the dependencies of every pattern (list of tables, views, user defined functions and procs). From this understanding SafePeak creates important analysis information on performance of every object: response time from the database, response time from SafePeak cache, average response time, percent of traffic and break down of behavior. One of the interesting things this behavior column shows is how often the object is actually pdated. The break down analysis allows knowing the above information for: queries and procedures, tables, views, databases and even instances level. The data is show now on all arriving queries, both read queries (that can be cached), but also any types of updates like DMLs, DDLs, DCLs, and even session settings queries. The stats are being updated every 15 minutes and SafePeak dashboard allows going back in time and investigating what happened within any time frame. 6. Logon trigger, for making sure nothing corrupts SafePeak cache data If you have an application with many parts, many servers many possible locations that can actually update the database, or the SQL Server is accessible to many DBAs or software engineers, each can access some database directly and do some changes without going thru SafePeak – this can create a potential corruption of the data stored in SafePeak cache. To make sure SafePeak cache is correct it needs to get all updates to arrive to SafePeak, and if a DBA will access the database directly and do some changes, for example, then SafePeak will simply not know about it and will not clean SafePeak cache. In the new version, SafePeak brought a new feature called “Logon Trigger” to solve the above challenge. By special click of a button SafePeak can deploy a special server logon trigger (with a CLR object) on your SQL Server that actually monitors all connections and informs SafePeak on any connection that is coming not from SafePeak. In SafePeak dashboard there is an interface that allows to control which logins can be ignored based on login names and IPs, while the rest will invoke cache cleanup of SafePeak and actually locks SafePeak cache until this connection will not be closed. Important to note, that this does not interrupt any logins, only informs SafePeak on such connection. On the Dashboard screen in SafePeak you will be able to see those connections and then decide what to do with them. Configuration of this feature in SafePeak dashboard can be done here: Settings -> SQL instances management -> click on instance -> Logon Trigger tab. Other features: 7. User management ability to grant permissions to someone without changing its configuration and only use SafePeak as performance analysis tool. 8. Better reports for analysis of performance using 15 minute resolution charts. 9. Caching of client cursors 10. Support for IPv6 Summary SafePeak is a great SQL Server performance acceleration solution for users who want immediate results for sites with performance, scalability and peak spikes challenges. Especially if your apps are packaged or 3rd party, since no code changes are done. SafePeak can significantly increase response times, by reducing network roundtrip to the database, decreasing CPU resource usage, eliminating I/O and storage access. SafePeak team provides a free fully functional trial www.safepeak.com/download and actually provides a one-on-one assistance during such trial. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Pinal Dave, PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology

Read the article
Creating packages in code – Execute SQL Task

The Execute SQL Task is for obvious reasons very well used, so I thought if you are building packages in code the chances are you will be using it. Using the task basic features of the task are quite straightforward, add the task and set some properties, just like any other. When you start interacting with variables though it can be a little harder to grasp so these samples should see you through. Some of these more advanced features are explained in much more detail in our ever popular post The Execute SQL Task, here I’ll just be showing you how to implement them in code. The abbreviated code blocks below demonstrate the different features of the task. The complete code has been encapsulated into a sample class which you can download (ExecSqlPackage.cs). Each feature described has its own method in the sample class which is mentioned after the code block. This first sample just shows adding the task, setting the basic properties for a connection and of course an SQL statement. Package package = new Package(); // Add the SQL OLE-DB connection ConnectionManager sqlConnection = AddSqlConnection(package, "localhost", "master"); // Add the SQL Task package.Executables.Add("STOCK:SQLTask"); // Get the task host wrapper TaskHost taskHost = package.Executables[0] as TaskHost; // Set required properties taskHost.Properties["Connection"].SetValue(taskHost, sqlConnection.ID); taskHost.Properties["SqlStatementSource"].SetValue(taskHost, "SELECT * FROM sysobjects"); For the full version of this code, see the CreatePackage method in the sample class. The AddSqlConnection method is a helper method that adds an OLE-DB connection to the package, it is of course in the sample class file too. Returning a single value with a Result Set The following sample takes a different approach, getting a reference to the ExecuteSQLTask object task itself, rather than just using the non-specific TaskHost as above. Whilst it means we need to add an extra reference to our project (Microsoft.SqlServer.SQLTask) it makes coding much easier as we have compile time validation of any property and types we use. For the more complex properties that is very valuable and saves a lot of time during development. The query has also been changed to return a single value, one row and one column. The sample shows how we can return that value into a variable, which we also add to our package in the code. To do this manually you would set the Result Set property on the General page to Single Row and map the variable on the Result Set page in the editor. Package package = new Package(); // Add the SQL OLE-DB connection ConnectionManager sqlConnection = AddSqlConnection(package, "localhost", "master"); // Add the SQL Task package.Executables.Add("STOCK:SQLTask"); // Get the task host wrapper TaskHost taskHost = package.Executables[0] as TaskHost; // Add variable to hold result value package.Variables.Add("Variable", false, "User", 0); // Get the task object ExecuteSQLTask task = taskHost.InnerObject as ExecuteSQLTask; // Set core properties task.Connection = sqlConnection.Name; task.SqlStatementSource = "SELECT id FROM sysobjects WHERE name = 'sysrowsets'"; // Set single row result set task.ResultSetType = ResultSetType.ResultSetType_SingleRow; // Add result set binding, map the id column to variable task.ResultSetBindings.Add(); IDTSResultBinding resultBinding = task.ResultSetBindings.GetBinding(0); resultBinding.ResultName = "id"; resultBinding.DtsVariableName = "User::Variable"; For the full version of this code, see the CreatePackageResultVariable method in the sample class. The other types of Result Set behaviour are just a variation on this theme, set the property and map the result binding as required. Parameter Mapping for SQL Statements This final example uses a parameterised SQL statement, with the coming from a variable. The syntax varies slightly between connection types, as explained in the Working with Parameters and Return Codes in the Execute SQL Taskhelp topic, but OLE-DB is the most commonly used, for which a question mark is the parameter value placeholder. Package package = new Package(); // Add the SQL OLE-DB connection ConnectionManager sqlConnection = AddSqlConnection(package, ".", "master"); // Add the SQL Task package.Executables.Add("STOCK:SQLTask"); // Get the task host wrapper TaskHost taskHost = package.Executables[0] as TaskHost; // Get the task object ExecuteSQLTask task = taskHost.InnerObject as ExecuteSQLTask; // Set core properties task.Connection = sqlConnection.Name; task.SqlStatementSource = "SELECT id FROM sysobjects WHERE name = ?"; // Add variable to hold parameter value package.Variables.Add("Variable", false, "User", "sysrowsets"); // Add input parameter binding task.ParameterBindings.Add(); IDTSParameterBinding parameterBinding = task.ParameterBindings.GetBinding(0); parameterBinding.DtsVariableName = "User::Variable"; parameterBinding.ParameterDirection = ParameterDirections.Input; parameterBinding.DataType = (int)OleDBDataTypes.VARCHAR; parameterBinding.ParameterName = "0"; parameterBinding.ParameterSize = 255; For the full version of this code, see the CreatePackageParameterVariable method in the sample class. You’ll notice the data type has to be specified for the parameter IDTSParameterBinding .DataType Property, and these type codes are connection specific too. My enumeration I wrote several years ago is shown below was probably done by reverse engineering a package and also the API header file, but I recently found a very handy post that covers more connections as well for exactly this, Setting the DataType of IDTSParameterBinding objects (Execute SQL Task). /// <summary> /// Enumeration of OLE-DB types, used when mapping OLE-DB parameters. /// </summary> private enum OleDBDataTypes { BYTE = 0x11, CURRENCY = 6, DATE = 7, DB_VARNUMERIC = 0x8b, DBDATE = 0x85, DBTIME = 0x86, DBTIMESTAMP = 0x87, DECIMAL = 14, DOUBLE = 5, FILETIME = 0x40, FLOAT = 4, GUID = 0x48, LARGE_INTEGER = 20, LONG = 3, NULL = 1, NUMERIC = 0x83, NVARCHAR = 130, SHORT = 2, SIGNEDCHAR = 0x10, ULARGE_INTEGER = 0x15, ULONG = 0x13, USHORT = 0x12, VARCHAR = 0x81, VARIANT_BOOL = 11 } Download Sample code ExecSqlPackage.cs (10KB)

Read the article
SQL SERVER – Simple Example of Snapshot Isolation – Reduce the Blocking Transactions

- by pinaldave

To learn any technology and move to a more advanced level, it is very important to understand the fundamentals of the subject first. Today, we will be talking about something which has been quite introduced a long time ago but not properly explored when it comes to the isolation level. Snapshot Isolation was introduced in SQL Server in 2005. However, the reality is that there are still many software shops which are using the SQL Server 2000, and therefore cannot be able to maintain the Snapshot Isolation. Many software shops have upgraded to the later version of the SQL Server, but their respective developers have not spend enough time to upgrade themselves with the latest technology. “It works!” is a very common answer of many when they are asked about utilizing the new technology, instead of backward compatibility commands. In one of the recent consultation project, I had same experience when developers have “heard about it” but have no idea about snapshot isolation. They were thinking it is the same as Snapshot Replication – which is plain wrong. This is the same demo I am including here which I have created for them. In Snapshot Isolation, the updated row versions for each transaction are maintained in TempDB. Once a transaction has begun, it ignores all the newer rows inserted or updated in the table. Let us examine this example which shows the simple demonstration. This transaction works on optimistic concurrency model. Since reading a certain transaction does not block writing transaction, it also does not block the reading transaction, which reduced the blocking. First, enable database to work with Snapshot Isolation. Additionally, check the existing values in the table from HumanResources.Shift. ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON GO SELECT ModifiedDate FROM HumanResources.Shift GO Now, we will need two different sessions to prove this example. First Session: Set Transaction level isolation to snapshot and begin the transaction. Update the column “ModifiedDate” to today’s date. -- Session 1 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN UPDATE HumanResources.Shift SET ModifiedDate = GETDATE() GO Please note that we have not yet been committed to the transaction. Now, open the second session and run the following “SELECT” statement. Then, check the values of the table. Please pay attention on setting the Isolation level for the second one as “Snapshot” at the same time when we already start the transaction using BEGIN TRAN. -- Session 2 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that the values in the table are still original values. They have not been modified yet. Once again, go back to session 1 and begin the transaction. -- Session 1 COMMIT After that, go back to Session 2 and see the values of the table. -- Session 2 SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that the values are yet not changed and they are still the same old values which were there right in the beginning of the session. Now, let us commit the transaction in the session 2. Once committed, run the same SELECT statement once more and see what the result is. -- Session 2 COMMIT SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that it now reflects the new updated value. I hope that this example is clear enough as it would give you good idea how the Snapshot Isolation level works. There is much more to write about an extra level, READ_COMMITTED_SNAPSHOT, which we will be discussing in another post soon. If you wish to use this transaction’s Isolation level in your production database, I would appreciate your comments about their performance on your servers. I have included here the complete script used in this example for your quick reference. ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON GO SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 1 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN UPDATE HumanResources.Shift SET ModifiedDate = GETDATE() GO -- Session 2 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 1 COMMIT -- Session 2 SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 2 COMMIT SELECT ModifiedDate FROM HumanResources.Shift GO Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Transaction Isolation

Read the article
Improving Partitioned Table Join Performance

- by Paul White

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) ); CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY); WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50); -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE); -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory: Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); CREATE TABLE #P ( partition_number integer PRIMARY KEY); INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1; SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; DROP TABLE #P; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

Read the article
A Taxonomy of Numerical Methods v1

- by JoshReuben

Numerical Analysis – When, What, (but not how) Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up. I’ve included links to detailed explanations and to C++ code examples. I’ve tried to classify Numerical methods in the following broad categories: Solving Systems of Linear Equations Solving Non-Linear Equations Iteratively Interpolation Curve Fitting Optimization Numerical Differentiation & Integration Solving ODEs Boundary Problems Solving EigenValue problems Enjoy – I did ! Solving Systems of Linear Equations Overview Solve sets of algebraic equations with x unknowns The set is commonly in matrix form Gauss-Jordan Elimination http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx Produces solution of the equations & the coefficient matrix Efficient, stable 2 steps: · Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution · Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set Elementary ops for matrix decomposition: · Row multiplication · Row switching · Add multiples of rows to other rows Use pivoting to ensure rows are ordered for achieving triangular form LU Decomposition http://en.wikipedia.org/wiki/LU_decomposition C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html Represent the matrix as a product of lower & upper triangular matrices A modified version of GJ Elimination Advantage – can easily apply forward & backward elimination to solve triangular matrices Techniques: · Doolittle Method – sets the L matrix diagonal to unity · Crout Method - sets the U matrix diagonal to unity Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix Gauss-Seidel Iteration http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method C++: http://www.nr.com/forum/showthread.php?t=722 Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively). an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance Solving Non-Linear Equations Iteratively find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial use iterative techniques Iterative methods · used when there are no known analytical techniques · Requires set functions to be continuous & differentiable · Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result · Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met · bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge Incremental method if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically. Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities Fixed point method http://en.wikipedia.org/wiki/Fixed-point_iteration C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false Algebraically rearrange a solution to isolate a variable then apply incremental method Bisection method http://en.wikipedia.org/wiki/Bisection_method C++: http://numericalcomputing.wordpress.com/category/algorithms/ Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in Adv: unaffected by function gradient à reliable Disadv: slow convergence False Position Method http://en.wikipedia.org/wiki/False_position_method C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/ Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this) Newton-Raphson method http://en.wikipedia.org/wiki/Newton's_method C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3 Also known as Newton's method Convenient, efficient Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input. Evaluates the function & its derivative at each step. Can be extended to the Newton MutiRoot method for solving multiple roots Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!! Secant Method http://en.wikipedia.org/wiki/Secant_method C++: http://forum.vcoderz.com/showthread.php?p=205230 Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step. Similar implementation to False Positive method Birge-Vieta Method http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up Interpolation Overview Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation) Use Taylor Series Expansion of a function f(x) around a specific value for x Linear Interpolation http://en.wikipedia.org/wiki/Linear_interpolation C++: http://www.hamaluik.com/?p=289 Straight line between 2 points à concatenate interpolants between each pair of data points Bilinear Interpolation http://en.wikipedia.org/wiki/Bilinear_interpolation C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/ Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another. Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell. Lagrange Interpolation http://en.wikipedia.org/wiki/Lagrange_polynomial C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php For polynomials Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes Numerically unstable Barycentric Interpolation http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715 C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/ Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x Newton Divided Difference Interpolation http://en.wikipedia.org/wiki/Newton_polynomial C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html Hermite Divided Differences: Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences. For a given set of 3 data points , fit a quadratic interpolant through the data Bracketed functions allow Newton divided differences to be calculated recursively Difference table Cubic Spline Interpolation http://en.wikipedia.org/wiki/Spline_interpolation C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html Spline is a piecewise polynomial Provides smoothness – for interpolations with significantly varying data Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval Curve Fitting A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points Least Squares Fit http://en.wikipedia.org/wiki/Least_squares C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c Residual – difference between observed value & expected value Model function is often chosen as a linear combination of the specified functions Determines: A) The model instance in which the sum of squared residuals has the least value B) param values for which model best fits data Straight Line Fit Linear correlation between independent variable and dependent variable Linear Regression http://en.wikipedia.org/wiki/Linear_regression C++: http://www.oocities.org/david_swaim/cpp/linregc.htm Special case of statistically exact extrapolation Leverage least squares Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition) Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights Polynomial Fit - use a polynomial basis function Moving Average http://en.wikipedia.org/wiki/Moving_average C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters Replace each data point with average of neighbors. Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point. Parameters: smoothing factor, period, weight basis Optimization Overview Given function with multiple variables, find Min (or max by minimizing –f(x)) Iterative approach Efficient, but not necessarily reliable Conditions: noisy data, constraints, non-linear models Detection via sign of first derivative - Derivative of saddle points will be 0 Local minima Bisection method Similar method for finding a root for a non-linear equation Start with an interval that contains a minimum Golden Search method http://en.wikipedia.org/wiki/Golden_section_search C++: http://www.codecogs.com/code/maths/optimization/golden.php Bisect intervals according to golden ratio 0.618.. Achieves reduction by evaluating a single function instead of 2 Newton-Raphson Method Brent method http://en.wikipedia.org/wiki/Brent's_method C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0 Simplex Method http://en.wikipedia.org/wiki/Simplex_algorithm C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm Find the global minima of any multi-variable function Direct search – no derivatives required At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices. Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point. Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point Oscillation can be avoided by choosing the 2nd worst result Restart if it gets stuck Parameters: contraction & expansion factors Simulated Annealing http://en.wikipedia.org/wiki/Simulated_annealing C++: http://code.google.com/p/cppsimulatedannealing/ Analogy to heating & cooling metal to strengthen its structure Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta Cooling schedule – can be linear, step-wise or exponential Differential Evolution http://en.wikipedia.org/wiki/Differential_evolution C++: http://www.amichel.com/de/doc/html/ More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies Parallel direct search method against multiple discrete or continuous variables Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector Many params: #parents, #variables, step size, crossover constant etc Convergence is slow – many more function evaluations than simulated annealing Numerical Differentiation Overview 2 approaches to finite difference methods: · A) approximate function via polynomial interpolation then differentiate · B) Taylor series approximation – additionally provides error estimate Finite Difference methods http://en.wikipedia.org/wiki/Finite_difference_method C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h. Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved. Provide an approximation of the derivative within a O(h^2) accuracy There is also central difference & extended central difference which has a O(h^4) accuracy Richardson Extrapolation http://en.wikipedia.org/wiki/Richardson_extrapolation C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html A sequence acceleration method applied to finite differences Fast convergence, high accuracy O(h^4) Derivatives via Interpolation Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation Note: the higher the order of the derivative, the lower the approximation precision Numerical Integration Estimate finite & infinite integrals of functions More accurate procedure than numerical differentiation Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are Newton Cotes Methods http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas C++: http://www.siafoo.net/snippet/324 For equally spaced data points Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum Weights are derived from Lagrange Basis polynomials Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint Romberg Integration http://en.wikipedia.org/wiki/Romberg's_method C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q= Combines trapezoidal rule with Richardson Extrapolation Evaluates the integrand at equally spaced points The integrand must have continuous derivatives Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small. Gaussian Quadrature http://en.wikipedia.org/wiki/Gaussian_quadrature C++: http://www.alglib.net/integration/gaussianquadratures.php Data points are chosen to yield best possible accuracy – requires fewer evaluations Ability to handle singularities, functions that are difficult to evaluate The integrand can include a weighting function determined by a set of orthogonal polynomials. Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1 Techniques (basically different weighting functions): · Gauss-Legendre Integration w(x)=1 · Gauss-Laguerre Integration w(x)=e^-x · Gauss-Hermite Integration w(x)=e^-x^2 · Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2) Solving ODEs Use when high order differential equations cannot be solved analytically Evaluated under boundary conditions RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations Euler method http://en.wikipedia.org/wiki/Euler_method C++: http://rosettacode.org/wiki/Euler_method First order Runge–Kutta method. Simple recursive method – given an initial value, calculate derivative deltas. Unstable & not very accurate (O(h) error) – not used in practice A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries Higher order Runge Kutta http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods C++: http://www.dreamincode.net/code/snippet1441.htm 2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced Boundary Value Problems Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied Shooting Method http://en.wikipedia.org/wiki/Shooting_method C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations. Finite Difference Method For linear & non-linear systems Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though Improve the accuracy by increasing the number of mesh points Solving EigenValue Problems An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively Jacobi method http://en.wikipedia.org/wiki/Jacobi_method C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html Robust but Computationally intense – use for small matrices < 10x10 Power Iteration http://en.wikipedia.org/wiki/Power_iteration For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices Inverse Iteration Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix Rayleigh Method http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis Variation of power iteration method Rayleigh Quotient Method Variation of inverse iteration method Matrix Tri-diagonalization Method Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms Whats Next Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades: Data Mining – (I covered this briefly in a previous post: http://geekswithblogs.net/JoshReuben/archive/2007/12/31/ssas-dm-algorithms.aspx ) Search & Sort Routing Problem Solving Logical Theorem Proving Planning Probabilistic Reasoning Machine Learning Solvers (eg MIP) Bioinformatics (Sequence Alignment, Protein Folding) Quant Finance (I read Wilmott’s books – interesting) Sooner or later, I’ll cover the above topics as well.

Read the article
Change the User Interface Language in Ubuntu

- by Matthew Guay

Would you like to use your Ubuntu computer in another language? Here’s how you can easily change your interface language in Ubuntu. Ubuntu’s default install only includes a couple languages, but it makes it easy to find and add a new interface language to your computer. To get started, open the System menu, select Administration, and then click Language Support. Ubuntu may ask if you want to update or add components to your current default language when you first open the dialog. Click Install to go ahead and install the additional components, or you can click Remind Me Later to wait as these will be installed automatically when you add a new language. Now we’re ready to find and add an interface language to Ubuntu. Click Install / Remove Languages to add the language you want. Find the language you want in the list, and click the check box to install it. Ubuntu will show you all the components it will install for the language; this often includes spellchecking files for OpenOffice as well. Once you’ve made your selection, click Apply Changes to install your new language. Make sure you’re connected to the internet, as Ubuntu will have to download the additional components you’ve selected. Enter your system password when prompted, and then Ubuntu will download the needed languages files and install them. Back in the main Language & Text dialog, we’re now ready to set our new language as default. Find your new language in the list, and then click and drag it to the top of the list. Notice that Thai is the first language listed, and English is the second. This will make Thai the default language for menus and windows in this account. The tooltip reminds us that this setting does not effect system settings like currency or date formats. To change these, select the Text Tab and pick your new language from the drop-down menu. You can preview the changes in the bottom Example box. The changes we just made will only affect this user account; the login screen and startup will not be affected. If you wish to change the language in the startup and login screens also, click Apply System-Wide in both dialogs. Other user accounts will still retain their original language settings; if you wish to change them, you must do it from those accounts. Once you have your new language settings all set, you’ll need to log out of your account and log back in to see your new interface language. When you re-login, Ubuntu may ask you if you want to update your user folders’ names to your new language. For example, here Ubuntu is asking if we want to change our folders to their Thai equivalents. If you wish to do so, click Update or its equivalents in your language. Now your interface will be almost completely translated into your new language. As you can see here, applications with generic names are translated to Thai but ones with specific names like Shutter keep their original name. Even the help dialogs are translated, which makes it easy for users around to world to get started with Ubuntu. Once again, you may notice some things that are still in English, but almost everything is translated. Adding a new interface language doesn’t add the new language to your keyboard, so you’ll still need to set that up. Check out our article on adding languages to your keyboard to get this setup. If you wish to revert to your original language or switch to another new language, simply repeat the above steps, this time dragging your original or new language to the top instead of the one you chose previously. Conclusion Ubuntu has a large number of supported interface languages to make it user-friendly to people around the globe. And since you can set the language for each user account, it’s easy for multi-lingual individuals to share the same computer. Or, if you’re using Windows, check out our article on how you can Change the User Interface Language in Vista or Windows 7, too! Similar Articles Productive Geek Tips Restart the Ubuntu Gnome User Interface QuicklyChange the User Interface Language in Vista or Windows 7Create a Samba User on UbuntuInstall Samba Server on UbuntuSee Which Groups Your Linux User Belongs To TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips VMware Workstation 7 Acronis Online Backup DVDFab 6 Revo Uninstaller Pro FetchMp3 Can Download Videos & Convert Them to Mp3 Use Flixtime To Create Video Slideshows Creating a Password Reset Disk in Windows Bypass Waiting Time On Customer Service Calls With Lucyphone MELTUP – "The Beginning Of US Currency Crisis And Hyperinflation" Enable or Disable the Task Manager Using TaskMgrED

Read the article
Entity Framework 4.0: Creating objects of correct type when using lazy loading

- by DigiMortal

In my posting about Entity Framework 4.0 and POCOs I introduced lazy loading in EF applications. EF uses proxy classes for lazy loading and this means we have new types in that come and go dynamically in runtime. We don’t have these types available when we write code but we cannot forget that EF may expect us to use dynamically generated types. In this posting I will give you simple hint how to use correct types in your code. The background of lazy loading and proxy classes As a first thing I will explain you in short what is proxy class. Business classes when designed correctly have no knowledge about their birth and death – they don’t know how they are created and they don’t know how their data is persisted. This is the responsibility of object runtime. When we use lazy loading we need a little bit different classes that know how to load data for properties when code accesses the property first time. As we cannot add this functionality to our business classes (they may be stored through more than one data access technology or by more than one Data Access Layer (DAL)) we create proxy classes that extend our business classes. If we have class called Product and product has lazy loaded property called Customer then we need proxy class, let’s say ProductProxy, that has same public signature as Product so we can use it INSTEAD OF product in our code. ProductProxy overrides Customer property. If customer is not asked then customer is null. But if we ask for Customer property then overridden property of ProductProxy loads it from database. This is how lazy loading works. Problem – two types for same thing As lazy loading may introduce dynamically generated proxy types we don’t know in our application code which type is returned. We cannot be sure that we have Product not ProductProxy returned. This leads us to the following question: how can we create Product of correct type if we don’t know the correct type? In EF solution is simple. Solution – use factory methods If you are using repositories and you are not using factories (imho it is pretty pointless with mapper) you can add factory methods to your EF based repositories. Take a look at this class. public class Event { public int ID { get; set; } public string Title { get; set; } public string Location { get; set; } public virtual Party Organizer { get; set; } public DateTime Date { get; set; } } We have virtual member called Organizer. This property is virtual because we want to use lazy loading on this class so Organizer is loaded only when we ask it. EF provides us with method called CreateObject<T>(). CreateObject<T>() is member of ObjectContext class and it creates the object based on given type. In runtime proxy type for Event is created for us automatically and when we call CreateObject<T>() for Event it returns as object of Event proxy type. The factory method for events repository is as follows. public Event CreateEvent() { var evt = _context.CreateObject<Event>(); return evt; } And we are done. Instead of creating factory classes we created factory methods that guarantee that created objects are of correct type. Conclusion Although lazy loading introduces some new objects we cannot use at design time because they live only in runtime we can write code without worrying about exact implementation type of object. This holds true until we have clean code and we don’t make any decisions based on object type. EF4.0 provides us with very simple factory method that create and return objects of correct type. All we had to do was adding factory methods to our repositories.

Read the article
8 Mac System Features You Can Access in Recovery Mode

- by Chris Hoffman

A Mac’s Recovery Mode is for more than just reinstalling Mac OS X. You’ll find many other useful troubleshooting utilities here — you can use these even if your Mac can’t boot normally. To access Recovery Mode, restart your Mac and press and hold the Command + R keys during the boot-up process. This is one of several hidden startup options on a Mac. Reinstall Mac OS X Most people know Recovery Mode as the place you go to reinstall OS X on your Mac. Recovery Mode will download the OS X installer files from teh Intenret if you don’t have them locally, so they don’t take up space on your disk and you’ll never have to hunt for an opearign system disc. Better yet, it will download up-to-date installation files so you don’t have to spend hours installing operating system updates later. Microsoft could learn a lot from Apple here. Restore From a Time Machine Backup Instead of reinstalling OS X, you can choose to restore your Mac from a time machine backup. This is like restoring a system image on another operating system. You’ll need an external disk containing a backup image created on the current computer to do this. Browse the Web The Get Help Online link opens the Safari web browser to Apple’s documentation site. It’s not limited to Apple’s website, though — you can navigate to any website you like. This feature allows you to access and use a browser on your Mac even if it isn’t booting properly. It’s ideal for looking up troubleshooting information. Manage Your Disks The Disk Utility option opens the same Disk Utility you can access from within Mac OS X. It allows you to partition disks, format them, scan disks for problems, wipe drives, and set up drives in a RAID configuration. If you need to edit partitions from outside your operating system, you can just boot into the recovery environment — you don’t have to download a special partitioning tool and boot into it. Choose the Default Startup Disk Click the Apple menu on the bar at the top of your screen and select Startup Disk to access the Choose Startup Disk tool. Use this tool to choose your computer’s default startup disk and reboot into another operating system. For example, it’s useful if you have Windows installed alongside Mac OS X with Boot Camp. Add or Remove an EFI Firmware Password You can also add a firmware password to your Mac. This works like a BIOS password or UEFI password on a Windows or Linux PC. Click the Utilities menu on the bar at the top of your screen and select Firmware Password Utility to open this tool. Use the tool to turn on a firmware password, which will prevent your computer from starting up from a different hard disk, CD, DVD, or USB drive without the password you provide. This prevents people form booting up your Mac with an unauthorized operating system. If you’ve already enabled a firmware password, you can remove it from here. Use Network Tools to Troubleshoot Your Connection Select Utilities > Network Utility to open a network diagnostic tool. This utility provides a graphical way to view your network connection information. You can also use the netstat, ping, lookup, traceroute, whois, finger, and port scan utilities from here. These can be helpful to troubleshoot Internet connection problems. For example, the ping command can demonstrate whether you can communicate with a remote host and show you if you’re experiencing packet loss, while the traceroute command can show you where a connection is failing if you can’t connect to a remote server. Open a Terminal If you’d like to get your hands dirty, you can select Utilities > Terminal to open a terminal from here. This terminal allows you to do more advanced troubleshooting. Mac OS X uses the bash shell, just as typical Linux distributions do. Most people will just need to use the Reinstall Mac OS X option here, but there are many other tools you can benefit from. If the Recovery Mode files on your Mac are damaged or unavailable, your Mac will automatically download them from Apple so you can use the full recovery environment.

Read the article
Create an iTunes Account without a credit card

- by Matthew Guay

iTunes Store offers a large variety of free content, but to download it you have to have an account. Usually you have to enter your credit card information to sign up, but here’s an easy way to get an iTunes account for free downloads without entering any payment info. Although iTunes Store is known for paid downloads of movies, music, and more, it also has a treasure trove of free media. Some of it, including Podcasts and iTunes U educational content do not require an account to download. However, any other free content, including free iPhone/iPod Touch apps and free or promotional music, videos, and TV Shows all require an account to download. If you try to download a free movie or music download, you will be required to enter payment information. Even though your card will not be charged, it will be kept on file so you can be charged if you download a for-pay item. However, if you only plan to download free items, it may be preferable to not have your account linked to a credit card. The following steps will get you an account without entering your credit card info. Getting Started First, make sure you have iTunes installed. If you don’t already have it, download and install it (link below) with the default settings. Now open iTunes, and click the iTunes Store link on the left. Click the App Store link on the top of this page. Select a free app to download. A simple way to do this is to scroll down to the Top Free Apps box on the right side, hover your mouse over the first item, and click on the Free button that appears when you hover over it. A popup will open asking you to sign in with your Apple ID. Click “Create New Account”. Click Continue to create your account. Check the box to accept the Store Terms and Conditions, and click Continue. Enter your email address, password, security question, and date of birth, and uncheck the boxes to get email if you don’t want it…then click Continue. Now, you will be asked to provide a payment method. Notice now that the last option says None! Click that bullet option… Then enter your billing address. Simply enter your normal billing address, even though you are not entering a payment method. Click Continue and your account will be created! If you get the Address Verification screen just verify your county and click Done. An email will be sent to you to verify your account… Click on the link in your email to verify your account, iTunes will launch and you’re prompted to enter in the Apple ID and Password you just created. Your account is successfully created! Now you can easily download any free media from iTunes. Keep an eye on the Free on iTunes box on the bottom of the iTunes Store page for interesting downloads, or if you have an iPhone or iPod Touch, watch the popular Free downloads on the Apps page. And of course there is always great content on iTunes U to grab free as well. Purchasing for-pay media If you want to purchase an item on the iTunes store later, simply click on the item to download as normal. Click Buy to proceed with the purchase. iTunes will prompt you that you need to enter payment information to complete the purchase. Enter your Apple ID email and password, and then add the payment information as prompted. Remove Payment Information from an iTunes Account If you’ve already entered payment information into your iTunes account, and would like to remove it, click Store in the top iTunes menu, and select View My Account. Enter your Apple ID email and password, and click View Account. This will open your account information. Click the Edit Payment Information button. Now, click the None button to remove your payment information. Click Done to save the changes. Your account will now prompt you to enter payment information if you try to make a purchase. You could repeat these steps after making a purchase if you do not want iTunes to keep your payment info on file. Conclusion This is a great way to make an iTunes account without entering your credit card, or to remove your credit card info from your account. Parents may especially enjoy this tip, as they can have an iTunes account on their kids computer or iPod Touch without worrying about them spending money with it. Links Download iTunes Similar Articles Productive Geek Tips Quick Tip: Switch Between Signatures in Outlook 2007 the Easy WayRedeem Pre-paid Zune Card Points for Zune Marketplace MediaCreate An Electronic Business Card In Outlook 2007Understanding Windows Vista Aero Glass RequirementsSpeed up Your Windows Vista Computer with ReadyBoost TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional Draw Online using Harmony How to Browse Privately in Firefox Kill Processes Quickly with Process Assassin Need to Come Up with a Good Name? Try Wordoid StockFox puts a Lightweight Stock Ticker in your Statusbar Explore Google Public Data Visually

Read the article
Pure Server-Side Filtering with RadGridView and WCF RIA Services

Those of you who are familiar with WCF RIA Services know that the DomainDataSource control provides a FilterDescriptors collection that enables you to filter data returned by the query on the server. We have been using this DomainDataSource feature in our RIA Services with DomainDataSource online example for almost an year now. In the example, we are listening for RadGridViews Filtering event in order to intercept any filtering that is performed on the client and translate it to something that the DomainDataSource will understand, in this case a System.Windows.Data.FilterDescriptor being added or removed from its FilterDescriptors collection. Think of RadGridView.FilterDescriptors as client-side filtering and of DomainDataSource.FilterDescriptors as server-side filtering. We no longer need the client-side one. With the introduction of the Custom Filtering Controls feature many new possibilities have opened. With these custom controls we no longer need to do any filtering on the client. I have prepared a very small project that demonstrates how to filter solely on the server by using a custom filtering control. As I have already mentioned filtering on the server is done through the FilterDescriptors collection of the DomainDataSource control. This collection holds instances of type System.Windows.Data.FilterDescriptor. The FilterDescriptor has three important properties: PropertyPath: Specifies the name of the property that we want to filter on (the left operand). Operator: Specifies the type of comparison to use when filtering. An instance of FilterOperator Enumeration. Value: The value to compare with (the right operand). An instance of the Parameter Class. By adding filters, you can specify that only entities which meet the condition in the filter are loaded from the domain context. In case you are not familiar with these concepts you might find Brad Abrams blog interesting. Now, our requirements are to create some kind of UI that will manipulate the DomainDataSource.FilterDescriptors collection. When it comes to collections, my first choice of course would be RadGridView. If you are not familiar with the Custom Filtering Controls concept I would strongly recommend getting acquainted with my step-by-step tutorial Custom Filtering with RadGridView for Silverlight and checking the online example out. I have created a simple custom filtering control that contains a RadGridView and several buttons. This control is aware of the DomainDataSource instance, since it is operating on its FilterDescriptors collection. In fact, the RadGridView that is inside it is bound to this collection. In order to display filters that are relevant for the current column only, I have applied a filter to the grid. This filter is a Telerik.Windows.Data.FilterDescriptor and is used to filter the little grid inside the custom control. It should not be confused with the DomainDataSource.FilterDescriptors collection that RadGridView is actually bound to. These are the RIA filters. Additionally, I have added several other features. For example, if you have specified a DataFormatString on your original column, the Value column inside the custom control will pick it up and format the filter values accordingly. Also, I have transferred the data type of the column that you are filtering to the Value column of the custom control. This will help the little RadGridView determine what kind of editor to show up when you begin edit, for example a date picker for DateTime columns. Finally, I have added four buttons two of them can be used to add or remove filters and the other two will communicate the changes you have made to the server. Here is the full source code of the DomainDataSourceFilteringControl. The XAML: <UserControl x:Class="PureServerSideFiltering.DomainDataSourceFilteringControl" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:telerikGrid="clr-namespace:Telerik.Windows.Controls;assembly=Telerik.Windows.Controls.GridView" xmlns:telerik="clr-namespace:Telerik.Windows.Controls;assembly=Telerik.Windows.Controls" Width="300"> <Border x:Name="LayoutRoot" BorderThickness="1" BorderBrush="#FF8A929E" Padding="5" Background="#FFDFE2E5"> <Grid> <Grid.RowDefinitions> <RowDefinition Height="Auto"/> <RowDefinition Height="150"/> <RowDefinition Height="Auto"/> </Grid.RowDefinitions> <StackPanel Grid.Row="0" Margin="2" Orientation="Horizontal" HorizontalAlignment="Center"> <telerik:RadButton Name="addFilterButton" Click="OnAddFilterButtonClick" Content="Add Filter" Margin="2" Width="96"/> <telerik:RadButton Name="removeFilterButton" Click="OnRemoveFilterButtonClick" Content="Remove Filter" Margin="2" Width="96"/> </StackPanel> <telerikGrid:RadGridView Name="filtersGrid" Grid.Row="1" Margin="2" ItemsSource="{Binding FilterDescriptors}" AddingNewDataItem="OnFilterGridAddingNewDataItem" ColumnWidth="*" ShowGroupPanel="False" AutoGenerateColumns="False" CanUserResizeColumns="False" CanUserReorderColumns="False" CanUserFreezeColumns="False" RowIndicatorVisibility="Collapsed" IsFilteringAllowed="False" CanUserSortColumns="False"> <telerikGrid:RadGridView.Columns> <telerikGrid:GridViewComboBoxColumn DataMemberBinding="{Binding Operator}" UniqueName="Operator"/> <telerikGrid:GridViewDataColumn Header="Value" DataMemberBinding="{Binding Value.Value}" UniqueName="Value"/> </telerikGrid:RadGridView.Columns> </telerikGrid:RadGridView> <StackPanel Grid.Row="2" Margin="2" Orientation="Horizontal" HorizontalAlignment="Center"> <telerik:RadButton Name="filterButton" Click="OnApplyFiltersButtonClick" Content="Apply Filters" Margin="2" Width="96"/> <telerik:RadButton Name="clearButton" Click="OnClearFiltersButtonClick" Content="Clear Filters" Margin="2" Width="96"/> </StackPanel> </Grid> </Border> </UserControl> And the code-behind: using System; using System.Collections.Generic; using System.Linq; using System.Net; using System.Windows; using System.Windows.Controls; using System.Windows.Documents; using System.Windows.Input; using System.Windows.Media; using System.Windows.Media.Animation; using System.Windows.Shapes; using Telerik.Windows.Controls.GridView; using System.Windows.Data; using Telerik.Windows.Controls; using Telerik.Windows.Data; namespace PureServerSideFiltering { /// <summary> /// A custom filtering control capable of filtering purely server-side. /// </summary> public partial class DomainDataSourceFilteringControl : UserControl, IFilteringControl { // The main player here. DomainDataSource domainDataSource; // This is the name of the property that this column displays. private string dataMemberName; // This is the type of the property that this column displays. private Type dataMemberType; /// <summary> /// Identifies the <see cref="IsActive"/> dependency property. /// </summary> /// <remarks> /// The state of the filtering funnel (i.e. full or empty) is bound to this property. /// </remarks> public static readonly DependencyProperty IsActiveProperty = DependencyProperty.Register( "IsActive", typeof(bool), typeof(DomainDataSourceFilteringControl), new PropertyMetadata(false)); /// <summary> /// Gets or sets a value indicating whether the filtering is active. /// </summary> /// <remarks> /// Set this to true if you want to lit-up the filtering funnel. /// </remarks> public bool IsActive { get { return (bool)GetValue(IsActiveProperty); } set { SetValue(IsActiveProperty, value); } } /// <summary> /// Gets or sets the domain data source. /// We need this in order to work on its FilterDescriptors collection. /// </summary> /// <value>The domain data source.</value> public DomainDataSource DomainDataSource { get { return this.domainDataSource; } set { this.domainDataSource = value; } } public System.Windows.Data.FilterDescriptorCollection FilterDescriptors { get { return this.DomainDataSource.FilterDescriptors; } } public DomainDataSourceFilteringControl() { InitializeComponent(); } public void Prepare(GridViewBoundColumnBase column) { this.LayoutRoot.DataContext = this; if (this.DomainDataSource == null) { // Sorry, but we need a DomainDataSource. Can't do anything without it. return; } // This is the name of the property that this column displays. this.dataMemberName = column.GetDataMemberName(); // This is the type of the property that this column displays. // We need this in order to see which FilterOperators to feed to the combo-box column. this.dataMemberType = column.DataType; // We will use our magic Type extension method to see which operators are applicable for // this data type. You can go to the extension method body and see what it does. ((GridViewComboBoxColumn)this.filtersGrid.Columns["Operator"]).ItemsSource = this.dataMemberType.ApplicableFilterOperators(); // This is very nice as well. We will tell the Value column its data type. In this way // RadGridView will pick up the best editor according to the data type. For example, // if the data type of the value is DateTime, you will be editing it with a DatePicker. // Nice! ((GridViewDataColumn)this.filtersGrid.Columns["Value"]).DataType = this.dataMemberType; // Yet another nice feature. We will transfer the original DataFormatString (if any) to // the Value column. In this way if you have specified a DataFormatString for the original // column, you will see all filter values formatted accordingly. ((GridViewDataColumn)this.filtersGrid.Columns["Value"]).DataFormatString = column.DataFormatString; // This is important. Since our little filtersGrid will be bound to the entire collection // of this.domainDataSource.FilterDescriptors, we need to set a Telerik filter on the // grid so that it will display FilterDescriptor which are relevane to this column ONLY! Telerik.Windows.Data.FilterDescriptor columnFilter = new Telerik.Windows.Data.FilterDescriptor("PropertyPath" , Telerik.Windows.Data.FilterOperator.IsEqualTo , this.dataMemberName); this.filtersGrid.FilterDescriptors.Add(columnFilter); // We want to listen for this in order to activate and de-activate the UI funnel. this.filtersGrid.Items.CollectionChanged += this.OnFilterGridItemsCollectionChanged; } /// <summary> // Since the DomainDataSource is a little bit picky about adding uninitialized FilterDescriptors // to its collection, we will prepare each new instance with some default values and then // the user can change them later. Go to the event handler to see how we do this. /// </summary> void OnFilterGridAddingNewDataItem(object sender, GridViewAddingNewEventArgs e) { // We need to initialize the new instance with some values and let the user go on from here. System.Windows.Data.FilterDescriptor newFilter = new System.Windows.Data.FilterDescriptor(); // This is a must. It should know what member it is filtering on. newFilter.PropertyPath = this.dataMemberName; // Initialize it with one of the allowed operators. // TypeExtensions.ApplicableFilterOperators method for more info. newFilter.Operator = this.dataMemberType.ApplicableFilterOperators().First(); if (this.dataMemberType == typeof(DateTime)) { newFilter.Value.Value = DateTime.Now; } else if (this.dataMemberType == typeof(string)) { newFilter.Value.Value = "<enter text>"; } else if (this.dataMemberType.IsValueType) { // We need something non-null for all value types. newFilter.Value.Value = Activator.CreateInstance(this.dataMemberType); } // Let the user edit the new filter any way he/she likes. e.NewObject = newFilter; } void OnFilterGridItemsCollectionChanged(object sender, System.Collections.Specialized.NotifyCollectionChangedEventArgs e) { // We are active only if we have any filters define. In this case the filtering funnel will lit-up. this.IsActive = this.filtersGrid.Items.Count > 0; } private void OnApplyFiltersButtonClick(object sender, RoutedEventArgs e) { if (this.DomainDataSource.IsLoadingData) { return; } // Comment this if you want the popup to stay open after the button is clicked. this.ClosePopup(); // Since this.domainDataSource.AutoLoad is false, this will take into // account all filtering changes that the user has made since the last // Load() and pull the new data to the client. this.DomainDataSource.Load(); } private void OnClearFiltersButtonClick(object sender, RoutedEventArgs e) { if (this.DomainDataSource.IsLoadingData) { return; } // We want to remove ONLY those filters from the DomainDataSource // that this control is responsible for. this.DomainDataSource.FilterDescriptors .Where(fd => fd.PropertyPath == this.dataMemberName) // Only "our" filters. .ToList() .ForEach(fd => this.DomainDataSource.FilterDescriptors.Remove(fd)); // Bye-bye! // Comment this if you want the popup to stay open after the button is clicked. this.ClosePopup(); // After we did our housekeeping, get the new data to the client. this.DomainDataSource.Load(); } private void OnAddFilterButtonClick(object sender, RoutedEventArgs e) { if (this.DomainDataSource.IsLoadingData) { return; } // Let the user enter his/or her requirements for a new filter. this.filtersGrid.BeginInsert(); this.filtersGrid.UpdateLayout(); } private void OnRemoveFilterButtonClick(object sender, RoutedEventArgs e) { if (this.DomainDataSource.IsLoadingData) { return; } // Find the currently selected filter and destroy it. System.Windows.Data.FilterDescriptor filterToRemove = this.filtersGrid.SelectedItem as System.Windows.Data.FilterDescriptor; if (filterToRemove != null && this.DomainDataSource.FilterDescriptors.Contains(filterToRemove)) { this.DomainDataSource.FilterDescriptors.Remove(filterToRemove); } } private void ClosePopup() { System.Windows.Controls.Primitives.Popup popup = this.ParentOfType<System.Windows.Controls.Primitives.Popup>(); if (popup != null) { popup.IsOpen = false; } } } } Finally, we need to tell RadGridViews Columns to use this custom control instead of the default one. Here is how to do it: using System; using System.Collections.Generic; using System.Linq; using System.Net; using System.Windows; using System.Windows.Controls; using System.Windows.Documents; using System.Windows.Input; using System.Windows.Media; using System.Windows.Media.Animation; using System.Windows.Shapes; using System.Windows.Data; using Telerik.Windows.Data; using Telerik.Windows.Controls; using Telerik.Windows.Controls.GridView; namespace PureServerSideFiltering { public partial class MainPage : UserControl { public MainPage() { InitializeComponent(); this.grid.AutoGeneratingColumn += this.OnGridAutoGeneratingColumn; // Uncomment this if you want the DomainDataSource to start pre-filtered. // You will notice how our custom filtering controls will correctly read this information, // populate their UI with the respective filters and lit-up the funnel to indicate that // filtering is active. Go ahead and try it. this.employeesDataSource.FilterDescriptors.Add(new System.Windows.Data.FilterDescriptor("Title", System.Windows.Data.FilterOperator.Contains, "Assistant")); this.employeesDataSource.FilterDescriptors.Add(new System.Windows.Data.FilterDescriptor("HireDate", System.Windows.Data.FilterOperator.IsGreaterThan, new DateTime(1998, 12, 31))); this.employeesDataSource.FilterDescriptors.Add(new System.Windows.Data.FilterDescriptor("HireDate", System.Windows.Data.FilterOperator.IsLessThanOrEqualTo, new DateTime(1999, 12, 31))); this.employeesDataSource.Load(); } /// <summary> /// First of all, we will need to replace the default filtering control /// of each column with out custom filtering control DomainDataSourceFilteringControl /// </summary> private void OnGridAutoGeneratingColumn(object sender, GridViewAutoGeneratingColumnEventArgs e) { GridViewBoundColumnBase dataColumn = e.Column as GridViewBoundColumnBase; if (dataColumn != null) { // We do not like ugly dates. if (dataColumn.DataType == typeof(DateTime)) { dataColumn.DataFormatString = "{0:d}"; // Short date pattern. // Notice how this format will be later transferred to the Value column // of the grid that we have inside the DomainDataSourceFilteringControl. } // Replace the default filtering control with our. dataColumn.FilteringControl = new DomainDataSourceFilteringControl() { // Let the control know about the DDS, after all it will work directly on it. DomainDataSource = this.employeesDataSource }; // Finally, lit-up the filtering funnel through the IsActive dependency property // in case there are some filters on the DDS that match our column member. string dataMemberName = dataColumn.GetDataMemberName(); dataColumn.FilteringControl.IsActive = this.employeesDataSource.FilterDescriptors .Where(fd => fd.PropertyPath == dataMemberName) .Count() > 0; } } } } The best part is that we are not only writing filters for the DomainDataSource we can read and load them. If the DomainDataSource has some pre-existing filters (like I have created in the code above), our control will read them and will populate its UI accordingly. Even the filtering funnel will light-up! Remember, the funnel is controlled by the IsActive property of our control. While this is just a basic implementation, the source code is absolutely yours and you can take it from here and extend it to match your specific business requirements. Below the main grid there is another debug grid. With its help you can monitor what filter descriptors are added and removed to the domain data source. Download Source Code. (You will have to have the AdventureWorks sample database installed on the default SQLExpress instance in order to run it.) Enjoy!Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article
SQL SERVER – Disabled Index and Update Statistics

- by pinaldave

When we try to update the statistics, it throws an error as if the clustered index is disabled. Now let us enable the clustered index only and attempt to update the statistics of the table right after that. Have you ever come across the situation where a conversation never gets over and it continues even though original point of discussion has passed. I am facing the same situation in the case of Disabled Index. Here is the link to original conversations. SQL SERVER – Disable Clustered Index and Data Insert – Reader had a issue here with Disabled Index SQL SERVER – Understanding ALTER INDEX ALL REBUILD with Disabled Clustered Index – Reader asked the effect of Rebuilding Indexes The same reader asked me today – “I understood what the disabled indexes do; what is their effect on statistics. Is it true that even though indexes are disabled, they continue updating the statistics?“ The answer is very interesting: If you have disabled clustered index, you will be not able to update the statistics at all for any index. If you have enabled clustered index and disabled non clustered index when you update the statistics of the table, it automatically updates the statistics of the ALL (disabled and enabled – both) the indexes on the table. If you are not satisfied with the answer, let us go over a simple example. I have written necessary comments in the code itself to have a clear idea. USE tempdb GO -- Drop Table if Exists IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[TableName]') AND type IN (N'U')) DROP TABLE [dbo].[TableName] GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL ) GO -- Insert Some data INSERT INTO TableName SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' UNION ALL SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Five' GO -- Create Clustered Index ALTER TABLE [TableName] ADD CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Check that all the indexes are enabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO Now let us update the statistics of the table and check the statistics update date. -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO Now let us disable the indexes and check if they are disabled using sys.indexes. -- Disable Indexes -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE GO -- Check that all the indexes are disabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO Let us try to update the statistics of the table. -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO /* -- Above operation should thrown following error Msg 1974, Level 16, State 1, Line 1 Cannot perform the specified operation on table 'TableName' because its clustered index 'PK_TableName' is disabled. */ When we try to update the statistics it throws an error as it clustered index is disabled. Now let us enable the clustered index only and attempt to update the statistics of the table right after that. -- Now let us rebuild clustered index only ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD GO -- Check that all the indexes status SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO We can clearly see that even though the nonclustered index is disabled it is also updated. If you do not need a nonclustered index, I suggest you to drop it as keeping them disabled is an overhead on your system. This is because every time the statistics are updated for system all the statistics for disabled indexesare also updated. -- Clean up DROP TABLE [TableName] GO The complete script is given below for easy reference. USE tempdb GO -- Drop Table if Exists IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[TableName]') AND type IN (N'U')) DROP TABLE [dbo].[TableName] GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL ) GO -- Insert Some data INSERT INTO TableName SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' UNION ALL SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Five' GO -- Create Clustered Index ALTER TABLE [TableName] ADD CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Check that all the indexes are enabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Disable Indexes -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE GO -- Check that all the indexes are disabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO /* -- Above operation should thrown following error Msg 1974, Level 16, State 1, Line 1 Cannot perform the specified operation on table 'TableName' because its clustered index 'PK_TableName' is disabled. */ -- Now let us rebuild clustered index only ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD GO -- Check that all the indexes status SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Clean up DROP TABLE [TableName] GO Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Statistics

Read the article
Using Oracle ADF Data Visualization Tools (DVT) Line Graphs to Display Weather Information

- by Christian David Straub

OverviewA guest post by Jeanne Waldman.I have a simple JDeveloper Fusion application that retrieves weather data. I wanted to compare the week's temperatures of different locations in a graph. I decided to check out the dvt:lineGraph component, and it took me a few minutes to add it to my jspx page and supply it with data.Drag and Drop the dvt:lineGraph onto your pageI opened my .jspx page in design modeIn the Component Palette, I selected ADF Data Visualization.Then I dragged 'Line' onto my page.A dialog popped up giving me options of the type of line graph. I chose the default.A lineGraph displayed with some default data. Hook up your weather dataNow I wanted to hook up my own data. I browsed the tagdoc, and I found the tabularData attribute.Attribute: tabularDataType: java.util.ListTagDoc:Specifies a list of data that the graph uses to create a grid and populate itself. The List consists of a three-member Object array for each data value to be passed to the graph. The members of each array must be organized as follows: The first member (index 0) is the column label, in the grid, of the data value. This is generally a String. If the graph has a time axis, then this should be a Java Date. Column labels typically identify groups in the graph. The second member (index 1) is the row label, in the grid, of the data value. This is generally a String. Row labels appear as series labels in the graph (usually in the legend). The third member (index 2) is the data value, which is usually a Double.The first member is the column label of the data value. This would be the day of the week.The second member is the row label of the data value. This would be the location name.The third member is the data value, usually a Double. This would be the temperature. I already had all this information, I just needed to put it in a List with a three-member Object array for each data value. /** * This is used for the lineGraph to show the data for each location. */ public List<Object[]> getTabularData() { List<Object[]> tabularData = new ArrayList<Object []>(); List<WeatherForecast> weatherForecastList = getWeatherForecastList(); // loop through the list and build up the tabular data. Then cache it. for(WeatherForecast wf : weatherForecastList) { List<ForecastDay> forecastDayList = wf.getForecastDayList(); String location = wf.getLocation(); for (ForecastDay fday : forecastDayList) { String day = fday.getPrettyDate(); String highTemp = fday.getHighF(); tabularData.add(new Object[]{day, location, Double.valueOf(highTemp)}); } } return tabularData; } Now I bound the lineGraph to this method by setting tabularData to#{weatherForAllLocationsBean.tabularData}weatherForAllLocationsBean is my bean that is defined in faces-config.xml. Adding a barGraphIn about 30 seconds, I added a barGraph with the same data. I dragged and dropped a bar graph onto the page, used the same tabularData as I did in the line graph. The page looks like this: ConclusionI was very happy how fast it was to hook up my weather data to these graphs. They look great, and they have built in functionality. For instance, I can hide/show a location by clicking on the name of the location in the legend.

Read the article
Quick guide to Oracle IRM 11g: Classification design

- by Simon Thorpe

Quick guide to Oracle IRM 11g indexThis is the final article in the quick guide to Oracle IRM. If you've followed everything prior you will now have a fully functional and tested Information Rights Management service. It doesn't matter if you've been following the 10g or 11g guide as this next article is common to both. ContentsWhy this is the most important part... Understanding the classification and standard rights model Identifying business use cases Creating an effective IRM classification modelOne single classification across the entire businessA context for each and every possible granular use caseWhat makes a good context? Deciding on the use of roles in the context Reviewing the features and security for context roles Summary Why this is the most important part...Now the real work begins, installing and getting an IRM system running is as simple as following instructions. However to actually have an IRM technology easily protecting your most sensitive information without interfering with your users existing daily work flows and be able to scale IRM across the entire business, requires thought into how confidential documents are created, used and distributed. This article is going to give you the information you need to ask the business the right questions so that you can deploy your IRM service successfully. The IRM team here at Oracle have over 10 years of experience in helping customers and it is important you understand the following to be successful in securing access to your most confidential information. Whatever you are trying to secure, be it mergers and acquisitions information, engineering intellectual property, health care documentation or financial reports. No matter what type of user is going to access the information, be they employees, contractors or customers, there are common goals you are always trying to achieve.Securing the content at the earliest point possible and do it automatically. Removing the dependency on the user to decide to secure the content reduces the risk of mistakes significantly and therefore results a more secure deployment. K.I.S.S. (Keep It Simple Stupid) Reduce complexity in the rights/classification model. Oracle IRM lets you make changes to access to documents even after they are secured which allows you to start with a simple model and then introduce complexity once you've understood how the technology is going to be used in the business. After an initial learning period you can review your implementation and start to make informed decisions based on user feedback and administration experience. Clearly communicate to the user, when appropriate, any changes to their existing work practice. You must make every effort to make the transition to sealed content as simple as possible. For external users you must help them understand why you are securing the documents and inform them the value of the technology to both your business and them. Before getting into the detail, I must pay homage to Martin White, Vice President of client services in SealedMedia, the company Oracle acquired and who created Oracle IRM. In the SealedMedia years Martin was involved with every single customer and was key to the design of certain aspects of the IRM technology, specifically the context model we will be discussing here. Listening carefully to customers and understanding the flexibility of the IRM technology, Martin taught me all the skills of helping customers build scalable, effective and simple to use IRM deployments. No matter how well the engineering department designed the software, badly designed and poorly executed projects can result in difficult to use and manage, and ultimately insecure solutions. The advice and information that follows was born with Martin and he's still delivering IRM consulting with customers and can be found at www.thinkers.co.uk. It is from Martin and others that Oracle not only has the most advanced, scalable and usable document security solution on the market, but Oracle and their partners have the most experience in delivering successful document security solutions. Understanding the classification and standard rights model The goal of any successful IRM deployment is to balance the increase in security the technology brings without over complicating the way people use secured content and avoid a significant increase in administration and maintenance. With Oracle it is possible to automate the protection of content, deploy the desktop software transparently and use authentication methods such that users can open newly secured content initially unaware the document is any different to an insecure one. That is until of course they attempt to do something for which they don't have any rights, such as copy and paste to an insecure application or try and print. Central to achieving this objective is creating a classification model that is simple to understand and use but also provides the right level of complexity to meet the business needs. In Oracle IRM the term used for each classification is a "context". A context defines the relationship between.A group of related documents The people that use the documents The roles that these people perform The rights that these people need to perform their role The context is the key to the success of Oracle IRM. It provides the separation of the role and rights of a user from the content itself. Documents are sealed to contexts but none of the rights, user or group information is stored within the content itself. Sealing only places information about the location of the IRM server that sealed it, the context applied to the document and a few other pieces of metadata that pertain only to the document. This important separation of rights from content means that millions of documents can be secured against a single classification and a user needs only one right assigned to be able to access all documents. If you have followed all the previous articles in this guide, you will be ready to start defining contexts to which your sensitive information will be protected. But before you even start with IRM, you need to understand how your own business uses and creates sensitive documents and emails. Identifying business use cases Oracle is able to support multiple classification systems, but usually there is one single initial need for the technology which drives a deployment. This need might be to protect sensitive mergers and acquisitions information, engineering intellectual property, financial documents. For this and every subsequent use case you must understand how users create and work with documents, to who they are distributed and how the recipients should interact with them. A successful IRM deployment should start with one well identified use case (we go through some examples towards the end of this article) and then after letting this use case play out in the business, you learn how your users work with content, how well your communication to the business worked and if the classification system you deployed delivered the right balance. It is at this point you can start rolling the technology out further. Creating an effective IRM classification model Once you have selected the initial use case you will address with IRM, you need to design a classification model that defines the access to secured documents within the use case. In Oracle IRM there is an inbuilt classification system called the "context" model. In Oracle IRM 11g it is possible to extend the server to support any rights classification model, but the majority of users who are not using an application integration (such as Oracle IRM within Oracle Beehive) are likely to be starting out with the built in context model. Before looking at creating a classification system with IRM, it is worth reviewing some recognized standards and methods for creating and implementing security policy. A very useful set of documents are the ISO 17799 guidelines and the SANS security policy templates. First task is to create a context against which documents are to be secured. A context consists of a group of related documents (all top secret engineering research), a list of roles (contributors and readers) which define how users can access documents and a list of users (research engineers) who have been given a role allowing them to interact with sealed content. Before even creating the first context it is wise to decide on a philosophy which will dictate the level of granularity, the question is, where do you start? At a department level? By project? By technology? First consider the two ends of the spectrum... One single classification across the entire business Imagine that instead of having separate contexts, one for engineering intellectual property, one for your financial data, one for human resources personally identifiable information, you create one context for all documents across the entire business. Whilst you may have immediate objections, there are some significant benefits in thinking about considering this. Document security classification decisions are simple. You only have one context to chose from! User provisioning is simple, just make sure everyone has a role in the only context in the business. Administration is very low, if you assign rights to groups from the business user repository you probably never have to touch IRM administration again. There are however some obvious downsides to this model.All users in have access to all IRM secured content. So potentially a sales person could access sensitive mergers and acquisition documents, if they can get their hands on a copy that is. You cannot delegate control of different documents to different parts of the business, this may not satisfy your regulatory requirements for the separation and delegation of duties. Changing a users role affects every single document ever secured. Even though it is very unlikely a business would ever use one single context to secure all their sensitive information, thinking about this scenario raises one very important point. Just having one single context and securing all confidential documents to it, whilst incurring some of the problems detailed above, has one huge value. Once secured, IRM protected content can ONLY be accessed by authorized users. Just think of all the sensitive documents in your business today, imagine if you could ensure that only everyone you trust could open them. Even if an employee lost a laptop or someone accidentally sent an email to the wrong recipient, only the right people could open that file. A context for each and every possible granular use case Now let's think about the total opposite of a single context design. What if you created a context for each and every single defined business need and created multiple contexts within this for each level of granularity? Let's take a use case where we need to protect engineering intellectual property. Imagine we have 6 different engineering groups, and in each we have a research department, a design department and manufacturing. The company information security policy defines 3 levels of information sensitivity... restricted, confidential and top secret. Then let's say that each group and department needs to define access to information from both internal and external users. Finally add into the mix that they want to review the rights model for each context every financial quarter. This would result in a huge amount of contexts. For example, lets just look at the resulting contexts for one engineering group. Q1FY2010 Restricted Internal - Engineering Group 1 - Research Q1FY2010 Restricted Internal - Engineering Group 1 - Design Q1FY2010 Restricted Internal - Engineering Group 1 - Manufacturing Q1FY2010 Restricted External- Engineering Group 1 - Research Q1FY2010 Restricted External - Engineering Group 1 - Design Q1FY2010 Restricted External - Engineering Group 1 - Manufacturing Q1FY2010 Confidential Internal - Engineering Group 1 - Research Q1FY2010 Confidential Internal - Engineering Group 1 - Design Q1FY2010 Confidential Internal - Engineering Group 1 - Manufacturing Q1FY2010 Confidential External - Engineering Group 1 - Research Q1FY2010 Confidential External - Engineering Group 1 - Design Q1FY2010 Confidential External - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret Internal - Engineering Group 1 - Research Q1FY2010 Top Secret Internal - Engineering Group 1 - Design Q1FY2010 Top Secret Internal - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret External - Engineering Group 1 - Research Q1FY2010 Top Secret External - Engineering Group 1 - Design Q1FY2010 Top Secret External - Engineering Group 1 - Manufacturing Now multiply the above by 6 for each engineering group, 18 contexts. You are then creating/reviewing another 18 every 3 months. After a year you've got 72 contexts. What would be the advantages of such a complex classification model? You can satisfy very granular rights requirements, for example only an authorized engineering group 1 researcher can create a top secret report for access internally, and his role will be reviewed on a very frequent basis. Your business may have very complex rights requirements and mapping this directly to IRM may be an obvious exercise. The disadvantages of such a classification model are significant...Huge administrative overhead. Someone in the business must manage, review and administrate each of these contexts. If the engineering group had a single administrator, they would have 72 classifications to reside over each year. From an end users perspective life will be very confusing. Imagine if a user has rights in just 6 of these contexts. They may be able to print content from one but not another, be able to edit content in 2 contexts but not the other 4. Such confusion at the end user level causes frustration and resistance to the use of the technology. Increased synchronization complexity. Imagine a user who after 3 years in the company ends up with over 300 rights in many different contexts across the business. This would result in long synchronization times as the client software updates all your offline rights. Hard to understand who can do what with what. Imagine being the VP of engineering and as part of an internal security audit you are asked the question, "What rights to researchers have to our top secret information?". In this complex model the answer is not simple, it would depend on many roles in many contexts. Of course this example is extreme, but it highlights that trying to build many barriers in your business can result in a nightmare of administration and confusion amongst users. In the real world what we need is a balance of the two. We need to seek an optimum number of contexts. Too many contexts are unmanageable and too few contexts does not give fine enough granularity. What makes a good context? Good context design derives mainly from how well you understand your business requirements to secure access to confidential information. Some customers I have worked with can tell me exactly the documents they wish to secure and know exactly who should be opening them. However there are some customers who know only of the government regulation that requires them to control access to certain types of information, they don't actually know where the documents are, how they are created or understand exactly who should have access. Therefore you need to know how to ask the business the right questions that lead to information which help you define a context. First ask these questions about a set of documentsWhat is the topic? Who are legitimate contributors on this topic? Who are the authorized readership? If the answer to any one of these is significantly different, then it probably merits a separate context. Remember that sealed documents are inherently secure and as such they cannot leak to your competitors, therefore it is better sealed to a broad context than not sealed at all. Simplicity is key here. Always revert to the first extreme example of a single classification, then work towards essential complexity. If there is any doubt, always prefer fewer contexts. Remember, Oracle IRM allows you to change your mind later on. You can implement a design now and continue to change and refine as you learn how the technology is used. It is easy to go from a simple model to a more complex one, it is much harder to take a complex model that is already embedded in the work practice of users and try to simplify it. It is also wise to take a single use case and address this first with the business. Don't try and tackle many different problems from the outset. Do one, learn from the process, refine it and then take what you have learned into the next use case, refine and continue. Once you have a good grasp of the technology and understand how your business will use it, you can then start rolling out the technology wider across the business. Deciding on the use of roles in the context Once you have decided on that first initial use case and a context to create let's look at the details you need to decide upon. For each context, identify; Administrative rolesBusiness owner, the person who makes decisions about who may or may not see content in this context. This is often the person who wanted to use IRM and drove the business purchase. They are the usually the person with the most at risk when sensitive information is lost. Point of contact, the person who will handle requests for access to content. Sometimes the same as the business owner, sometimes a trusted secretary or administrator. Context administrator, the person who will enact the decisions of the Business Owner. Sometimes the point of contact, sometimes a trusted IT person. Document related rolesContributors, the people who create and edit documents in this context. Reviewers, the people who are involved in reviewing documents but are not trusted to secure information to this classification. This role is not always necessary. (See later discussion on Published-work and Work-in-Progress) Readers, the people who read documents from this context. Some people may have several of the roles above, which is fine. What you are trying to do is understand and define how the business interacts with your sensitive information. These roles obviously map directly to roles available in Oracle IRM. Reviewing the features and security for context roles At this point we have decided on a classification of information, understand what roles people in the business will play when administrating this classification and how they will interact with content. The final piece of the puzzle in getting the information for our first context is to look at the permissions people will have to sealed documents. First think why are you protecting the documents in the first place? It is to prevent the loss of leaking of information to the wrong people. To control the information, making sure that people only access the latest versions of documents. You are not using Oracle IRM to prevent unauthorized people from doing legitimate work. This is an important point, with IRM you can erect many barriers to prevent access to content yet too many restrictions and authorized users will often find ways to circumvent using the technology and end up distributing unprotected originals. Because IRM is a security technology, it is easy to get carried away restricting different groups. However I would highly recommend starting with a simple solution with few restrictions. Ensure that everyone who reasonably needs to read documents can do so from the outset. Remember that with Oracle IRM you can change rights to content whenever you wish and tighten security. Always return to the fact that the greatest value IRM brings is that ONLY authorized users can access secured content, remember that simple "one context for the entire business" model. At the start of the deployment you really need to aim for user acceptance and therefore a simple model is more likely to succeed. As time passes and users understand how IRM works you can start to introduce more restrictions and complexity. Another key aspect to focus on is handling exceptions. If you decide on a context model where engineering can only access engineering information, and sales can only access sales data. Act quickly when a sales manager needs legitimate access to a set of engineering documents. Having a quick and effective process for permitting other people with legitimate needs to obtain appropriate access will be rewarded with acceptance from the user community. These use cases can often be satisfied by integrating IRM with a good Identity & Access Management technology which simplifies the process of assigning users the correct business roles. The big print issue... Printing is often an issue of contention, users love to print but the business wants to ensure sensitive information remains in the controlled digital world. There are many cases of physical document loss causing a business pain, it is often overlooked that IRM can help with this issue by limiting the ability to generate physical copies of digital content. However it can be hard to maintain a balance between security and usability when it comes to printing. Consider the following points when deciding about whether to give print rights. Oracle IRM sealed documents can contain watermarks that expose information about the user, time and location of access and the classification of the document. This information would reside in the printed copy making it easier to trace who printed it. Printed documents are slower to distribute in comparison to their digital counterparts, so time sensitive information in printed format may present a lower risk. Print activity is audited, therefore you can monitor and react to users abusing print rights. Summary In summary it is important to think carefully about the way you create your context model. As you ask the business these questions you may get a variety of different requirements. There may be special projects that require a context just for sensitive information created during the lifetime of the project. There may be a department that requires all information in the group is secured and you might have a few senior executives who wish to use IRM to exchange a small number of highly sensitive documents with a very small number of people. Oracle IRM, with its very flexible context classification system, can support all of these use cases. The trick is to introducing the complexity to deliver them at the right level. In another article i'm working on I will go through some examples of how Oracle IRM might map to existing business use cases. But for now, this article covers all the important questions you need to get your IRM service deployed and successfully protecting your most sensitive information.

Read the article
My Interview with Microsoft

- by Victor Hurdugaci

This post is for those who want to apply or have already applied (but not finished the interview) for a Microsoft Job. The recruitment process is quite similar for everyone and consists of a few steps. Application E-Mail Interview Phone Interview On Site Interview I will tell you my story and how I went through the four phases. 1. Application My blog's title (Ex Nihilo Nihil Fit) means "Nothing Comes Out of Nothing". You can't get a job at Microsoft by not doing anything - this is true for anything else. The first step you need to complete is the application process. For this, many options are available. You can... ... apply online on Microsoft's Careers website as I did ... send your CV to different e-mail addresses (there are some dedicated e-mails for different positions) ... apply through some 3rd party organization (job shop, campus recruitment, job agency, etc) On MS Careers you just have to post your CV and choose the job you want. That's all! No recommendation letter, no cover letter, no nothing. Of course, not every CV passes the selection process. Here are some tips for improving your resume (worked for me): Don't write it just before applying! Write a draft version, wait a few days and then review it. This way you will find a lot of mistakes and stupid things you wrote initially. If you review it immediately after writing, your mind will not be criticism oriented and will just ignore mistakes. Repeat the write-wait-review process as many times as necessary, until you find that the review revealed no mistakes. After you did the final review and the CV is bullet-proof, ask others to review it. They will definitely find inconsistencies and mistakes and this will make you feel stupid. This is good because will open your eyes will make you go into an 'I want to improve' mode. You'll try to correct everything. After you come up with a modified version go again through steps 1 and 2. Repeat this as many times as necessary. [Special thanks to Lucian Sasu, Nadia Comanici, Andrei Ciobanu, Monica Balan and Lavinia Tanase for reviewing my CV!] Make it short and give only relevant facts. Initially, I come up with a 5 pages CV because I wrote every single technology with which I worked. There were a lot irrelevant things, I wrote Windows Workflow Foundation just because I played with it for a few days. I added extensive descriptions for every project, made a personal details section (name, birth date, address, etc) of 1/2 page. Others suggested to cut everything that was not necessary. You don't need to give extensive descriptions, just add a few words. For example, I wrote "VS Image Visualizer - Visual Studio 2008 debug visualizer for images" and added a link to the project's page - you submit formatted andcan embed links. Add something that makes it different. I don't know if this makes a difference, but I added some lines to separate items just like in the picture below. Definitely Microsoft gets thousands of CVs per day. You need something special. Don't lie! Tell exactly what you did and what is the proficiency level of your skills. For example, don't write "Advanced" for UML if you don't know the difference between composition and aggregation. Be realistic and don't under/over estimate yourself. Use the spell chick. Make sure everything is written in correct English and there are no grammar/spelling mistakes. Noddy likes a WC with grammar mi takes. You mght fail just because of that. Once you completed your CV, choose the job that suits best your needs, apply and wait... The waiting is a problem because all these big companies like Microsoft, Google, Mozilla, Apple, etc. will contact you only if they find something interesting in your application. If you're not suitable, then no rejection is sent. I applied for an Intern Software Development Engineer position at Microsoft Redmond. I cannot apply for a full time position because I want to finish the master program on time, in the next summer - an internship is just what I need. 2. E-Mail Interview January 20, 2010. Two months since I submitted the CV. I wasn't hoping anymore that MS will contact me, when I got an e-mail titled: "Victor Hurdugaci ES DK" from Holly Peterson saying: Read more >>

Read the article
SQL SERVER – Simple Example of Snapshot Isolation – Reduce the Blocking Transactions

- by pinaldave

To learn any technology and move to a more advanced level, it is very important to understand the fundamentals of the subject first. Today, we will be talking about something which has been quite introduced a long time ago but not properly explored when it comes to the isolation level. Snapshot Isolation was introduced in SQL Server in 2005. However, the reality is that there are still many software shops which are using the SQL Server 2000, and therefore cannot be able to maintain the Snapshot Isolation. Many software shops have upgraded to the later version of the SQL Server, but their respective developers have not spend enough time to upgrade themselves with the latest technology. “It works!” is a very common answer of many when they are asked about utilizing the new technology, instead of backward compatibility commands. In one of the recent consultation project, I had same experience when developers have “heard about it” but have no idea about snapshot isolation. They were thinking it is the same as Snapshot Replication – which is plain wrong. This is the same demo I am including here which I have created for them. In Snapshot Isolation, the updated row versions for each transaction are maintained in TempDB. Once a transaction has begun, it ignores all the newer rows inserted or updated in the table. Let us examine this example which shows the simple demonstration. This transaction works on optimistic concurrency model. Since reading a certain transaction does not block writing transaction, it also does not block the reading transaction, which reduced the blocking. First, enable database to work with Snapshot Isolation. Additionally, check the existing values in the table from HumanResources.Shift. ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON GO SELECT ModifiedDate FROM HumanResources.Shift GO Now, we will need two different sessions to prove this example. First Session: Set Transaction level isolation to snapshot and begin the transaction. Update the column “ModifiedDate” to today’s date. -- Session 1 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN UPDATE HumanResources.Shift SET ModifiedDate = GETDATE() GO Please note that we have not yet been committed to the transaction. Now, open the second session and run the following “SELECT” statement. Then, check the values of the table. Please pay attention on setting the Isolation level for the second one as “Snapshot” at the same time when we already start the transaction using BEGIN TRAN. -- Session 2 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that the values in the table are still original values. They have not been modified yet. Once again, go back to session 1 and begin the transaction. -- Session 1 COMMIT After that, go back to Session 2 and see the values of the table. -- Session 2 SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that the values are yet not changed and they are still the same old values which were there right in the beginning of the session. Now, let us commit the transaction in the session 2. Once committed, run the same SELECT statement once more and see what the result is. -- Session 2 COMMIT SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that it now reflects the new updated value. I hope that this example is clear enough as it would give you good idea how the Snapshot Isolation level works. There is much more to write about an extra level, READ_COMMITTED_SNAPSHOT, which we will be discussing in another post soon. If you wish to use this transaction’s Isolation level in your production database, I would appreciate your comments about their performance on your servers. I have included here the complete script used in this example for your quick reference. ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON GO SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 1 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN UPDATE HumanResources.Shift SET ModifiedDate = GETDATE() GO -- Session 2 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 1 COMMIT -- Session 2 SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 2 COMMIT SELECT ModifiedDate FROM HumanResources.Shift GO Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Transaction Isolation

Read the article
Troubleshooting Application Timeouts in SQL Server

- by Tara Kizer

I recently received the following email from a blog reader: "We are having an OLTP database instance, using SQL Server 2005 with little to moderate traffic (10-20 requests/min). There are also bulk imports that occur at regular intervals in this DB and the import duration ranges between 10secs to 1 min, depending on the data size. Intermittently (2-3 times in a week), we face an issue, where queries get timed out (default of 30 secs set in application). On analyzing, we found two stored procedures, having queries with multiple table joins inside them of taking a long time (5-10 mins) in getting executed, when ideally the execution duration ranges between 5-10 secs. Execution plan of the same displayed Clustered Index Scan happening instead of Clustered Index Seek. All required Indexes are found to be present and Index fragmentation is also minimal as we Rebuild Indexes regularly alongwith Updating Statistics. With no other alternate options occuring to us, we restarted SQL server and thereafter the performance was back on track. But sometimes it was still giving timeout errors for some hits and so we also restarted IIS and that stopped the problem as of now." Rather than respond directly to the blog reader, I thought it would be more interesting to share my thoughts on this issue in a blog. There are a few things that I can think of that could cause abnormal timeouts: Blocking Bad plan in cache Outdated statistics Hardware bottleneck To determine if blocking is the issue, we can easily run sp_who/sp_who2 or a query directly on sysprocesses (select * from master..sysprocesses where blocking <> 0). If blocking is present and consistent, then you'll need to determine whether or not to kill the parent blocking process. Killing a process will cause the transaction to rollback, so you need to proceed with caution. Killing the parent blocking process is only a temporary solution, so you'll need to do more thorough analysis to figure out why the blocking was present. You should look into missing indexes and perhaps consider changing the database's isolation level to READ_COMMITTED_SNAPSHOT. The blog reader mentions that the execution plan shows a clustered index scan when a clustered index seek is normal for the stored procedure. A clustered index scan might have been chosen either because that is what is in cache already or because of out of date statistics. The blog reader mentions that bulk imports occur at regular intervals, so outdated statistics is definitely something that could cause this issue. The blog reader may need to update statistics after imports are done if the imports are changing a lot of data (greater than 10%). If the statistics are good, then the query optimizer might have chosen to scan rather than seek in a previous execution because the scan was determined to be less costly due to the value of an input parameter. If this parameter value is rare, then its execution plan in cache is what we call a bad plan. You want the best plan in cache for the most frequent parameter values. If a bad plan is a recurring problem on your system, then you should consider rewriting the stored procedure. You might want to break up the code into multiple stored procedures so that each can have a different execution plan in cache. To remove a bad plan from cache, you can recompile the stored procedure. An alternative method is to run DBCC FREEPROCACHE which drops the procedure cache. It is better to recompile stored procedures rather than dropping the procedure cache as dropping the procedure cache affects all plans in cache rather than just the ones that were bad, so there will be a temporary performance penalty until the plans are loaded into cache again. To determine if there is a hardware bottleneck occurring such as slow I/O or high CPU utilization, you will need to run Performance Monitor on the database server. Hopefully you already have a baseline of the server so you know what is normal and what is not. Be on the lookout for I/O requests taking longer than 12 milliseconds and CPU utilization over 90%. The servers that I support typically are under 30% CPU utilization, but your baseline could be higher and be within a normal range. If restarting the SQL Server service fixes the problem, then the problem was most likely due to blocking or a bad plan in the procedure cache. Rather than restarting the SQL Server service, which causes downtime, the blog reader should instead analyze the above mentioned things. Proceed with caution when restarting the SQL Server service as all transactions that have not completed will be rolled back at startup. This crash recovery process could take longer than normal if there was a long-running transaction running when the service was stopped. Until the crash recovery process is completed on the database, it is unavailable to your applications. If restarting IIS fixes the problem, then the problem might not have been inside SQL Server. Prior to taking this step, you should do analysis of the above mentioned things. If you can think of other reasons why the blog reader is facing this issue a few times a week, I'd love to hear your thoughts via a blog comment.

Read the article
Basic Spatial Data with SQL Server and Entity Framework 5.0

- by Rick Strahl

In my most recent project we needed to do a bit of geo-spatial referencing. While spatial features have been in SQL Server for a while using those features inside of .NET applications hasn't been as straight forward as could be, because .NET natively doesn't support spatial types. There are workarounds for this with a few custom project like SharpMap or a hack using the Sql Server specific Geo types found in the Microsoft.SqlTypes assembly that ships with SQL server. While these approaches work for manipulating spatial data from .NET code, they didn't work with database access if you're using Entity Framework. Other ORM vendors have been rolling their own versions of spatial integration. In Entity Framework 5.0 running on .NET 4.5 the Microsoft ORM finally adds support for spatial types as well. In this post I'll describe basic geography features that deal with single location and distance calculations which is probably the most common usage scenario. SQL Server Transact-SQL Syntax for Spatial Data Before we look at how things work with Entity framework, lets take a look at how SQL Server allows you to use spatial data to get an understanding of the underlying semantics. The following SQL examples should work with SQL 2008 and forward. Let's start by creating a test table that includes a Geography field and also a pair of Long/Lat fields that demonstrate how you can work with the geography functions even if you don't have geography/geometry fields in the database. Here's the CREATE command:CREATE TABLE [dbo].[Geo]( [id] [int] IDENTITY(1,1) NOT NULL, [Location] [geography] NULL, [Long] [float] NOT NULL, [Lat] [float] NOT NULL ) Now using plain SQL you can insert data into the table using geography::STGeoFromText SQL CLR function:insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.527200 45.712113)', 4326), -121.527200, 45.712113 ) insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.517265 45.714240)', 4326), -121.517265, 45.714240 ) insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.511536 45.714825)', 4326), -121.511536, 45.714825) The STGeomFromText function accepts a string that points to a geometric item (a point here but can also be a line or path or polygon and many others). You also need to provide an SRID (Spatial Reference System Identifier) which is an integer value that determines the rules for how geography/geometry values are calculated and returned. For mapping/distance functionality you typically want to use 4326 as this is the format used by most mapping software and geo-location libraries like Google and Bing. The spatial data in the Location field is stored in binary format which looks something like this: Once the location data is in the database you can query the data and do simple distance computations very easily. For example to calculate the distance of each of the values in the database to another spatial point is very easy to calculate. Distance calculations compare two points in space using a direct line calculation. For our example I'll compare a new point to all the points in the database. Using the Location field the SQL looks like this:-- create a source point DECLARE @s geography SET @s = geography:: STGeomFromText('POINT(-121.527200 45.712113)' , 4326); --- return the ids select ID, Location as Geo , Location .ToString() as Point , @s.STDistance( Location) as distance from Geo order by distance The code defines a new point which is the base point to compare each of the values to. You can also compare values from the database directly, but typically you'll want to match a location to another location and determine the difference for which you can use the geography::STDistance function. This query produces the following output: The STDistance function returns the straight line distance between the passed in point and the point in the database field. The result for SRID 4326 is always in meters. Notice that the first value passed was the same point so the difference is 0. The other two points are two points here in town in Hood River a little ways away - 808 and 1256 meters respectively. Notice also that you can order the result by the resulting distance, which effectively gives you results that are ordered radially out from closer to further away. This is great for searches of points of interest near a central location (YOU typically!). These geolocation functions are also available to you if you don't use the Geography/Geometry types, but plain float values. It's a little more work, as each point has to be created in the query using the string syntax, but the following code doesn't use a geography field but produces the same result as the previous query.--- using float fields select ID, geography::STGeomFromText ('POINT(' + STR (long, 15,7 ) + ' ' + Str(lat ,15, 7) + ')' , 4326), geography::STGeomFromText ('POINT(' + STR (long, 15,7 ) + ' ' + Str(lat ,15, 7) + ')' , 4326). ToString(), @s.STDistance( geography::STGeomFromText ('POINT(' + STR(long ,15, 7) + ' ' + Str(lat ,15, 7) + ')' , 4326)) as distance from geo order by distance Spatial Data in the Entity Framework Prior to Entity Framework 5.0 on .NET 4.5 consuming of the data above required using stored procedures or raw SQL commands to access the spatial data. In Entity Framework 5 however, Microsoft introduced the new DbGeometry and DbGeography types. These immutable location types provide a bunch of functionality for manipulating spatial points using geometry functions which in turn can be used to do common spatial queries like I described in the SQL syntax above. The DbGeography/DbGeometry types are immutable, meaning that you can't write to them once they've been created. They are a bit odd in that you need to use factory methods in order to instantiate them - they have no constructor() and you can't assign to properties like Latitude and Longitude. Creating a Model with Spatial Data Let's start by creating a simple Entity Framework model that includes a Location property of type DbGeography: public class GeoLocationContext : DbContext { public DbSet<GeoLocation> Locations { get; set; } } public class GeoLocation { public int Id { get; set; } public DbGeography Location { get; set; } public string Address { get; set; } } That's all there's to it. When you run this now against SQL Server, you get a Geography field for the Location property, which looks the same as the Location field in the SQL examples earlier. Adding Spatial Data to the Database Next let's add some data to the table that includes some latitude and longitude data. An easy way to find lat/long locations is to use Google Maps to pinpoint your location, then right click and click on What's Here. Click on the green marker to get the GPS coordinates. To add the actual geolocation data create an instance of the GeoLocation type and use the DbGeography.PointFromText() factory method to create a new point to assign to the Location property:[TestMethod] public void AddLocationsToDataBase() { var context = new GeoLocationContext(); // remove all context.Locations.ToList().ForEach( loc => context.Locations.Remove(loc)); context.SaveChanges(); var location = new GeoLocation() { // Create a point using native DbGeography Factory method Location = DbGeography.PointFromText( string.Format("POINT({0} {1})", -121.527200,45.712113) ,4326), Address = "301 15th Street, Hood River" }; context.Locations.Add(location); location = new GeoLocation() { Location = CreatePoint(45.714240, -121.517265), Address = "The Hatchery, Bingen" }; context.Locations.Add(location); location = new GeoLocation() { // Create a point using a helper function (lat/long) Location = CreatePoint(45.708457, -121.514432), Address = "Kaze Sushi, Hood River" }; context.Locations.Add(location); location = new GeoLocation() { Location = CreatePoint(45.722780, -120.209227), Address = "Arlington, OR" }; context.Locations.Add(location); context.SaveChanges(); } As promised, a DbGeography object has to be created with one of the static factory methods provided on the type as the Location.Longitude and Location.Latitude properties are read only. Here I'm using PointFromText() which uses a "Well Known Text" format to specify spatial data. In the first example I'm specifying to create a Point from a longitude and latitude value, using an SRID of 4326 (just like earlier in the SQL examples). You'll probably want to create a helper method to make the creation of Points easier to avoid that string format and instead just pass in a couple of double values. Here's my helper called CreatePoint that's used for all but the first point creation in the sample above:public static DbGeography CreatePoint(double latitude, double longitude) { var text = string.Format(CultureInfo.InvariantCulture.NumberFormat, "POINT({0} {1})", longitude, latitude); // 4326 is most common coordinate system used by GPS/Maps return DbGeography.PointFromText(text, 4326); } Using the helper the syntax becomes a bit cleaner, requiring only a latitude and longitude respectively. Note that my method intentionally swaps the parameters around because Latitude and Longitude is the common format I've seen with mapping libraries (especially Google Mapping/Geolocation APIs with their LatLng type). When the context is changed the data is written into the database using the SQL Geography type which looks the same as in the earlier SQL examples shown. Querying Once you have some location data in the database it's now super easy to query the data and find out the distance between locations. A common query is to ask for a number of locations that are near a fixed point - typically your current location and order it by distance. Using LINQ to Entities a query like this is easy to construct:[TestMethod] public void QueryLocationsTest() { var sourcePoint = CreatePoint(45.712113, -121.527200); var context = new GeoLocationContext(); // find any locations within 5 kilometers ordered by distance var matches = context.Locations .Where(loc => loc.Location.Distance(sourcePoint) < 5000) .OrderBy( loc=> loc.Location.Distance(sourcePoint) ) .Select( loc=> new { Address = loc.Address, Distance = loc.Location.Distance(sourcePoint) }); Assert.IsTrue(matches.Count() > 0); foreach (var location in matches) { Console.WriteLine("{0} ({1:n0} meters)", location.Address, location.Distance); } } This example produces: 301 15th Street, Hood River (0 meters)The Hatchery, Bingen (809 meters)Kaze Sushi, Hood River (1,074 meters) The first point in the database is the same as my source point I'm comparing against so the distance is 0. The other two are within the 5 mile radius, while the Arlington location which is 65 miles or so out is not returned. The result is ordered by distance from closest to furthest away. In the code, I first create a source point that is the basis for comparison. The LINQ query then selects all locations that are within 5km of the source point using the Location.Distance() function, which takes a source point as a parameter. You can either use a pre-defined value as I'm doing here, or compare against another database DbGeography property (say when you have to points in the same database for things like routes). What's nice about this query syntax is that it's very clean and easy to read and understand. You can calculate the distance and also easily order by the distance to provide a result that shows locations from closest to furthest away which is a common scenario for any application that places a user in the context of several locations. It's now super easy to accomplish this. Meters vs. Miles As with the SQL Server functions, the Distance() method returns data in meters, so if you need to work with miles or feet you need to do some conversion. Here are a couple of helpers that might be useful (can be found in GeoUtils.cs of the sample project):/// <summary> /// Convert meters to miles /// </summary> /// <param name="meters"></param> /// <returns></returns> public static double MetersToMiles(double? meters) { if (meters == null) return 0F; return meters.Value * 0.000621371192; } /// <summary> /// Convert miles to meters /// </summary> /// <param name="miles"></param> /// <returns></returns> public static double MilesToMeters(double? miles) { if (miles == null) return 0; return miles.Value * 1609.344; } Using these two helpers you can query on miles like this:[TestMethod] public void QueryLocationsMilesTest() { var sourcePoint = CreatePoint(45.712113, -121.527200); var context = new GeoLocationContext(); // find any locations within 5 miles ordered by distance var fiveMiles = GeoUtils.MilesToMeters(5); var matches = context.Locations .Where(loc => loc.Location.Distance(sourcePoint) <= fiveMiles) .OrderBy(loc => loc.Location.Distance(sourcePoint)) .Select(loc => new { Address = loc.Address, Distance = loc.Location.Distance(sourcePoint) }); Assert.IsTrue(matches.Count() > 0); foreach (var location in matches) { Console.WriteLine("{0} ({1:n1} miles)", location.Address, GeoUtils.MetersToMiles(location.Distance)); } } which produces: 301 15th Street, Hood River (0.0 miles)The Hatchery, Bingen (0.5 miles)Kaze Sushi, Hood River (0.7 miles) Nice 'n simple. .NET 4.5 Only Note that DbGeography and DbGeometry are exclusive to Entity Framework 5.0 (not 4.4 which ships in the same NuGet package or installer) and requires .NET 4.5. That's because the new DbGeometry and DbGeography (and related) types are defined in the 4.5 version of System.Data.Entity which is a CLR assembly and is only updated by major versions of .NET. Why this decision was made to add these types to System.Data.Entity rather than to the frequently updated EntityFramework assembly that would have possibly made this work in .NET 4.0 is beyond me, especially given that there are no native .NET framework spatial types to begin with. I find it also odd that there is no native CLR spatial type. The DbGeography and DbGeometry types are specific to Entity Framework and live on those assemblies. They will also work for general purpose, non-database spatial data manipulation, but then you are forced into having a dependency on System.Data.Entity, which seems a bit silly. There's also a System.Spatial assembly that's apparently part of WCF Data Services which in turn don't work with Entity framework. Another example of multiple teams at Microsoft not communicating and implementing the same functionality (differently) in several different places. Perplexed as a I may be, for EF specific code the Entity framework specific types are easy to use and work well. Working with pre-.NET 4.5 Entity Framework and Spatial Data If you can't go to .NET 4.5 just yet you can also still use spatial features in Entity Framework, but it's a lot more work as you can't use the DbContext directly to manipulate the location data. You can still run raw SQL statements to write data into the database and retrieve results using the same TSQL syntax I showed earlier using Context.Database.ExecuteSqlCommand(). Here's code that you can use to add location data into the database:[TestMethod] public void RawSqlEfAddTest() { string sqlFormat = @"insert into GeoLocations( Location, Address) values ( geography::STGeomFromText('POINT({0} {1})', 4326),@p0 )"; var sql = string.Format(sqlFormat,-121.527200, 45.712113); Console.WriteLine(sql); var context = new GeoLocationContext(); Assert.IsTrue(context.Database.ExecuteSqlCommand(sql,"301 N. 15th Street") > 0); } Here I'm using the STGeomFromText() function to add the location data. Note that I'm using string.Format here, which usually would be a bad practice but is required here. I was unable to use ExecuteSqlCommand() and its named parameter syntax as the longitude and latitude parameters are embedded into a string. Rest assured it's required as the following does not work:string sqlFormat = @"insert into GeoLocations( Location, Address) values ( geography::STGeomFromText('POINT(@p0 @p1)', 4326),@p2 )";context.Database.ExecuteSqlCommand(sql, -121.527200, 45.712113, "301 N. 15th Street") Explicitly assigning the point value with string.format works however. There are a number of ways to query location data. You can't get the location data directly, but you can retrieve the point string (which can then be parsed to get Latitude and Longitude) and you can return calculated values like distance. Here's an example of how to retrieve some geo data into a resultset using EF's and SqlQuery method:[TestMethod] public void RawSqlEfQueryTest() { var sqlFormat = @" DECLARE @s geography SET @s = geography:: STGeomFromText('POINT({0} {1})' , 4326); SELECT Address, Location.ToString() as GeoString, @s.STDistance( Location) as Distance FROM GeoLocations ORDER BY Distance"; var sql = string.Format(sqlFormat, -121.527200, 45.712113); var context = new GeoLocationContext(); var locations = context.Database.SqlQuery<ResultData>(sql); Assert.IsTrue(locations.Count() > 0); foreach (var location in locations) { Console.WriteLine(location.Address + " " + location.GeoString + " " + location.Distance); } } public class ResultData { public string GeoString { get; set; } public double Distance { get; set; } public string Address { get; set; } } Hopefully you don't have to resort to this approach as it's fairly limited. Using the new DbGeography/DbGeometry types makes this sort of thing so much easier. When I had to use code like this before I typically ended up retrieving data pks only and then running another query with just the PKs to retrieve the actual underlying DbContext entities. This was very inefficient and tedious but it did work. Summary For the current project I'm working on we actually made the switch to .NET 4.5 purely for the spatial features in EF 5.0. This app heavily relies on spatial queries and it was worth taking a chance with pre-release code to get this ease of integration as opposed to manually falling back to stored procedures or raw SQL string queries to return spatial specific queries. Using native Entity Framework code makes life a lot easier than the alternatives. It might be a late addition to Entity Framework, but it sure makes location calculations and storage easy. Where do you want to go today? ;-) Resources Download Sample Project© Rick Strahl, West Wind Technologies, 2005-2012Posted in ADO.NET Sql Server .NET Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
AuthnRequest Settings in OIF / SP

- by Damien Carru

In this article, I will list the various OIF/SP settings that affect how an AuthnRequest message is created in OIF in a Federation SSO flow. The AuthnRequest message is used by an SP to start a Federation SSO operation and to indicate to the IdP how the operation should be executed: How the user should be challenged at the IdP Whether or not the user should be challenged at the IdP, even if a session already exists at the IdP for this user Which NameID format should be requested in the SAML Assertion Which binding (Artifact or HTTP-POST) should be requested from the IdP to send the Assertion Which profile should be used by OIF/SP to send the AuthnRequest message Enjoy the reading! Protocols The SAML 2.0, SAML 1.1 and OpenID 2.0 protocols define different message elements and rules that allow an administrator to influence the Federation SSO flows in different manners, when the SP triggers an SSO operation: SAML 2.0 allows extensive customization via the AuthnRequest message SAML 1.1 does not allow any customization, since the specifications do not define an authentication request message OpenID 2.0 allows for some customization, mainly via the OpenID 2.0 extensions such as PAPE or UI SAML 2.0 OIF/SP allows the customization of the SAML 2.0 AuthnRequest message for the following elements: ForceAuthn: Boolean indicating whether or not the IdP should force the user for re-authentication, even if the user has still a valid session By default set to false IsPassive Boolean indicating whether or not the IdP is allowed to interact with the user as part of the Federation SSO operation. If false, the Federation SSO operation might result in a failure with the NoPassive error code, because the IdP will not have been able to identify the user By default set to false RequestedAuthnContext Element indicating how the user should be challenged at the IdP If the SP requests a Federation Authentication Method unknown to the IdP or for which the IdP is not configured, then the Federation SSO flow will result in a failure with the NoAuthnContext error code By default missing NameIDPolicy Element indicating which NameID format the IdP should include in the SAML Assertion If the SP requests a NameID format unknown to the IdP or for which the IdP is not configured, then the Federation SSO flow will result in a failure with the InvalidNameIDPolicy error code If missing, the IdP will generally use the default NameID format configured for this SP partner at the IdP By default missing ProtocolBinding Element indicating which SAML binding should be used by the IdP to redirect the user to the SP with the SAML Assertion Set to Artifact or HTTP-POST By default set to HTTP-POST OIF/SP also allows the administrator to configure the server to: Set which binding should be used by OIF/SP to redirect the user to the IdP with the SAML 2.0 AuthnRequest message: Redirect or HTTP-POST By default set to Redirect Set which binding should be used by OIF/SP to redirect the user to the IdP during logout with SAML 2.0 Logout messages: Redirect or HTTP-POST By default set to Redirect SAML 1.1 The SAML 1.1 specifications do not define a message for the SP to send to the IdP when a Federation SSO operation is started. As such, there is no capability to configure OIF/SP on how to affect the start of the Federation SSO flow. OpenID 2.0 OpenID 2.0 defines several extensions that can be used by the SP/RP to affect how the Federation SSO operation will take place: OpenID request: mode: String indicating if the IdP/OP can visually interact with the user checkid_immediate does not allow the IdP/OP to interact with the user checkid_setup allows user interaction By default set to checkid_setup PAPE Extension: max_auth_age : Integer indicating in seconds the maximum amount of time since when the user authenticated at the IdP. If MaxAuthnAge is bigger that the time since when the user last authenticated at the IdP, then the user must be re-challenged. OIF/SP will set this attribute to 0 if the administrator configured ForceAuthn to true, otherwise this attribute won't be set Default missing preferred_auth_policies Contains a Federation Authentication Method Element indicating how the user should be challenged at the IdP By default missing Only specified in the OpenID request if the IdP/OP supports PAPE in XRDS, if OpenID discovery is used. UI Extension Popup mode Boolean indicating the popup mode is enabled for the Federation SSO By default missing Language Preference String containing the preferred language, set based on the browser's language preferences. By default missing Icon: Boolean indicating if the icon feature is enabled. In that case, the IdP/OP would look at the SP/RP XRDS to determine how to retrieve the icon By default missing Only specified in the OpenID request if the IdP/OP supports UI Extenstion in XRDS, if OpenID discovery is used. ForceAuthn and IsPassive WLST Command OIF/SP provides the WLST configureIdPAuthnRequest() command to set: ForceAuthn as a boolean: In a SAML 2.0 AuthnRequest, the ForceAuthn field will be set to true or false In an OpenID 2.0 request, if ForceAuthn in the configuration was set to true, then the max_auth_age field of the PAPE request will be set to 0, otherwise, max_auth_age won't be set IsPassive as a boolean: In a SAML 2.0 AuthnRequest, the IsPassive field will be set to true or false In an OpenID 2.0 request, if IsPassive in the configuration was set to true, then the mode field of the OpenID request will be set to checkid_immediate, otherwise set to checkid_setup Test In this test, OIF/SP is integrated with a remote SAML 2.0 IdP Partner, with the OOTB configuration. Based on this setup, when OIF/SP starts a Federation SSO flow, the following SAML 2.0 AuthnRequest would be generated: <samlp:AuthnRequest ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ID="id-E4BOT7lwbYK56lO57dBaqGUFq01WJSjAHiSR60Q4" Version="2.0" IssueInstant="2014-04-01T21:39:14Z" Destination="https://acme.com/saml20/sso"> <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://sp.com/oam/fed</saml:Issuer> <samlp:NameIDPolicy AllowCreate="true"/></samlp:AuthnRequest> Let's configure OIF/SP for that IdP Partner, so that the SP will require the IdP to re-challenge the user, even if the user is already authenticated: Enter the WLST environment by executing:$IAM_ORACLE_HOME/common/bin/wlst.sh Connect to the WLS Admin server:connect() Navigate to the Domain Runtime branch:domainRuntime() Execute the configureIdPAuthnRequest() command:configureIdPAuthnRequest(partner="AcmeIdP", forceAuthn="true") Exit the WLST environment:exit() After the changes, the following SAML 2.0 AuthnRequest would be generated: <samlp:AuthnRequest ForceAuthn="true" ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ID="id-E4BOT7lwbYK56lO57dBaqGUFq01WJSjAHiSR60Q4" Version="2.0" IssueInstant="2014-04-01T21:39:14Z" Destination="https://acme.com/saml20/sso"> <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://sp.com/oam/fed</saml:Issuer> <samlp:NameIDPolicy AllowCreate="true"/></samlp:AuthnRequest> To display or delete the ForceAuthn/IsPassive settings, perform the following operatons: Enter the WLST environment by executing:$IAM_ORACLE_HOME/common/bin/wlst.sh Connect to the WLS Admin server:connect() Navigate to the Domain Runtime branch:domainRuntime() Execute the configureIdPAuthnRequest() command: To display the ForceAuthn/IsPassive settings on the partnerconfigureIdPAuthnRequest(partner="AcmeIdP", displayOnly="true") To delete the ForceAuthn/IsPassive settings from the partnerconfigureIdPAuthnRequest(partner="AcmeIdP", delete="true") Exit the WLST environment:exit() Requested Fed Authn Method In my earlier "Fed Authentication Method Requests in OIF / SP" article, I discussed how OIF/SP could be configured to request a specific Federation Authentication Method from the IdP when starting a Federation SSO operation, by setting elements in the SSO request message. WLST Command The OIF WLST commands that can be used are: setIdPPartnerProfileRequestAuthnMethod() which will configure the requested Federation Authentication Method in a specific IdP Partner Profile, and accepts the following parameters: partnerProfile: name of the IdP Partner Profile authnMethod: the Federation Authentication Method to request displayOnly: an optional parameter indicating if the method should display the current requested Federation Authentication Method instead of setting it delete: an optional parameter indicating if the method should delete the current requested Federation Authentication Method instead of setting it setIdPPartnerRequestAuthnMethod() which will configure the specified IdP Partner entry with the requested Federation Authentication Method, and accepts the following parameters: partner: name of the IdP Partner authnMethod: the Federation Authentication Method to request displayOnly: an optional parameter indicating if the method should display the current requested Federation Authentication Method instead of setting it delete: an optional parameter indicating if the method should delete the current requested Federation Authentication Method instead of setting it This applies to SAML 2.0 and OpenID 2.0 protocols. See the "Fed Authentication Method Requests in OIF / SP" article for more information. Test In this test, OIF/SP is integrated with a remote SAML 2.0 IdP Partner, with the OOTB configuration. Based on this setup, when OIF/SP starts a Federation SSO flow, the following SAML 2.0 AuthnRequest would be generated: <samlp:AuthnRequest ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ID="id-E4BOT7lwbYK56lO57dBaqGUFq01WJSjAHiSR60Q4" Version="2.0" IssueInstant="2014-04-01T21:39:14Z" Destination="https://acme.com/saml20/sso"> <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://sp.com/oam/fed</saml:Issuer> <samlp:NameIDPolicy AllowCreate="true"/></samlp:AuthnRequest> Let's configure OIF/SP for that IdP Partner, so that the SP will request the IdP to use a mechanism mapped to the urn:oasis:names:tc:SAML:2.0:ac:classes:X509 Federation Authentication Method to authenticate the user: Enter the WLST environment by executing:$IAM_ORACLE_HOME/common/bin/wlst.sh Connect to the WLS Admin server:connect() Navigate to the Domain Runtime branch:domainRuntime() Execute the setIdPPartnerRequestAuthnMethod() command:setIdPPartnerRequestAuthnMethod("AcmeIdP", "urn:oasis:names:tc:SAML:2.0:ac:classes:X509") Exit the WLST environment:exit() After the changes, the following SAML 2.0 AuthnRequest would be generated: <samlp:AuthnRequest ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ID="id-E4BOT7lwbYK56lO57dBaqGUFq01WJSjAHiSR60Q4" Version="2.0" IssueInstant="2014-04-01T21:39:14Z" Destination="https://acme.com/saml20/sso"> <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://sp.com/oam/fed</saml:Issuer> <samlp:NameIDPolicy AllowCreate="true"/> <samlp:RequestedAuthnContext Comparison="minimum"> <saml:AuthnContextClassRef xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"> urn:oasis:names:tc:SAML:2.0:ac:classes:X509 </saml:AuthnContextClassRef> </samlp:RequestedAuthnContext></samlp:AuthnRequest> NameID Format The SAML 2.0 protocol allows for the SP to request from the IdP a specific NameID format to be used when the Assertion is issued by the IdP. Note: SAML 1.1 and OpenID 2.0 do not provide such a mechanism Configuring OIF The administrator can configure OIF/SP to request a NameID format in the SAML 2.0 AuthnRequest via: The OAM Administration Console, in the IdP Partner entry The OIF WLST setIdPPartnerNameIDFormat() command that will modify the IdP Partner configuration OAM Administration Console To configure the requested NameID format via the OAM Administration Console, perform the following steps: Go to the OAM Administration Console: http(s)://oam-admin-host:oam-admin-port/oamconsole Navigate to Identity Federation -> Service Provider Administration Open the IdP Partner you wish to modify In the Authentication Request NameID Format dropdown box with one of the values None The NameID format will be set Default Email Address The NameID format will be set urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress X.509 Subject The NameID format will be set urn:oasis:names:tc:SAML:1.1:nameid-format:X509SubjectName Windows Name Qualifier The NameID format will be set urn:oasis:names:tc:SAML:1.1:nameid-format:WindowsDomainQualifiedName Kerberos The NameID format will be set urn:oasis:names:tc:SAML:2.0:nameid-format:kerberos Transient The NameID format will be set urn:oasis:names:tc:SAML:2.0:nameid-format:transient Unspecified The NameID format will be set urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified Custom In this case, a field would appear allowing the administrator to indicate the custom NameID format to use The NameID format will be set to the specified format Persistent The NameID format will be set urn:oasis:names:tc:SAML:2.0:nameid-format:persistent I selected Email Address in this example Save WLST Command To configure the requested NameID format via the OIF WLST setIdPPartnerNameIDFormat() command, perform the following steps: Enter the WLST environment by executing:$IAM_ORACLE_HOME/common/bin/wlst.sh Connect to the WLS Admin server:connect() Navigate to the Domain Runtime branch:domainRuntime() Execute the setIdPPartnerNameIDFormat() command:setIdPPartnerNameIDFormat("PARTNER", "FORMAT", customFormat="CUSTOM") Replace PARTNER with the IdP Partner name Replace FORMAT with one of the following: orafed-none The NameID format will be set Default orafed-emailaddress The NameID format will be set urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress orafed-x509 The NameID format will be set urn:oasis:names:tc:SAML:1.1:nameid-format:X509SubjectName orafed-windowsnamequalifier The NameID format will be set urn:oasis:names:tc:SAML:1.1:nameid-format:WindowsDomainQualifiedName orafed-kerberos The NameID format will be set urn:oasis:names:tc:SAML:2.0:nameid-format:kerberos orafed-transient The NameID format will be set urn:oasis:names:tc:SAML:2.0:nameid-format:transient orafed-unspecified The NameID format will be set urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified orafed-custom In this case, a field would appear allowing the administrator to indicate the custom NameID format to use The NameID format will be set to the specified format orafed-persistent The NameID format will be set urn:oasis:names:tc:SAML:2.0:nameid-format:persistent customFormat will need to be set if the FORMAT is set to orafed-custom An example would be:setIdPPartnerNameIDFormat("AcmeIdP", "orafed-emailaddress") Exit the WLST environment:exit() Test In this test, OIF/SP is integrated with a remote SAML 2.0 IdP Partner, with the OOTB configuration. Based on this setup, when OIF/SP starts a Federation SSO flow, the following SAML 2.0 AuthnRequest would be generated: <samlp:AuthnRequest ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ID="id-E4BOT7lwbYK56lO57dBaqGUFq01WJSjAHiSR60Q4" Version="2.0" IssueInstant="2014-04-01T21:39:14Z" Destination="https://acme.com/saml20/sso"> <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://sp.com/oam/fed</saml:Issuer> <samlp:NameIDPolicy AllowCreate="true"/></samlp:AuthnRequest> After the changes performed either via the OAM Administration Console or via the OIF WLST setIdPPartnerNameIDFormat() command where Email Address would be requested as the NameID Format, the following SAML 2.0 AuthnRequest would be generated: <samlp:AuthnRequest ForceAuthn="false" IsPassive="false" ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ID="id-E4BOT7lwbYK56lO57dBaqGUFq01WJSjAHiSR60Q4" Version="2.0" IssueInstant="2014-04-01T21:39:14Z" Destination="https://acme.com/saml20/sso"> <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://sp.com/oam/fed</saml:Issuer> <samlp:NameIDPolicy Format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress" AllowCreate="true"/></samlp:AuthnRequest> Protocol Binding The SAML 2.0 specifications define a way for the SP to request which binding should be used by the IdP to redirect the user to the SP with the SAML 2.0 Assertion: the ProtocolBinding attribute indicates the binding the IdP should use. It is set to: Either urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST for HTTP-POST Or urn:oasis:names:tc:SAML:2.0:bindings:Artifact for Artifact The SAML 2.0 specifications also define different ways to redirect the user from the SP to the IdP with the SAML 2.0 AuthnRequest message, as the SP can send the message: Either via HTTP Redirect Or HTTP POST (Other bindings can theoretically be used such as Artifact, but these are not used in practice) Configuring OIF OIF can be configured: Via the OAM Administration Console or the OIF WLST configureSAMLBinding() command to set the Assertion Response binding to be used Via the OIF WLST configureSAMLBinding() command to indicate how the SAML AuthnRequest message should be sent Note: the binding for sending the SAML 2.0 AuthnRequest message will also be used to send the SAML 2.0 LogoutRequest and LogoutResponse messages. OAM Administration Console To configure the SSO Response/Assertion Binding via the OAM Administration Console, perform the following steps: Go to the OAM Administration Console: http(s)://oam-admin-host:oam-admin-port/oamconsole Navigate to Identity Federation -> Service Provider Administration Open the IdP Partner you wish to modify Check the "HTTP POST SSO Response Binding" box to request the IdP to return the SSO Response via HTTP POST, otherwise uncheck it to request artifact Save WLST Command To configure the SSO Response/Assertion Binding as well as the AuthnRequest Binding via the OIF WLST configureSAMLBinding() command, perform the following steps: Enter the WLST environment by executing:$IAM_ORACLE_HOME/common/bin/wlst.sh Connect to the WLS Admin server:connect() Navigate to the Domain Runtime branch:domainRuntime() Execute the configureSAMLBinding() command:configureSAMLBinding("PARTNER", "PARTNER_TYPE", binding, ssoResponseBinding="httppost") Replace PARTNER with the Partner name Replace PARTNER_TYPE with the Partner type (idp or sp) Replace binding with the binding to be used to send the AuthnRequest and LogoutRequest/LogoutResponse messages (should be httpredirect in most case; default) httppost for HTTP-POST binding httpredirect for HTTP-Redirect binding Specify optionally ssoResponseBinding to indicate how the SSO Assertion should be sent back httppost for HTTP-POST binding artifactfor for Artifact binding An example would be:configureSAMLBinding("AcmeIdP", "idp", "httpredirect", ssoResponseBinding="httppost") Exit the WLST environment:exit() Test In this test, OIF/SP is integrated with a remote SAML 2.0 IdP Partner, with the OOTB configuration which requests HTTP-POST from the IdP to send the SSO Assertion. Based on this setup, when OIF/SP starts a Federation SSO flow, the following SAML 2.0 AuthnRequest would be generated: <samlp:AuthnRequest ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ID="id-E4BOT7lwbYK56lO57dBaqGUFq01WJSjAHiSR60Q4" Version="2.0" IssueInstant="2014-04-01T21:39:14Z" Destination="https://acme.com/saml20/sso"> <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://sp.com/oam/fed</saml:Issuer> <samlp:NameIDPolicy AllowCreate="true"/></samlp:AuthnRequest> In the next article, I will cover the various crypto configuration properties in OIF that are used to affect the Federation SSO exchanges.Cheers,Damien Carru

Read the article
Gartner PCC Follow-up: Interview with Chaeny Emanavin, Usability Lead - Office of Information Develo

- by [email protected]

Last week at the Gartner Portals, Content and Collaboration conference in Baltimore, Chaeny and I co-presented on Oracle Enterprise 2.0 and BIA's Citizen Portal. Chaeny's presentation about the BIA solution was very well received and I wanted to do a follow-up interview with Chaeny to discuss more details about their solution and its Enterprise 2.0 features. Ajay: What were the main objectives for the BIA Citizen Portal? Chaeny: The BIA Citizen Portal is designed to provide all the services of the Bureau of Indian Affairs to the community of 564 federally recognized tribes that include over 1.9 million American Indians and Alaska Natives. The BIA provides the same breadth of services that the entire U.S. Federal Government provides in one small Bureau. So, we needed a solution that was flexible enough to handle content ranging from law enforcement to housing to education. Key objectives for external users was to use the Web as a communications channel and keep them informed on what services are available. We also wanted to build an internal web presence and community for BIA's 5000 employees to ensure that they update their content, leverage internal experts and create single sources of truth for key policy documents. Ajay: How is the project being implemented? Chaeny: We are using a phased approach. In phases 1 & 2, interim internal and external sites were built to ensure usability and functional requirements are being met. In Phases 3 & 4, we built out a modern internal and external presence using Oracle WebCenter Suite and Oracle Universal Content Management (UCM), including enabling delegated content management for our internal business units. Phase 4 was completed in January 2010. Phase 5 will add deeper Enterprise 2.0 collaboration capabilities to the solution. Ajay: Are you integrating any existing sites into the new solution? Chaeny: Yes, we have a SharePoint implementation that we are using for document management. We needed more precise functionality however. We found that SharePoint would let individual administrators of a SharePoint site actually create new sites. In a 3 months span, we had over 200 new sites created and most were not being used. So, we had an enormous sprawl problem. Our requirements mandated increased governance and more granular control over the creation of sites and flexible user access to content. In SharePoint this required custom code and was very time-intensive which was unfeasible given our tight deadlines. We are piloting Oracle WebCenter Spaces as our collaboration solution to mitigate these issues. However, we must integrate our existing SharePoint investment which we can do easily by using the SharePoint connectors available in Oracle WebCenter and UCM. Ajay: What were the key design parameters for your solution? Chaeny: We wanted everything driven by standards and policies. We created a cross-functional steering group called the Indian Affairs Web Council to codify policies that were baked into the system. Other key design areas were focused on security/governance, self-service content management, ease of use, integration with legacy applications and seamless single sign-on. We are using Dublin Core as our metadata standard. We also are using Java, APEX, and ADF as our development standards. Ajay: Why was it important to standardize on a platform? Chaeny: We initially looked at best-of-breed solutions, but we faced a lot of issues getting the different solutions to work together. Going with an integrated solution was more economical, easier to learn and faster to deliver the solution. Ajay: What type of legacy applications are you integrating into the portal? Chaeny: Initially we are starting with administrative apps such as people directory and user admin and then we will integrate HR and Financial applications among others. Ajay: Can you describe some of the E20 collaboration features you are putting into the solution? Chaeny: We are adding Enterprise 2.0 using Oracle WebCenter Spaces to deliver different collaboration tools such as wikis, blogs and discussion forums. Wikis to create rapid, ad hoc monthly roll-up reports; discussion forums to provide context-specific help; blogs to capture tacit organization knowledge from experts, identify gurus and turn tacit knowledge into explicit knowledge. Ajay: Are you doing anything specifically to spur adoption and usage? Chaeny: Yes, we did several things that I think helped us ramp quickly. First, we met our commitments for the new system launch date and also provided extra resources for a customer support "hotline" during the launch period. Prior to launch, we did exhaustive usability studies to capture user requirements around functionality, navigation and other key interaction areas. We also created extensive training programs so that the content managers in each business unit were comfortable using the content management tools and knew the best practices for usage. Finally, to launch the Enterprise 2.0 collaboration capabilities, we are working with a pilot group from the Division of Forestry and Wildland Fire Management of BIA. This group of people in the past have been willing early adopters and they have a strong business need to collaborate with many agencies both internal and external across State, County and other Federal jurisdictions. Their feedback is key to helping us launch Enterprise 2.0 successfully in our broader organization. Ajay: What were the biggest benefits to internal BIA employees and to the external community of users? Chaeny: For our employees, the new Enterprise 2.0-based solution will make it easier to find information; enhance employee productivity by embedding standard business processes into the system and create more of a community by creating connections with experts via social collaboration to ultimately provide better services more quickly. For the external American Indian and Alaska Native communities, we have a better relationship with the users and the new site has improved BIA's perception as a more responsive and customer-centric organization.

Read the article
Protecting offline IRM rights and the error "Unable to Connect to Offline database"

- by Simon Thorpe

One of the most common problems I get asked about Oracle IRM is in relation to the error message "Unable to Connect to Offline database". This error message is a result of how Oracle IRM is protecting the cached rights on the local machine and if that cache has become invalid in anyway, this error is thrown. Offline rights and security First we need to understand how Oracle IRM handles offline use. The way it is implemented is one of the main reasons why Oracle IRM is the leading document security solution and demonstrates our methodology to ensure that solutions address both security and usability and puts the balance of these two in your control. Each classification has a set of predefined roles that the manager of the classification can assign to users. Each role has an offline period which determines the amount of time a user can access content without having to communicate with the IRM server. By default for the context model, which is the classification system that ships out of the box with Oracle IRM, the offline period for each role is 3 days. This is easily changed however and can be as low as under an hour to as long as years. It is also possible to switch off the ability to access content offline which can be useful when content is very sensitive and requires a tight leash. So when a user is online, transparently in the background, the Oracle IRM Desktop communicates with the server and updates the users rights and offline periods. This transparent synchronization period is determined by the server and communicated to all IRM Desktops and allows for users rights to be kept up to date without their intervention. This allows us to support some very important scenarios which are key to a successful IRM solution. A user doesn't have to make any decision when going offline, they simply unplug their laptop and they already have their offline periods synchronized to the maximum values. Any solution that requires a user to make a decision at the point of going offline isn't going to work because people forget to do this and will therefore be unable to legitimately access their content offline. If your rights change to REMOVE your access to content, this also happens in the background. This is very useful when someone has an offline duration of a week and they happen to make a connection to the internet 3 days into that offline period, the Oracle IRM Desktop detects this online state and automatically updates all rights for the user. This means the business risk is reduced when setting long offline periods, because of the daily transparent sync, you can reflect changes as soon as the user is online. Of course, if they choose not to come online at all during that week offline period, you cannot effect change, but you take that risk in giving the 7 day offline period in the first place. If you are added to a NEW classification during the day, this will automatically be synchronized without the user even having to open a piece of content secured against that classification. This is very important, consider the scenario where a senior executive downloads all their email but doesn't open any of it. Disconnects the laptop and then gets on a plane. During the flight they attempt to open a document attached to a downloaded email which has been secured against an IRM classification the user was not even aware they had access to. Because their new role in this classification was automatically synchronized their experience is a good one and the document opens. More information on how the Oracle IRM classification model works can be found in this article by Martin Abrahams. So what about problems accessing the offline rights database? So onto the core issue... when these rights are cached to your machine they are stored in an encrypted database. The encryption of this offline database is keyed to the instance of the installation of the IRM Desktop and the Windows user account. Why? Well what you do not want to happen is for someone to get their rights for content and then copy these files across hundreds of other machines, therefore getting access to sensitive content across many environments. The IRM server has a setting which controls how many times you can cache these rights on unique machines. This is because people typically access IRM content on more than one computer. Their work desktop, a laptop and often a home computer. So Oracle IRM allows for the usability of caching rights on more than one computer whilst retaining strong security over this cache. So what happens if these files are corrupted in someway? That's when you will see the error, Unable to Connect to Offline database. The most common instance of seeing this is when you are using virtual machines and copy them from one computer to the next. The virtual machine software, VMWare Workstation for example, makes changes to the unique information of that virtual machine and as such invalidates the offline database. How do you solve the problem? Resolution is however simple. You just delete all of the offline database files on the machine and they will be recreated with working encryption when the Oracle IRM Desktop next starts. However this does mean that the IRM server will think you have your rights cached to more than one computer and you will need to rerequest your rights, even though you are only going to be accessing them on one. Because it still thinks the old cache is valid. So be aware, it is good practice to increase the server limit from the default of 1 to say 3 or 4. This is done using the Enterprise Manager instance of IRM. So to delete these offline files I have a simple .bat file you can use; Download DeleteOfflineDBs.bat Note that this uses pskillto stop the irmBackground.exe from running. This is part of the IRM Desktop and holds open a lock to the offline database. Either kill this from task manager or use pskillas part of the script.

Read the article
Seven Worlds will collide…. High Availability BI is not such a Distant Sun.

- by Testas

Over the last 5 years I have observed Microsoft persevere with the notion of Self Service BI over a series of conferences as far back as SQLBits V in Newport. The release of SQL Server 2012, improvements in Excel and the integration with SharePoint 2010 is making this a reality. Business users are now empowered to create their own BI reports through a number of different technologies such as PowerPivot, PowerView and Report Builder. This opens up a whole new way of working; improving staff productivity, promoting efficient decision making and delivering timely business reports. There is, however; a serious question to answer. What happens should any of these applications become unavailable? More to the point, how would the business react should key business users be unable to fulfil reporting requests for key management meetings when they require it? While the introduction of self-service BI will provide instant access to the creation of management information reports, it will also cause instant support calls should the access to the data become unavailable. These are questions that are often overlooked when a business evaluates the need for self-service BI. But as I have written in other blog posts, the thirst for information is unquenchable once the business users have access to the data. When they are unable to access the information, you will be the first to know about it and will be expected to have a resolution to the downtime as soon as possible. The world of self-service BI is pushing reporting and analytical databases to the tier 1 application level for some of Coeo’s customers. A level that is traditionally associated with mission critical OLTP environments. There is recognition that by making BI readily available to the business user, provisions also need to be made to ensure that the solution is highly available so that there is minimal disruption to the business. This is where High Availability BI infrastructures provide a solution. As there is a convergence of technologies to support a self-service BI culture, there is also a convergence of technologies that need to be understood in order to provide the high availability architecture required to support the self-service BI infrastructure. While you may not be the individual that implements these components, understanding the concepts behind these components will empower you to have meaningful discussions with the right people should you put this infrastructure in place. There are 7 worlds that you will have to understand to successfully implement a highly available BI infrastructure 1. Server/Virtualised server hardware/software 2. DNS 3. Network Load Balancing 4. Active Directory 5. Kerberos 6. SharePoint 7. SQL Server I have found myself over the last 6 months reaching out to knowledge that I learnt years ago when I studied for the Windows 2000 and 2003 (MCSE) Microsoft Certified System Engineer. (To the point that I am resuming my studies for the Windows Server 2008 equivalent to be up to date with newer technologies) This knowledge has proved very useful in the numerous engagements I have undertaken since being at Coeo, particularly when dealing with High Availability Infrastructures. As a result of running my session at SQLBits X and SQL Saturday in Dublin, the feedback I have received has been that many individuals desire to understand more of the concepts behind the first 6 “worlds” in the list above. Over the coming weeks, a series of blog posts will be put on this site to help understand the key concepts of each area as it pertains to a High Availability BI Infrastructure. Each post will not provide exhaustive coverage of the topic. For example DNS can be a book in its own right when you consider that there are so many different configuration options with Forward Lookup, Reverse Lookups, AD Integrated Zones and DNA forwarders to name some examples. What I want to do is share the pertinent points as it pertains to the BI infrastructure that you build so that you are equipped with the knowledge to have the right discussion when planning this infrastructure. Next, we will focus on the server infrastructure that will be required to support the High Availability BI Infrastructure, from both a physical box and virtualised perspective. Thanks Chris

Read the article
Creating an ASP.NET report using Visual Studio 2010 - Part 2

- by rajbk

We continue building our report in this three part series. Creating an ASP.NET report using Visual Studio 2010 - Part 1 Creating an ASP.NET report using Visual Studio 2010 - Part 3 Creating the Client Report Definition file (RDLC) Add a folder called “RDLC”. This will hold our RDLC report. Right click on the RDLC folder, select “Add new item..” and add an “RDLC” name of “Products”. We will use the “Report Wizard” to walk us through the steps of creating the RDLC. In the next dialog, give the dataset a name called “ProductDataSet”. Change the data source to “NorthwindReports.DAL” and select “ProductRepository(GetProductsProjected)”. The fields that are returned from the method are shown on the right. Click next. Drag and drop the ProductName, CategoryName, UnitPrice and Discontinued into the Values container. Note that you can create much more complex grouping using this UI. Click Next. Most of the selections on this screen are grayed out because we did not choose a grouping in the previous screen. Click next. Choose a style for your report. Click next. The report graphic design surface is now visible. Right click on the report and add a page header and page footer. With the report design surface active, drag and drop a TextBox from the tool box to the page header. Drag one more textbox to the page header. We will use the text boxes to add some header text as shown in the next figure. You can change the font size and other properties of the textboxes using the formatting tool bar (marked in red). You can also resize the columns by moving your cursor in between columns and dragging. Adding Expressions Add two more text boxes to the page footer. We will use these to add the time the report was generated and page numbers. Right click on the first textbox in the page footer and select “Expression”. Add the following expression for the print date (note the = sign at the left of the expression in the dialog below) "© Northwind Traders " & Format(Now(),"MM/dd/yyyy hh:mm tt") Right click on the second text box and add the following for the page count. Globals.PageNumber & " of " & Globals.TotalPages Formatting the page footer is complete. We are now going to format the “Unit Price” column so it displays the number in currency format. Right click on the [UnitPrice] column (not header) and select “Text Box Properties..” Under “Number”, select “Currency”. Hit OK. Adding a chart With the design surface active, go to the toolbox and drag and drop a chart control. You will need to move the product list table down first to make space for the chart contorl. The document can also be resized by dragging on the corner or at the page header/footer separator. In the next dialog, pick the first chart type. This can be changed later if needed. Click OK. The chart gets added to the design surface. Click on the blue bars in the chart (not legend). This will bring up drop locations for dropping the fields. Drag and drop the UnitPrice and CategoryName into the top (y axis) and bottom (x axis) as shown below. This will give us the total unit prices for a given category. That is the best I could come up with as far as what report to render, sorry :-) Delete the legend area to get more screen estate. Resize the chart to your liking. Change the header, x axis and y axis text by double clicking on those areas. We made it this far. Let’s impress the client by adding a gradient to the bar graph :-) Right click on the blue bar and select “Series properties”. Under “Fill”, add a color and secondary color and select the Gradient style. We are done designing our report. In the next section you will see how to add the report to the report viewer control, bind to the data and make it refresh when the filter criteria are changed. Creating an ASP.NET report using Visual Studio 2010 - Part 3

Read the article
New .NET Library for Accessing the Survey Monkey API

- by Ben Emmett

I’ve used Survey Monkey’s API for a while, and though it’s pretty powerful, there’s a lot of boilerplate each time it’s used in a new project, and the json it returns needs a bunch of processing to be able to use the raw information. So I’ve finally got around to releasing a .NET library you can use to consume the API more easily. The main advantages are: Only ever deal with strongly-typed .NET objects, making everything much more robust and a lot faster to get going Automatically handles things like rate-limiting and paging through results Uses combinations of endpoints to get all relevant data for you, and processes raw response data to map responses to questions To start, either install it using NuGet with PM> Install-Package SurveyMonkeyApi (easier option), or grab the source from https://github.com/bcemmett/SurveyMonkeyApi if you prefer to build it yourself. You’ll also need to have signed up for a developer account with Survey Monkey, and have both your API key and an OAuth token. A simple usage would be something like: string apiKey = "KEY"; string token = "TOKEN"; var sm = new SurveyMonkeyApi(apiKey, token); List<Survey> surveys = sm.GetSurveyList(); The surveys object is now a list of surveys with all the information available from the /surveys/get_survey_list API endpoint, including the title, id, date it was created and last modified, language, number of questions / responses, and relevant urls. If there are more than 1000 surveys in your account, the library pages through the results for you, making multiple requests to get a complete list of surveys. All the filtering available in the API can be controlled using .NET objects. For example you might only want surveys created in the last year and containing “pineapple” in the title: var settings = new GetSurveyListSettings { Title = "pineapple", StartDate = DateTime.Now.AddYears(-1) }; List<Survey> surveys = sm.GetSurveyList(settings); By default, whenever optional fields can be requested with a response, they will all be fetched for you. You can change this behaviour if for some reason you explicitly don’t want the information, using var settings = new GetSurveyListSettings { OptionalData = new GetSurveyListSettingsOptionalData { DateCreated = false, AnalysisUrl = false } }; Survey Monkey’s 7 read-only endpoints are supported, and the other 4 which make modifications to data might be supported in the future. The endpoints are: Endpoint Method Object returned /surveys/get_survey_list GetSurveyList() List<Survey> /surveys/get_survey_details GetSurveyDetails() Survey /surveys/get_collector_list GetCollectorList() List<Collector> /surveys/get_respondent_list GetRespondentList() List<Respondent> /surveys/get_responses GetResponses() List<Response> /surveys/get_response_counts GetResponseCounts() Collector /user/get_user_details GetUserDetails() UserDetails /batch/create_flow Not supported Not supported /batch/send_flow Not supported Not supported /templates/get_template_list Not supported Not supported /collectors/create_collector Not supported Not supported The hierarchy of objects the library can return is Survey List<Page> List<Question> QuestionType List<Answer> List<Item> List<Collector> List<Response> Respondent List<ResponseQuestion> List<ResponseAnswer> Each of these classes has properties which map directly to the names of properties returned by the API itself (though using PascalCasing which is more natural for .NET, rather than the snake_casing used by SurveyMonkey). For most users, Survey Monkey imposes a rate limit of 2 requests per second, so by default the library leaves at least 500ms between requests. You can request higher limits from them, so if you want to change the delay between requests just use a different constructor: var sm = new SurveyMonkeyApi(apiKey, token, 200); //200ms delay = 5 reqs per sec There’s a separate cap of 1000 requests per day for each API key, which the library doesn’t currently enforce, so if you think you’ll be in danger of exceeding that you’ll need to handle it yourself for now. To help, you can see how many requests the current instance of the SurveyMonkeyApi object has made by reading its RequestsMade property. If the library encounters any errors, including communicating with the API, it will throw a SurveyMonkeyException, so be sure to handle that sensibly any time you use it to make calls. Finally, if you have a survey (or list of surveys) obtained using GetSurveyList(), the library can automatically fill in all available information using sm.FillMissingSurveyInformation(surveys); For each survey in the list, it uses the other endpoints to fill in the missing information about the survey’s question structure, respondents, and responses. This results in at least 5 API calls being made per survey, so be careful before passing it a large list. It also joins up the raw response information to the survey’s question structure, so that for each question in a respondent’s set of replies, you can access a ProcessedAnswer object. For example, a response to a dropdown question (from the /surveys/get_responses endpoint) might be represented in json as { "answers": [ { "row": "9384627365", } ], "question_id": "615487516" } Separately, the question’s structure (from the /surveys/get_survey_details endpoint) might have several possible answers, one of which might look like { "text": "Fourth item in dropdown list", "visible": true, "position": 4, "type": "row", "answer_id": "9384627365" } The library understands how this mapping works, and uses that to give you the following ProcessedAnswer object, which first describes the family and type of question, and secondly gives you the respondent’s answers as they relate to the question. Survey Monkey has many different question types, with 11 distinct data structures, each of which are supported by the library. If you have suggestions or spot any bugs, let me know in the comments, or even better submit a pull request .

Read the article
Debugging Tips for Skinning

- by Christian David Straub

Another guest post by Jeanne Waldman.If you are developing a skin for your Fusion Application in JDeveloper you should know these tips. 1. Firebug is your friend 2. Uncompress the css style classes 3. CHECK_FILE_MODIFICATION so that you see your skinning changes right away 4. View the generated CSS File 1. Firebug is your friend Install Firebug (http://getfirebug.com/layout) into Firefox and use it to view your rendered jspx page in the browser. You can select the HTML dom nodes on your page and you can see the css styles applied to each dom node. 2. Uncompress the css style classes By default the styleclasses that are rendered are compressed. You may see style classes like class="x10" and class="x2e". But in your skin css file you have styleclasses like: af|inputText::content or af|panelBox::header It is easier for you to develop a skin and debug a skin with Firebug if you see the uncompressed styleclasses. To do this, a. open web.xml b. add <context-param> <param-name>org.apache.myfaces.trinidad.DISABLE_CONTENT_COMPRESSION</param-name> <param-value>true</param-value> </context-param> c. save d. restart the server and re-run your page. 3. CHECK_FILE_MODIFICATION so that you see your skinning changes right away For performance sake the ADF Faces framework does not check if you skin .css file has changed on every render. But this is exactly what you want to happen when you are developing or debugging a skin. You want your changes to get noticed right away, without restarting the server. To do this, a. open web.xml b. add <context-param> <description>If this parameter is true, there will be an automatic check of the modification date of your JSPs, and saved state will be discarded when JSP's change. It will also automatically check if your skinning css files have changed without you having to restart the server. This makes development easier, but adds overhead. For this reason this parameter should be set to false when your application is deployed.</description> <param-name>org.apache.myfaces.trinidad.CHECK_FILE_MODIFICATION</param-name> <param-value>false</param-value> </context-param> c. save d. restart the server and re-run your page. e. from then on, you can change your skin's .css file, save it and refresh your page and you should see the changes right away 4. View the generated CSS File There are different ways to view the generated CSS File which is your skin's css file merged in with all the skins it extends and processed and generated to the filesystem and linked to your generated html page. One way is to view it with Firebug. The problem with this approach is you might see something that is a little different than the actual css file because Firebug may do some extra processing. I like to view the generated css file by: Right click on your page in the browser

Read the article
SQL SERVER – Solution – Puzzle – Statistics are not Updated but are Created Once

- by pinaldave

Earlier I asked puzzle why statistics are not updated. Read the complete details over here: Statistics are not Updated but are Created Once In the question I have demonstrated even though statistics should have been updated after lots of insert in the table are not updated.(Read the details SQL SERVER – When are Statistics Updated – What triggers Statistics to Update) In this example I have created following situation: Create Table Insert 1000 Records Check the Statistics Now insert 10 times more 10,000 indexes Check the Statistics – it will be NOT updated Auto Update Statistics and Auto Create Statistics for database is TRUE Now I have requested two things in the example 1) Why this is happening? 2) How to fix this issue? I have many answers – here is the how I fixed it which has resolved the issue for me. NOTE: There are multiple answers to this problem and I will do my best to list all. Solution: Create nonclustered Index on column City Here is the working example for the same. Let us understand this script and there is added explanation at the end. -- Execution Plans Difference -- Estimated Execution Plan Vs Actual Execution Plan -- Create Sample Database CREATE DATABASE SampleDB GO USE SampleDB GO -- Create Table CREATE TABLE ExecTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO CREATE NONCLUSTERED INDEX IX_ExecTable1 ON ExecTable (City); GO -- Insert One Thousand Records -- INSERT 1 INSERT INTO ExecTable (ID,FirstName,LastName,City) SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Select Statement SELECT FirstName, LastName, City FROM ExecTable WHERE City = 'New York' GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Replace your Statistics over here DBCC SHOW_STATISTICS('ExecTable', IX_ExecTable1); GO -------------------------------------------------------------- -- Round 2 -- Insert One Thousand Records -- INSERT 2 INSERT INTO ExecTable (ID,FirstName,LastName,City) SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Select Statement SELECT FirstName, LastName, City FROM ExecTable WHERE City = 'New York' GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Replace your Statistics over here DBCC SHOW_STATISTICS('ExecTable', IX_ExecTable1); GO -- Clean up Database DROP TABLE ExecTable GO When I created non clustered index on the column city, it also created statistics on the same column with same name as index. When we populate the data in the column the index is update – resulting execution plan to be invalided – this leads to the statistics to be updated in next execution of SELECT. This behavior does not happen on Heap or column where index is auto created. If you explicitly update the index, often you can see the statistics are updated as well. You can see this is for sure happening if you follow the tell of John Sansom. John Sansom‘s suggestion: That was fun! Although the column statistics are invalidated by the time the second select statement is executed, the query is not compiled/recompiled but instead the existing query plan is reused. It is the “next” compiled query against the column statistics that will see that they are out of date and will then in turn instantiate the action of updating statistics. You can see this in action by forcing the second statement to recompile. SELECT FirstName, LastName, City FROM ExecTable WHERE City = ‘New York’ option(RECOMPILE) GO Kevin Cross also have another suggestion: I agree with John. It is reusing the Execution Plan. Aside from OPTION(RECOMPILE), clearing the Execution Plan Cache before the subsequent tests will also work. i.e., run this before round 2: ————————————————————– – Clear execution plan cache before next test DBCC FREEPROCCACHE WITH NO_INFOMSGS; ————————————————————– Nice puzzle! Kevin As this was puzzle John and Kevin both got the correct answer, there was no condition for answer to be part of best practices. I know John and he is finest DBA around – his tremendous knowledge has always impressed me. John and Kevin both will agree that clearing cache either using DBCC FREEPROCCACHE and recompiling each query every time is for sure not good advice on production server. It is correct answer but not best practice. By the way, if you have better solution or have better suggestion please advise. I am open to change my answer and publish further improvement to this solution. On very separate note, I like to have clustered index on my Primary Key, which I have not mentioned here as it is out of the scope of this puzzle. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, Readers Contribution, Readers Question, SQL, SQL Authority, SQL Index, SQL Puzzle, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Statistics

Read the article

Search Results

Search found 13955 results on 559 pages for 'date comparison'.

Page 514/559 | < Previous Page | 510 511 512 513 514 515 516 517 518 519 520 521 | Next Page >

- by pinaldave

- by pinaldave

- by Paul White

- by JoshReuben

- by Matthew Guay

- by DigiMortal

- by Chris Hoffman

- by Matthew Guay

- by pinaldave

- by Christian David Straub

- by Simon Thorpe

- by Victor Hurdugaci

- by pinaldave

- by Tara Kizer

- by Rick Strahl

- by Damien Carru

- by [email protected]

- by Simon Thorpe

- by Testas

- by rajbk

- by Ben Emmett

- by Christian David Straub

- by pinaldave

< Previous Page | 510 511 512 513 514 515 516 517 518 519 520 521 | Next Page >