Search Results

Search found 7127 results on 286 pages for 'calculated columns'.

Page 97/286 | < Previous Page | 93 94 95 96 97 98 99 100 101 102 103 104 | Next Page >

PowerPivot and the Slowly Changing Dimensions

- by AlbertoFerrari

Slowly changing dimensions are very common in the data warehouses and, basically, they store many versions of the same entity whenever a change happens in the columns for which history needs to be maintained. For example, the AdventureWorks data warehouse has a type 2 SCD in the DimProduct table. It can be easily checked for the product code “FR-M94S-38” which shows three different versions of itself, with changing product cost and list price. This is exactly what we can expect to find in any data...(read more)

Read the article
Solaris X86 AESNI OpenSSL Engine

- by danx

Solaris X86 AESNI OpenSSL Engine Cryptography is a major component of secure e-commerce. Since cryptography is compute intensive and adds a significant load to applications, such as SSL web servers (https), crypto performance is an important factor. Providing accelerated crypto hardware greatly helps these applications and will help lead to a wider adoption of cryptography, and lower cost, in e-commerce and other applications. The Intel Westmere microprocessor has six new instructions to acclerate AES encryption. They are called "AESNI" for "AES New Instructions". These are unprivileged instructions, so no "root", other elevated access, or context switch is required to execute these instructions. These instructions are used in a new built-in OpenSSL 1.0 engine available in Solaris 11, the aesni engine. Previous Work Previously, AESNI instructions were introduced into the Solaris x86 kernel and libraries. That is, the "aes" kernel module (used by IPsec and other kernel modules) and the Solaris pkcs11 library (for user applications). These are available in Solaris 10 10/09 (update 8) and above, and Solaris 11. The work here is to add the aesni engine to OpenSSL. X86 AESNI Instructions Intel's Xeon 5600 is one of the processors that support AESNI. This processor is used in the Sun Fire X4170 M2 As mentioned above, six new instructions acclerate AES encryption in processor silicon. The new instructions are: aesenc performs one round of AES encryption. One encryption round is composed of these steps: substitute bytes, shift rows, mix columns, and xor the round key. aesenclast performs the final encryption round, which is the same as above, except omitting the mix columns (which is only needed for the next encryption round). aesdec performs one round of AES decryption aesdeclast performs the final AES decryption round aeskeygenassist Helps expand the user-provided key into a "key schedule" of keys, one per round aesimc performs an "inverse mixed columns" operation to convert the encryption key schedule into a decryption key schedule pclmulqdq Not a AESNI instruction, but performs "carryless multiply" operations to acclerate AES GCM mode. Since the AESNI instructions are implemented in hardware, they take a constant number of cycles and are not vulnerable to side-channel timing attacks that attempt to discern some bits of data from the time taken to encrypt or decrypt the data. Solaris x86 and OpenSSL Software Optimizations Having X86 AESNI hardware crypto instructions is all well and good, but how do we access it? The software is available with Solaris 11 and is used automatically if you are running Solaris x86 on a AESNI-capable processor. AESNI is used internally in the kernel through kernel crypto modules and is available in user space through the PKCS#11 library. For OpenSSL on Solaris 11, AESNI crypto is available directly with a new built-in OpenSSL 1.0 engine, called the "aesni engine." This is in lieu of the extra overhead of going through the Solaris OpenSSL pkcs11 engine, which accesses Solaris crypto and digest operations. Instead, AESNI assembly is included directly in the new aesni engine. Instead of including the aesni engine in a separate library in /lib/openssl/engines/, the aesni engine is "built-in", meaning it is included directly in OpenSSL's libcrypto.so.1.0.0 library. This reduces overhead and the need to manually specify the aesni engine. Since the engine is built-in (that is, in libcrypto.so.1.0.0), the openssl -engine command line flag or API call is not needed to access the engine—the aesni engine is used automatically on AESNI hardware. Ciphers and Digests supported by OpenSSL aesni engine The Openssl aesni engine auto-detects if it's running on AESNI hardware and uses AESNI encryption instructions for these ciphers: AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CFB128, AES-192-CFB128, AES-256-CFB128, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB, AES-128-OFB, AES-192-OFB, and AES-256-OFB. Implementation of the OpenSSL aesni engine The AESNI assembly language routines are not a part of the regular Openssl 1.0.0 release. AESNI is a part of the "HEAD" ("development" or "unstable") branch of OpenSSL, for future release. But AESNI is also available as a separate patch provided by Intel to the OpenSSL project for OpenSSL 1.0.0. A minimal amount of "glue" code in the aesni engine works between the OpenSSL libcrypto.so.1.0.0 library and the assembly functions. The aesni engine code is separate from the base OpenSSL code and requires patching only a few source files to use it. That means OpenSSL can be more easily updated to future versions without losing the performance from the built-in aesni engine. OpenSSL aesni engine Performance Here's some graphs of aesni engine performance I measured by running openssl speed -evp $algorithm where $algorithm is aes-128-cbc, aes-192-cbc, and aes-256-cbc. These are using the 64-bit version of openssl on the same AESNI hardware, a Sun Fire X4170 M2 with a Intel Xeon E5620 @2.40GHz, running Solaris 11 FCS. "Before" is openssl without the aesni engine and "after" is openssl with the aesni engine. The numbers are MBytes/second. OpenSSL aesni engine performance on Sun Fire X4170 M2 (Xeon E5620 @2.40GHz) (Higher is better; "before"=OpenSSL on AESNI without AESNI engine software, "after"=OpenSSL AESNI engine) As you can see the speedup is dramatic for all 3 key lengths and for data sizes from 16 bytes to 8 Kbytes—AESNI is about 7.5-8x faster over hand-coded amd64 assembly (without aesni instructions). Verifying the OpenSSL aesni engine is present The easiest way to determine if you are running the aesni engine is to type "openssl engine" on the command line. No configuration, API, or command line options are needed to use the OpenSSL aesni engine. If you are running on Intel AESNI hardware with Solaris 11 FCS, you'll see this output indicating you are using the aesni engine: intel-westmere $ openssl engine (aesni) Intel AES-NI engine (no-aesni) (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support If you are running on Intel without AESNI hardware you'll see this output indicating the hardware can't support the aesni engine: intel-nehalem $ openssl engine (aesni) Intel AES-NI engine (no-aesni) (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support For Solaris on SPARC or older Solaris OpenSSL software, you won't see any aesni engine line at all. Third-party OpenSSL software (built yourself or from outside Oracle) will not have the aesni engine either. Solaris 11 FCS comes with OpenSSL version 1.0.0e. The output of typing "openssl version" should be "OpenSSL 1.0.0e 6 Sep 2011". 64- and 32-bit OpenSSL OpenSSL comes in both 32- and 64-bit binaries. 64-bit executable is now the default, at /usr/bin/openssl, and OpenSSL 64-bit libraries at /lib/amd64/libcrypto.so.1.0.0 and libssl.so.1.0.0 The 32-bit executable is at /usr/bin/i86/openssl and the libraries are at /lib/libcrytpo.so.1.0.0 and libssl.so.1.0.0. Availability The OpenSSL AESNI engine is available in Solaris 11 x86 for both the 64- and 32-bit versions of OpenSSL. It is not available with Solaris 10. You must have a processor that supports AESNI instructions, otherwise OpenSSL will fallback to the older, slower AES implementation without AESNI. Processors that support AESNI include most Westmere and Sandy Bridge class processor architectures. Some low-end processors (such as for mobile/laptop platforms) do not support AESNI. The easiest way to determine if the processor supports AESNI is with the isainfo -v command—look for "amd64" and "aes" in the output: $ isainfo -v 64-bit amd64 applications pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu Conclusion The Solaris 11 OpenSSL aesni engine provides easy access to powerful Intel AESNI hardware cryptography, in addition to Solaris userland PKCS#11 libraries and Solaris crypto kernel modules.

Read the article
SQL SERVER – Find Referenced or Referencing Object in SQL Server using sys.sql_expression_dependencies

- by pinaldave

A very common question which I often receive are: How do I find all the tables used in a particular stored procedure? How do I know which stored procedures are using a particular table? Both are valid question but before we see the answer of this question – let us understand two small concepts – Referenced and Referencing. Here is the sample stored procedure. CREATE PROCEDURE mySP AS SELECT * FROM Sales.Customer GO Reference: The table Sales.Customer is the reference object as it is being referenced in the stored procedure mySP. Referencing: The stored procedure mySP is the referencing object as it is referencing Sales.Customer table. Now we know what is referencing and referenced object. Let us run following queries. I am using AdventureWorks2012 as a sample database. If you do not have SQL Server 2012 here is the way to get SQL Server 2012 AdventureWorks database. Find Referecing Objects of a particular object Here we are finding all the objects which are using table Customer in their object definitions (regardless of the schema). USE AdventureWorks GO SELECT referencing_schema_name = SCHEMA_NAME(o.SCHEMA_ID), referencing_object_name = o.name, referencing_object_type_desc = o.type_desc, referenced_schema_name, referenced_object_name = referenced_entity_name, referenced_object_type_desc = o1.type_desc, referenced_server_name, referenced_database_name --,sed.* -- Uncomment for all the columns FROM sys.sql_expression_dependencies sed INNER JOIN sys.objects o ON sed.referencing_id = o.[object_id] LEFT OUTER JOIN sys.objects o1 ON sed.referenced_id = o1.[object_id] WHERE referenced_entity_name = 'Customer' The above query will return all the objects which are referencing the table Customer. Find Referenced Objects of a particular object Here we are finding all the objects which are used in the view table vIndividualCustomer. USE AdventureWorks GO SELECT referencing_schema_name = SCHEMA_NAME(o.SCHEMA_ID), referencing_object_name = o.name, referencing_object_type_desc = o.type_desc, referenced_schema_name, referenced_object_name = referenced_entity_name, referenced_object_type_desc = o1.type_desc, referenced_server_name, referenced_database_name --,sed.* -- Uncomment for all the columns FROM sys.sql_expression_dependencies sed INNER JOIN sys.objects o ON sed.referencing_id = o.[object_id] LEFT OUTER JOIN sys.objects o1 ON sed.referenced_id = o1.[object_id] WHERE o.name = 'vIndividualCustomer' The above query will return all the objects which are referencing the table Customer. I am just glad to write above query. There are more to write to this subject. In future blog post I will write more in depth about other DMV which also aids in finding referenced data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL DMV, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology

Read the article
Hidden Formatting Troubles with STR() (SQL Spackle)

Fill in another bit of your T-SQL knowledge about STR(). It right justifies, rounds, and controls the output width of columns. Sounds perfect but here's why you might not want to use it. Join SQL Backup’s 35,000+ customers to compress and strengthen your backups "SQL Backup will be a REAL boost to any DBA lucky enough to use it." Jonathan Allen. Download a free trial now.

Read the article
Generate DROP statements for all extended properties

- by jamiet

This evening I have been attempting to migrate an existing on-premise database to SQL Azure using the wizard that is built-in to SQL Server Management Studio (SSMS). When I did so I received the following error: The following objects are not supported = [MS_Description] = Extended Property Evidently databases containing extended properties can not be migrated using this particular wizard so I set about removing all of the extended properties – unfortunately there were over a thousand of them so I needed a better way than simply deleting each and every one of them manually. I found a couple of resources online that went some way toward this: Drop all extended properties in a MSSQL database by Angelo Hongens Modifying and deleting extended properties by Adam Aspin Unfortunately neither provided a script that exactly suited my needs. Angelo’s covered extended properties on tables and columns however I had other objects that had extended properties on them. Adam’s looked more complete but when I ran it I got an error: Msg 468, Level 16, State 9, Line 78 Cannot resolve the collation conflict between "Latin1_General_100_CS_AS" and "Latin1_General_CI_AS" in the equal to operation. So, both great resources but I wasn’t able to use either on their own to get rid of all of my extended properties. Hence, I combined the excellent work that Angelo and Adam had provided in order to manufacture my own script which did successfully manage to generate calls to sp_dropextendedproperty for all of my extended properties. If you think you might be able to make use of such a script then feel free to download it from https://skydrive.live.com/redir.aspx?cid=550f681dad532637&resid=550F681DAD532637!16707&parid=550F681DAD532637!16706&authkey=!APxPIQCatzC7BQ8. This script will remove extended properties on tables, columns, check constraints, default constraints, views, sprocs, foreign keys, primary keys, table triggers, UDF parameters, sproc parameters, databases, schemas, database files and filegroups. If you have any object types with extended properties on them that are not in that list then consult Adam’s aforementioned article – it should prove very useful. I repeat here the message that I have placed at the top of the script: /* This script will generate calls to sp_dropextendedproperty for every extended property that exists in your database. Actually, a caveat: I don't promise that it will catch each and every extended property that exists, but I'm confident it will catch most of them! It is based on this: http://blog.hongens.nl/2010/02/25/drop-all-extended-properties-in-a-mssql-database/ by Angelo Hongens. Also had lots of help from this: http://www.sqlservercentral.com/articles/Metadata/72609/ by Adam Aspin Adam actually provides a script at that link to do something very similar but when I ran it I got an error: Msg 468, Level 16, State 9, Line 78 Cannot resolve the collation conflict between "Latin1_General_100_CS_AS" and "Latin1_General_CI_AS" in the equal to operation. So I put together this version instead. Use at your own risk. Jamie Thomson 2012-03-25 */ Hope this is useful to someone! @Jamiet

Read the article
The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article
The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article
Refreshing imported MySQL data with MySQL for Excel

- by Javier Rivera

Welcome to another blog post from the MySQL for Excel Team. Today we're going to talk about a new feature included since MySQL for Excel 1.3.0, you can install the latest GA or maintenance version using the MySQL Installer or optionally you can download directly any GA or non-GA version from the MySQL Developer Zone.As some users suggested in our forums we should be maintaining the link between tables and Excel not only when editing data through the Edit MySQL Data option, but also when importing data via Import MySQL Data. Before 1.3.0 this process only provided you with an offline copy of the Table's data into Excel and you had no way to refresh that information from the DB later on. Now, with this new feature we'll show you how easy is to work with the latest available information at all times. This feature is transparent to you (it doesn't require additional steps to work as long as the users had the Create an Excel Table for the imported MySQL table data option enabled. To ensure you have this option checked, click over Advanced Options... after the Import Data dialog is displayed). The current blog post assumes you already know how to import data into excel, you could always take a look at our previous post How To - Guide to Importing Data from a MySQL Database to Excel using MySQL for Excel if you need further reference on that topic. After importing Data from a MySQL Table into Excel, you can refresh the data in 3 ways.1. Simply right click over the range of the imported data, to show the pop-up menu: Click over the Refresh button to obtain the latest copy of the data in the table. 2. Click the Refresh button on the Data ribbon: 3. Click the Refresh All button in the Data ribbon (beware this will refresh all Excel tables in the Workbook): Please take a note of a couple of details here, the first one is about the size of the table. If by the time you refresh the table new columns had been added to it, and you originally have imported all columns, the table will grow to the right. The same applies to rows, if the table has new rows and you did not limit the results , the table will grow to to the bottom of the sheet in Excel. The second detail you should take into account is this operation will overwrite any changes done to the cells after the table was originally imported or previously refreshed: Now with this new feature, imported data remains linked to the data source and is available to be updated at all times. It empowers the user to always be able to work with the latest version of the imported MySQL data. We hope you like this this new feature and give it a try! Remember that your feedback is very important for us, so drop us a message with your comments, suggestions for this or other features and follow us at our social media channels: MySQL on Windows (this) Blog: https://blogs.oracle.com/MySqlOnWindows/ MySQL for Excel forum: http://forums.mysql.com/list.php?172 Facebook: http://www.facebook.com/mysql YouTube channel: https://www.youtube.com/user/MySQLChannel Thanks!

Read the article
SQL Server SELECT INTO

- by Derek Dieter

The most efficient method of copying a result set into a new table is to use the SELECT INTO method. This method also follows a very simple syntax. [/sql] SELECT * INTO dbo.NewTableName FROM dbo.ExistingTable [sql] Once the query above is executed, all the columns and data in the table ExistingTable (along with their datatypes) will be copied into a [...]

Read the article
Copying & Pasting Rows Between Grids in SQL Developer

- by thatjeffsmith

Apologies for slacking on the blogging front here lately. Still mentally hung over from Open World, and lots of things going on behind the scenes here in Oracle-land. Whilst (love that word) blogging is part of my job, it’s not the ONLY part of my job So a super-quick and dirty ‘trick’ this morning. Copying Query Result Record as New Row in Table Copy and paste is something everyone ‘gets.’ I don’t know we have to thank for that, whether it’s Microsoft or Xerox, but it’s been ingrained in our way of dealing with all things computers. Almost to the detriment of some of our users – they’ll use Copy and Paste when perhaps our Export feature is superior, but I digress. Where it does work just fine is when you want to create a new row in your table that matches a row you have retrieved from an executed query. Just click in the gutter or row number to get the entire row selected Once you have your data selected, do your thing, i.e. ctrl+C or Command/Apple+C or whatever. Now open your view or table editor, go to the data page, and ask for a new row. New record, no data Paste in the data from the clipboard. It’s smart enough to paste the separate values out to the separate columns. The clipboard saves the day, again. If your columns orders are different, just change the order in the grids. If you have extra information, don’t copy the entire row. I know, I know – Jeff this is too simple, why are you wasting our time here? It seems intuitive, but how many of you actually tried this before reading it just now? I seem to get more positive feedback from the very basic user interface 101 tips than the esoteric click-click-click-ctrl-shift-click tricks I prefer to post. Lots of interesting stuff on tap, so stay tuned!

Read the article
Using CTAS & Exchange Partition Replace IAS for Copying Partition on Exadata

- by Bandari Huang

Usage Scenario: Copy data&index from one partition to another partition in a partitioned table. Solution: Create a partition definition Copy data from one partition to another partiton by 'Insert as select (IAS)' Create a nonpartitioned table by 'Create table as select (CTAS)' Convert a nonpartitioned table into a partition of partitoned table by exchangng their data segments. Rebuild unusable index Exchange Partition Convertion Mutual convertion between a partition (or subpartition) and a nonpartitioned table Mutual convertion between a hash-partitioned table and a partition of a composite *-hash partitioned table Mutual convertiton a [range | list]-partitioned table into a partition of a composite *-[range | list] partitioned table. Exchange Partition Usage Scenario High-speed data loading of new, incremental data into an existing partitioned table in DW environment Exchanging old data partitions out of a partitioned table, the data is purged from the partitioned table without actually being deleted and can be archived separately Exchange Partition Syntax ALTER TABLE schema.table EXCHANGE [PARTITION|SUBPARTITION] [partition|subprtition] WITH TABLE schema.table [INCLUDE|EXCLUDING] INDEX [WITH|WITHOUT] VALIDATION UPDATE [INDEXES|GLOBAL INDEXES] INCLUDING | EXCLUDING INDEXES Specify INCLUDING INDEXES if you want local index partitions or subpartitions to be exchanged with the corresponding table index (for a nonpartitioned table) or local indexes (for a hash-partitioned table). Specify EXCLUDING INDEXES if you want all index partitions or subpartitions corresponding to the partition and all the regular indexes and index partitions on the exchanged table to be marked UNUSABLE. If you omit this clause, then the default is EXCLUDING INDEXES. WITH | WITHOUT VALIDATION Specify WITH VALIDATION if you want Oracle Database to return an error if any rows in the exchanged table do not map into partitions or subpartitions being exchanged. Specify WITHOUT VALIDATION if you do not want Oracle Database to check the proper mapping of rows in the exchanged table. If you omit this clause, then the default is WITH VALIDATION. UPADATE INDEX|GLOBAL INDEX Unless you specify UPDATE INDEXES, the database marks UNUSABLE the global indexes or all global index partitions on the table whose partition is being exchanged. Global indexes or global index partitions on the table being exchanged remain invalidated. (You cannot use UPDATE INDEXES for index-organized tables. Use UPDATE GLOBAL INDEXES instead.) Exchanging Partitions&Subpartitions Notes Both tables involved in the exchange must have the same primary key, and no validated foreign keys can be referencing either of the tables unless the referenced table is empty. When exchanging partitioned index-organized tables: – The source and target table or partition must have their primary key set on the same columns, in the same order. – If key compression is enabled, then it must be enabled for both the source and the target, and with the same prefix length. – Both the source and target must be index organized. – Both the source and target must have overflow segments, or neither can have overflow segments. Also, both the source and target must have mapping tables, or neither can have a mapping table. – Both the source and target must have identical storage attributes for any LOB columns.

Read the article
MySQL for Excel 1.1.3 has been released

- by Javier Treviño

The MySQL Windows Experience Team is proud to announce the release of MySQL for Excel version 1.1.3, the latest addition to the MySQL Installer for Windows. MySQL for Excel is an application plug-in enabling data analysts to very easily access and manipulate MySQL data within Microsoft Excel. It enables you to directly work with a MySQL database from within Microsoft Excel so you can easily do tasks such as: Importing MySQL Data into Excel Exporting Excel data directly into MySQL to a new or existing table Editing MySQL data directly within Excel MySQL for Excel is installed using the MySQL Installer for Windows. The MySQL installer comes in 2 versions Full (150 MB) which includes a complete set of MySQL products with their binaries included in the download Web (1.5 MB - a network install) which will just pull MySQL for Excel over the web and install it when run. You can download MySQL Installer from our official Downloads page at http://dev.mysql.com/downloads/installer/. MySQL for Excel 1.1.3 introduces the following features: Upon saving a Workbook containing Worksheets in Edit Mode, the user is asked if he wants to exit the Edit Mode on all Worksheets before their parent Workbook is saved so the Worksheets are saved unprotected, otherwise the Worksheets will remain protected and the users will be able to unprotect them later retrieving the passkeys from the application log after closing MySQL for Excel. Added background coloring to the column names header row of an Import Data operation to have the same look as the one in an Edit Data operation (i.e. gray-ish background). Connection passwords can be stored securely just like MySQL Workbench does and these secured passwords are shared with Workbench in the same way connections are. Changed the way the MySQL for Excel ribbon toggle button works, instead of just showing or hiding the add-in it actually opens and closes it. Added a connection test before any operation against the database (schema creation, data import, append, export or edition) so the operation dialog is not shown and a friendlier error message is shown. Also this release contains the following bug fixes: Added a check on every connection test for an expired password, if the password has been expired a dialog is now shown to the user to reset the password. Bug #17354118 - DON'T HANDLE EXPIRED PASSWORDS Added code to escape text values to be imported to an Excel worksheet that start with an equals sign so Excel does not treat those values as formulas that will fail evaluation. This is an option turned on by default that can be turned off by users if they wish to import values to be treated as Excel formulas. Bug #17354102 - ERROR IMPORTING TEXT VALUES TO EXCEL STARTING WITH AN EQUALS SIGN Added code to properly check the reason for a failing connection, if it's a failing password the user gets a dialog to retry the connection with a different password until the connection succeeds, a connection error not related to the password is thrown or the user cancels. If the failing connection is not related to a bad password an error message is shown to the users indicating the reason of the failure. Bug #16239007 - CONNECTIONS TO MYSQL SERVICES NOT RUNNING DISPLAY A WRONG PASSWORD ERROR MESSAGE Added global options dialog that can be accessed from the Schema Selection and DB Object Selection panels where the timeouts for the connection to the DB Server and for the query commands can be changed from their default values (15 seconds for the connection timeout and 30 seconds for the query timeout). MySQL Bug #68732, Bug #17191646 - QUERY TIMEOUT CANNOT BE ADJUSTED IN MYSQL FOR EXCEL Changed the Varchar(65,535) data type shown in the Export Data data type combo box to Text since the maximum row size is 65,535 bytes and any autodetected column data type with a length greater than 4,000 should be set to Text actually for the table to be created successfully. MySQL Bug #69779, Bug #17191633 - EXPORT FAILS FOR EXCEL FILES CONTAINING > 4000 CHARACTERS OF TEXT PER CELL Removed code that was replacing all spaces typed by the user in an overriden data type for a new column in an Export Data operation, also improved the data type detection code to flag as invalid data types with parenthesis but without any text inside or where the contents inside the parenthesis are not valid for the specific data type. Bug #17260260 - EXPORT DATA SET TYPE NOT WORKING WITH MEMBER VALUES CONTAINING SPACES Added support for the year data type with a length of 2 or 4 and a validation that valid values are integers between 1901-2155 (for 4-digit years) or between 0-99 (for 2-digit years). Bug #17259915 - EXPORT DATA YEAR DATA TYPE NOT RECOGNIZED IF DECLARED WITH A DISPLAY WIDTH) Fixed code for Export Data operations where users overrode the data type for columns typing Text in the data type combobox, which is a valid data type but was not recognized as such. Bug #17259490 - EXPORT DATA TEXT DATA TYPE NOT RECOGNIZED AS A VALID DATA TYPE Changed the location of the registry where the MySQL for Excel add-in is installed to HKEY_LOCAL_MACHINE instead of HKEY_CURRENT_USER so the add-in is accessible by all users and not only to the user that installed it. For this to work with Excel 2007 a hotfix may be required (see http://support.microsoft.com/kb/976477). MySQL Bug #68746, Bug #16675992 - EXCEL-ADD-IN IS ONLY INSTALLED FOR USER ACCOUNT THAT THE INSTALLATION RUNS UNDER Added support for Excel 2013 Single Document Interface, now that Excel 2013 creates 1 window per workbook also the Excel Add-In maintains an independent custom task pane in each window. MySQL Bug #68792, Bug #17272087 - MYSQL FOR EXCEL SIDEBAR DOES NOT APPEAR IN EXCEL 2013 (WITH WORKAROUND) Included the latest MySQL Utility with a code fix for the COM exception thrown when attempting to open Workbench in the Manage Connections window. Bug #17258966 - MYSQL WORKBENCH NOT OPENED BY CLICKING MANAGE CONNECTIONS HOTLABEL Fixed code for Append Data operations that was not applying a calculated automatic mapping correctly when the source and target tables had different number of columns, some columns with the same name but some of those lying on column indexes beyond the limit of the other source/target table. MySQL Bug #69220, Bug #17278349 - APPEND DOESN'T AUTOMATICALLY DETECT EXCEL COL HEADER WITH SAME NAME AS SQL FIELD Fixed some code for Edit Data operations that was escaping special characters twice (during edition in Excel and then upon sending the query to the MySQL server). MySQL Bug #68669, Bug #17271693 - A BACKSLASH IS INSERTED BEFORE AN APOSTROPHE EDITING TABLE WITH MYSQL FOR EXCEL Upgraded MySQL Utility with latest version that encapsulates dialog base classes and introduces more classes to handle Workbench connections, and removed these from the Excel project. Bug #16500331 - CAN'T DELETE CONNECTIONS CREATED WITHIN ADDIN You can access the MySQL for Excel documentation at http://dev.mysql.com/doc/refman/5.6/en/mysql-for-excel.html You can find our team’s blog at http://blogs.oracle.com/MySQLOnWindows. You can also post questions on our MySQL for Excel forum found at http://forums.mysql.com/. Enjoy and thanks for the support!

Read the article
In defense of SELECT * in production code, in some limited cases?

- by Alexander Kuznetsov

It is well known that SELECT * is not acceptable in production code, with the exception of this pattern: IF EXISTS( SELECT * We all know that whenever we see code code like this: Listing 1. "Bad" SQL SELECT Column1 , Column2 FROM ( SELECT c. * , ROW_NUMBER () OVER ( PARTITION BY Column1 ORDER BY Column2 ) AS rn FROM data.SomeTable AS c ) AS c WHERE rn < 5 we are supposed to automatically replace * with an explicit list of columns, as follows: Listing 2. "Good" SQL SELECT Column1 , Column2 FROM...(read more)

Read the article
It's intellisense for SQL Server

- by Nick Harrison

It's intellisense for SQL Server Anyone who has ever worked with me, heard me speak, or read any of writings knows that I am a HUGE fan of Reflector. By extension, I am a big fan of Red - Gate I have recently begun exploring some of their other offerings and came across this jewel. SQL Prompt is a plug in for Visual Studio and SQL Server Management Studio. It provides several tools to make dealing with SQL a little easier for your friendly neighborhood developer. When you a query window in a database, the plugin kicks in and gathers the metadata for the database that you are in. As you type a query, you get handy feedback like a list of tables after you type select. You can select one of the tables, specify * and then tab to expand the select clause to include all of the columns from the selected table. As you are building up the where clause, you are prompted by the names of columns in the selected tables. If you spend any time writing ad hoc queries or building stored procedures by hand, this can save you substantial time. If you are learning a new data model, this can greatly cut down on your frustration level. The other really cool thing here is Format SQL. I have searched all over the place for a really good SQL formatter. Badly formatted SQL is so much harder to read than well formatted SQL. Unfortunately, management studio offers no support for keeping your SQL well formatted. There are many tools available to format your SQL. Some work better than others. Some don't work that well at all. Most will give you some measure of control over how the formatted SQL looks. SQL Prompt produces good results and is easy to configure. Sadly no tool is perfect, and what would we be without a wish list. There are some features that I would like to see: Make it easier to paste SQL in and out of code. Strip off string builder, etc Automate replacing hard coded values with bind variables or parameters In addition to reformatting SQL, which is a huge refactor, support for other SQL refactors would be nice. Convert join to sub query and vice versa come to mind Wish list a side, this is a wonderful tool that easily saves me an hour or more on most weeks.

Read the article
Determining distribution of NULL values

- by AaronBertrand

Today on the twitter hash tag #sqlhelp, @leenux_tux asked: How can I figure out the percentage of fields that don't have data ? After further clarification, it turns out he is after what proportion of columns are NULL. Some folks suggested using a data profiling task in SSIS . There may be some validity to that, but I'm still a fan of sticking to T-SQL when I can, so here is how I would approach it: Create a #temp table or @table variable to store the results. Create a cursor that loops through all...(read more)

Read the article
Ensure your view and function meta data is upto date.

- by simonsabin

You will see if you use views and functions that SQL Server holds the rowset metadata for this in system tables. This means that if you change the underlying tables, columns and data types your views and functions can be out of sync. This is especially the case with views and functions that use select * To get the metadata to be updated you need to use sp_refreshsqlmodule. This forces the object to be “re run” into the database and the meta data updated. Thomas mentioned sp_refreshview which is a...(read more)

Read the article
SSAS: Utility to check you have the correct data types and sizes in your cube definition

- by DrJohn

This blog describes a tool I developed which allows you to compare the data types and data sizes found in the cube’s data source view with the data types/sizes of the corresponding dimensional attribute. Why is this important? Well when creating named queries in a cube’s data source view, it is often necessary to use the SQL CAST or CONVERT operation to change the data type to something more appropriate for SSAS. This is particularly important when your cube is based on an Oracle data source or using custom SQL queries rather than views in the relational database. The problem with BIDS is that if you change the underlying SQL query, then the size of the data type in the dimension does not update automatically. This then causes problems during deployment whereby processing the dimension fails because the data in the relational database is wider than that allowed by the dimensional attribute. In particular, if you use some string manipulation functions provided by SQL Server or Oracle in your queries, you may find that the 10 character string you expect suddenly turns into an 8,000 character monster. For example, the SQL Server function REPLACE returns column with a width of 8,000 characters. So if you use this function in the named query in your DSV, you will get a column width of 8,000 characters. Although the Oracle REPLACE function is far more intelligent, the generated column size could still be way bigger than the maximum length of the data actually in the field. Now this may not be a problem when prototyping, but in your production cubes you really should clean up this kind of thing as these massive strings will add to processing times and storage space. Similarly, you do not want to forget to change the size of the dimension attribute if your database columns increase in size. Introducing CheckCubeDataTypes Utiltity The CheckCubeDataTypes application extracts all the data types and data sizes for all attributes in the cube and compares them to the data types and data sizes in the cube’s data source view. It then generates an Excel CSV file which contains all this metadata along with a flag indicating if there is a mismatch between the DSV and the dimensional attribute. Note that the app not only checks all the attribute keys but also the name and value columns for each attribute. Another benefit of having the metadata held in a CSV text file format is that you can place the file under source code control. This allows you to compare the metadata of the previous cube release with your new release to highlight problems introduced by new development. You can download the C# source code from here: CheckCubeDataTypes.zip A typical example of the output Excel CSV file is shown below - note that the last column shows a data size mismatch by TRUE appearing in the column

Read the article
Undocumented Query Plans: Equality Comparisons

- by Paul White

The diagram below shows two data sets, with differences highlighted: To find changed rows using TSQL, we might write a query like this: The logic is clear: join rows from the two sets together on the primary key column, and return rows where a change has occurred in one or more data columns. Unfortunately, this query only finds one of the expected four rows: The problem, of course, is that our query does not correctly handle NULLs. The ‘not equal to’ operators <> and != do not evaluate...(read more)

Read the article
SQL SERVER – Script to Update a Specific Column in Entire Database

- by Pinal Dave

Last week, I have received a very interesting question and I find in email and I really liked the question as I had to play around with SQL Script for a while to come up with the answer he was looking for. Please read the question and I believe that all of us face this kind of situation. “Pinal, In our database we have recently introduced ModifiedDate column in all of the tables. Now onwards any update happens in the row, we are updating current date and time to that field. Now here is the issue, when we added that field we did not update it with a default value because we were not sure when we will go live with the system so we let it be NULL. Now modification to the application went live yesterday and we are now updating this field. Here is where I need your help. We need to update all the tables in our database where we have column created ModifiedDate and now want to update with current datetime. As our system is already live since yesterday there are several thousands of the rows which are already updated with real world value so we do not want to update those values. Essentially, in our entire database where ever there is a ModifiedDate column and if it is NULL we want to update that with current date time? Do you have a script for it?” Honestly I did not have such a script. This is very specific required but I was able to come up with two different methods how he can use this method. Method 1 : Using INFORMATION_SCHEMA SELECT 'UPDATE ' + T.TABLE_SCHEMA + '.' + T.TABLE_NAME + ' SET ModifiedDate = GETDATE() WHERE ModifiedDate IS NULL;' FROM INFORMATION_SCHEMA.TABLES T INNER JOIN INFORMATION_SCHEMA.COLUMNS C ON T.TABLE_NAME = C.TABLE_NAME AND c.COLUMN_NAME ='ModifiedDate' WHERE T.TABLE_TYPE = 'BASE TABLE' ORDER BY T.TABLE_SCHEMA, T.TABLE_NAME; Method 2: Using DMV SELECT 'UPDATE ' + SCHEMA_NAME(t.schema_id) + '.' + t.name + ' SET ModifiedDate = GETDATE() WHERE ModifiedDate IS NULL;' FROM sys.tables AS t INNER JOIN sys.columns c ON t.OBJECT_ID = c.OBJECT_ID WHERE c.name ='ModifiedDate' ORDER BY SCHEMA_NAME(t.schema_id), t.name; Above scripts will create an UPDATE script which will do the task which is asked. We can pretty much the update script to any other SELECT statement and retrieve any other data as well. Click to Download Scripts Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Joins, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Deterministic/Consistent Unique Masking

- by Dinesh Rajasekharan-Oracle

One of the key requirements while masking data in large databases or multi database environment is to consistently mask some columns, i.e. for a given input the output should always be the same. At the same time the masked output should not be predictable. Deterministic masking also eliminates the need to spend enormous amount of time spent in identifying data relationships, i.e. parent and child relationships among columns defined in the application tables. In this blog post I will explain different ways of consistently masking the data across databases using Oracle Data Masking and Subsetting The readers of post should have minimal knowledge on Oracle Enterprise Manager 12c, Application Data Modeling, Data Masking concepts. For more information on these concepts, please refer to Oracle Data Masking and Subsetting document Oracle Data Masking and Subsetting 12c provides four methods using which users can consistently yet irreversibly mask their inputs. 1. Substitute 2. SQL Expression 3. Encrypt 4. User Defined Function SUBSTITUTE The substitute masking format replaces the original value with a value from a pre-created database table. As the method uses a hash based algorithm in the back end the mappings are consistent. For example consider DEPARTMENT_ID in EMPLOYEES table is replaced with FAKE_DEPARTMENT_ID from FAKE_TABLE. The substitute masking transformation that all occurrences of DEPARTMENT_ID say ‘101’ will be replaced with ‘502’ provided same substitution table and column is used , i.e. FAKE_TABLE.FAKE_DEPARTMENT_ID. The following screen shot shows the usage of the Substitute masking format with in a masking definition: Note that the uniqueness of the masked value depends on the number of columns being used in the substitution table i.e. if the original table contains 50000 unique values, then for the masked output to be unique and deterministic the substitution column should also contain 50000 unique values without which only consistency is maintained but not uniqueness. SQL EXPRESSION SQL Expression replaces an existing value with the output of a specified SQL Expression. For example while masking an EMPLOYEES table the EMAIL_ID of an employee has to be in the format EMPLOYEE’s [email protected] while FIRST_NAME and LAST_NAME are the actual column names of the EMPLOYEES table then the corresponding SQL Expression will look like %FIRST_NAME%||’.’||%LAST_NAME%||’@COMPANY.COM’. The advantage of this technique is that if you are masking FIRST_NAME and LAST_NAME of the EMPLOYEES table than the corresponding EMAIL ID will be replaced accordingly by the masking scripts. One of the interesting aspect’s of a SQL Expressions is that you can use sub SQL expressions, which means that you can write a nested SQL and use it as SQL Expression to address a complex masking business use cases. SQL Expression can also be used to consistently replace value with hashed value using Oracle’s PL/SQL function ORA_HASH. The following SQL Expression will help in the previous example for replacing the DEPARTMENT_IDs with a hashed number ORA_HASH (%DEPARTMENT_ID%, 1000) The following screen shot shows the usage of encrypt masking format with in the masking definition: ORA_HASH takes three arguments: 1. Expression which can be of any data type except LONG, LOB, User Defined Type [nested table type is allowed]. In the above example I used the Original value as expression. 2. Number of hash buckets which can be number between 0 and 4294967295. The default value is 4294967295. You can also co-relate the number of hash buckets to a range of numbers. In the above example above the bucket value is specified as 1000, so the end result will be a hashed number in between 0 and 1000. 3. Seed, can be any number which decides the consistency, i.e. for a given seed value the output will always be same. The default seed is 0. In the above SQL Expression a seed in not specified, so it to 0. If you have to use a non default seed then the function will look like. ORA_HASH (%DEPARTMENT_ID%, 1000, 1234 The uniqueness depends on the input and the number of hash buckets used. However as ORA_HASH uses a 32 bit algorithm, considering birthday paradox or pigeonhole principle there is a 0.5 probability of collision after 232-1 unique values. ENCRYPT Encrypt masking format uses a blend of 3DES encryption algorithm, hashing, and regular expression to produce a deterministic and unique masked output. The format of the masked output corresponds to the specified regular expression. As this technique uses a key [string] to encrypt the data, the same string can be used to decrypt the data. The key also acts as seed to maintain consistent outputs for a given input. The following screen shot shows the usage of encrypt masking format with in the masking definition: Regular Expressions may look complex for the first time users but you will soon realize that it’s a simple language. There are many resources in internet, oracle documentation, oracle learning library, my oracle support on writing a Regular Expressions, out of all the following My Oracle Support document helped me to get started with Regular Expressions: Oracle SQL Support for Regular Expressions[Video](Doc ID 1369668.1) USER DEFINED FUNCTION [UDF] User Defined Function or UDF provides flexibility for the users to code their own masking logic in PL/SQL, which can be called from masking Defintion. The standard format of an UDF in Oracle Data Masking and Subsetting is: Function udf_func (rowid varchar2, column_name varchar2, original_value varchar2) returns varchar2; Where • rowid is the row identifier of the column that needs to be masked • column_name is the name of the column that needs to be masked • original_value is the column value that needs to be masked You can achieve deterministic masking by using Oracle’s built in hash functions like, ORA_HASH, DBMS_CRYPTO.MD4, DBMS_CRYPTO.MD5, DBMS_UTILITY. GET_HASH_VALUE.Please refers to the Oracle Database Documentation for more information on the Oracle Hash functions. For example the following masking UDF generate deterministic unique hexadecimal values for a given string input: CREATE OR REPLACE FUNCTION RD_DUX (rid varchar2, column_name varchar2, orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2 (26); no_of_characters number(2); BEGIN no_of_characters:=6; stext:=substr(RAWTOHEX(DBMS_CRYPTO.HASH(UTL_RAW.CAST_TO_RAW(text),1)),0,no_of_characters); RETURN stext; END; The uniqueness depends on the input and length of the string and number of bits used by hash algorithm. In the above function MD4 hash is used [denoted by argument 1 in the DBMS_CRYPTO.HASH function which is a 128 bit algorithm which produces 2^128-1 unique hashed values , however this is limited by the length of the input string which is 6, so only 6^6 unique values will be generated. Also do not forget about the birthday paradox/pigeonhole principle mentioned earlier in this post. An another example is to consistently replace characters or numbers preserving the length and special characters as shown below: CREATE OR REPLACE FUNCTION RD_DUS(rid varchar2,column_name varchar2,orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2(26); BEGIN DBMS_RANDOM.SEED(orig_val); stext:=TRANSLATE(orig_val,'ABCDEFGHILKLMNOPQRSTUVWXYZ',DBMS_RANDOM.STRING('U',26)); stext:=TRANSLATE(stext,'abcdefghijklmnopqrstuvwxyz',DBMS_RANDOM.STRING('L',26)); stext:=TRANSLATE(stext,'0123456789',to_char(DBMS_RANDOM.VALUE(1,9))); stext:=REPLACE(stext,'.','0'); RETURN stext; END; The following screen shot shows the usage of an UDF with in a masking definition: To summarize, Oracle Data Masking and Subsetting helps you to consistently mask data across databases using one or all of the methods described in this post. It saves the hassle of identifying the parent-child relationships defined in the application table. Happy Masking

Read the article
Removing hard-coded values and defensive design vs YAGNI

- by Ben Scott

First a bit of background. I'm coding a lookup from Age - Rate. There are 7 age brackets so the lookup table is 3 columns (From|To|Rate) with 7 rows. The values rarely change - they are legislated rates (first and third columns) that have stayed the same for 3 years. I figured that the easiest way to store this table without hard-coding it is in the database in a global configuration table, as a single text value containing a CSV (so "65,69,0.05,70,74,0.06" is how the 65-69 and 70-74 tiers would be stored). Relatively easy to parse then use. Then I realised that to implement this I would have to create a new table, a repository to wrap around it, data layer tests for the repo, unit tests around the code that unflattens the CSV into the table, and tests around the lookup itself. The only benefit of all this work is avoiding hard-coding the lookup table. When talking to the users (who currently use the lookup table directly - by looking at a hard copy) the opinion is pretty much that "the rates never change." Obviously that isn't actually correct - the rates were only created three years ago and in the past things that "never change" have had a habit of changing - so for me to defensively program this I definitely shouldn't store the lookup table in the application. Except when I think YAGNI. The feature I am implementing doesn't specify that the rates will change. If the rates do change, they will still change so rarely that maintenance isn't even a consideration, and the feature isn't actually critical enough that anything would be affected if there was a delay between the rate change and the updated application. I've pretty much decided that nothing of value will be lost if I hard-code the lookup, and I'm not too concerned about my approach to this particular feature. My question is, as a professional have I properly justified that decision? Hard-coding values is bad design, but going to the trouble of removing the values from the application seems to violate the YAGNI principle. EDIT To clarify the question, I'm not concerned about the actual implementation. I'm concerned that I can either do a quick, bad thing, and justify it by saying YAGNI, or I can take a more defensive, high-effort approach, that even in the best case ultimately has low benefits. As a professional programmer does my decision to implement a design that I know is flawed simply come down to a cost/benefit analysis?

Read the article
SQL Server 2008 R2: StreamInsight changes at RTM: Access to grouping keys via explicit typing

- by Greg Low

One of the problems that existed in the CTP3 edition of StreamInsight was an error that occurred if you tried to access the grouping key from within your projection expression. That was a real issue as you always need access to the key. It's a bit like using a GROUP BY in TSQL and then not including the columns you're grouping by in the SELECT clause. You'd see the results but not be able to know which results are which. Look at the following code: var laneSpeeds = from e in vehicleSpeeds group e...(read more)

Read the article
SQL SERVER – A Brief Note on SET TEXTSIZE

- by pinaldave

Here is a small conversation I received. I thought though an old topic, indeed a thought provoking for the moment. Question: Is there any difference between LEFT function and SET TEXTSIZE? I really like this small but interesting question. The question does not specify the difference between usage or performance. Anyway we will quickly take a look at how TEXTSIZE works. You can run the following script to see how LEFT and SET TEXTSIZE works. USE TempDB GO -- Create TestTable CREATE TABLE MyTable (ID INT, MyText VARCHAR(MAX)) GO INSERT MyTable (ID, MyText) VALUES(1, REPLICATE('1234567890', 100)) GO -- Select Data SELECT ID, MyText FROM MyTable GO -- Using Left SELECT ID, LEFT(MyText, 10) MyText FROM MyTable GO -- Set TextSize SET TEXTSIZE 10; SELECT ID, MyText FROM MyTable; SET TEXTSIZE 2147483647 GO -- Clean up DROP TABLE MyTable GO Now let us see the usage result which we receive from both of the example. If you are going to ask what you should do – I really do not know. I can tell you where I will use either of the same. LEFT seems to be easy to use but again if you like to do extra work related to SET TEXTSIZE go for it. Here is how I will use SET TEXTSIZE. If I am selecting data from in my SSMS for testing or any other non production related work from a large table which has lots of columns with varchar data, I will consider using this statement to reduce the amount of the data which is retrieved in the result set. In simple word, for testing purpose I will use it. On the production server, there should be a specific reason to use the same. Here is my candid opinion – I do not think they can be directly comparable even though both of them give the exact same result. LEFT is applicable only on the column of a single SELECT statement. where it is used but it SET TEXTSIZE applies to all the columns in the SELECT and follow up SELECT statements till the SET TEXTSIZE is not modified again in the session. Uncomparable! I hope this sample example gives you idea how to use SET TEXTSIZE in your daily use. I would like to know your opinion about how and when do you use this feature. Please leave a comment. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Managing Slowly Changing Dimension with MERGE Statement in SQL Server

Slowly Changing Dimension (SCD) Transformation is a quick and easy way to manage smaller slowly changing dimensions but it has several limitations and does not perform well when the number of rows or columns gets larger. Arshad Ali explores some of the alternatives you can use for managing larger slowly changing dimensions. How to automate your .NET and SQL Server deploymentsDeploy .NET code and SQL Server databases in a single repeatable process with Red Gate Deployment Manager. Start deploying with a 28-day trial.

Read the article
Idera Compliance Manager 3.5 and SQL Server 2012 Release Candidate

Unlike most conventional database auditing solutions, SQL Compliance Manager places a blanket over data access with real-time auditing. Clients can pinpoint any malicious intent with sensitive column auditing. This feature gives specifics as to who has accessed information located within an audited table's sensitive columns. With transaction status auditing, database administrators can detect suspicious activity by auditing the status of transactions that execute DML statements on an audited database with the help of rollbacks and save-points. In addition, SQL Compliance Manager lives up t...

Read the article

Search Results

Search found 7127 results on 286 pages for 'calculated columns'.

Page 97/286 | < Previous Page | 93 94 95 96 97 98 99 100 101 102 103 104 | Next Page >

- by AlbertoFerrari

- by danx

- by pinaldave

- by jamiet

- by Rob Farley

- by Rob Farley

- by Javier Rivera

- by Derek Dieter

- by thatjeffsmith

- by Bandari Huang

- by Javier Treviño

- by Alexander Kuznetsov

- by Nick Harrison

- by AaronBertrand

- by simonsabin

- by DrJohn

- by Paul White

- by Pinal Dave

- by Dinesh Rajasekharan-Oracle

- by Ben Scott

- by Greg Low

- by pinaldave

< Previous Page | 93 94 95 96 97 98 99 100 101 102 103 104 | Next Page >