SQL Server Split() Function

Posted by HighAltitudeCoder on Geeks with Blogs See other posts from Geeks with Blogs or by HighAltitudeCoder
Published on Sun, 06 Jun 2010 18:33:50 GMT Indexed on 2010/06/08 2:12 UTC
Read the original article Hit count: 561

Filed under:

Title goes here

 

Ever wanted a dbo.Split() function, but not had the time to debug it completely? 

Let me guess - you are probably working on a stored procedure with 50 or more parameters; two or three of them are parameters of differing types, while the other 47 or so all of the same type (id1, id2, id3, id4, id5...).  Worse, you've found several other similar stored procedures with the ONLY DIFFERENCE being the number of like parameters taped to the end of the parameter list.

If this is the situation you find yourself in now, you may be wondering, "why am I working with three different copies of what is basically the same stored procedure, and why am I having to maintain changes in three different places?  Can't I have one stored procedure that accomplishes the job of all three?

My answer to you: YES! 

Here is the Split() function I've created. 

 
/******************************************************************************
 
                                    Split.sql
 
******************************************************************************/
/******************************************************************************
 
Split a delimited string into sub-components and return them as a table.
 
Parameter 1: Input string which is to be split into parts.
Parameter 2: Delimiter which determines the split points in input string.
Works with space or spaces as delimiter. Split() is apostrophe-safe.
 
SYNTAX:
SELECT * FROM Split('Dvorak,Debussy,Chopin,Holst', ',')
SELECT * FROM Split('Denver|Seattle|San Diego|New York', '|')
SELECT * FROM Split('Denver is the super-awesomest city of them all.', ' ')
 
******************************************************************************/
USE AdventureWorks
GO
 
IF EXISTS
      (SELECT *
      FROM sysobjects
      WHERE xtype = 'TF'
      AND name = 'Split'
      )
BEGIN
      DROP FUNCTION Split
END
GO
 
CREATE FUNCTION Split (
      @InputString                  VARCHAR(8000),
      @Delimiter                    VARCHAR(50)
)
 
RETURNS @Items TABLE (
      Item                          VARCHAR(8000)
)
 
AS
BEGIN
      IF @Delimiter = ' '
      BEGIN
            SET @Delimiter = ','
            SET @InputString = REPLACE(@InputString, ' ', @Delimiter)
      END
 
      IF (@Delimiter IS NULL OR @Delimiter = '')
            SET @Delimiter = ','
 
--INSERT INTO @Items VALUES (@Delimiter) -- Diagnostic
--INSERT INTO @Items VALUES (@InputString) -- Diagnostic
 
      DECLARE @Item                 VARCHAR(8000)
      DECLARE @ItemList       VARCHAR(8000)
      DECLARE @DelimIndex     INT
 
      SET @ItemList = @InputString
      SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0)
      WHILE (@DelimIndex != 0)
      BEGIN
            SET @Item = SUBSTRING(@ItemList, 0, @DelimIndex)
            INSERT INTO @Items VALUES (@Item)
 
            -- Set @ItemList = @ItemList minus one less item
            SET @ItemList = SUBSTRING(@ItemList, @DelimIndex+1, LEN(@ItemList)-@DelimIndex)
            SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0)
      END -- End WHILE
 
      IF @Item IS NOT NULL -- At least one delimiter was encountered in @InputString
      BEGIN
            SET @Item = @ItemList
            INSERT INTO @Items VALUES (@Item)
      END
 
      -- No delimiters were encountered in @InputString, so just return @InputString
      ELSE INSERT INTO @Items VALUES (@InputString)
 
      RETURN
 
END -- End Function
GO
 
---- Set Permissions
--GRANT SELECT ON Split TO UserRole1
--GRANT SELECT ON Split TO UserRole2
--GO




































































































 

The syntax is basically as follows:

SELECT <fields>
FROM Table 1
JOIN Table 2
ON ...
JOIN Table 3
ON ...
WHERE
LOGICAL CONDITION A
AND LOGICAL CONDITION B
AND LOGICAL CONDITION C
AND TABLE2.Id IN (SELECT * FROM Split(@IdList, ','))

@IdList is a parameter passed into the stored procedure, and the comma (',') is the delimiter you have chosen to split the parameter list on.


You can also use it like this:

SELECT <fields>
FROM Table 1
JOIN Table 2
ON ...
JOIN Table 3
ON ...
WHERE
LOGICAL CONDITION A
AND LOGICAL CONDITION B
AND LOGICAL CONDITION C
HAVING COUNT(SELECT * FROM Split(@IdList, ',')


Similarly, it can be used in other aggregate functions at run-time:

SELECT MIN(SELECT * FROM Split(@IdList, ','), <fields>
FROM Table 1
JOIN Table 2
ON ...
JOIN Table 3
ON ...
WHERE
LOGICAL CONDITION A
AND LOGICAL CONDITION B
AND LOGICAL CONDITION C
GROUP BY <fields>


Now that I've (hopefully effectively) explained the benefits to using this function and implementing it in one or more of your database objects, let me warn you of a caveat that you are likely to encounter.  You may have a team member who waits until the right moment to ask you a pointed question: "Doesn't this function just do the same thing as using the IN function?  Why didn't you just use that instead?  In other words, why bother with this function?"

What's happening is, one or more team members has failed to understand the reason for implementing this kind of function in the first place.  (Note: this is THE MOST IMPORTANT ASPECT OF THIS POST).

Allow me to outline a few pros to implementing this function, so you may effectively parry this question.  Touche.

1) Code consolidation.  You don't have to maintain what is basically the same code and logic, but with varying numbers of the same parameter in several SQL objects. 

I'm not going to go into the cons related to using this function, because the afore mentioned team member is probably more than adept at pointing these out.  Remember, the real positive contribution is ou are decreasing the liklihood that your team fails to update all (x) duplicate copies of what are basically the same stored procedure, and so on...  This is the classic downside to duplicate code.  It is a virus, and you should kill it.

You might be better off rejecting your team member's question, and responding with your own: "Would you rather maintain the same logic in multiple different stored procedures, and hope that the team doesn't forget to always update all of them at the same time?".  In his head, he might be thinking "yes, I would like to maintain several different copies of the same stored procedure", although you probably will not get such a direct response. 

2) Added flexibility - you can use the Split function elsewhere, and for splitting your data in different ways.  Plus, you can use any kind of delimiter you wish.  How can you know today the ways in which you might want to examine your data tomorrow?  Segue to my next point.

3) Because the function takes a delimiter parameter, you can split the data in any number of ways.  This greatly increases the utility of such a function and enables your team to work with the data in a variety of different ways in the future.  You can split on a single char, symbol, word, or group of words.  You can split on spaces.  (The list goes on... test it out).

Finally, you can dynamically define the behavior of a stored procedure (or other SQL object) at run time, through the use of this function.  Rather than have several objects that accomplish almost the same thing, why not have only one instead?

 

© Geeks with Blogs or respective owner