Search Results

Search found 31546 results on 1262 pages for 'sql white papers'.

Page 40/1262 | < Previous Page | 36 37 38 39 40 41 42 43 44 45 46 47  | Next Page >

  • Sql Server 2005 cluster - unable to rename to old server name

    - by Paul2020
    We have a sql 2005 cluster on W2K8 cluster. It is a named instance say SRV1\A. Then I built a new W2K8 (with a diff cluster service name) but the same service account. Then I installed a new sql 2005 cluster say SRV2\A. Now when I bring down the sql server resources on SRV1 and try to rename SRV2\A to SRV1\A through the cluster admin, I get the error the network name already exists. I have tried bringing an old cluster and installing a new cluster with the same name and it works. Why am I not able to rename the name? Any advice would very helpful.

    Read the article

  • An XEvent a Day (19 of 31) – Using Customizable Fields

    - by Jonathan Kehayias
    Today’s post will be somewhat short, but we’ll look at Customizable Fields on Events in Extended Events and how they are used to collect additional information.  Customizable Fields generally represent information of potential interest that may be expensive to collect, and is therefore made available for collection if specified by the Event Session.  In SQL Server 2008 and 2008 R2, there are 50 Events that have customizable columns in their payload.  In SQL Server Denali CTP1, there...(read more)

    Read the article

  • An XEvent a Day (5 of 31) - Targets Week – ring_buffer

    - by Jonathan Kehayias
    Yesterday’s post, Querying the Session Definition and Active Session DMV’s , showed how to find information about the Event Sessions that exist inside a SQL Server and how to find information about the Active Event Sessions that are running inside a SQL Server using the Session Definition and Active Session DMV’s.  With the background information now out of the way, and since this post falls on the start of a new week I’ve decided to make this Targets Week, where each day we’ll look at a different...(read more)

    Read the article

  • An XEvent a Day (4 of 31) – Querying the Session Definition and Active Session DMV’s

    - by Jonathan Kehayias
    Yesterdays post, Managing Event Sessions , showed how to manage Event Sessions in Extended Events Sessions inside the Extended Events framework in SQL Server. In today's post, we’ll take a look at how to find information about the defined Event Sessions that already exist inside a SQL Server using the Session Definition DMV’s and how to find information about the Active Event Sessions that exist using the Active Session DMV’s. Session Definition DMV’s The Session Definition DMV’s provide information...(read more)

    Read the article

  • An XEvent a Day (1 of 31) – An Overview of Extended Events

    - by Jonathan Kehayias
    First introduced in SQL Server 2008, Extended Events provided a new mechanism for capturing information about events inside the Database Engine that was both highly performant and highly configurable. Designed from the ground up with performance as a primary focus, Extended Events may seem a bit odd at first look, especially when you compare it to SQL Trace. However, as you begin to work with Extended Events, you will most likely change how you think about tracing problems, and will find the power...(read more)

    Read the article

  • Feature pack for SQL Server 2005 SP4 - collection of standalone packages

    - by ssqa.net
    With the release of SQL2005Sp4 an additional task is essential for DBAs & Developers to avoid any compatibility issues with existing code agains SP4 instance. Feature pack for SQL Server 2005 SP4 is available to download which contains the standalone packages such as SQLNative Client, ADOMD, OLAPDM etc.... as it states the feature pack are built on latest versions of add-on and backward compatibility contents for SQL Server 2005. The above link provides individual file to download for each environment...(read more)

    Read the article

  • Cumulative Update #1 for SQL Server 2005 SP4

    - by AaronBertrand
    Well, much quicker than I would have suspected, the SQL Server Release Services team has incorporated all of the fixes in 2005 SP3's CU #12 into the first CU for SP4. Thanks to Chris Wood for the heads up. You can get the new Cumulative Update here: KB #2464079 : Cumulative update package 1 for SQL Server 2005 Service Pack 4 The nice round number of build 5000 didn't last long either; this CU will update you from 9.00.5000 to 9.00.5254....(read more)

    Read the article

  • An XEvent A Day: 31 days of Extended Events

    - by Jonathan Kehayias
    Back in April, Paul Randal ( Blog | Twitter ) did a 30 day series titled A SQL Server Myth a Day , where he covered a different myth about SQL Server every day of the month. At the same time Glenn Alan Berry ( Blog |Twitter) did a 30 day series titled A DMV a Day , where he blogged about a different DMV every day of the month. Being so inspired by these two guys, I have decided to attempt a month long series on Extended Events that I am going to call A XEvent a Day . I originally wanted to do this...(read more)

    Read the article

  • Is SQL Server 2008 R2 a full release of SQL Server?

    - by AaronBertrand
    This has come up in conversations more than once in the past little while - recently on twitter I made the casual comment that later this year, SQL Server 2008 will be "two versions old." Well, not everyone agrees that that is technically true. So, I thought I'd put something out there that isn't limited to 140 characters. There are certainly some valid arguments on both sides, but my opinion - based both on these facts and on my memory that Microsoft has marketed it as such - is that SQL Server...(read more)

    Read the article

  • Managing SQL Server users via Active directory groups

    - by hyty
    I'm building SQL Server instance for reporting purposes. My plan is to use AD groups for server and database logins. I have several groups with different roles (admin, developer, user etc.), and I would like to map these roles into SQL Server database roles (db_owner, db_datawriter etc.). What are the pros and cons of using AD groups for logins? What kind of problems you have noticed?

    Read the article

  • Dump Microsoft SQL Server database to an SQL script

    - by Matt Sheppard
    Is there any way to export a Microsoft SQL Server database to an sql script? I'm looking for something which behaves similarly to mysqldump, taking a database name, and producing a single script which will recreate all the tables, stored procedures, reinsert all the data etc. I've seen http://vyaskn.tripod.com/code.htm#inserts, but I ideally want something to recreate everything (not just the data) which works in a single step to produce the final script.

    Read the article

  • Edit Top 200 Rows SQL Pane equivalent in Visual Studio 2012 SQL Server Tools "View Data"

    - by Johan Kronberg
    I've always used Edit Top 200 Rows and then edited the query in the SQL Pane of the 2008 Management Studio to find the rows I want to edit data for. Now I have the tools inside Visual Studio 2012 and want to use be able to change the query after right clicking a table and choosing "View Data" but I can't see that this is possible. Has the "SQL Pane" feature been removed or am I not seeing something?

    Read the article

  • SQL Server Intellisense VS. Red Gate SQL Prompt

    Fabiano Amorim is hooked on today's Integrated Development Environments with built-in Intellisense, so he looked forward keenly to SQL Server 2008's native intellisense. He was disappointed at how it turned out, so turned instead to SQL Prompt. Fabiano explains why he prefers to SQL Prompt, why he reckons it fits in with the way that database developers work, and goes on to describe some of the features he'd like to see in it.

    Read the article

  • SQL Server is now supported by phpBB!

    - by The Official Microsoft IIS Site
    Our team is really excited to announce the new release of phpBB 3.0.7-PL1 by the phpBB community that supports SQL Server, and one can download it from the Web Application Gallery for a very easy install!! But let’s step back for a moment and provide some background. Microsoft’s Interoperability team has been working with a few PHP projects to support SQL Server using our driver, phpBB was one of them. Although phpBB already had some support for SQL Server / Access, our 1.1 release driver offered...(read more)

    Read the article

  • Microsoft Sql Server 2008 R2 System Databases

    For a majority of software developers little time is spent understanding the inner workings of the database management systems (DBMS) they use to store data for their applications.  I personally place myself in this grouping. In my case, I have used various versions of Microsoft’s SQL Server (2000, 2005, and 2008 R2) and just recently learned how valuable they really are when I was preparing to deliver a lecture on "SQL Server 2008 R2, System Databases". Microsoft Sql Server 2008 R2 System DatabasesSo what are system databases in MS SQL Server, and why should I know them? Microsoft uses system databases to support the SQL Server DBMS, much like a developer uses config files or database tables to support an application. These system databases individually provide specific functionality that allows MS SQL Server to function. Name Database File Log File Master master.mdf mastlog.ldf Resource mssqlsystemresource.mdf mssqlsystemresource.ldf Model model.mdf modellog.ldf MSDB msdbdata.mdf msdblog.ldf Distribution distmdl.mdf distmdl.ldf TempDB tempdb.mdf templog.ldf Master DatabaseIf you have used MS SQL Server then you should recognize the Master database especially if you used the SQL Server Management Studio (SSMS) to connect to a user created database. MS SQL Server requires the Master database in order for DBMS to start due to the information that it stores. Examples of data stored in the Master database User Logins Linked Servers Configuration information Information on User Databases Resource DatabaseHonestly, until recently I never knew this database even existed until I started to research SQL Server system databases. The reason for this is due largely to the fact that the resource database is hidden to users. In fact, the database files are stored within the Binn folder instead of the standard MS SQL Server database folder path. This database contains all system objects that can be accessed by all other databases.  In short, this database contains all system views and store procedures that appear in all other user databases regarding system information. One of the many benefits to storing system views and store procedures in a single hidden database is the fact it improves upgrading a SQL Server database; not to mention that maintenance is decreased since only one code base has to be mainlined for all of the system views and procedures. Model DatabaseThe Model database as the name implies is the model for all new databases created by users. This allows for predefining default database objects for all new databases within a MS SQL Server instance. For example, if every database created by a user needs to have an “Audit” table when it is  created then defining the “Audit” table in the model will guarantees that the table will be located in every new database create after the model is altered. MSDB DatabaseThe MSDBdatabase is used by SQL Server Agent, SQL Server Database Mail, SQL Server Service Broker, along with SQL Server. The SQL Server Agent uses this database to store job configurations and SQL job schedules along with SQL Alerts, and Operators. In addition, this database also stores all SQL job parameters along with each job’s execution history.  Finally, this database is also used to store database backup and maintenance plans as well as details pertaining to SQL Log shipping if it is being used. Distribution DatabaseThe Distribution database is only used during replication and stores meta data and history information pertaining to the act of replication data. Furthermore, when transactional replication is used this database also stores information regarding each transaction. It is important to note that replication is not turned on by default in MS SQL Server and that the distribution database is hidden from SSMS. Tempdb DatabaseThe Tempdb as the name implies is used to store temporary data and data objects. Examples of this include temp tables and temp store procedures. It is important to note that when using this database all data and data objects are cleared from this database when SQL Server restarts. This database is also used by SQL Server when it is performing some internal operations. Typically, SQL Server uses this database for the purpose of large sort and index operations. Finally, this database is used to store row versions if row versioning or snapsot isolation transactions are being used by SQL Server. Additionally, I would love to hear from others about their experiences using system databases, tables, and objects in a real world environments.

    Read the article

  • Microsoft Sql Server 2008 R2 System Databases

    For a majority of software developers little time is spent understanding the inner workings of the database management systems (DBMS) they use to store data for their applications.  I personally place myself in this grouping. In my case, I have used various versions of Microsoft’s SQL Server (2000, 2005, and 2008 R2) and just recently learned how valuable they really are when I was preparing to deliver a lecture on "SQL Server 2008 R2, System Databases". Microsoft Sql Server 2008 R2 System DatabasesSo what are system databases in MS SQL Server, and why should I know them? Microsoft uses system databases to support the SQL Server DBMS, much like a developer uses config files or database tables to support an application. These system databases individually provide specific functionality that allows MS SQL Server to function. Name Database File Log File Master master.mdf mastlog.ldf Resource mssqlsystemresource.mdf mssqlsystemresource.ldf Model model.mdf modellog.ldf MSDB msdbdata.mdf msdblog.ldf Distribution distmdl.mdf distmdl.ldf TempDB tempdb.mdf templog.ldf Master DatabaseIf you have used MS SQL Server then you should recognize the Master database especially if you used the SQL Server Management Studio (SSMS) to connect to a user created database. MS SQL Server requires the Master database in order for DBMS to start due to the information that it stores. Examples of data stored in the Master database User Logins Linked Servers Configuration information Information on User Databases Resource DatabaseHonestly, until recently I never knew this database even existed until I started to research SQL Server system databases. The reason for this is due largely to the fact that the resource database is hidden to users. In fact, the database files are stored within the Binn folder instead of the standard MS SQL Server database folder path. This database contains all system objects that can be accessed by all other databases.  In short, this database contains all system views and store procedures that appear in all other user databases regarding system information. One of the many benefits to storing system views and store procedures in a single hidden database is the fact it improves upgrading a SQL Server database; not to mention that maintenance is decreased since only one code base has to be mainlined for all of the system views and procedures. Model DatabaseThe Model database as the name implies is the model for all new databases created by users. This allows for predefining default database objects for all new databases within a MS SQL Server instance. For example, if every database created by a user needs to have an “Audit” table when it is  created then defining the “Audit” table in the model will guarantees that the table will be located in every new database create after the model is altered. MSDB DatabaseThe MSDBdatabase is used by SQL Server Agent, SQL Server Database Mail, SQL Server Service Broker, along with SQL Server. The SQL Server Agent uses this database to store job configurations and SQL job schedules along with SQL Alerts, and Operators. In addition, this database also stores all SQL job parameters along with each job’s execution history.  Finally, this database is also used to store database backup and maintenance plans as well as details pertaining to SQL Log shipping if it is being used. Distribution DatabaseThe Distribution database is only used during replication and stores meta data and history information pertaining to the act of replication data. Furthermore, when transactional replication is used this database also stores information regarding each transaction. It is important to note that replication is not turned on by default in MS SQL Server and that the distribution database is hidden from SSMS. Tempdb DatabaseThe Tempdb as the name implies is used to store temporary data and data objects. Examples of this include temp tables and temp store procedures. It is important to note that when using this database all data and data objects are cleared from this database when SQL Server restarts. This database is also used by SQL Server when it is performing some internal operations. Typically, SQL Server uses this database for the purpose of large sort and index operations. Finally, this database is used to store row versions if row versioning or snapsot isolation transactions are being used by SQL Server. Additionally, I would love to hear from others about their experiences using system databases, tables, and objects in a real world environments.

    Read the article

  • Folders in SQL Server Data Tools

    - by jamiet
    Recently I have begun a new project in which I am using SQL Server Data Tools (SSDT) and SQL Server Integration Services (SSIS) 2012. Although I have been using SSDT & SSIS fairly extensively while SQL Server 2012 was in the beta phase I usually find that you don’t learn about the capabilities and quirks of new products until you use them on a real project, hence I am hoping I’m going to have a lot of experiences to share on my blog over the coming few weeks. In this first such blog post I want to talk about file and folder organisation in SSDT. The predecessor to SSDT is Visual Studio Database Projects. When one created a new Visual Studio Database Project a folder structure was provided with “Schema Objects” and “Scripts” in the root and a series of subfolders for each schema: Apparently a few customers were not too happy with the tool arbitrarily creating lots of folders in Solution Explorer and hence SSDT has gone in completely the opposite direction; now no folders are created and new objects will get created in the root – it is at your discretion where they get moved to: After using SSDT for a few weeks I can safely say that I preferred the older way because I never used Solution Explorer to navigate my schema objects anyway so it didn’t bother me how many folders it created. Having said that the thought of a single long list of files in Solution Explorer without any folders makes me shudder so on this project I have been manually creating folders in which to organise files and I have tried to mimic the old way as much as possible by creating two folders in the root, one for all schema objects and another for Pre/Post deployment scripts: This works fine until different developers start to build their own different subfolder structures; if you are OCD-inclined like me this is going to grate on you eventually and hence you are going to want to move stuff around so that you have consistent folder structures for each schema and (if you have multiple databases) each project. Moreover new files get created with a filename of the object name + “.sql” and often people like to have an extra identifier in the filename to indicate the object type: The overall point is this – files and folders in your solution are going to change. Some version control systems (VCSs) don’t take kindly to files being moved around or renamed because they recognise the renamed/moved file simply as a new file and when they do that you lose the revision history which, to my mind, is one of the key benefits of using a VCS in the first place. On this project we have been using Team Foundation Server (TFS) and while it pains me to say it (as I am no great fan of TFS’s version control system) it has proved invaluable when dealing with the SSDT problems that I outlined above because it is integrated right into the Visual Studio IDE. Thus the advice from this blog post is: If you are using SSDT consider using an Visual-Studio-integrated VCS that can easily handle file renames and file moves I suspect that fans of other VCSs will counter by saying that their VCS weapon of choice can handle renames/file moves quite satisfactorily and if that’s the case…great…let me know about them in the comments. This blog post is not an attempt to make people use one particular VCS, only to make people aware of this issue that might rise when using SSDT. More to come in the coming few weeks! @jamiet

    Read the article

  • SQL Server 2005 Merge Replication to SQL Server CE 3.5

    - by user33067
    Hi, In my organization, we have a SQL Server 2005 database server (DBServer). Users of an application will normally be connected to DBServer, but, occasionally, would like to disconnect and continue their work on a laptop using SQL Server Compact Edition 3.5 (SQLCE). Due to this, we have been looking into using Merge Replication between the DBServer and SQLCE. From what I have read about this process, IIS must be installed on "the server"... yet, I have found no indication to whether this is talking about DBServer or SQLCE. I had assumed the documentation was referring to DBServer and proposed this to our networking staff. That idea was quickly put to rest as it is not our policy to install IIS on an internal server. This is where our SQL Server 2005 web server (WebServer) entered the picture. The idea being that IIS would be installed on WebServer and would be the conduit for DBServer and SQLCE to communicate. This sounded like a good idea at first, until I started looking for documentation on this type of setup. Everything I have been able deals with a DBServer -- SQLCE -- DBServer setup... nothing on DBServer -- WebServer -- SQLCE -- WebServer -- DBServer. Questions: Is going with a 3 server setup ideal? Does anyone have documentation on this type of setup? Does IIS even need to be running on one of the big servers, or can it just run off the laptop with SQLCE on it? (I'd really like this option ;))

    Read the article

  • How to connect to local instance of SQL Server 2008 Express

    - by Billy Logan
    I just installed SQL Server 2008 Express on my windows 7 machine. I previously had 2005 on here and used it just fine with the old SQL Server Management Studio Express. I was able to connect with no problems to my PC-NAME\SQLEXPRESS instance. I uninstalled 2005 and SQL Server Management Studio Express. I then installed SQL Server 2008 Express on my machine and elected to have it install SQL Server Management Studio. Now, when I try to connect to PC-NAME\SQLEXPRESS (with Windows Authentication, like I always did), I get the following message: Cannot connect to PC-NAME\SQLEXPRESS. A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified) (Microsoft SQL Server, Error: -1) For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&EvtSrc=MSSQLServer&EvtID=-1&LinkId=20476 When I went to the help link it mentions, the help page suggests the following: * Make sure that the SQL Server Browser service is started on the server. * Use the SQL Server Surface Area Configuration tool to enable SQL Server to accept remote connections. For more information about the SQL Server Surface Area Configuration Tool, see Surface Area Configuration for Services and Connections. I did try starting the SQL Server Browser, but don't see that the Surface Area Configuration is installed with this express version. I had seen another user with an almost exact same issue that was missing the database engine on install. If that were the case how could i test for that and where would i go to download that install. Thanks in advance, Billy

    Read the article

  • SQL 2008 Replication corrupt data problem

    - by Jonathan K
    We took a SQL 2000 database. Took a lightspeed backup. Restored on SQL 2008 active/passive cluster. Then setup replication to replicate the data back to SQL 2000. So 2008 is the publisher/distributor, and 2000 is doing a pull subscription. Everything works well, execpt we occassionally get corrupt data in varchar/text fields on the subscriber. So for example we have a table with 4500 records. When we run this statement: update MedstaffProvider set Notes = 'Cell Phone: 360.123.4567 Answering Service: 360.123.9876' where LastName = 'smith' The record in the 2008 database is updated as expected. But in the subsriber datbase we'll get gibberish in the notes field: óPÌ[1] T $Oé[1] ð²ñ. K Here's what we know: This is repeatable, meaning we can run that same query all day long and get the same gibberish. If you alter update statement slightly the data gets replicated just fine. The collation on both databases is the same. So far we've only detected the problem with text/varchar fields. (The notes field above is text). Only one or two records in a table are impacted. The table structure looks identical in both 2000/2008. We haven't made any changes. We have found one solution that fixes the problem. Basically if we recreate the table in 2008 (say as MedStaffProvider2) and then insert all the data. Drop the original table. Rename the table to it's original name. Setup replication again. And run the exact same update statement it works as expected. Does anyone have any idea what might be happening here? Or are there any other techniques we can use to troubleshoot this? I've found a solution for this, but would really like to undertsand why this is happening.

    Read the article

  • Using a "white list" for extracting terms for Text Mining, Part 2

    - by [email protected]
    In my last post, we set the groundwork for extracting specific tokens from a white list using a CTXRULE index. In this post, we will populate a table with the extracted tokens and produce a case table suitable for clustering with Oracle Data Mining. Our corpus of documents will be stored in a database table that is defined as create table documents(id NUMBER, text VARCHAR2(4000)); However, any suitable Oracle Text-accepted data type can be used for the text. We then create a table to contain the extracted tokens. The id column contains the unique identifier (or case id) of the document. The token column contains the extracted token. Note that a given document many have many tokens, so there will be one row per token for a given document. create table extracted_tokens (id NUMBER, token VARCHAR2(4000)); The next step is to iterate over the documents and extract the matching tokens using the index and insert them into our token table. We use the MATCHES function for matching the query_string from my_thesaurus_rules with the text. DECLARE     cursor c2 is       select id, text       from documents; BEGIN     for r_c2 in c2 loop        insert into extracted_tokens          select r_c2.id id, main_term token          from my_thesaurus_rules          where matches(query_string,                        r_c2.text)>0;     end loop; END; Now that we have the tokens, we can compute the term frequency - inverse document frequency (TF-IDF) for each token of each document. create table extracted_tokens_tfidf as   with num_docs as (select count(distinct id) doc_cnt                     from extracted_tokens),        tf       as (select a.id, a.token,                            a.token_cnt/b.num_tokens token_freq                     from                        (select id, token, count(*) token_cnt                        from extracted_tokens                        group by id, token) a,                       (select id, count(*) num_tokens                        from extracted_tokens                        group by id) b                     where a.id=b.id),        doc_freq as (select token, count(*) overall_token_cnt                     from extracted_tokens                     group by token)   select tf.id, tf.token,          token_freq * ln(doc_cnt/df.overall_token_cnt) tf_idf   from num_docs,        tf,        doc_freq df   where df.token=tf.token; From the WITH clause, the num_docs query simply counts the number of documents in the corpus. The tf query computes the term (token) frequency by computing the number of times each token appears in a document and divides that by the number of tokens found in the document. The doc_req query counts the number of times the token appears overall in the corpus. In the SELECT clause, we compute the tf_idf. Next, we create the nested table required to produce one record per case, where a case corresponds to an individual document. Here, we COLLECT all the tokens for a given document into the nested column extracted_tokens_tfidf_1. CREATE TABLE extracted_tokens_tfidf_nt              NESTED TABLE extracted_tokens_tfidf_1                  STORE AS extracted_tokens_tfidf_tab AS              select id,                     cast(collect(DM_NESTED_NUMERICAL(token,tf_idf)) as DM_NESTED_NUMERICALS) extracted_tokens_tfidf_1              from extracted_tokens_tfidf              group by id;   To build the clustering model, we create a settings table and then insert the various settings. Most notable are the number of clusters (20), using cosine distance which is better for text, turning off auto data preparation since the values are ready for mining, the number of iterations (20) to get a better model, and the split criterion of size for clusters that are roughly balanced in number of cases assigned. CREATE TABLE km_settings (setting_name  VARCHAR2(30), setting_value VARCHAR2(30)); BEGIN  INSERT INTO km_settings (setting_name, setting_value) VALUES     VALUES (dbms_data_mining.clus_num_clusters, 20);  INSERT INTO km_settings (setting_name, setting_value)     VALUES (dbms_data_mining.kmns_distance, dbms_data_mining.kmns_cosine);   INSERT INTO km_settings (setting_name, setting_value) VALUES     VALUES (dbms_data_mining.prep_auto,dbms_data_mining.prep_auto_off);   INSERT INTO km_settings (setting_name, setting_value) VALUES     VALUES (dbms_data_mining.kmns_iterations,20);   INSERT INTO km_settings (setting_name, setting_value) VALUES     VALUES (dbms_data_mining.kmns_split_criterion,dbms_data_mining.kmns_size);   COMMIT; END; With this in place, we can now build the clustering model. BEGIN     DBMS_DATA_MINING.CREATE_MODEL(     model_name          => 'TEXT_CLUSTERING_MODEL',     mining_function     => dbms_data_mining.clustering,     data_table_name     => 'extracted_tokens_tfidf_nt',     case_id_column_name => 'id',     settings_table_name => 'km_settings'); END;To generate cluster names from this model, check out my earlier post on that topic.

    Read the article

  • When is a SQL function not a function?

    - by Rob Farley
    Should SQL Server even have functions? (Oh yeah – this is a T-SQL Tuesday post, hosted this month by Brad Schulz) Functions serve an important part of programming, in almost any language. A function is a piece of code that is designed to return something, as opposed to a piece of code which isn’t designed to return anything (which is known as a procedure). SQL Server is no different. You can call stored procedures, even from within other stored procedures, and you can call functions and use these in other queries. Stored procedures might query something, and therefore ‘return data’, but a function in SQL is considered to have the type of the thing returned, and can be used accordingly in queries. Consider the internal GETDATE() function. SELECT GETDATE(), SomeDatetimeColumn FROM dbo.SomeTable; There’s no logical difference between the field that is being returned by the function and the field that’s being returned by the table column. Both are the datetime field – if you didn’t have inside knowledge, you wouldn’t necessarily be able to tell which was which. And so as developers, we find ourselves wanting to create functions that return all kinds of things – functions which look up values based on codes, functions which do string manipulation, and so on. But it’s rubbish. Ok, it’s not all rubbish, but it mostly is. And this isn’t even considering the SARGability impact. It’s far more significant than that. (When I say the SARGability aspect, I mean “because you’re unlikely to have an index on the result of some function that’s applied to a column, so try to invert the function and query the column in an unchanged manner”) I’m going to consider the three main types of user-defined functions in SQL Server: Scalar Inline Table-Valued Multi-statement Table-Valued I could also look at user-defined CLR functions, including aggregate functions, but not today. I figure that most people don’t tend to get around to doing CLR functions, and I’m going to focus on the T-SQL-based user-defined functions. Most people split these types of function up into two types. So do I. Except that most people pick them based on ‘scalar or table-valued’. I’d rather go with ‘inline or not’. If it’s not inline, it’s rubbish. It really is. Let’s start by considering the two kinds of table-valued function, and compare them. These functions are going to return the sales for a particular salesperson in a particular year, from the AdventureWorks database. CREATE FUNCTION dbo.FetchSales_inline(@salespersonid int, @orderyear int) RETURNS TABLE AS  RETURN (     SELECT e.LoginID as EmployeeLogin, o.OrderDate, o.SalesOrderID     FROM Sales.SalesOrderHeader AS o     LEFT JOIN HumanResources.Employee AS e     ON e.EmployeeID = o.SalesPersonID     WHERE o.SalesPersonID = @salespersonid     AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101')     AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101') ) ; GO CREATE FUNCTION dbo.FetchSales_multi(@salespersonid int, @orderyear int) RETURNS @results TABLE (     EmployeeLogin nvarchar(512),     OrderDate datetime,     SalesOrderID int     ) AS BEGIN     INSERT @results (EmployeeLogin, OrderDate, SalesOrderID)     SELECT e.LoginID, o.OrderDate, o.SalesOrderID     FROM Sales.SalesOrderHeader AS o     LEFT JOIN HumanResources.Employee AS e     ON e.EmployeeID = o.SalesPersonID     WHERE o.SalesPersonID = @salespersonid     AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101')     AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101')     ;     RETURN END ; GO You’ll notice that I’m being nice and responsible with the use of the DATEADD function, so that I have SARGability on the OrderDate filter. Regular readers will be hoping I’ll show what’s going on in the execution plans here. Here I’ve run two SELECT * queries with the “Show Actual Execution Plan” option turned on. Notice that the ‘Query cost’ of the multi-statement version is just 2% of the ‘Batch cost’. But also notice there’s trickery going on. And it’s nothing to do with that extra index that I have on the OrderDate column. Trickery. Look at it – clearly, the first plan is showing us what’s going on inside the function, but the second one isn’t. The second one is blindly running the function, and then scanning the results. There’s a Sequence operator which is calling the TVF operator, and then calling a Table Scan to get the results of that function for the SELECT operator. But surely it still has to do all the work that the first one is doing... To see what’s actually going on, let’s look at the Estimated plan. Now, we see the same plans (almost) that we saw in the Actuals, but we have an extra one – the one that was used for the TVF. Here’s where we see the inner workings of it. You’ll probably recognise the right-hand side of the TVF’s plan as looking very similar to the first plan – but it’s now being called by a stack of other operators, including an INSERT statement to be able to populate the table variable that the multi-statement TVF requires. And the cost of the TVF is 57% of the batch! But it gets worse. Let’s consider what happens if we don’t need all the columns. We’ll leave out the EmployeeLogin column. Here, we see that the inline function call has been simplified down. It doesn’t need the Employee table. The join is redundant and has been eliminated from the plan, making it even cheaper. But the multi-statement plan runs the whole thing as before, only removing the extra column when the Table Scan is performed. A multi-statement function is a lot more powerful than an inline one. An inline function can only be the result of a single sub-query. It’s essentially the same as a parameterised view, because views demonstrate this same behaviour of extracting the definition of the view and using it in the outer query. A multi-statement function is clearly more powerful because it can contain far more complex logic. But a multi-statement function isn’t really a function at all. It’s a stored procedure. It’s wrapped up like a function, but behaves like a stored procedure. It would be completely unreasonable to expect that a stored procedure could be simplified down to recognise that not all the columns might be needed, but yet this is part of the pain associated with this procedural function situation. The biggest clue that a multi-statement function is more like a stored procedure than a function is the “BEGIN” and “END” statements that surround the code. If you try to create a multi-statement function without these statements, you’ll get an error – they are very much required. When I used to present on this kind of thing, I even used to call it “The Dangers of BEGIN and END”, and yes, I’ve written about this type of thing before in a similarly-named post over at my old blog. Now how about scalar functions... Suppose we wanted a scalar function to return the count of these. CREATE FUNCTION dbo.FetchSales_scalar(@salespersonid int, @orderyear int) RETURNS int AS BEGIN     RETURN (         SELECT COUNT(*)         FROM Sales.SalesOrderHeader AS o         LEFT JOIN HumanResources.Employee AS e         ON e.EmployeeID = o.SalesPersonID         WHERE o.SalesPersonID = @salespersonid         AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101')         AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101')     ); END ; GO Notice the evil words? They’re required. Try to remove them, you just get an error. That’s right – any scalar function is procedural, despite the fact that you wrap up a sub-query inside that RETURN statement. It’s as ugly as anything. Hopefully this will change in future versions. Let’s have a look at how this is reflected in an execution plan. Here’s a query, its Actual plan, and its Estimated plan: SELECT e.LoginID, y.year, dbo.FetchSales_scalar(p.SalesPersonID, y.year) AS NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID; We see here that the cost of the scalar function is about twice that of the outer query. Nicely, the query optimizer has worked out that it doesn’t need the Employee table, but that’s a bit of a red herring here. There’s actually something way more significant going on. If I look at the properties of that UDF operator, it tells me that the Estimated Subtree Cost is 0.337999. If I just run the query SELECT dbo.FetchSales_scalar(281,2003); we see that the UDF cost is still unchanged. You see, this 0.0337999 is the cost of running the scalar function ONCE. But when we ran that query with the CROSS JOIN in it, we returned quite a few rows. 68 in fact. Could’ve been a lot more, if we’d had more salespeople or more years. And so we come to the biggest problem. This procedure (I don’t want to call it a function) is getting called 68 times – each one between twice as expensive as the outer query. And because it’s calling it in a separate context, there is even more overhead that I haven’t considered here. The cheek of it, to say that the Compute Scalar operator here costs 0%! I know a number of IT projects that could’ve used that kind of costing method, but that’s another story that I’m not going to go into here. Let’s look at a better way. Suppose our scalar function had been implemented as an inline one. Then it could have been expanded out like a sub-query. It could’ve run something like this: SELECT e.LoginID, y.year, (SELECT COUNT(*)     FROM Sales.SalesOrderHeader AS o     LEFT JOIN HumanResources.Employee AS e     ON e.EmployeeID = o.SalesPersonID     WHERE o.SalesPersonID = p.SalesPersonID     AND o.OrderDate >= DATEADD(year,y.year-2000,'20000101')     AND o.OrderDate < DATEADD(year,y.year-2000+1,'20000101')     ) AS NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID; Don’t worry too much about the Scan of the SalesOrderHeader underneath a Nested Loop. If you remember from plenty of other posts on the matter, execution plans don’t push the data through. That Scan only runs once. The Index Spool sucks the data out of it and populates a structure that is used to feed the Stream Aggregate. The Index Spool operator gets called 68 times, but the Scan only once (the Number of Executions property demonstrates this). Here, the Query Optimizer has a full picture of what’s being asked, and can make the appropriate decision about how it accesses the data. It can simplify it down properly. To get this kind of behaviour from a function, we need it to be inline. But without inline scalar functions, we need to make our function be table-valued. Luckily, that’s ok. CREATE FUNCTION dbo.FetchSales_inline2(@salespersonid int, @orderyear int) RETURNS table AS RETURN (SELECT COUNT(*) as NumSales     FROM Sales.SalesOrderHeader AS o     LEFT JOIN HumanResources.Employee AS e     ON e.EmployeeID = o.SalesPersonID     WHERE o.SalesPersonID = @salespersonid     AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101')     AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101') ); GO But we can’t use this as a scalar. Instead, we need to use it with the APPLY operator. SELECT e.LoginID, y.year, n.NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID OUTER APPLY dbo.FetchSales_inline2(p.SalesPersonID, y.year) AS n; And now, we get the plan that we want for this query. All we’ve done is tell the function that it’s returning a table instead of a single value, and removed the BEGIN and END statements. We’ve had to name the column being returned, but what we’ve gained is an actual inline simplifiable function. And if we wanted it to return multiple columns, it could do that too. I really consider this function to be superior to the scalar function in every way. It does need to be handled differently in the outer query, but in many ways it’s a more elegant method there too. The function calls can be put amongst the FROM clause, where they can then be used in the WHERE or GROUP BY clauses without fear of calling the function multiple times (another horrible side effect of functions). So please. If you see BEGIN and END in a function, remember it’s not really a function, it’s a procedure. And then fix it. @rob_farley

    Read the article

< Previous Page | 36 37 38 39 40 41 42 43 44 45 46 47  | Next Page >