Search Results

Search found 33585 results on 1344 pages for 'sql execution plan'.

Page 93/1344 | < Previous Page | 89 90 91 92 93 94 95 96 97 98 99 100  | Next Page >

  • How to create a "copy" of SQL Server transaction log file

    - by Salman A
    I want a copy of SQL Server transaction log file for "raw" analysis. What is the safest way to get a copy of that file without shutting down the database and disturbing the existing log/backups/backup schedules and just about everything. FYI, Its a SQL Server 2000 database server and I can see the log file (its about 4GB in size) and I cannot copy it as is; I get the "access denied" error when copying from explorer or command line.

    Read the article

  • Best index(es) to use for an OR Statement in SQL Server

    - by Chuck Haines
    I have a table which has a bunch of columns but the two relevant ones are: Due_Amount MONEY Bounced_Due_Amount MONEY I have a SQL query like the following SELECT * FROM table WHERE (Due_Amount 0 OR Bounced_Due_Amount 0) Would the best index to put on this table for SQL Server 2008 be an index which includes both columns in the index, or should I put an separate index on each column?

    Read the article

  • New help with SQL Server select statement

    - by gio_333m
    I need help with select statement in SQL Server / T-SQL My table looks like this: Id (int) QuestionId (int) GenreId (int) I want to select random N rows from this table so that maximum number of same GenreId in the result set is less than X for all GenreId-s except one. For that one GenreId, I need row count with that GenreId to be equal to Y.

    Read the article

  • SQL to delete duplicate records in a table [closed]

    - by oo
    Possible Duplicate: Delete duplicate records from a SQL table without a primary key I have a table with the columns person_ID firstname lastname and I somehow ended up with a bunch of duplicates. Is there any way to look at all columns where firstname and lastname are the same and delete all except one of them (it doesn't matter which one is left as they are all the same.) EDIT: I just found a duplicate question and perfect answer: http://stackoverflow.com/questions/985384/delete-duplicate-records-from-a-sql-table-without-a-primary-key

    Read the article

  • I thought everyone did it like this – Training Session Code Management

    - by Fatherjack
    One of an occasional series of blogs about things that I do that perhaps others don’t. From very early on in my dealings with SQL Server Management Studio I started using Solutions and Projects. This means that I started using them when writing sessions and it wasn’t until speaking with someone at PASS Summit 2013 that I found out that this was a process that was unheard of by some people. So, here we go, a run through how I create and manage code and other documents that I use in presentations. For people unsure what solutions and projects are; • Solution – a container for one or more projects. • Project – a container for files, .sql files are grouped as Queries, all other files are stored as Misc. How do I start? Open Management Studio as normal, and then click File | New and select Project This will bring up the New Project dialog box and you can select/add details as necessary in the places indicated. If this is the first project you are creating then be sure to select the Create directory for solution check box (4). If know in advance that you are going to have more than one project in the solution then you may want to edit the Solution name (3) as by default it will take the name of the project that you enter at (2). This will lead you to the following folder structure (depending on the location that you chose in 3) above. In SSMS you need to turn on the Solution Explorer, either via the View menu or pressing Ctrl + Alt + L                   This will bring up a dockable window that will let you quickly access the files that you choose to include in the Solution.                     Can we get to work and write some code yet please? Yes, we can. As with many Microsoft products there are several ways to go about this, let’s look at the easiest way when creating new code. When writing a presentation I usually start from the position we are currently in – a brand new solution and project with no code. Later on we will look at incorporating existing code files into the Project where we need it. Right-click on the Project name and choose Add New Query           As soon as you click this you will be prompted to select the sql server that you want to connect to and once you have done that you will have your new query open in the text editor and the Solution Explorer will now look like this, showing your server connection and your new query.               And the Project folder will look like this         Now once you have written your code don’t press save, choose Save As and give the code a better name than QueryX.sql. SSMS will interpret this as a request to rename Query1 and your Project and the Project folder will show that SQLQuery1.sql no longer exists but there is now a file named as you requested. If you happen to click save in error then right-click the query in the project and choose rename.               You can then alter the name as you like, even when open in the SSMS text editor, and the file will be renamed. When creating a set of scripts for a presentation I name files with a numeric prefix so that when they are sorted by name they are in the order that I need to use them during the session. I love this idea but I’ve got loads of existing scripts I want to put in Projects Excellent, adding existing files to a project is easy, let’s consider that you have query files in your My Documents folder and you want to bring them into the Project we have just created. Right-click on the Project and choose Add | Existing Item           Navigate to the location of your chosen file and select it. The file will open in SSMS text editor and the Project will be updated to show that the selected query is now part of your project. If you look in Windows Explorer you will see that the query file has been copied into the Project folder, the original file still remains in your My Documents (or wherever it existed). I’ll leave it as an exercise for the reader to explore creating further Projects within a solution but will happily answer questions if you get into difficulties. What other advantages do I get from this? Well, as all your code is neatly in one Solution folder and the folder contains only files that are pertinent to the session you are presenting then it makes it very easy to share this code, simply copy the whole folder onto a USB stick, Blog, FTP location, wherever you choose and it’s all there in one self-contained parcel. You don’t have to limit yourself to .sql query files, you can add any sort of document via the Add Existing Item method, just try it out. Right-click on the protect and choose Add | Existing Item           Change the file type filter.                       You can multi select items here using Ctrl as you click each item you want. When you are done, click the Add button and the items will be brought into your project.                 Again, using this process means the files are copied into the project folder, leaving you original files untouched in their original location. Once they are here you can double click them in the SSMS Solution Explorer to open them, for files with a specific file type then the appropriate application will be launched – ie Word, Excel etc. However, if the files are something that the SSMS Text editor can display then they will open in a tab in SSMS. Try it out with a text file or even a PS1 file … This sounds excellent but what do I need to watch out for? One big thing to consider when working like this is the version of SSMS that you are using. There is something fundamentally different between the different versions in the way that the project (.ssmssqlproj) and solution (.sqlsuo and .ssmssln) files are formatted. If you create a solution in an older version of SSMS and then open it in a newer version you will be given the option to upgrade it. Once you do this upgrade then the older version of SSMS will not be able to open the solution any more. Now this ranks as more of an annoyance than disaster as the files within the projects are not affected in any way, you would just have to delete the files mentioned and recreate the solution in the older version again. Summary So, here we have seen how using SSMS Projects and Solutions can help keep related code files (and other document types) together in a neat structure so that they can be quickly navigated during a presentation and it also makes it incredibly simple to distribute your code and share it with others. I hope this is of use to you and helps you bring more order into your sql files, whether you are a person that does technical presentations or not, having your code grouped and managed can make for a lot of advantages as your code library expands.  

    Read the article

  • Network communications mechanisms for SQL Server

    - by Akshay Deep Lamba
    Problem I am trying to understand how SQL Server communicates on the network, because I'm having to tell my networking team what ports to open up on the firewall for an edge web server to communicate back to the SQL Server on the inside. What do I need to know? Solution In order to understand what needs to be opened where, let's first talk briefly about the two main protocols that are in common use today: TCP - Transmission Control Protocol UDP - User Datagram Protocol Both are part of the TCP/IP suite of protocols. We'll start with TCP. TCP TCP is the main protocol by which clients communicate with SQL Server. Actually, it is more correct to say that clients and SQL Server use Tabular Data Stream (TDS), but TDS actually sits on top of TCP and when we're talking about Windows and firewalls and other networking devices, that's the protocol that rules and controls are built around. So we'll just speak in terms of TCP. TCP is a connection-oriented protocol. What that means is that the two systems negotiate the connection and both agree to it. Think of it like a phone call. While one person initiates the phone call, the other person has to agree to take it and both people can end the phone call at any time. TCP is the same way. Both systems have to agree to the communications, but either side can end it at any time. In addition, there is functionality built into TCP to ensure that all communications can be disassembled and reassembled as necessary so it can pass over various network devices and be put together again properly in the right order. It also has mechanisms to handle and retransmit lost communications. Because of this functionality, TCP is the protocol used by many different network applications. The way the applications all can share is through the use of ports. When a service, like SQL Server, comes up on a system, it must listen on a port. For a default SQL Server instance, the default port is 1433. Clients connect to the port via the TCP protocol, the connection is negotiated and agreed to, and then the two sides can transfer information as needed until either side decides to end the communication. In actuality, both sides will have a port to use for the communications, but since the client's port is typically determined semi-randomly, when we're talking about firewalls and the like, typically we're interested in the port the server or service is using. UDP UDP, unlike TCP, is not connection oriented. A "client" can send a UDP communications to anyone it wants. There's nothing in place to negotiate a communications connection, there's nothing in the protocol itself to coordinate order of communications or anything like that. If that's needed, it's got to be handled by the application or by a protocol built on top of UDP being used by the application. If you think of TCP as a phone call, think of UDP as a postcard. I can put a postcard in the mail to anyone I want, and so long as it is addressed properly and has a stamp on it, the postal service will pick it up. Now, what happens it afterwards is not guaranteed. There's no mechanism for retransmission of lost communications. It's great for short communications that doesn't necessarily need an acknowledgement. Because multiple network applications could be communicating via UDP, it uses ports, just like TCP. The SQL Browser or the SQL Server Listener Service uses UDP. Network Communications - Talking to SQL Server When an instance of SQL Server is set up, what TCP port it listens on depends. A default instance will be set up to listen on port 1433. A named instance will be set to a random port chosen during installation. In addition, a named instance will be configured to allow it to change that port dynamically. What this means is that when a named instance starts up, if it finds something already using the port it normally uses, it'll pick a new port. If you have a named instance, and you have connections coming across a firewall, you're going to want to use SQL Server Configuration Manager to set a static port. This will allow the networking and security folks to configure their devices for maximum protection. While you can change the network port for a default instance of SQL Server, most people don't. Network Communications - Finding a SQL Server When just the name is specified for a client to connect to SQL Server, for instance, MySQLServer, this is an attempt to connect to the default instance. In this case the client will automatically attempt to communicate to port 1433 on MySQLServer. If you've switched the port for the default instance, you'll need to tell the client the proper port, usually by specifying the following syntax in the connection string: <server>,<port>. For instance, if you moved SQL Server to listen on 14330, you'd use MySQLServer,14330 instead of just MySQLServer. However, because a named instance sets up its port dynamically by default, the client never knows at the outset what the port is it should talk to. That's what the SQL Browser or the SQL Server Listener Service (SQL Server 2000) is for. In this case, the client sends a communication via the UDP protocol to port 1434. It asks, "Where is the named instance?" So if I was running a named instance called SQL2008R2, it would be asking the SQL Browser, "Hey, how do I talk to MySQLServer\SQL2008R2?" The SQL Browser would then send back a communications from UDP port 1434 back to the client telling the client how to talk to the named instance. Of course, you can skip all of this of you set that named instance's port statically. Then you can use the <server>,<port> mechanism to connect and the client won't try to talk to the SQL Browser service. It'll simply try to make the connection. So, for instance, is the SQL2008R2 instance was listening on port 20080, specifying MySQLServer,20080 would attempt a connection to the named instance. Network Communications - Named Pipes Named pipes is an older network library communications mechanism and it's generally not used any longer. It shouldn't be used across a firewall. However, if for some reason you need to connect to SQL Server with it, this protocol also sits on top of TCP. Named Pipes is actually used by the operating system and it has its own mechanism within the protocol to determine where to route communications. As far as network communications is concerned, it listens on TCP port 445. This is true whether we're talking about a default or named instance of SQL Server. The Summary Table To put all this together, here is what you need to know: Type of Communication Protocol Used Default Port Finding a SQL Server or SQL Server Named Instance UDP 1434 Communicating with a default instance of SQL Server TCP 1433 Communicating with a named instance of SQL Server TCP * Determined dynamically at start up Communicating with SQL Server via Named Pipes TCP 445

    Read the article

  • Sub query pass through

    - by SQL and the like
    Occasionally in forums and on client sites I see conditional subqueries in statements. This is where the developer has decided that it is only necessary to process some data under a certain condition.  By way of example, something like this : Create Procedure GetOrder @SalesOrderId integer, @CountDetails tinyint as Select SOH.salesorderid , case when @CountDetails = 1 then (Select count(*) from Sales.SalesOrderDetail SOD where SOH.SalesOrderID = SOD.SalesOrderID) end from sales.SalesOrderHeader...(read more)

    Read the article

  • Does query plan optimizer works well with joined/filtered table-valued functions?

    - by smoothdeveloper
    In SQLSERVER 2005, I'm using table-valued function as a convenient way to perform arbitrary aggregation on subset data from large table (passing date range or such parameters). I'm using theses inside larger queries as joined computations and I'm wondering if the query plan optimizer work well with them in every condition or if I'm better to unnest such computation in my larger queries. Does query plan optimizer unnest table-valued functions if it make sense? If it doesn't, what do you recommend to avoid code duplication that would occur by manually unnesting them? If it does, how do you identify that from the execution plan? code sample: create table dbo.customers ( [key] uniqueidentifier , constraint pk_dbo_customers primary key ([key]) ) go /* assume large amount of data */ create table dbo.point_of_sales ( [key] uniqueidentifier , customer_key uniqueidentifier , constraint pk_dbo_point_of_sales primary key ([key]) ) go create table dbo.product_ranges ( [key] uniqueidentifier , constraint pk_dbo_product_ranges primary key ([key]) ) go create table dbo.products ( [key] uniqueidentifier , product_range_key uniqueidentifier , release_date datetime , constraint pk_dbo_products primary key ([key]) , constraint fk_dbo_products_product_range_key foreign key (product_range_key) references dbo.product_ranges ([key]) ) go . /* assume large amount of data */ create table dbo.sales_history ( [key] uniqueidentifier , product_key uniqueidentifier , point_of_sale_key uniqueidentifier , accounting_date datetime , amount money , quantity int , constraint pk_dbo_sales_history primary key ([key]) , constraint fk_dbo_sales_history_product_key foreign key (product_key) references dbo.products ([key]) , constraint fk_dbo_sales_history_point_of_sale_key foreign key (point_of_sale_key) references dbo.point_of_sales ([key]) ) go create function dbo.f_sales_history_..snip.._date_range ( @accountingdatelowerbound datetime, @accountingdateupperbound datetime ) returns table as return ( select pos.customer_key , sh.product_key , sum(sh.amount) amount , sum(sh.quantity) quantity from dbo.point_of_sales pos inner join dbo.sales_history sh on sh.point_of_sale_key = pos.[key] where sh.accounting_date between @accountingdatelowerbound and @accountingdateupperbound group by pos.customer_key , sh.product_key ) go -- TODO: insert some data -- this is a table containing a selection of product ranges declare @selectedproductranges table([key] uniqueidentifier) -- this is a table containing a selection of customers declare @selectedcustomers table([key] uniqueidentifier) declare @low datetime , @up datetime -- TODO: set top query parameters . select saleshistory.customer_key , saleshistory.product_key , saleshistory.amount , saleshistory.quantity from dbo.products p inner join @selectedproductranges productrangeselection on p.product_range_key = productrangeselection.[key] inner join @selectedcustomers customerselection on 1 = 1 inner join dbo.f_sales_history_..snip.._date_range(@low, @up) saleshistory on saleshistory.product_key = p.[key] and saleshistory.customer_key = customerselection.[key] I hope the sample makes sense. Much thanks for your help!

    Read the article

  • How does C#'s DateTime.Now affect query plan caching in SQL Server?

    - by Bill Paetzke
    Given: Let's say we have a stored procedure. It reports data back to a user on a webpage. The user can set a date range. If the user sets today's date as the "end date," which includes today's data, the web app passes DateTime.Now to the sql proc. Let's say that one user runs a report--5/1/2010 to now--over and over several times. On the webpage, the user sees "5/1/2010" to "5/4/2010." But the web app passes DateTime.Now to the sql proc as the end date. So, the end date in the proc will always be different, although the user is querying a similar date range. Assume the number of records in the table and number of users are large. So any performance gains matter. Hence the importance of the question. Question: Does passing DateTime.Now as a parameter to a proc prevent SQL Server from caching the query plan? If so, then is the web app missing out on huge performance gains? Possible Solution: I thought DateTime.Today.AddDays(1) would be a possible solution. It would allow the user to get the latest data and always pass the same end date to the sql proc--"5/5/2010" in this case. Please speak to this as well. Sample proc and execution (if that helps to understand): CREATE PROCEDURE GetFooData @StartDate datetime @EndDate datetime AS SELECT * FROM Foo WHERE LogDate >= @StartDate AND LogDate < @EndDate Here's a sample execution using DateTime.Now: EXEC GetFooData '2010-05-01', '2010-05-04 15:41:27' -- passed in DateTime.Now Here's a sample execution using DateTime.Today.AddDays(1) EXEC GetFooData '2010-05-01', '2010-05-05' -- passed in DateTime.Today.AddDays(1) The same data is returned for both procs, since the current time is: 2010-05-04 15:41:27.

    Read the article

  • How does DateTime.Now affect query plan caching in SQL Server?

    - by Bill Paetzke
    Question: Does passing DateTime.Now as a parameter to a proc prevent SQL Server from caching the query plan? If so, then is the web app missing out on huge performance gains? Possible Solution: I thought DateTime.Today.AddDays(1) would be a possible solution. It would pass the same end-date to the sql proc (per day). And the user would still get the latest data. Please speak to this as well. Given Example: Let's say we have a stored procedure. It reports data back to a user on a webpage. The user can set a date range. If the user sets today's date as the "end date," which includes today's data, the web app passes DateTime.Now to the sql proc. Let's say that one user runs a report--5/1/2010 to now--over and over several times. On the webpage, the user sees 5/1/2010 to 5/4/2010. But the web app passes DateTime.Now to the sql proc as the end date. So, the end date in the proc will always be different, although the user is querying a similar date range. Assume the number of records in the table and number of users are large. So any performance gains matter. Hence the importance of the question. Example proc and execution (if that helps to understand): CREATE PROCEDURE GetFooData @StartDate datetime @EndDate datetime AS SELECT * FROM Foo WHERE LogDate >= @StartDate AND LogDate < @EndDate Here's a sample execution using DateTime.Now: EXEC GetFooData '2010-05-01', '2010-05-04 15:41:27' -- passed in DateTime.Now Here's a sample execution using DateTime.Today.AddDays(1) EXEC GetFooData '2010-05-01', '2010-05-05' -- passed in DateTime.Today.AddDays(1) The same data is returned for both procs, since the current time is: 2010-05-04 15:41:27.

    Read the article

  • SQL Server Log File Won't Shrink due cause "log are pending replication" on non replicated DB?

    - by user796466
    I have a non Mission Critial DB 9am-5pm SQL Server database that I have set up to do nightly full backups and log backups every 30 minutes during business hours. The database is in full recovery and normally I have no reason to truncate/shrink logs unless I do some heavy maintenance. Log backups manage the size with no issue. However I have not been at this client for several weeks and upon inspection I noticed that the log had grown to about 10 times the size of the .mdf file. I poked around backups had been running and I had not gotten any severity error alerts (SQL mail). I attempted to put DB in simple recovery and shrink the log, this was no good. I precede to try a log backup and I got: The log was not truncated because records at the beginning of the log are pending replication or Change Data Capture. Ensure the Log Reader Agent or capture job is running or use sp_repldone to mark transactions as distributed or captured. Restart SQL Server rinse repeat same thing ... I said ??? Replication is not nor ever has been set up on this DB or database /server ??? So the log backups have not been flushing the .ldf. So I did a couple hours of research and I found: http://www.sqlmonster.com/Uwe/Forum.aspx/sql-server/5445/Log-file-is-not-truncated-inspite-of-regular-log-backup http://www.eggheadcafe.com/software/aspnet/30708322/the-log-was-not-truncated-because-records-at-the-beginning-of-the-log-are-pending-replication.aspx seems to be some kind of poorly documented bug ?? The solution seems to have been to run exec sp_repldone, more precisley EXEC sp_repldone @xactid = NULL, @xact_segno = NULL, @numtrans = 0, @time= 0, @reset = 1 This procedure can be used in emergency situations to allow truncation of the transaction log when transactions pending replication are present. Using this procedure prevents Microsoft SQL Server 2000 from replicating the database until the database is unpublished and republished. ~ MSDN When I do that I get the following Msg 18757, Level 16, State 1, Procedure sp_repldone, Line 1 Unable to execute procedure. The database is not published. Execute the procedure in a database that is published for replication. Which makes sense Because the DB has never been published for replication. I have several questions: A) First and foremost is, WTF is going on ? What is causeing this, I am interested in knowing the why here ? Is this genuinley a bug or is there some aspect of the backup that is not functioning properly that cause's the DB to mimick a replicated state ? Someone please edify me on this. B) Second ... Do I really have to publish / replicate this DB to exec this SP to fix this ??? Sounds crazy or is there some T-SQL that I can put it in a published state exec the proc and be on my way ... C) Third, if I do indeed have to publish this database to exec the SP to release this unneeded mis replicated/intended log , to get my .ldf file and backup back on track. How do I publish the database without an online host that it is asking for ??? I don't generally do this kind of database administration and need some guidance. Sorry if this is too verbose but just voicing the question helps me clarify it ... Thank you in advance for your help

    Read the article

  • Investigation: Can different combinations of components effect Dataflow performance?

    - by jamiet
    Introduction The Dataflow task is one of the core components (if not the core component) of SQL Server Integration Services (SSIS) and often the most misunderstood. This is not surprising, its an incredibly complicated beast and we’re abstracted away from that complexity via some boxes that go yellow red or green and that have some lines drawn between them. Example dataflow In this blog post I intend to look under that facade and get into some of the nuts and bolts of the Dataflow Task by investigating how the decisions we make when building our packages can affect performance. I will do this by comparing the performance of three dataflows that all have the same input, all produce the same output, but which all operate slightly differently by way of having different transformation components. I also want to use this blog post to challenge a common held opinion that I see perpetuated over and over again on the SSIS forum. That is, that people assume adding components to a dataflow will be detrimental to overall performance. Its not surprising that people think this –it is intuitive to think that more components means more work- however this is not a view that I share. I have always been of the opinion that there are many factors affecting dataflow duration and the number of components is actually one of the less important ones; having said that I have never proven that assertion and that is one reason for this investigation. I have actually seen evidence that some people think dataflow duration is simply a function of number of rows and number of components. I’ll happily call that one out as a myth even without any investigation!  The Setup I have a 2GB datafile which is a list of 4731904 (~4.7million) customer records with various attributes against them and it contains 2 columns that I am going to use for categorisation: [YearlyIncome] [BirthDate] The data file is a SSIS raw format file which I chose to use because it is the quickest way of getting data into a dataflow and given that I am testing the transformations, not the source or destination adapters, I want to minimise external influences as much as possible. In the test I will split the customers according to month of birth (12 of those) and whether or not their yearly income is above or below 50000 (2 of those); in other words I will be splitting them into 24 discrete categories and in order to do it I shall be using different combinations of SSIS’ Conditional Split and Derived Column transformation components. The 24 datapaths that occur will each input to a rowcount component, again because this is the least resource intensive means of terminating a datapath. The test is being carried out on a Dell XPS Studio laptop with a quad core (8 logical Procs) Intel Core i7 at 1.73GHz and Samsung SSD hard drive. Its running SQL Server 2008 R2 on Windows 7. The Variables Here are the three combinations of components that I am going to test:     One Conditional Split - A single Conditional Split component CSPL Split by Month of Birth and income category that will use expressions on [YearlyIncome] & [BirthDate] to send each row to one of 24 outputs. This next screenshot displays the expression logic in use: Derived Column & Conditional Split - A Derived Column component DER Income Category that adds a new column [IncomeCategory] which will contain one of two possible text values {“LessThan50000”,”GreaterThan50000”} and uses [YearlyIncome] to determine which value each row should get. A Conditional Split component CSPL Split by Month of Birth and Income Category then uses that new column in conjunction with [BirthDate] to determine which of the same 24 outputs to send each row to. Put more simply, I am separating the Conditional Split of #1 into a Derived Column and a Conditional Split. The next screenshots display the expression logic in use: DER Income Category         CSPL Split by Month of Birth and Income Category       Three Conditional Splits - A Conditional Split component that produces two outputs based on [YearlyIncome], one for each Income Category. Each of those outputs will go to a further Conditional Split that splits the input into 12 outputs, one for each month of birth (identical logic in each). In this case then I am separating the single Conditional Split of #1 into three Conditional Split components. The next screenshots display the expression logic in use: CSPL Split by Income Category         CSPL Split by Month of Birth 1& 2       Each of these combinations will provide an input to one of the 24 rowcount components, just the same as before. For illustration here is a screenshot of the dataflow containing three Conditional Split components: As you can these dataflows have a fair bit of work to do and remember that they’re doing that work for 4.7million rows. I will execute each dataflow 10 times and use the average for comparison. I foresee three possible outcomes: The dataflow containing just one Conditional Split (i.e. #1) will be quicker There is no significant difference between any of them One of the two dataflows containing multiple transformation components will be quicker Regardless of which of those outcomes come to pass we will have learnt something and that makes this an interesting test to carry out. Note that I will be executing the dataflows using dtexec.exe rather than hitting F5 within BIDS. The Results and Analysis The table below shows all of the executions, 10 for each dataflow. It also shows the average for each along with a standard deviation. All durations are in seconds. I’m pasting a screenshot because I frankly can’t be bothered with the faffing about needed to make a presentable HTML table. It is plain to see from the average that the dataflow containing three conditional splits is significantly faster, the other two taking 43% and 52% longer respectively. This seems strange though, right? Why does the dataflow containing the most components outperform the other two by such a big margin? The answer is actually quite logical when you put some thought into it and I’ll explain that below. Before progressing, a side note. The standard deviation for the “Three Conditional Splits” dataflow is orders of magnitude smaller – indicating that performance for this dataflow can be predicted with much greater confidence too. The Explanation I refer you to the screenshot above that shows how CSPL Split by Month of Birth and salary category in the first dataflow is setup. Observe that there is a case for each combination of Month Of Date and Income Category – 24 in total. These expressions get evaluated in the order that they appear and hence if we assume that Month of Date and Income Category are uniformly distributed in the dataset we can deduce that the expected number of expression evaluations for each row is 12.5 i.e. 1 (the minimum) + 24 (the maximum) divided by 2 = 12.5. Now take a look at the screenshots for the second dataflow. We are doing one expression evaluation in DER Income Category and we have the same 24 cases in CSPL Split by Month of Birth and Income Category as we had before, only the expression differs slightly. In this case then we have 1 + 12.5 = 13.5 expected evaluations for each row – that would account for the slightly longer average execution time for this dataflow. Now onto the third dataflow, the quick one. CSPL Split by Income Category does a maximum of 2 expression evaluations thus the expected number of evaluations per row is 1.5. CSPL Split by Month of Birth 1 & CSPL Split by Month of Birth 2 both have less work to do than the previous Conditional Split components because they only have 12 cases to test for thus the expected number of expression evaluations is 6.5 There are two of them so total expected number of expression evaluations for this dataflow is 6.5 + 6.5 + 1.5 = 14.5. 14.5 is still more than 12.5 & 13.5 though so why is the third dataflow so much quicker? Simple, the conditional expressions in the first two dataflows have two boolean predicates to evaluate – one for Income Category and one for Month of Birth; the expressions in the Conditional Split in the third dataflow however only have one predicate thus they are doing a lot less work. To sum up, the difference in execution times can be attributed to the difference between: MONTH(BirthDate) == 1 && YearlyIncome <= 50000 and MONTH(BirthDate) == 1 In the first two dataflows YearlyIncome <= 50000 gets evaluated an average of 12.5 times for every row whereas in the third dataflow it is evaluated once and once only. Multiply those 11.5 extra operations by 4.7million rows and you get a significant amount of extra CPU cycles – that’s where our duration difference comes from. The Wrap-up The obvious point here is that adding new components to a dataflow isn’t necessarily going to make it go any slower, moreover you may be able to achieve significant improvements by splitting logic over multiple components rather than one. Performance tuning is all about reducing the amount of work that needs to be done and that doesn’t necessarily mean use less components, indeed sometimes you may be able to reduce workload in ways that aren’t immediately obvious as I think I have proven here. Of course there are many variables in play here and your mileage will most definitely vary. I encourage you to download the package and see if you get similar results – let me know in the comments. The package contains all three dataflows plus a fourth dataflow that will create the 2GB raw file for you (you will also need the [AdventureWorksDW2008] sample database from which to source the data); simply disable all dataflows except the one you want to test before executing the package and remember, execute using dtexec, not within BIDS. If you want to explore dataflow performance tuning in more detail then here are some links you might want to check out: Inequality joins, Asynchronous transformations and Lookups Destination Adapter Comparison Don’t turn the dataflow into a cursor SSIS Dataflow – Designing for performance (webinar) Any comments? Let me know! @Jamiet

    Read the article

  • PowerShell: Read Excel to Create Inserts

    - by BuckWoody
    I’m writing a series of articles on how to migrate “departmental” data into SQL Server. I also hold workshops on the entire process – from discovering that the data exists to the modeling process and then how to design the Extract, Transform and Load (ETL) process. Finally I write about (and teach) a few methods on actually moving the data. One of those options is to use PowerShell. There are a lot of ways even with that choice, but the one I show is to read two columns from the spreadsheet and output statements that would insert the data using a stored procedure. Of course, you could re-write this as INSERT statements, out to a text file for bcp, or even use a database connection in the script to move the data directly from Excel into SQL Server. This snippet won’t run on your system, of course – it assumes a Microsoft Office Excel 2007 spreadsheet located at c:\temp called VendorList.xlsx. It looks for a tab in that spreadsheet called Vendors. The statement that does the writing just uses one column: Vendor Code. Here’s the breakdown of what I’m doing: In the first block, I connect to Microsoft Office Excel. That connection string is specific to Excel 2007, so if you need a different version you’ll need to look that up. In the second block I set up a selection from the entire spreadsheet based on that tab. Note that if you’re only after certain data you shouldn’t get the whole spreadsheet – that’s just good practice. In the next block I create the text I want, inserting the Vendor Code field as I go. Finally I close the connection. Enjoy! $ExcelConnection= New-Object -com "ADODB.Connection" $ExcelFile="c:\temp\VendorList.xlsx" $ExcelConnection.Open("Provider=Microsoft.ACE.OLEDB.12.0;` Data Source=$ExcelFile;Extended Properties=Excel 12.0;") $strQuery="Select * from [Vendors$]" $ExcelRecordSet=$ExcelConnection.Execute($strQuery) do { Write-Host "EXEC sp_InsertVendors '" $ExcelRecordSet.Fields.Item("Vendor Code").Value "'" $ExcelRecordSet.MoveNext()} Until ($ExcelRecordSet.EOF) $ExcelConnection.Close() Script Disclaimer, for people who need to be told this sort of thing: Never trust any script, including those that you find here, until you understand exactly what it does and how it will act on your systems. Always check the script on a test system or Virtual Machine, not a production system. All scripts on this site are performed by a professional stunt driver on a closed course. Your mileage may vary. Void where prohibited. Offer good for a limited time only. Keep out of reach of small children. Do not operate heavy machinery while using this script. If you experience blurry vision, indigestion or diarrhea during the operation of this script, see a physician immediately. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • MS SQL Server 2008 Developer Training Kit Released

    - by Aamir Hasan
    The SQL Server 2008 Developer Training Kit will help you understand how to build web applications which deeply exploit the rich data types, programming models and new development paradigms in SQL Server 2008.  http://www.microsoft.com/downloads/details.aspx?FamilyID=E9C68E1B-1E0E-4299-B498-6AB3CA72A6D7&displaylang=en

    Read the article

  • SQL Azure vs. SQL Server

    If youd like to know the differences between SQL Server and SQL Azure, check this white paper. This FAQ is also interesting. var addthis_pub="guybarrette";...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • HTTP Push from SQL Server Comet SQL

    Article provides example solution for presenting data in "real-time" from Microsoft SQL Server in HTML browser. Article presents how to implement Comet functionality in ASP.NET and how to connect Comet with Query Notification from SQL Server....Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Post SQL 2008 R2 Launch Thurs 15th London - UK SQL Server User Group is having a Social Event @ the

    - by tonyrogerson
    The UK SQL Server User Group is organising a Social event for SQL and SQL Server professionals, the event will be held after the SQL Server 2008 R2 launch event and is a short walk from that venue. See site for more information: http://sqlserverfaq.com/events/222/Social-for-SQL-and-SQL-Server-professionals-SQL-quiz-meet-your-peers-ask-the-group-Q-A.aspx We are putting some light bites on, if you are coming then do let us know through the site. Neil Hambly who is the London UK SQL Server User Group...(read more)

    Read the article

  • Non-English Character Display in Oracle SQL Developer

    - by thatjeffsmith
    I get a variation on this question at least once a week, if not more frequently. I’m from Israel, and the language on the databases is Hebrew. When I use the old and deprecated SQL*Plus (windows rich client) I can see the hebrew clearly, when I use the latest SQL Developer, I get gibberish. This question appears on the forums about every week or so as well. So what’s the deal? Well, it starts with a basic misunderstanding of NLS Client parameters. These should accurately reflect the language and locality setup on your LOCAL machine. DO NOT COPY what’s set in the database. The these parameters work together with the database so that information can be transferred back and forth correctly. Having the wrong NLS parameters locally can be bad. [ORACLE DOCS]Setting the NLS_LANG parameter properly is essential to proper data conversion. The character set that is specified by the NLS_LANG parameter should reflect the setting for the client operating system. Setting NLS_LANG correctly enables proper conversion from the client operating system character encoding to the database character set. When these settings are the same, Oracle Database assumes that the data being sent or received is encoded in the same character set as the database character set, so character set validation or conversion may not be performed. This can lead to corrupt data if conversions are necessary. OK, so what are you supposed to do? Set the Font! 9 times out of 10, this preference fixes the problem with display issues. Make sure you set a Font that supports the characters you’re trying to display. It’s as simple as that. This preference defines the font used to display characters in the editors and the data grids. If you have it set to a font that doesn’t have Hebrew character support – you’re not going to see Hebrew in SQL Developer. A few years ago…wow, like 15 years ago, I learned that the Tohama Font is pretty Unicode-friendly. Bad Font Selection A Font that’s not non-English friendly Good Font Selection Exact same text, except rendered with the Tahoma font Summary Having problems seeing non-English text in SQL Developer? Check the font! And do not start messing with NLS parameters without talking to your DBA first.

    Read the article

  • SQL Server CTE Basics

    The CTE was introduced into standard SQL in order to simplify various classes of SQL Queries for which a derived table just wasn't suitable. For some reason, it can be difficult to grasp the techniques of using it. Well, that's before Rob Sheldon explained it all so clearly for us.

    Read the article

< Previous Page | 89 90 91 92 93 94 95 96 97 98 99 100  | Next Page >