Search Results

Search found 8340 results on 334 pages for 'merge join'.

Page 20/334 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27  | Next Page >

  • problem with join SQL Server 2000

    - by eyalb
    I have 3 tables - Items, Props, Items_To_Props i need to return all items that match all properties that i send example items 1 2 3 4 props T1 T2 T3 items_to_props 1 T1 1 T2 1 T3 2 T1 3 T1 when i send T1,T2 i need to get only item 1

    Read the article

  • LINQ Left Join And Right Join

    - by raja
    Hi, I need a help, I have two dataTable called A and B , i need all rows from A and matching row of B Ex: A: B: User | age| Data ID | age|Growth 1 |2 |43.5 1 |2 |46.5 2 |3 |44.5 1 |5 |49.5 3 |4 |45.6 1 |6 |48.5 I need Out Put: User | age| Data |Growth ------------------------ 1 |2 |43.5 |46.5 2 |3 |44.5 | 3 |4 |45.6 |

    Read the article

  • Grails - Need to restrict fetched rows based on condition on join table

    - by sector7
    Hi guys, I have these two domains Car and Driver which have many-to-many relationship. This association is defined in table tblCarsDrivers which has, not surprisingly, primary keys of both the tables BUT additionally also has a new boolean field deleted. Herein lies the problem. When I find/get query on domain Car, I am fetched all related drivers irrespective of their deleted status in tblCarsDrivers, which is expected. I need to put a clause/constraint to exclude the deleted drivers from the list of fetched records. PS: I tried using an association domain CarDriver in joinTable name but that seems not to work. Apparently it expects only table names, not maps. PPS: I know its unnatural to have any other fields besides the mapping keys in mapping table but this is how I got it and it cant be changed. Car domain is defined as such - class Car { Integer id String name static hasMany = [drivers:Driver] static mapping = { table 'tblCars' version false drivers joinTable:[name: 'tblCarsDrivers',column:'driverid',key:'carid'] } } Thanks!

    Read the article

  • Need help building SQL Query (simple JOIN)

    - by Newbie
    Hello! In my database, I have a "users", a "quests" and a "questings" table. A user can solve a quest. Solving a quest will save the "user_id" and the "quest_id" in my "questings" table. Now, I want to select all quests, a user has NOT solved (meaning there is no entry for this user and quest in "questings" table)! Let's say the user has the id 14. How to write this query? After solving this query, I want to filter the results, too. A quest and a user has a city, too. What to do for writing a query which returns all quests, a user has NOT solved yet, in the users city (user city == quest city)?

    Read the article

  • mysql join on two indexes takes long time!!

    - by Alaa
    Hi All I have a custom query in dripal, this query is: select count(distinct B.src) from node A, url_alias B where concat('node/',A.nid)= B.src; now, nid in node is primary key and i have made src as an index in url_alias table. after waiting for more than a minute i got this: +-----------------------+ | count(distinct B.src) | +-----------------------+ | 325715 | +-----------------------+ 1 row in set (1 min 24.37 sec) now my question is: why did this query take this long, and how to optimize it?? Thanks for your help

    Read the article

  • Merge two data frames together that have the same variable names and data types

    - by Brandon
    I have tried the merge function to merge two csv files that I imported. They both have the same variable names and data types but each time I run merge all that I get is an object that contains the names of the two data frames. I have tried the following: # ex1 obj <- merge(obj1, obj2, by=obj) # ex2 obj <- merge(obj1, obj2, all) and several other iterations of the above. Is merge the correct function? If so, what am I doing wrong?

    Read the article

  • Finding group maxes in SQL join result

    - by Gene
    Two SQL tables. One contestant has many entries: Contestants Entries Id Name Id Contestant_Id Score -- ---- -- ------------- ----- 1 Fred 1 3 100 2 Mary 2 3 22 3 Irving 3 1 888 4 Grizelda 4 4 123 5 1 19 6 3 50 Low score wins. Need to retrieve current best scores of all contestants ordered by score: Best Entries Report Name Entry_Id Score ---- -------- ----- Fred 5 19 Irving 2 22 Grizelda 4 123 I can certainly get this done with many queries. My question is whether there's a way to get the result with one, efficient SQL query. I can almost see how to do it with GROUP BY, but not quite. In case it's relevant, the environment is Rails ActiveRecord and PostgreSQL.

    Read the article

  • How to join 2 tables & display them correctly?

    - by steven
    http://img293.imageshack.us/img293/857/tablez.jpg Here is a picture of the 2 tables. The mybb_users table is the table that has the users that signed up for the forum. The mybb_userfields is the table that contain custom profile field data that they are able to customize & change in their profile. Now, all I want to do is display all users in rows with the custom profile field data that they provided in their profile(which is in the mybb_userfields table) How can I display these fields correctly together? For instance, p0gz is a male,lives in AZ,he owns a 360,does not know his bandwidth & Flip Side Phoenix is his team. How can it just be like "p0gz-male-az-360-dont know-flipside phoenix" in a row~???

    Read the article

  • mysql join 3 tables and count

    - by air
    Please look at this image here is 3 tables , and out i want is uid from table1 industry from table 3 of same uid count of fid from table 2 of same uid like in the sample example output will be 2 records Thanks

    Read the article

  • Fetch last item in a category that fits specific criteria

    - by Franz
    Let's assume I have a database with two tables: categories and articles. Every article belongs to a category. Now, let's assume I want to fetch the latest article of each category that fits a specific criteria (read: the article does). If it weren't for that extra criteria, I could just add a column called last_article_id or something similar to the categories table - even though that wouldn't be properly normalized. How can I do this though? I assume there's something using GROUP BY and HAVING?

    Read the article

  • Merging two arrays in PHP

    - by Industrial
    Hi everyone, I am trying to create a new array from two current arrays. Tried array_merge, but it will not give me what I want. $array1 is a list of keys that I pass to a function. $array2 holds the results from that function, but doesn't contain any non-available resuls for keys. So, I want to make sure that all requested keys comes out with 'null':ed values, as according to the shown $result array. It goes a little something like this: $array1 = array('item1', 'item2', 'item3', 'item4'); $array2 = array( 'item1' => 'value1', 'item2' => 'value2', 'item3' => 'value3' ); Here's the result I want: $result = array( 'item1' => 'value1', 'item2' => 'value2', 'item3' => 'value3', 'item4' => '' ); It can be done this way, but I don't think that it's a good solution - I really don't like to take the easy way out and suppress PHP errors by adding @:s in the code. This sample would obviously throw errors since 'item4' is not in $array2, based on the example. foreach ($keys as $k => $v){ @$array[$v] = $items[$v]; } So, what's the fastest (performance-wise) way to accomplish the same result?

    Read the article

  • Multiple Foreign keys to a single table and single key pointing to more than one table

    - by user1216775
    I need some suggestions from the database design experts here. I have around six foreign keys into a single table (defect) which all point to primary key in user table. It is like: defect (.....,assigned_to,created_by,updated_by,closed_by...) If I want to get information about the defect I can make six joins. Do we have any better way to do it? Another one is I have a states table which can store one of the user-defined set of values. I have defect table and task table and I want both of these tables to share the common state table (New, In Progress etc.). So I created: task (.....,state_id,type_id,.....) defect(.....,state_id,type_id,...) state(state_id,state_name,...) importance(imp_id,imp_name,...) There are many such common attributes along with state like importance(normal, urgent etc), priority etc. And for all of them I want to use same table. I am keeping one flag in each of the tables to differentiate task and defect. What is the best solution in such a case? If somebody is using this application in health domain, they would like to assign different types, states, importances for their defect or tasks. Moreover when a user selects any project I want to display all the types,states etc under configuration parameters section.

    Read the article

  • 4 table query / join. getting duplicate rows

    - by Horse
    So I have written a query that will grab an order (this is for an ecommerce type site), and from that order id it will get all order items (ecom_order_items), print options (c_print_options) and images (images). The eoi_p_id is currently a foreign key from the images table. This works fine and the query is: SELECT eoi_parentid, eoi_p_id, eoi_po_id, eoi_quantity, i_id, i_parentid, po_name, po_price FROM ecom_order_items, images, c_print_options WHERE eoi_parentid = '1' AND i_id = eoi_p_id AND po_id = eoi_po_id; The above would grab all the stuff I need for order #1 Now to complicate things I added an extra table (ecom_products), which needs to act in a similar way to the images table. The eoi_p_id can also point at a foreign key in this table too. I have added an extra field 'eoi_type' which will either have the value 'image', or 'product'. Now items in the order could be made up of a mix of items from images or ecom_products. Whatever I try it either ends up with too many records, wont actually output any with eoi_type = 'product', and just generally wont work. Any ideas on how to achieve what I am after? Can provide SQL samples if needed? SELECT eoi_id, eoi_parentid, eoi_p_id, eoi_po_id, eoi_po_id_2, eoi_quantity, eoi_type, i_id, i_parentid, po_name, po_price, po_id, ep_id FROM ecom_order_items, images, c_print_options, ecom_products WHERE eoi_parentid = '9' AND i_id = eoi_p_id AND po_id = eoi_po_id The above outputs duplicate rows and doesnt work as expected. Am I going about this the wrong way? Should I have seperate foreign key fields for the eoi_p_id depending it its an image or a product? Should I be using JOINs? Here is a mysql explain of the tables in question ecom_products +-------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------+--------------+------+-----+---------+----------------+ | ep_id | int(8) | NO | PRI | NULL | auto_increment | | ep_title | varchar(255) | NO | | NULL | | | ep_link | text | NO | | NULL | | | ep_desc | text | NO | | NULL | | | ep_imgdrop | text | NO | | NULL | | | ep_price | decimal(6,2) | NO | | NULL | | | ep_category | varchar(255) | NO | | NULL | | | ep_hide | tinyint(1) | NO | | 0 | | | ep_featured | tinyint(1) | NO | | 0 | | +-------------+--------------+------+-----+---------+----------------+ ecom_order_items +--------------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+-------------+------+-----+---------+----------------+ | eoi_id | int(8) | NO | PRI | NULL | auto_increment | | eoi_parentid | int(8) | NO | | NULL | | | eoi_type | varchar(32) | NO | | NULL | | | eoi_p_id | int(8) | NO | | NULL | | | eoi_po_id | int(8) | NO | | NULL | | | eoi_quantity | int(4) | NO | | NULL | | +--------------+-------------+------+-----+---------+----------------+ c_print_options +------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+--------------+------+-----+---------+----------------+ | po_id | int(8) | NO | PRI | NULL | auto_increment | | po_name | varchar(255) | NO | | NULL | | | po_price | decimal(6,2) | NO | | NULL | | +------------+--------------+------+-----+---------+----------------+ images +--------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+--------------+------+-----+---------+----------------+ | i_id | int(8) | NO | PRI | NULL | auto_increment | | i_filename | varchar(255) | NO | | NULL | | | i_data | longtext | NO | | NULL | | | i_parentid | int(8) | NO | | NULL | | +--------------+--------------+------+-----+---------+----------------+

    Read the article

  • How to join by column name

    - by Daniel Vaca
    I have a table T1 such that gsdv |nsdv |esdv ------------------- 228.90 |216.41|0.00 and a table T2 such that ds |nm -------------------------- 'Non-Revenue Sales'|'ESDV' 'Gross Sales' |'GSDV' 'Net Sales' |'NSDV' How do I get the following table? ds |nm |val --------------------------------- 'Non-Revenue Sales'|'ESDV'|0.00 'Gross Sales' |'GSDV'|228.90 'Net Sales' |'NSDV'|216.41 I know that I can this by doing the following SELECT ds,nm,esdv val FROM T1,T2 WHERE nm = 'esdv' UNION SELECT ds,nm,gsdv val FROM T1,T2 WHERE nm = 'gsdv' UNION SELECT ds,nm,nsdv val FROM T1,T2 WHERE nm = 'nsdv' but I am looking for a more generic/nicer solution. I am using Sybase, but if you can think of a way to do this with other DBMS, please let me know. Thanks.

    Read the article

  • SQL Full Outer Join

    - by Torment March
    I have a table named 'Logs' with the following values : CheckDate CheckType CheckTime ------------------------------------------- 2011-11-25 IN 14:40:00 2011-11-25 OUT 14:45:00 2011-11-25 IN 14:50:00 2011-11-25 OUT 14:55:00 2011-11-25 IN 15:00:00 2011-11-25 OUT 15:05:00 2011-11-25 IN 15:15:00 2011-11-25 OUT 15:20:00 2011-11-25 IN 15:25:00 2011-11-25 OUT 15:30:00 2011-11-25 OUT 15:40:00 2011-11-25 IN 15:45:00 I want to use the previous table to produce a result of: CheckDate CheckIn CheckOut ----------------------------------------- 2011-11-25 14:40:00 14:45:00 2011-11-25 14:50:00 14:55:00 2011-11-25 15:00:00 15:05:00 2011-11-25 15:15:00 15:20:00 2011-11-25 15:25:00 15:30:00 2011-11-25 NULL 15:40:00 2011-11-25 15:45:00 NULL So far I have come up with this result set : CheckDate CheckIn CheckOut ----------------------------------------- 2011-11-25 14:40:00 14:45:00 2011-11-25 14:50:00 14:55:00 2011-11-25 15:00:00 15:05:00 2011-11-25 15:15:00 15:20:00 2011-11-25 15:25:00 15:30:00 2011-11-25 15:45:00 NULL The problem is I cannot generate the log without CheckIns : CheckDate CheckIn CheckOut ----------------------------------------- 2011-11-25 NULL 15:40:00 The sequence of CheckIn - CheckOut pairing and order is in increasing time value.

    Read the article

  • Advance Query with Join

    - by user1462589
    I'm trying to convert a product table that contains all the detail of the product into separate tables in SQL. I've got everything done except for duplicated descriptor details. The problem I am having all the products have size/color/style/other that many other products contain. I want to only have one size or color descriptor for all the items and reuse the "ID" for all the product which I believe is a Parent key to the Product ID which is a ...Foreign Key. The only problem is that every descriptor would have multiple Foreign Keys assigned to it. So I was thinking on the fly just have it skip figuring out a Foreign Parent key for each descriptor and just check to see if that descriptor exist and if it does use its Key for the descriptor. Data Table PI Colo Sz OTHER 1 | Blue | 5 | Vintage 2 | Blue | 6 | Vintage 3 | Blac | 5 | Simple 4 | Blac | 6 | Simple =================================== Its destination table is this =================================== DI Description 1 | Blue 2 | Blac 3 | 5 4 | 6 6 | Vintage 7 | Simple ============================= Select Data.Table Unique.Data.Table.Colo Unique.Data.Table.Sz Unique.Data.Table.Other ======================================= Then the dual part of the questions after we create all the descriptors how to do a new query and assign the product ID to the descriptors. PI| DI 1 | 1 1 | 3 1 | 4 2 | 1 2 | 3 2 | 4 By figuring out how to do this I should be able to duplicate this pattern for all 300 + columns in the product. Some of these fields are 60+ characters large so its going to save a ton of space. Do I use a Array?

    Read the article

  • Heaps of Trouble?

    - by Paul White NZ
    If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material.  In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all.  The problem is, predictably, that performance is not very good.  The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found.  Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables.  A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings).  The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan.  The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate.  Every join in the plan shown above estimates that it will produce just a single row as output.  Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows.  It should not surprise you that this has rather dire consequences for the remainder of the query plan.  In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables.  Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each.  The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints.  A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query.  The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement!  As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there.  (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B).  Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear.  The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools.  If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on.  Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID.  The term ‘eager’ means that the spool consumes all of its input rows when it starts up.  The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index.  From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table.  The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself.  The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation.  Nested loops join preserves the order of rows on its outer input, so sorting early is safe.  (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input.  It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation.  It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins.  You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins.  The answer, again, is down to the poor information available to the optimizer.  Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs.  SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk.  The join process then continues, and may again run out of memory.  This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow.  The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion.  You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this).  Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it?  We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs.  The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk.  Even so, the query runs to completion in three or four seconds.  That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information.  We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world.  It’s there primarily for its educational and entertainment value.  I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details.  My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

    Read the article

  • In SQL, a Join is actually an Intersection? And it is also a linkage or a "Sideway Union"?

    - by Jian Lin
    I always thought of a Join in SQL as some kind of linkage between two tables. For example, select e.name, d.name from employees e, departments d where employees.deptID = departments.deptID In this case, it is linking two tables, to show each employee with a department name instead of a department ID. And kind of like a "linkage" or "Union" sideway". But, after learning about inner join vs outer join, it shows that a Join (Inner join) is actually an intersection. For example, when one table has the ID 1, 2, 7, 8, while another table has the ID 7 and 8 only, the way we get the intersection is: select * from t1, t2 where t1.ID = t2.ID to get the two records of "7 and 8". So it is actually an intersection. So we have the "Intersection" of 2 tables. Compare this with the "Union" operation on 2 tables. Can a Join be thought of as an "Intersection"? But what about the "linking" or "sideway union" aspect of it?

    Read the article

  • Merge method in MergeSort Algorithm .

    - by Tony
    I've seen many mergeSort implementations .Here is the version in Data Structures and Algorithms in Java (2nd Edition) by Robert Lafore : private void recMergeSort(long[] workSpace, int lowerBound,int upperBound) { if(lowerBound == upperBound) // if range is 1, return; // no use sorting else { // find midpoint int mid = (lowerBound+upperBound) / 2; // sort low half recMergeSort(workSpace, lowerBound, mid); // sort high half recMergeSort(workSpace, mid+1, upperBound); // merge them merge(workSpace, lowerBound, mid+1, upperBound); } // end else } // end recMergeSort() private void merge(long[] workSpace, int lowPtr, int highPtr, int upperBound) { int j = 0; // workspace index int lowerBound = lowPtr; int mid = highPtr-1; int n = upperBound-lowerBound+1; // # of items while(lowPtr <= mid && highPtr <= upperBound) if( theArray[lowPtr] < theArray[highPtr] ) workSpace[j++] = theArray[lowPtr++]; else workSpace[j++] = theArray[highPtr++]; while(lowPtr <= mid) workSpace[j++] = theArray[lowPtr++]; while(highPtr <= upperBound) workSpace[j++] = theArray[highPtr++]; for(j=0; j<n; j++) theArray[lowerBound+j] = workSpace[j]; } // end merge() One interesting thing about merge method is that , almost all the implementations didn't pass the lowerBound parameter to merge method . lowerBound is calculated in the merge . This is strange , since lowerPtr = mid + 1 ; lowerBound = lowerPtr -1 ; that means lowerBound = mid ; Why the author didn't pass mid to merge like merge(workSpace, lowerBound,mid, mid+1, upperBound); ? I think there must be a reason , otherwise I can't understand why an algorithm older than half a center ,and have all coincident in the such little detail.

    Read the article

  • How to prevent git merge to merge a specific file from trunk into a branch and vice versa

    - by svenn
    Hi, I am using git while developing VHDL code. I am doing development on a component in a git branch: comp_dev. The component interface does not change, just the code inside the component. Now, this component already exists in the master branch, but in a more stable version, enough for other developers to be able to use the component. The other developers also have branches for their work, and when their code is good they merge their branches back to master. At this stage I need to be able to merge all the changes from master back to my comp_dev branch, which is basically no problem, but sometimes the stable version of the component I am working on do change as a part of other designers work, but not the interface. I have to do manual git merge -s ours on that particular file every time I want to merge, otherwise I get a conflict that I need to solve manually, throwing out their work. The same happens if I want to merge changes in other files back to master. If I forget to do git merge -s ours src/rx/state_machine.vhd comp_dev before I do a git merge, then I end up with either a manual merge, or I accidentally merge an unstable version of the state machine on top of the stable one. Is there a way to temporarily exclude one file from merges?

    Read the article

  • MS Word 2007 Mail Merge fails on ZIP codes with leading Zeros (eg. 01234)

    - by Pretzel
    I have an Excel Spreadsheet with a ZIP code column. For some dumb reason the original spreadsheet I got had all the zip codes stored as numbers, so a ZIP code like 01234 was stored as 1234. Easy to fix with "Format Column" as "Special = ZIP Code". All values like 1234, show up as 01234. Great! When I import it into Word via Mail Merge (to print address labels), the ZIP codes on all the addresses starting with a leading zero (like 01234) revert to their old form (1234). How do I fix this?

    Read the article

  • How to efficiently merge a lot of vCard files for the same person?

    - by mihi
    I currently have contact information at several places: old PDA's address book mobile phone's phone book (primarily name, phone number) email client's address book (primarily name, email) web mailer's address book (primarily name, email) instant messenger's contact list (primarily name, im, email, birthday) And there are several social or business networking sites on the Internet where contacts provide information about themselves, like LinkedIn or XING. All those sources can export as vCard, but as you might imagine, I get a lot of vCards for the very same contact that way. Are there any tools where I can import them and then merge them (it may ask me which phone number is more current in case of field clashes of course)? Bonus points if it can track which information I have discarded so when I re-export all information from one of the sources I can't import to (networking sites), it won't ask me again if I want to overwrite phone number of person X with the same ancient number... I hope you understand what I try to accomplish, if not just ask :-)

    Read the article

  • Advice Needed: Developers blocked by waiting on code to merge from another branch using GitFlow

    - by fogwolf
    Our team just made the switch from FogBugz & Kiln/Mercurial to Jira & Stash/Git. We are using the Git Flow model for branching, adding subtask branches off of feature branches (relating to Jira subtasks of Jira features). We are using Stash to assign a reviewer when we create a pull request to merge back into the parent branch (usually develop but for subtasks back into the feature branch). The problem we're finding is that even with the best planning and breakdown of feature cases, when multiple developers are working together on the same feature, say on the front-end and back-end, if they are working on interdependent code that is in separate branches one developer ends up blocking the other. We've tried pulling between each others' branches as we develop. We've also tried creating local integration branches each developer can pull from multiple branches to test the integration as they develop. Finally, and this seems to work possibly the best for us so far, though with a bit more overhead, we have tried creating an integration branch off of the feature branch right off the bat. When a subtask branch (off of the feature branch) is ready for a pull request and code review, we also manually merge those change sets into this feature integration branch. Then all interested developers are able to pull from that integration branch into other dependent subtask branches. This prevents anyone from waiting for any branch they are dependent upon to pass code review. I know this isn't necessarily a Git issue - it has to do with working on interdependent code in multiple branches, mixed with our own work process and culture. If we didn't have the strict code-review policy for develop (true integration branch) then developer 1 could merge to develop for developer 2 to pull from. Another complication is that we are also required to do some preliminary testing as part of the code review process before handing the feature off to QA.This means that even if front-end developer 1 is pulling directly from back-end developer 2's branch as they go, if back-end developer 2 finishes and his/her pull request is sitting in code review for a week, then front-end developer 2 technically can't create his pull request/code review because his/her code reviewer can't test because back-end developer 2's code hasn't been merged into develop yet. Bottom line is we're finding ourselves in a much more serial rather than parallel approach in these instance, depending on which route we go, and would like to find a process to use to avoid this. Last thing I'll mention is we realize by sharing code across branches that haven't been code reviewed and finalized yet we are in essence using the beta code of others. To a certain extent I don't think we can avoid that and are willing to accept that to a degree. Anyway, any ideas, input, etc... greatly appreciated. Thanks!

    Read the article

  • SQL Table stored as a Heap - the dangers within

    - by MikeD
    Nearly all of the time I create a table, I include a primary key, and often that PK is implemented as a clustered index. Those two don't always have to go together, but in my world they almost always do. On a recent project, I was working on a data warehouse and a set of SSIS packages to import data from an OLTP database into my data warehouse. The data I was importing from the business database into the warehouse was mostly new rows, sometimes updates to existing rows, and sometimes deletes. I decided to use the MERGE statement to implement the insert, update or delete in the data warehouse, I found it quite performant to have a stored procedure that extracted all the new, updated, and deleted rows from the source database and dump it into a working table in my data warehouse, then run a stored proc in the warehouse that was the MERGE statement that took the rows from the working table and updated the real fact table. Use Warehouse CREATE TABLE Integration.MergePolicy (PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date, Operation varchar(5)) CREATE TABLE fact.Policy (PolicyKey int identity primary key, PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date) CREATE PROC Integration.MergePolicy as begin begin tran Merge fact.Policy as tgtUsing Integration.MergePolicy as SrcOn (tgt.PolicyId = Src.PolicyId) When not matched by Target then Insert (PolicyId, PolicyTypeKey, Premium, Deductible, EffectiveDate)values (src.PolicyId, src.PolicyTypeKey, src.Premium, src.Deductible, src.EffectiveDate) When matched and src.Operation = 'U' then Update set PolicyTypeKey = src.PolicyTypeKey,Premium = src.Premium,Deductible = src.Deductible,EffectiveDate = src.EffectiveDate When matched and src.Operation = 'D' then Delete ;delete from Integration.WorkPolicy commit end Notice that my worktable (Integration.MergePolicy) doesn't have any primary key or clustered index. I didn't think this would be a problem, since it was relatively small table and was empty after each time I ran the stored proc. For one of the work tables, during the initial loads of the warehouse, it was getting about 1.5 million rows inserted, processed, then deleted. Also, because of a bug in the extraction process, the same 1.5 million rows (plus a few hundred more each time) was getting inserted, processed, and deleted. This was being sone on a fairly hefty server that was otherwise unused, and no one was paying any attention to the time it was taking. This week I received a backup of this database and loaded it on my laptop to troubleshoot the problem, and of course it took a good ten minutes or more to run the process. However, what seemed strange to me was that after I fixed the problem and happened to run the merge sproc when the work table was completely empty, it still took almost ten minutes to complete. I immediately looked back at the MERGE statement to see if I had some sort of outer join that meant it would be scanning the target table (which had about 2 million rows in it), then turned on the execution plan output to see what was happening under the hood. Running the stored procedure again took a long time, and the plan output didn't show me much - 55% on the MERGE statement, and 45% on the DELETE statement, and table scans on the work table in both places. I was surprised at the relative cost of the DELETE statement, because there were really 0 rows to delete, but I was expecting to see the table scans. (I was beginning now to suspect that my problem was because the work table was being stored as a heap.) Then I turned on STATS_IO and ran the sproc again. The output was quite interesting.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Policy'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'MergePolicy'. Scan count 1, logical reads 433276, physical reads 60, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. I've reproduced the above from memory, the details aren't exact, but the essential bit was the very high number of logical reads on the table stored as a heap. Even just doing a SELECT Count(*) from Integration.MergePolicy incurred that sort of output, even though the result was always 0. I suppose I should research more on the allocation and deallocation of pages to tables stored as a heap, but I haven't, and my original assumption that a table stored as a heap with no rows would only need to read one page to answer any query was definitely proven wrong. It's likely that some sort of physical defragmentation of the table may have cleaned that up, but it seemed that the easiest answer was to put a clustered index on the table. After doing so, the execution plan showed a cluster index scan, and the IO stats showed only a single page read. (I aborted my first attempt at adding a clustered index on the table because it was taking too long - instead I ran TRUNCATE TABLE Integration.MergePolicy first and added the clustered index, both of which took very little time). I suspect I may not have noticed this if I had used TRUNCATE TABLE Integration.MergePolicy instead of DELETE FROM Integration.MergePolicy, since I'm guessing that the truncate operation does some rather quick releasing of pages allocated to the heap table. In the future, I will likely be much more careful to have a clustered index on every table I use, even the working tables. Mike  

    Read the article

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27  | Next Page >