sql optimization - Page 215

SQL Server 2008 R2 Launch Event - Montreal

- by guybarrette

If you’re into SQL Server, you may want to attend the free 2008 R2 launch event that will take place on May 26th, 2010 in Montreal. Agenda: 8:00 - 9:00am : Registration and Breakfast 9:00 – 9:15am: Welcome and Introductions 9:15 – 10:00am: Keynote Presentation 10:00 - 10:15am: Morning break 10:15 – 11:45am: SQL Server Presentation 11:45 – 12:45pm: Lunch 12:45 – 1:45pm: Track Session 1 1:45 – 2:45pm: Track Session 2 2:45 – 3:00pm: Afternoon break 3:00 - 4:00pm: Track Session 3 Track Descriptions DBA TRACK Session 1: Ensure Business Continuity with SQL Server 2008 R2, Windows Server 2008 & Hyper-V Live Migration Session 2: Simplify management of your SQL Server data platform with Multi-server Management Session 3: Deliver unprecedented access to business-critical data at a lower TCO with SQL Server 2008 R2 Parallel Data Warehouse BI TRACK Session1: Enable Managed Self-service BI with Power Pivot for Excel and SharePoint 2010 Session 2: Achieve Rapid Reporting with Reporting Services and Report Builder 3.0 Session 3: Importance of Master Data Management Dev - Visual Studio TRACK Session 1: Developing SQL Applications with Visual Studio 2010 Session 2:Managing Change for SQL Server applications using Team Foundation Server Session 3: Targeting SQL Azure using Visual Studio Register here var addthis_pub="guybarrette";

Read the article

sys.dm_exec_query_stats interaction with recompilation

- by Sam Saffron

We use sys.dm_exec_query_stats to track down slow queries and queries that are IO offenders. This works great, we get a lot of very insightful stats. It is clear this is not as accurate as running a profiler trace, as you have no idea when SQL Server will decide to chuck out a an execution plan. We have quite a few queries where the wrong execution plan is cached. For example queries like the following: SELECT TOP 30 a.Id FROM Posts a JOIN Posts q ON q.Id = a.ParentId JOIN PostTags pt ON q.Id = pt.PostId WHERE a.PostTypeId = 2 AND a.DeletionDate IS NULL AND a.CommunityOwnedDate IS NULL AND a.CreationDate @date AND LEN(a.Body) 300 AND pt.Tag = @tag AND a.Score 0 ORDER BY a.Score DESC The problem is that the ideal plan really depends on the date selected (screenshot of ideal plan): However if the wrong plan is cached, it totally chokes when the date range is big: (notice the big fat lines) To overcome this we were recommended to use either OPTION (OPTIMIZE FOR UNKNOWN) or OPTION (RECOMPILE) OPTIMIZE FOR UNKNOWN results in a slightly better plan, which is far from optimal. Executions are tracked in sys.dm_exec_query_stats. RECOMPILE results in the best plan being chosen, however no execution counts and stats are tracked in sys.dm_exec_query_stats. Is there another DMV we could use to track stats on queries with OPTION (RECOMPILE)? Is this behavior by-design? Is there another way we can for recompilation while keeping stats tracked in sys.dm_exec_query_stats? Note: the framework will always execute parameterized queries using sp_executesql

Read the article

SQL Server 2012 Service Pack 1 is available - this time for sure!

- by AaronBertrand

Last week I mentioned in passing that Service Pack 1 is now available, while I was blogging from the PASS Summit keynote . I wanted to put up an official post instead of having it appear as a footnote there (I also updated my April Fools' joke to point to the right place). Service Pack 1 Details Service Pack 1 is build # 11.0.3000 and includes 13 fixes to public KB items and 35 other internal (VSTS) items. You can see the list of fixes in KB #2674319 . You can also read about new features included...(read more)

Read the article

T-SQL Snack: How Much Free Storage Space is Available?

- by andyleonard

Introduction Ever have a need to calculate the total available storage space for a server? Recently I did. Here's a solution I came up with - I bet someone can do this better! xp_fixeddrives There's a handy stored procedure called xp_fixeddrives that reports the available storage space: exec xp_fixeddrives This returns: drive MB free ----- ----------- C 6998 E 201066 Problem solved right? Maybe. The Sum What I really want is the sum total of all available space presented to the server. I built this...(read more)

Read the article

SQL Server 2012 Service Pack 1 is available - this time for sure!

- by AaronBertrand

Last week I mentioned in passing that Service Pack 1 is now available, while I was blogging from the PASS Summit keynote . I wanted to put up an official post instead of having it appear as a footnote there (I also updated my April Fools' joke to point to the right place). Service Pack 1 Details Service Pack 1 is build # 11.0.3000 and includes 13 fixes to public KB items and 35 other internal (VSTS) items. You can see the list of fixes in KB #2674319 . You can also read about new features included...(read more)

Read the article

Learning MySQL Query optimization

- by recluze

I've been doing web/desktop/server development for a while and have worked with many databases (mysql mostly). I've come to the point in my career when I need to have someone look at my queries because they're 'kind of slow'. I believe it's now time to start learning query optimization. While I know the basics of index and joins etc., I'm not familiar with how to use, say, the EXPLAIN output to improve performance of my queries. I have not been able to find any online material that starts with the basics and takes me to application. Getting a book is not an option right now so I'm looking for tips about how to proceed with this. I hope this question is general enough not to get closed.

Read the article

Reporting SQL Vulnerability [migrated]

- by Ciaran87Bel

My first post here so i'll hopefully keep it simple. I have just finished building a CMS targeted at a certain industry and built a test site to see how everything works. Anyway I wrote a program to check for sql injection vulnerabilities and the program followed a blog link to an external website. The program discovered that the external site had a massive vulnerability that left it open to practically anyone who could then access every bit of data on their MYSQL Server and run queries etc. The thing is this external site is the brand leader in their industry and do millions upon millions of sales per annum. I have tried contacting them to let them know and even went as far as contacting the company that built their platform but I was pretty much brushed off and haven't heard back from them. Their database would contain the details of hundreds of thousands of customers and all their data. I could easily make myself site admin etc in a few seconds but they won't listen to me even though I have offered to share the vulnerability with them and help in anyway I can. Is there anything else I can do because it is one of the biggest security risks I have ever personally come across. Is there any other steps I should take to report this? Thanks

Read the article

Prevent full table scan for query with multiple where clauses

- by Dave Jarvis

A while ago I posted a message about optimizing a query in MySQL. I have since ported the data and query to PostgreSQL, but now PostgreSQL has the same problem. The solution in MySQL was to force the optimizer to not optimize using STRAIGHT_JOIN. PostgreSQL offers no such option. Here is the explain: Here is the query: SELECT avg(d.amount) AS amount, y.year FROM station s, station_district sd, year_ref y, month_ref m, daily d LEFT JOIN city c ON c.id = 10663 WHERE -- Find all the stations within a specific unit radius ... -- 6371.009 * SQRT( POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) + (COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) * POW(RADIANS(c.longitude_decimal - s.longitude_decimal), 2)) ) <= 50 AND -- Ignore stations outside the given elevations -- s.elevation BETWEEN 0 AND 2000 AND sd.id = s.station_district_id AND -- Gather all known years for that station ... -- y.station_district_id = sd.id AND -- The data before 1900 is shaky; insufficient after 2009. -- y.year BETWEEN 1980 AND 2000 AND -- Filtered by all known months ... -- m.year_ref_id = y.id AND m.month = 12 AND -- Whittled down by category ... -- m.category_id = '001' AND -- Into the valid daily climate data. -- m.id = d.month_ref_id AND d.daily_flag_id <> 'M' GROUP BY y.year It appears as though PostgreSQL is looking at the DAILY table first, which is simply not the right way to go about this query as there are nearly 300 million rows. How do I force PostgreSQL to start at the CITY table? Thank you!

Read the article

How to optimize this MySQL query

- by James Simpson

This query was working fine when the database was small, but now that there are millions of rows in the database, I am realizing I should have looked at optimizing this earlier. It is looking at over 600,000 rows and is Using where; Using temporary; Using filesort (which leads to an execution time of 5-10 seconds). It is using an index on the field 'battle_type.' SELECT username, SUM( outcome ) AS wins, COUNT( * ) - SUM( outcome ) AS losses FROM tblBattleHistory WHERE battle_type = '0' && outcome < '2' GROUP BY username ORDER BY wins DESC , losses ASC , username ASC LIMIT 0 , 50

Read the article

Optimizing Oracle query

- by Omnipresent

SELECT MAX(verification_id) FROM VERIFICATION_TABLE WHERE head = 687422 AND mbr = 23102 AND RTRIM(LTRIM(lname)) = '.iq bzw' AND TO_CHAR(dob,'MM/DD/YYYY')= '08/10/2004' AND system_code = 'M'; This query is taking 153 seconds to run. there are millions of rows in VERIFICATION_TABLE. I think query is taking long because of the functions in where clause. However, I need to do ltrim rtrim on the columns and also date has to be matched in MM/DD/YYYY format. How can I optimize this query?

Read the article

performance issue in a select query from a single table

- by daedlus

Hi , I have a table as below dbo.UserLogs ------------------------------------- Id | UserId |Date | Name| P1 | Dirty ------------------------------------- There can be several records per userId[even in millions] I have clustered index on Date column and query this table very frequently in time ranges. The column 'Dirty' is non-nullable and can take either 0 or 1 only so I have no indexes on 'Dirty' I have several millions of records in this table and in one particular case in my application i need to query this table to get all UserId that have at least one record that is marked dirty. I tried this query - select distinct(UserId) from UserLogs where Dirty=1 I have 10 million records in total and this takes like 10min to run and i want this to run much faster than this. [i am able to query this table on date column in less than a minute.] Any comments/suggestion are welcome. my env 64bit,sybase15.0.3,Linux

Read the article

Simple MySQL Query taking 45 seconds (Gets a record and its "latest" child record)

- by Brian Lacy

I have a query which gets a customer and the latest transaction for that customer. Currently this query takes over 45 seconds for 1000 records. This is especially problematic because the script itself may need to be executed as frequently as once per minute! I believe using subqueries may be the answer, but I've had trouble constructing it to actually give me the results I need. SELECT customer.CustID, customer.leadid, customer.Email, customer.FirstName, customer.LastName, transaction.*, MAX(transaction.TransDate) AS LastTransDate FROM customer INNER JOIN transaction ON transaction.CustID = customer.CustID WHERE customer.Email = '".$email."' GROUP BY customer.CustID ORDER BY LastTransDate LIMIT 1000 I really need to get this figured out ASAP. Any help would be greatly appreciated!

Read the article

Require help in Writing Query

- by harigm

The following image have been uploaded to show what I am trying to do and what I wanted out of it Can any one help me write the Query to get the results what I want Please check the following SELECT * FROM KPT WHERE PROPERTY_ID IN (SELECT PROPERTY_ID FROM khata_header WHERE DIV_ID = 3 and RECORD_STATUS = 0) and CHALLAN_NO > 42646 The above is the query I have written and I have got the following result set ID CHALLAN_NO PROPERTY_ID SITE_NO TOTAL_AMOUNT ----- ------------- -------------- ------------------- --------------- 1242 42757 3103010141 296 595 1243 63743 3204190257 483 594 1244 63743 3204190257 483 594 1334 43395 3217010223 1088 576 1421 524210 3320050416 (null) (null) 1422 524210 3320050416 (null) (null) 1560 564355 3320021408 (null) (null) 1870 516292 3320040420 (null) (null) 1940 68357 3217100104 139 1153 1941 68357 3217100104 139 1153 2002 56256 3320100733 511 4430 2003 56256 3320100733 511 4430 2004 66488 3217040869 293 3094 2005 66488 3217040869 293 3094 2016 64571 3217040374 (null) (null) 2036 523122 3320020352 (null) (null) 2039 65682 3217040021 273 919 In my resultset, I am getting the PropertyId repeated, since there are multilple entries, How Can I know How many have been repeated What are those Property Id which have repeated more than 2 times. Little Back ground about the tables are PROPERTY_ID is the FK in the KPT PROPERTY_ID is the PK in KH I am writing a subquery to get the Result, so I am stuck I dont know how to get my results Please help

Read the article

I need some help optimizing my database schema

- by Steffan

Here's a layout of my data: Heading 1: Sub heading Sub heading Sub heading Sub heading Sub heading Heading 2: Sub heading Sub heading Sub heading Sub heading Sub heading Heading 3: Sub heading Sub heading Sub heading Sub heading Sub heading Heading 4: Sub heading Sub heading Sub heading Sub heading Sub heading Heading 5: Sub heading Sub heading Sub heading Sub heading Sub heading These headings need to have a 'Completion Status' boolean value which gets linked to a user Id. Currently, this is how my table looks: id | userID | field_1 | field_2 | field_3 | field_4 | etc... ----------------------------------------------------------------------- 1 | 1 | 0 | 0 | 1 | 0 | ----------------------------------------------------------------------- 2 | 2 | 1 | 0 | 1 | 1 | Each field represents one Sub Heading. Having this many columns in my table looks awfully inefficient... How can I go about optimizing this? I can't think of any way to neaten it up :/

Read the article

MySQL MyISAM table performance... painfully, painfully slow

- by Salman A

I've got a table structure that can be summarized as follows: pagegroup * pagegroupid * name has 3600 rows page * pageid * pagegroupid * data references pagegroup; has 10000 rows; can have anything between 1-700 rows per pagegroup; the data column is of type mediumtext and the column contains 100k - 200kbytes data per row userdata * userdataid * pageid * column1 * column2 * column9 references page; has about 300,000 rows; can have about 1-50 rows per page The above structure is pretty straight forwad, the problem is that that a join from userdata to page group is terribly, terribly slow even though I have indexed all columns that should be indexed. The time needed to run a query for such a join (userdata inner_join page inner_join pagegroup) exceeds 3 minutes. This is terribly slow considering the fact that I am not selecting the data column at all. Example of the query that takes too long: SELECT userdata.column1, pagegroup.name FROM userdata INNER JOIN page USING( pageid ) INNER JOIN pagegroup USING( pagegroupid ) Please help by explaining why does it take so long and what can i do to make it faster. Edit #1 Explain returns following gibberish: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE userdata ALL pageid 372420 1 SIMPLE page eq_ref PRIMARY,pagegroupid PRIMARY 4 topsecret.userdata.pageid 1 1 SIMPLE pagegroup eq_ref PRIMARY PRIMARY 4 topsecret.page.pagegroupid 1 Edit #2 SELECT u.field2, p.pageid FROM userdata u INNER JOIN page p ON u.pageid = p.pageid; /* 0.07 sec execution, 6.05 sec fecth */ id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE u ALL pageid 372420 1 SIMPLE p eq_ref PRIMARY PRIMARY 4 topsecret.u.pageid 1 Using index SELECT p.pageid, g.pagegroupid FROM page p INNER JOIN pagegroup g ON p.pagegroupid = g.pagegroupid; /* 9.37 sec execution, 60.0 sec fetch */ id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE g index PRIMARY PRIMARY 4 3646 Using index 1 SIMPLE p ref pagegroupid pagegroupid 5 topsecret.g.pagegroupid 3 Using where Moral of the story Keep medium/long text columns in a separate table if you run into performance problems such as this one.

Read the article

Date arithmetic using integer values

- by Dave Jarvis

Problem String concatenation is slowing down a query: date(extract(YEAR FROM m.taken)||'-1-1') d1, date(extract(YEAR FROM m.taken)||'-1-31') d2 This is realized in code as part of a string, which follows (where the p_ variables are integers): date(extract(YEAR FROM m.taken)||''-'||p_month1||'-'||p_day1||''') d1, date(extract(YEAR FROM m.taken)||''-'||p_month2||'-'||p_day2||''') d2 This part of the query runs in 3.2 seconds with the dates, and 1.5 seconds without, leading me to believe there is ample room for improvement. Question What is a better way to create the date (presumably without concatenation)? Many thanks!

Read the article

Enforcing a query in MySql to use a specific index

- by Hossein

Hi, I have large table. consisting of only 3 columns (id(INT),bookmarkID(INT),tagID(INT)).I have two BTREE indexes one for each bookmarkID and tagID columns.This table has about 21 Million records. I am trying to run this query: SELECT bookmarkID,COUNT(bookmarkID) AS count FROM bookmark_tag_map GROUP BY tagID,bookmarkID HAVING tagID IN (-----"tagIDList"-----) AND count >= N which takes ages to return the results.I read somewhere that if make an index in which it has tagID,bookmarkID together, i will get a much faster result. I created the index after some time. Tried the query again, but it seems that this query is not using the new index that I have made.I ran EXPLAIN and saw that it is actually true. My question now is that how I can enforce a query to use a specific index? also comments on other ways to make the query faster are welcome. Thanks

Read the article

Strange: Planner takes decision with lower cost, but (very) query long runtime

- by S38

Facts: PGSQL 8.4.2, Linux I make use of table inheritance Each Table contains 3 million rows Indexes on joining columns are set Table statistics (analyze, vacuum analyze) are up-to-date Only used table is "node" with varios partitioned sub-tables Recursive query (pg = 8.4) Now here is the explained query: WITH RECURSIVE rows AS ( SELECT * FROM ( SELECT r.id, r.set, r.parent, r.masterid FROM d_storage.node_dataset r WHERE masterid = 3533933 ) q UNION ALL SELECT * FROM ( SELECT c.id, c.set, c.parent, r.masterid FROM rows r JOIN a_storage.node c ON c.parent = r.id ) q ) SELECT r.masterid, r.id AS nodeid FROM rows r QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on rows r (cost=2742105.92..2862119.94 rows=6000701 width=16) (actual time=0.033..172111.204 rows=4 loops=1) CTE rows -> Recursive Union (cost=0.00..2742105.92 rows=6000701 width=28) (actual time=0.029..172111.183 rows=4 loops=1) -> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.025..0.027 rows=1 loops=1) Index Cond: (masterid = 3533933) -> Hash Join (cost=0.33..262208.33 rows=600070 width=28) (actual time=40628.371..57370.361 rows=1 loops=3) Hash Cond: (c.parent = r.id) -> Append (cost=0.00..211202.04 rows=12001404 width=20) (actual time=0.011..46365.669 rows=12000004 loops=3) -> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.002..0.002 rows=0 loops=3) -> Seq Scan on node_dataset c (cost=0.00..55001.01 rows=3000001 width=20) (actual time=0.007..3426.593 rows=3000001 loops=3) -> Seq Scan on node_stammdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=0.008..9049.189 rows=3000001 loops=3) -> Seq Scan on node_stammdaten_adresse c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=3.455..8381.725 rows=3000001 loops=3) -> Seq Scan on node_testdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=1.810..5259.178 rows=3000001 loops=3) -> Hash (cost=0.20..0.20 rows=10 width=16) (actual time=0.010..0.010 rows=1 loops=3) -> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.002..0.004 rows=1 loops=3) Total runtime: 172111.371 ms (16 rows) (END) So far so bad, the planner decides to choose hash joins (good) but no indexes (bad). Now after doing the following: SET enable_hashjoins TO false; The explained query looks like that: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on rows r (cost=15198247.00..15318261.02 rows=6000701 width=16) (actual time=0.038..49.221 rows=4 loops=1) CTE rows -> Recursive Union (cost=0.00..15198247.00 rows=6000701 width=28) (actual time=0.032..49.201 rows=4 loops=1) -> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.028..0.031 rows=1 loops=1) Index Cond: (masterid = 3533933) -> Nested Loop (cost=0.00..1507822.44 rows=600070 width=28) (actual time=10.384..16.382 rows=1 loops=3) Join Filter: (r.id = c.parent) -> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.001..0.003 rows=1 loops=3) -> Append (cost=0.00..113264.67 rows=3001404 width=20) (actual time=8.546..12.268 rows=1 loops=4) -> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.001..0.001 rows=0 loops=4) -> Bitmap Heap Scan on node_dataset c (cost=58213.87..113214.88 rows=3000001 width=20) (actual time=1.906..1.906 rows=0 loops=4) Recheck Cond: (c.parent = r.id) -> Bitmap Index Scan on node_dataset_parent (cost=0.00..57463.87 rows=3000001 width=0) (actual time=1.903..1.903 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_stammdaten_parent on node_stammdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=3.272..3.273 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_stammdaten_adresse_parent on node_stammdaten_adresse c (cost=0.00..8.60 rows=1 width=20) (actual time=4.333..4.333 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_testdaten_parent on node_testdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=2.745..2.746 rows=0 loops=4) Index Cond: (c.parent = r.id) Total runtime: 49.349 ms (21 rows) (END) - incredibly faster, because indexes were used. Notice: Cost of the second query ist somewhat higher than for the first query. So the main question is: Why does the planner make the first decision, instead of the second? Also interesing: Via SET enable_seqscan TO false; i temp. disabled seq scans. Than the planner used indexes and hash joins, and the query still was slow. So the problem seems to be the hash join. Maybe someone can help in this confusing situation? thx, R.

Read the article

Optimize date query for large child tables: GiST or GIN?

- by Dave Jarvis

Problem 72 child tables, each having a year index and a station index, are defined as follows: CREATE TABLE climate.measurement_12_013 ( -- Inherited from table climate.measurement_12_013: id bigint NOT NULL DEFAULT nextval('climate.measurement_id_seq'::regclass), -- Inherited from table climate.measurement_12_013: station_id integer NOT NULL, -- Inherited from table climate.measurement_12_013: taken date NOT NULL, -- Inherited from table climate.measurement_12_013: amount numeric(8,2) NOT NULL, -- Inherited from table climate.measurement_12_013: category_id smallint NOT NULL, -- Inherited from table climate.measurement_12_013: flag character varying(1) NOT NULL DEFAULT ' '::character varying, CONSTRAINT measurement_12_013_category_id_check CHECK (category_id = 7), CONSTRAINT measurement_12_013_taken_check CHECK (date_part('month'::text, taken)::integer = 12) ) INHERITS (climate.measurement) CREATE INDEX measurement_12_013_s_idx ON climate.measurement_12_013 USING btree (station_id); CREATE INDEX measurement_12_013_y_idx ON climate.measurement_12_013 USING btree (date_part('year'::text, taken)); (Foreign key constraints to be added later.) The following query runs abysmally slow due to a full table scan: SELECT count(1) AS measurements, avg(m.amount) AS amount FROM climate.measurement m WHERE m.station_id IN ( SELECT s.id FROM climate.station s, climate.city c WHERE -- For one city ... -- c.id = 5182 AND -- Where stations are within an elevation range ... -- s.elevation BETWEEN 0 AND 3000 AND 6371.009 * SQRT( POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) + (COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) * POW(RADIANS(c.longitude_decimal - s.longitude_decimal), 2)) ) <= 50 ) AND -- -- Begin extracting the data from the database. -- -- The data before 1900 is shaky; insufficient after 2009. -- extract( YEAR FROM m.taken ) BETWEEN 1900 AND 2009 AND -- Whittled down by category ... -- m.category_id = 1 AND m.taken BETWEEN -- Start date. (extract( YEAR FROM m.taken )||'-01-01')::date AND -- End date. Calculated by checking to see if the end date wraps -- into the next year. If it does, then add 1 to the current year. -- (cast(extract( YEAR FROM m.taken ) + greatest( -1 * sign( (extract( YEAR FROM m.taken )||'-12-31')::date - (extract( YEAR FROM m.taken )||'-01-01')::date ), 0 ) AS text)||'-12-31')::date GROUP BY extract( YEAR FROM m.taken ) The sluggishness comes from this part of the query: m.taken BETWEEN /* Start date. */ (extract( YEAR FROM m.taken )||'-01-01')::date AND /* End date. Calculated by checking to see if the end date wraps into the next year. If it does, then add 1 to the current year. */ (cast(extract( YEAR FROM m.taken ) + greatest( -1 * sign( (extract( YEAR FROM m.taken )||'-12-31')::date - (extract( YEAR FROM m.taken )||'-01-01')::date ), 0 ) AS text)||'-12-31')::date The HashAggregate from the plan shows a cost of 10006220141.11, which is, I suspect, on the astronomically huge side. There is a full table scan on the measurement table (itself having neither data nor indexes) being performed. The table aggregates 237 million rows from its child tables. Question What is the proper way to index the dates to avoid full table scans? Options I have considered: GIN GiST Rewrite the WHERE clause Separate year_taken, month_taken, and day_taken columns to the tables What are your thoughts? Thank you!

Read the article

PostgreSQL - fetch the row which has the Max value for a column

- by Joshua Berry

I'm dealing with a Postgres table (called "lives") that contains records with columns for time_stamp, usr_id, transaction_id, and lives_remaining. I need a query that will give me the most recent lives_remaining total for each usr_id There are multiple users (distinct usr_id's) time_stamp is not a unique identifier: sometimes user events (one by row in the table) will occur with the same time_stamp. trans_id is unique only for very small time ranges: over time it repeats remaining_lives (for a given user) can both increase and decrease over time example: time_stamp|lives_remaining|usr_id|trans_id ----------------------------------------- 07:00 | 1 | 1 | 1 09:00 | 4 | 2 | 2 10:00 | 2 | 3 | 3 10:00 | 1 | 2 | 4 11:00 | 4 | 1 | 5 11:00 | 3 | 1 | 6 13:00 | 3 | 3 | 1 As I will need to access other columns of the row with the latest data for each given usr_id, I need a query that gives a result like this: time_stamp|lives_remaining|usr_id|trans_id ----------------------------------------- 11:00 | 3 | 1 | 6 10:00 | 1 | 2 | 4 13:00 | 3 | 3 | 1 As mentioned, each usr_id can gain or lose lives, and sometimes these timestamped events occur so close together that they have the same timestamp! Therefore this query won't work: SELECT b.time_stamp,b.lives_remaining,b.usr_id,b.trans_id FROM (SELECT usr_id, max(time_stamp) AS max_timestamp FROM lives GROUP BY usr_id ORDER BY usr_id) a JOIN lives b ON a.max_timestamp = b.time_stamp Instead, I need to use both time_stamp (first) and trans_id (second) to identify the correct row. I also then need to pass that information from the subquery to the main query that will provide the data for the other columns of the appropriate rows. This is the hacked up query that I've gotten to work: SELECT b.time_stamp,b.lives_remaining,b.usr_id,b.trans_id FROM (SELECT usr_id, max(time_stamp || '*' || trans_id) AS max_timestamp_transid FROM lives GROUP BY usr_id ORDER BY usr_id) a JOIN lives b ON a.max_timestamp_transid = b.time_stamp || '*' || b.trans_id ORDER BY b.usr_id Okay, so this works, but I don't like it. It requires a query within a query, a self join, and it seems to me that it could be much simpler by grabbing the row that MAX found to have the largest timestamp and trans_id. The table "lives" has tens of millions of rows to parse, so I'd like this query to be as fast and efficient as possible. I'm new to RDBM and Postgres in particular, so I know that I need to make effective use of the proper indexes. I'm a bit lost on how to optimize. I found a similar discussion here. Can I perform some type of Postgres equivalent to an Oracle analytic function? Any advice on accessing related column information used by an aggregate function (like MAX), creating indexes, and creating better queries would be much appreciated! P.S. You can use the following to create my example case: create TABLE lives (time_stamp timestamp, lives_remaining integer, usr_id integer, trans_id integer); insert into lives values ('2000-01-01 07:00', 1, 1, 1); insert into lives values ('2000-01-01 09:00', 4, 2, 2); insert into lives values ('2000-01-01 10:00', 2, 3, 3); insert into lives values ('2000-01-01 10:00', 1, 2, 4); insert into lives values ('2000-01-01 11:00', 4, 1, 5); insert into lives values ('2000-01-01 11:00', 3, 1, 6); insert into lives values ('2000-01-01 13:00', 3, 3, 1);

Read the article

How to optimise MySQL query containing a subquery?

- by aidan

Read the article

Question about Cost in Oracle Explain Plan

- by Will

When Oracle is estimating the 'Cost' for certain queries, does it actually look at the amount of data (rows) in a table? For example: If I'm doing a full table scan of employees for name='Bob', does it estimate the cost by counting the amount of existing rows, or is it always a set cost?

Read the article

How can i get rid of 'ORA-01489: result of string concatenation is too long' in this query?

- by core_pro

this query gets the dominating sets in a network. so for example given a network A<----->B B<----->C B<----->D C<----->E D<----->C D<----->E F<----->E it returns B,E B,F A,E but it doesn't work for large data because i'm using string methods in my result. i have been trying to remove the string methods and return a view or something but to no avail With t as (select 'A' as per1, 'B' as per2 from dual union all select 'B','C' from dual union all select 'B','D' from dual union all select 'C','B' from dual union all select 'C','E' from dual union all select 'D','C' from dual union all select 'D','E' from dual union all select 'E','C' from dual union all select 'E','D' from dual union all select 'F','E' from dual) ,t2 as (select distinct least(per1, per2) as per1, greatest(per1, per2) as per2 from t union select distinct greatest(per1, per2) as per1, least(per1, per2) as per1 from t) ,t3 as (select per1, per2, row_number() over (partition by per1 order by per2) as rn from t2) ,people as (select per, row_number() over (order by per) rn from (select distinct per1 as per from t union select distinct per2 from t) ) ,comb as (select sys_connect_by_path(per,',')||',' as p from people connect by rn > prior rn ) ,find as (select p, per2, count(*) over (partition by p) as cnt from ( select distinct comb.p, t3.per2 from comb, t3 where instr(comb.p, ','||t3.per1||',') > 0 or instr(comb.p, ','||t3.per2||',') > 0 ) ) ,rnk as (select p, rank() over (order by length(p)) as rnk from find where cnt = (select count(*) from people) order by rnk ) select distinct trim(',' from p) as p from rnk where rnk.rnk = 1`

Read the article

Postgre database ignoring created index ?!

- by drasto

I have an Postgre database and a table called my_table. There are 4 columns in that table (id, column1, column2, column3). The id column is primary key, there are no other constrains or indexes on columns. The table has about 200000 rows. I want to print out all rows which has value of column column2 equal(case insensitive) to 'value12'. I use this: SELECT * FROM my_table WHERE column2 = lower('value12') here is the execution plan for this statement(result of set enable_seqscan=on; EXPLAIN SELECT * FROM my_table WHERE column2 = lower('value12')): Seq Scan on my_table (cost=0.00..4676.00 rows=10000 width=55) Filter: ((column2)::text = 'value12'::text) I consider this to be to slow so I create an index on column column2 for better prerformance of searches: CREATE INDEX my_index ON my_table (lower(column2)) Now I ran the same select: SELECT * FROM my_table WHERE column2 = lower('value12') and I expect it to be much faster because it can use index. However it is not faster, it is as slow as before. So I check the execution plan and it is the same as before(see above). So it still uses sequential scen and it ignores the index! Where is the problem ?

Read the article

Mysql - help me optimize this query (improved question)

- by sandeepan-nath

About the system: - There are tutors who create classes and packs - A tags based search approach is being followed.Tag relations are created when new tutors register and when tutors create packs (this makes tutors and packs searcheable). For details please check the section How tags work in this system? below. Following is the concerned query SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) AS key_1_total_matches, SUM(DISTINCT( t.tag LIKE "%democracy%" )) AS key_2_total_matches, COUNT(DISTINCT( od.id_od )) AS tutor_popularity, CASE WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1 AND wc.wc_type = 0 AND wc.class_date > '2010-06-01 22:00:56' AND wccp.status = 1 AND ( wccp.country_code = 'IE' OR wccp.country_code IN ( 'INT' ) ) ), 0) ) THEN 1 ELSE 0 END AS 'classes_published', CASE WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1 AND lp.published = 1 AND lpcp.status = 1 AND ( lpcp.country_code = 'IE' OR lpcp.country_code IN ( 'INT' ) ) ), 0) ) THEN 1 ELSE 0 END AS 'packs_published', td . *, u . * FROM tutor_details AS td JOIN users AS u ON u.id_user = td.id_user LEFT JOIN learning_packs_tag_relations AS lptagrels ON td.id_tutor = lptagrels.id_tutor LEFT JOIN learning_packs AS lp ON lptagrels.id_lp = lp.id_lp LEFT JOIN learning_packs_categories AS lpc ON lpc.id_lp_cat = lp.id_lp_cat LEFT JOIN learning_packs_categories AS lpcp ON lpcp.id_lp_cat = lpc.id_parent LEFT JOIN learning_pack_content AS lpct ON ( lp.id_lp = lpct.id_lp ) LEFT JOIN webclasses_tag_relations AS wtagrels ON td.id_tutor = wtagrels.id_tutor LEFT JOIN webclasses AS wc ON wtagrels.id_wc = wc.id_wc LEFT JOIN learning_packs_categories AS wcc ON wcc.id_lp_cat = wc.id_wp_cat LEFT JOIN learning_packs_categories AS wccp ON wccp.id_lp_cat = wcc.id_parent LEFT JOIN order_details AS od ON td.id_tutor = od.id_author LEFT JOIN orders AS o ON od.id_order = o.id_order LEFT JOIN tutors_tag_relations AS ttagrels ON td.id_tutor = ttagrels.id_tutor JOIN tags AS t ON ( t.id_tag = ttagrels.id_tag ) OR ( t.id_tag = lptagrels.id_tag ) OR ( t.id_tag = wtagrels.id_tag ) WHERE ( u.country = 'IE' OR u.country IN ( 'INT' ) ) AND CASE WHEN ( ( t.id_tag = lptagrels.id_tag ) AND ( lp.id_lp 0 ) ) THEN lp.id_status = 1 AND lp.published = 1 AND lpcp.status = 1 AND ( lpcp.country_code = 'IE' OR lpcp.country_code IN ( 'INT' ) ) ELSE 1 END AND CASE WHEN ( ( t.id_tag = wtagrels.id_tag ) AND ( wc.id_wc 0 ) ) THEN wc.wc_api_status = 1 AND wc.wc_type = 0 AND wc.class_date '2010-06-01 22:00:56' AND wccp.status = 1 AND ( wccp.country_code = 'IE' OR wccp.country_code IN ( 'INT' ) ) ELSE 1 END AND CASE WHEN ( od.id_od 0 ) THEN od.id_author = td.id_tutor AND o.order_status = 'paid' AND CASE WHEN ( od.id_wc 0 ) THEN od.can_attend_class = 1 ELSE 1 END ELSE 1 END GROUP BY td.id_tutor HAVING key_1_total_matches = 1 AND key_2_total_matches = 1 ORDER BY tutor_popularity DESC, u.surname ASC, u.name ASC LIMIT 0, 20 The problem The results returned by the above query are correct (AND logic working as per expectation), but the time taken by the query rises alarmingly for heavier data and for the current data I have it is like 25 seconds as against normal query timings of the order of 0.005 - 0.0002 seconds, which makes it totally unusable. It is possible that some of the delay is being caused because all the possible fields have not yet been indexed. The tag field of tags table is indexed. Is there something faulty with the query? What can be the reason behind 20+ seconds of execution time? How tags work in this system? When a tutor registers, tags are entered and tag relations are created with respect to tutor's details like name, surname etc. When a Tutors create packs, again tags are entered and tag relations are created with respect to pack's details like pack name, description etc. tag relations for tutors stored in tutors_tag_relations and those for packs stored in learning_packs_tag_relations. All individual tags are stored in tags table. The explain query output:- Please see this screenshot - http://www.test.examvillage.com/Explain_query.jpg

Search Results

Search found 30300 results on 1212 pages for 'sql optimization'.

Page 215/1212 | < Previous Page | 211 212 213 214 215 216 217 218 219 220 221 222 | Next Page >

- by guybarrette

- by Sam Saffron

- by AaronBertrand

- by andyleonard

- by AaronBertrand

- by recluze

- by Ciaran87Bel

- by Dave Jarvis

- by James Simpson

- by Omnipresent

- by daedlus

- by Brian Lacy

- by harigm

- by Steffan

- by Salman A

- by Dave Jarvis

- by Hossein

- by S38

- by Dave Jarvis

- by Joshua Berry

- by aidan

- by Will

- by core_pro

- by drasto

- by sandeepan-nath

< Previous Page | 211 212 213 214 215 216 217 218 219 220 221 222 | Next Page >