rainbow tables - Page 70

Using Linq-To-SQL I'm getting some weird behavior doing text searches with the .Contains method. Loo

- by Nate Bross

I have a table, where I need to do a case insensitive search on a text field. If I run this query in LinqPad directly on my database, it works as expected Table.Where(tbl => tbl.Title.Contains("StringWithAnyCase")) // also, adding in the same constraints I'm using in my repository works in LinqPad // Table.Where(tbl => tbl.Title.Contains("StringWithAnyCase") && tbl.IsActive == true) In my application, I've got a repository which exposes IQueryable objects which does some initial filtering and it looks like this var dc = new MyDataContext(); public IQueryable<Table> GetAllTables() { var ret = dc.Tables.Where(t => t.IsActive == true); return ret; } In the controller (its an MVC app) I use code like this in an attempt to mimic the LinqPad query: var rpo = new RepositoryOfTable(); var tables = rpo.GetAllTables(); // for some reason, this does a CASE SENSITIVE search which is NOT what I want. tables = tables.Where(tbl => tbl.Title.Contains("StringWithAnyCase"); return View(tables); The column is defiend as an nvarchar(50) in SQL Server 2008. Any help or guidance is greatly appreciated!

Read the article

SQL Server 2008 + expensive union all

- by Tim Mahy

Hi al, we have 5 tables over which we should query with user search input throughout a stored procedure. We do a union all of the similar data inside a view. Because of this the view can not be materialized. We are not able to change these 5 tables drastically (like creating a 6th table that contains the similar data of the 5 tables and reference that new one from the 5 tables). The query is rather expensive / slow what are our other options? It's allowed to think outside the box. Unfortunately I cannot give more information like the table/view/SP definition because of customer confidentiality... greetings, Tim

Read the article

how run Access 2007 module in Vb6?

- by Mahmoud

I have created a module in access 2007 that will update linked tables, but I wanted to run this module from vb6. I have tried this code from Microsoft, but it didnt work. Sub AccessTest1() Dim A As Object Set A = CreateObject("Access.Application") A.Visible = False A.OpenCurrentDatabase (App.Path & "/DataBase/acc.accdb") A.DoCmd.RunMacro "RefreshLinks" End Sub What I am aiming to do, is to allow my program to update all linked tables to new links, in case the program has been used on other computer In case you want to take a look at the module program, here it is: Sub CreateLinkedJetTable() Dim cat As ADOX.Catalog Dim tbl As ADOX.Table Set cat = New ADOX.Catalog ' Open the catalog. cat.ActiveConnection = CurrentProject.Connection Set tbl = New ADOX.Table ' Create the new table. tbl.Name = "Companies" Set tbl.ParentCatalog = cat ' Set the properties to create the link. tbl.Properties("Jet OLEDB:Link Datasource") = CurrentProject.Path & "/db3.mdb" tbl.Properties("Jet OLEDB:Remote Table Name") = "Companies" tbl.Properties("Jet OLEDB:Create Link") = True ' To link a table with a database password set the Link Provider String ' tbl.Properties("Jet OLEDB:Link Provider String") = "MS Access;PWD=Admin;" ' Append the table to the tables collection. cat.Tables.Append tbl Set cat = Nothing End Sub Sub RefreshLinks() Dim cat As ADOX.Catalog Dim tbl As ADOX.Table Set cat = New ADOX.Catalog ' Open the catalog. cat.ActiveConnection = CurrentProject.Connection Set tbl = New ADOX.Table For Each tbl In cat.Tables ' Verify that the table is a linked table. If tbl.Type = "LINK" Then tbl.Properties("Jet OLEDB:Link Datasource") = CurrentProject.Path & "/db3.mdb" ' To refresh a linked table with a database password set the Link Provider String 'tbl.Properties("Jet OLEDB:Link Provider String") = "MS Access;PWD=Admin;" End If Next End Sub

Read the article

make VIEW of SHOW TABLE query

- by cneeds

Is there a way to make a VIEW of this SHOW TABLE query? SHOW FULL TABLES FROM `db_name` WHERE `Table_type` = "Base table" When I save this as a view (using phpMyAdmin) I get #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SHOW FULL TABLES FROM `db_name` WHERE `Table_type` = "Base table"' at line 4 when phpMyAdmin tries to execute this CREATE ALGORITHM = UNDEFINED VIEW `Tables` AS SHOW FULL TABLES FROM `db_name` WHERE `Table_type` = "Base table"

Read the article

Execution plan issue requires reset on SQL Server 2005, how to determine cause?

- by Tony Brandner

We have a web application that delivers training to thousands of corporate students running on top of SQL Server 2005. Recently, we started seeing that a single specific query in the application went from 1 second to about 30 seconds in terms of execution time. The application started throwing timeouts in that area. Our first thought was that we may have incorrect indexes, so we reviewed the tables and indexes. However, similar queries elsewhere in the application also run quickly. Reviewing the indexes showed us that they were configured as expected. We were able to narrow it down to a single query, not a stored procedure. Running this query in SQL Studio also runs quickly. We tried running the application in a different server environment. So a different web server with the same query, parameters and database. The query still ran slow. The query is a fairly large one related to determining a student's current list of training. It includes joins and left joins on a dozen tables and subqueries. A few of the tables are fairly large (hundreds of thousands of rows) and some of the other tables are small lookup tables. The query uses a grouping clause and a few where conditions. A few of the tables are quite active and the contents change often but the volume of added rows doesn't seem extreme. These symptoms led us to consider the execution plan. First off, as soon as we reset the execution plan cache with the SQL command 'DBCC FREEPROCCACHE', the problem went away. Unfortunately, the problem started to reoccur within a few days. The problem has continued to plague us for awhile now. It's usually the same query, but we did appear to see the problem occur in another single query recently. It happens enough to be a nuisance. We're having a heck of a time trying to fix it since we can't reproduce it in any other environment other than production. I have downloaded the High Availability guide from Red Gate and I read up more on execution plans. I hope to run the profiler on the live server, but I'm a bit concerned about impact. I would like to ask - what is the best way to figure out what is triggering this problem? Has anyone else seen this same issue?

Read the article

With database table design, Is there an advantage of using special naming convention for a Many-to-m

- by Jian Lin

Whenever there is a many-to-many relationship, I think there will be immediately a many-to-many table needed (also called a junction table). Is there any advantage of using special naming convention for these tables, as opposed to the other 1 to many tables? Or are there popular special naming conventions used by companies designing database tables?

Read the article

Kernel dealing with the section headers in an ELF

- by uki

I recently read that the kernel and the dynamic loader mostly deal with the program header tables in an ELF file and that assemblers, compilers and linkers deal with the section header tables. The number of program header tables and section header tables are mentioned in the ELF header in fields named e_phnum and e_shnum respectively. e_phnum is two bytes in size, so if the number of program headers is 65535, we use a scheme known as extended numbering where, e_phnum is set to 0xffff and sh_link field of the zeroth section header table holds the actual count. My doubt is : If the count of program headers exceeds 65535, does that mean the kernel and/or the dynamic loader end up having to read the section table?

Read the article

which is better, creating a view or a new table?

- by Carson

I have some demanding mysql queries that are needed to grap same datasets from several mysql tables. I am thinking of creating a table or view to gather all demanding columns from other tables, so as to increase performance. If I create that table, I may need to do extra insert / update / delete operation each time other tables updated. if I create view, I am worrying if the performance can be greatly improved. Because data from other tables are changing very frequently. Most likely, the view may need to be created first everytime before selecting it. Any ideas? e.g. how to cache? other extra measures I can do?

Read the article

In SQL, a Join is actually an Intersection? And it is also a linkage or a "Sideway Union"?

- by Jian Lin

I always thought of a Join in SQL as some kind of linkage between two tables. For example, select e.name, d.name from employees e, departments d where employees.deptID = departments.deptID In this case, it is linking two tables, to show each employee with a department name instead of a department ID. And kind of like a "linkage" or "Union" sideway". But, after learning about inner join vs outer join, it shows that a Join (Inner join) is actually an intersection. For example, when one table has the ID 1, 2, 7, 8, while another table has the ID 7 and 8 only, the way we get the intersection is: select * from t1, t2 where t1.ID = t2.ID to get the two records of "7 and 8". So it is actually an intersection. So we have the "Intersection" of 2 tables. Compare this with the "Union" operation on 2 tables. Can a Join be thought of as an "Intersection"? But what about the "linking" or "sideway union" aspect of it?

Read the article

Database design -- does it respect 3rd NF?

- by Flavius

Hi I have the following relations (tables) in a relational model Person person_id, first_name, last_name, address Student person_id, matr_nr Teacher person_id, salary Lecture lecture_id, lect_name, lect_description Attendees lecture_id, person_id, date I'm wondering about the functional dependencies of Student and Teacher. Do these tables respect the 3rd normal form? Which should be the primary keys of these tables?

Read the article

save sqlite3 file of my App

- by user1553381

I'm new with dealing with the .sqlite3 in iphone, I created a sqlite3 file in /Users/myLab/Library/Application Support/iPhone Simulator/5.1/Applications/308C4355-D8EE-4524-A7F9-638DEB68B298/Documents/file.sqlite3 and I inserted the tables into it using Terminal.app and everything works ok with my app. but when I moved this application to another device, opened by xcode and trying to run it, I discovered that my tables are not found in this .sqlite3 file in another device. how can I save my tables in .sqlite3 file??

Read the article

MySQL: LOAD DATA reclaim disk space after delete

- by Michael

I have a DB schema composed of MYISAM tables, i am interested to delete old records from time to time from some of the tables. I know that delete does not reclaim the memory space, but as i found in a description of DELETE command, inserts may reuse the space deleted In MyISAM tables, deleted rows are maintained in a linked list and subsequent INSERT operations reuse old row positions. I am interested if LOAD DATA command also reuses the deleted space? UPDATE I am also interested how the index space reclaimed?

Read the article

SQL SELECT from third level table

- by Spidermain50

Ok I know little about SQL so bare with me... I'm trying to see if certain values exist in a third level table and I don't know how to go about it. Here is the scenario... I have a Accident table that holds accident information. It has 3 one-to-many child tables (Units, Occupants, NonMotorists). An each of those child tables have their own many-to-many child table (Alcohol). I need to be able to have some way of seeing if a range of values exists in a field in those Alcohol tables. Here is watered down version of what my structure for the tables looks like... --tblAccident--_ PK_AccidentNumber --tblAccidentUnit-- PK_PrimaryKey FK_AccidentNumber --tblAccidentOccupant-- PK_PrimaryKey FK_AccidentNumber --tblAccidentNonMotorist-- PK_PrimaryKey FK_AccidentNumber --tblAccidentUnitAlcohol-- PK_PrimaryKey FK_ForeignKey AlcoholValue <---- THIS IS WHAT I NEED TO SEARCH --tblAccidentOccupantAlcohol-- PK_PrimaryKey FK_ForeignKey AlcoholValue <---- THIS IS WHAT I NEED TO SEARCH --tblAccidentNonMotoristAlcohol-- PK_PrimaryKey FK_ForeignKey AlcoholValue <---- THIS IS WHAT I NEED TO SEARCH I hope this makes some sense as to what i am trying to accomplish. thank you

Read the article

Stop Entity Framework from updating edmx model with a column that isn't needed

- by TMan

I have rowguids in all my tables to help with change tracking in all my tables. I don't want/need these tables in my edmx or my entities. However, I do still need to make changes to other things sometimes so everytime i go to update model from database in the edmx it adds all the rowguids in all my tables everytime and i have to manually delete each one. Is there a way to handle this from happening? Is there a way I can maybe edit the T4 to maybe ignore that 'rowguid' column? Database first Entity framework

Read the article

Very basic database theory.

- by John R

I have a set of tables to show the relationship between organziations and supporters below. Although I have done some basic mySQL querries, I know very little about database 'design'. I plan to querry the database for: -a list of contributors to a specific organization... or, -a list of organizations that a specific suporter supports. The database tables for organiations and contributors may have other columns in the future and recieve a lesser amount of querries based on that information. A | X A | Y A | Z B | X B | Y C | X C | Z How should the tables be set up? I assume that there should be a third table, but there is still redundent information in the third table. Is there a better way of setting up the tables? +----+-------+ +-------------+----------+ +----+-------+ | id | org | | org | contr | | id | contr.| +----+-------+ +-------------+----------+ +----+-------+ | 1 | A | | 1 | 1 | | 1 | X | | 2 | B | | 1 | 2 | | 2 | Y | | 3 | C | | 1 | 3 | | 3 | Z | +----+-------+ | 2 | 1 | +----+-------+ | 2 | 2 | | 3 | 1 | | 3 | 3 | +-------------+----------+

Read the article

Is this possible in sql server 2005?

- by chandru_cp

This is my queries select ClientName,ClientMobNo from Clients select DriverName,DriverMobNo from Drivers It gives me two result tables... But i want to combine both the result tables into a single table... I tried union and union all it doesn't give me what i want.... Note: There is no relationship between the two tables...... There may be 200 clients and 100 drivers...

Read the article

Compare two table and find matching columns

- by Karthick

Hi, I have two tables table1 and table2, i need to write a select query which will list me the columns that exist in both the tables.(mysql) I need to do for different tables (2 at a time) Is this possible? I tried using INFORMATION_SCHEMA.COLUMNS but am not able to get it right.

Read the article

Database Design Question

- by Soo

Ok SO, I have a user table and want to define groups of users together. The best solution I have for this is to create three database tables as follows: UserTable user_id user_name UserGroupLink group_id member_id GroupInfo group_id group_name This method keeps the member and group information separate. This is just my way of thinking. Is there a better way to do this? Also, what is a good naming convention for tables that link two other tables?

Read the article

Getting Error 91

- by user1695788

I have a general comprehension issue with classes and objects. What I'm trying to do is pretty simple but I'm getting errors. In the code example below, sometimes the line "Call tables.MethodInCTables" runs fine and sometimes it produces error 91, object not set. IN all cases, I can "see" the method in the type ahead so I know that the code recognizes the "tables" instance and "sees" MethodInCTables. But then I get the run-time error. Sub MainSub() Dim tables as New CTables Call tables.MethodInCTables End Sub ----Class Module = CTables Sub MethodInCTables() ...do something End Sub

Read the article

T-SQL Tuesday #33: Trick Shots: Undocumented, Underdocumented, and Unknown Conspiracies!

- by Most Valuable Yak (Rob Volk)

Mike Fal (b | t) is hosting this month's T-SQL Tuesday on Trick Shots. I love this choice because I've been preoccupied with sneaky/tricky/evil SQL Server stuff for a long time and have been presenting on it for the past year. Mike's directives were "Show us a cool trick or process you developed…It doesn’t have to be useful", which most of my blogging definitely fits, and "Tell us what you learned from this trick…tell us how it gave you insight in to how SQL Server works", which is definitely a new concept. I've done a lot of reading and watching on SQL Server Internals and even attended training, but sometimes I need to go explore on my own, using my own tools and techniques. It's an itch I get every few months, and, well, it sure beats workin'. I've found some people to be intimidated by SQL Server's internals, and I'll admit there are A LOT of internals to keep track of, but there are tons of excellent resources that clearly document most of them, and show how knowing even the basics of internals can dramatically improve your database's performance. It may seem like rocket science, or even brain surgery, but you don't have to be a genius to understand it. Although being an "evil genius" can help you learn some things they haven't told you about. ;) This blog post isn't a traditional "deep dive" into internals, it's more of an approach to find out how a program works. It utilizes an extremely handy tool from an even more extremely handy suite of tools, Sysinternals. I'm not the only one who finds Sysinternals useful for SQL Server: Argenis Fernandez (b | t), Microsoft employee and former T-SQL Tuesday host, has an excellent presentation on how to troubleshoot SQL Server using Sysinternals, and I highly recommend it. Argenis didn't cover the Strings.exe utility, but I'll be using it to "hack" the SQL Server executable (DLL and EXE) files. Please note that I'm not promoting software piracy or applying these techniques to attack SQL Server via internal knowledge. This is strictly educational and doesn't reveal any proprietary Microsoft information. And since Argenis works for Microsoft and demonstrated Sysinternals with SQL Server, I'll just let him take the blame for it. :P (The truth is I've used Strings.exe on SQL Server before I ever met Argenis.) Once you download and install Strings.exe you can run it from the command line. For our purposes we'll want to run this in the Binn folder of your SQL Server instance (I'm referencing SQL Server 2012 RTM): cd "C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn" C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn> strings *sql*.dll > sqldll.txt C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn> strings *sql*.exe > sqlexe.txt I've limited myself to DLLs and EXEs that have "sql" in their names. There are quite a few more but I haven't examined them in any detail. (Homework assignment for you!) If you run this yourself you'll get 2 text files, one with all the extracted strings from every SQL DLL file, and the other with the SQL EXE strings. You can open these in Notepad, but you're better off using Notepad++, EditPad, Emacs, Vim or another more powerful text editor, as these will be several megabytes in size. And when you do open it…you'll find…a TON of gibberish. (If you think that's bad, just try opening the raw DLL or EXE file in Notepad. And by the way, don't do this in production, or even on a running instance of SQL Server.) Even if you don't clean up the file, you can still use your editor's search function to find a keyword like "SELECT" or some other item you expect to be there. As dumb as this sounds, I sometimes spend my lunch break just scanning the raw text for anything interesting. I'm boring like that. Sometimes though, having these files available can lead to some incredible learning experiences. For me the most recent time was after reading Joe Sack's post on non-parallel plan reasons. He mentions a new SQL Server 2012 execution plan element called NonParallelPlanReason, and demonstrates a query that generates "MaxDOPSetToOne". Joe (formerly on the Microsoft SQL Server product team, so he knows this stuff) mentioned that this new element was not currently documented and tried a few more examples to see what other reasons could be generated. Since I'd already run Strings.exe on the SQL Server DLLs and EXE files, it was easy to run grep/find/findstr for MaxDOPSetToOne on those extracts. Once I found which files it belonged to (sqlmin.dll) I opened the text to see if the other reasons were listed. As you can see in my comment on Joe's blog, there were about 20 additional non-parallel reasons. And while it's not "documentation" of this underdocumented feature, the names are pretty self-explanatory about what can prevent parallel processing. I especially like the ones about cursors – more ammo! - and am curious about the PDW compilation and Cloud DB replication reasons. One reason completely stumped me: NoParallelHekatonPlan. What the heck is a hekaton? Google and Wikipedia were vague, and the top results were not in English. I found one reference to Greek, stating "hekaton" can be translated as "hundredfold"; with a little more Wikipedia-ing this leads to hecto, the prefix for "one hundred" as a unit of measure. I'm not sure why Microsoft chose hekaton for such a plan name, but having already learned some Greek I figured I might as well dig some more in the DLL text for hekaton. Here's what I found: hekaton_slow_param_passing Occurs when a Hekaton procedure call dispatch goes to slow parameter passing code path The reason why Hekaton parameter passing code took the slow code path hekaton_slow_param_pass_reason sp_deploy_hekaton_database sp_undeploy_hekaton_database sp_drop_hekaton_database sp_checkpoint_hekaton_database sp_restore_hekaton_database e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\hkproc.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\matgen.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\matquery.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\sqlmeta.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\resultset.cpp Interesting! The first 4 entries (in red) mention parameters and "slow code". Could this be the foundation of the mythical DBCC RUNFASTER command? Have I been passing my parameters the slow way all this time? And what about those sp_xxxx_hekaton_database procedures (in blue)? Could THEY be the secret to a faster SQL Server? Could they promise a "hundredfold" improvement in performance? Are these special, super-undocumented DIB (databases in black)? I decided to look in the SQL Server system views for any objects with hekaton in the name, or references to them, in hopes of discovering some new code that would answer all my questions: SELECT name FROM sys.all_objects WHERE name LIKE '%hekaton%' SELECT name FROM sys.all_objects WHERE object_definition(OBJECT_ID) LIKE '%hekaton%' Which revealed: name ------------------------ (0 row(s) affected) name ------------------------ sp_createstats sp_recompile sp_updatestats (3 row(s) affected) Hmm. Well that didn't find much. Looks like these procedures are seriously undocumented, unknown, perhaps forbidden knowledge. Maybe a part of some unspeakable evil? (No, I'm not paranoid, I just like mysteries and thought that punching this up with that kind of thing might keep you reading. I know I'd fall asleep without it.) OK, so let's check out those 3 procedures and see what they reveal when I search for "Hekaton": sp_createstats: -- filter out local temp tables, Hekaton tables, and tables for which current user has no permissions -- Note that OBJECTPROPERTY returns NULL on type="IT" tables, thus we only call it on type='U' tables OK, that's interesting, let's go looking down a little further: ((@table_type<>'U') or (0 = OBJECTPROPERTY(@table_id, 'TableIsInMemory'))) and -- Hekaton table Wellllll, that tells us a few new things: There's such a thing as Hekaton tables (UPDATE: I'm not the only one to have found them!) They are not standard user tables and probably not in memory UPDATE: I misinterpreted this because I didn't read all the code when I wrote this blog post. The OBJECTPROPERTY function has an undocumented TableIsInMemory option Let's check out sp_recompile: -- (3) Must not be a Hekaton procedure. And once again go a little further: if (ObjectProperty(@objid, 'IsExecuted') <> 0 AND ObjectProperty(@objid, 'IsInlineFunction') = 0 AND ObjectProperty(@objid, 'IsView') = 0 AND -- Hekaton procedure cannot be recompiled -- Make them go through schema version bumping branch, which will fail ObjectProperty(@objid, 'ExecIsCompiledProc') = 0) And now we learn that hekaton procedures also exist, they can't be recompiled, there's a "schema version bumping branch" somewhere, and OBJECTPROPERTY has another undocumented option, ExecIsCompiledProc. (If you experiment with this you'll find this option returns null, I think it only works when called from a system object.) This is neat! Sadly sp_updatestats doesn't reveal anything new, the comments about hekaton are the same as sp_createstats. But we've ALSO discovered undocumented features for the OBJECTPROPERTY function, which we can now search for: SELECT name, object_definition(OBJECT_ID) FROM sys.all_objects WHERE object_definition(OBJECT_ID) LIKE '%OBJECTPROPERTY(%' I'll leave that to you as more homework. I should add that searching the system procedures was recommended long ago by the late, great Ken Henderson, in his Guru's Guide books, as a great way to find undocumented features. That seems to be really good advice! Now if you're a programmer/hacker, you've probably been drooling over the last 5 entries for hekaton (in green), because these are the names of source code files for SQL Server! Does this mean we can access the source code for SQL Server? As The Oracle suggested to Neo, can we return to The Source??? Actually, no. Well, maybe a little bit. While you won't get the actual source code from the compiled DLL and EXE files, you'll get references to source files, debugging symbols, variables and module names, error messages, and even the startup flags for SQL Server. And if you search for "DBCC" or "CHECKDB" you'll find a really nice section listing all the DBCC commands, including the undocumented ones. Granted those are pretty easy to find online, but you may be surprised what those web sites DIDN'T tell you! (And neither will I, go look for yourself!) And as we saw earlier, you'll also find execution plan elements, query processing rules, and who knows what else. It's also instructive to see how Microsoft organizes their source directories, how various components (storage engine, query processor, Full Text, AlwaysOn/HADR) are split into smaller modules. There are over 2000 source file references, go do some exploring! So what did we learn? We can pull strings out of executable files, search them for known items, browse them for unknown items, and use the results to examine internal code to learn even more things about SQL Server. We've even learned how to use command-line utilities! We are now 1337 h4X0rz! (Not really. I hate that leetspeak crap.) Although, I must confess I might've gone too far with the "conspiracy" part of this post. I apologize for that, it's just my overactive imagination. There's really no hidden agenda or conspiracy regarding SQL Server internals. It's not The Matrix. It's not like you'd find anything like that in there: Attach Matrix Database DM_MATRIX_COMM_PIPELINES MATRIXXACTPARTICIPANTS dm_matrix_agents Alright, enough of this paranoid ranting! Microsoft are not really evil! It's not like they're The Borg from Star Trek: ALTER FEDERATION DROP ALTER FEDERATION SPLIT DROP FEDERATION #tsql2sday

Read the article

MySQL Memory usage

- by Rob Stevenson-Leggett

Our MySQL server seems to be using a lot of memory. I've tried looking for slow queries and queries with no index and have halved the peak CPU usage and Apache memory usage but the MySQL memory stays constantly at 2.2GB (~51% of available memory on the server). Here's the graph from Plesk. Running top in the SSH window shows the same figures. Does anyone have any ideas on why the memory usage is constant like this and not peaks and troughs with usage of the app? Here's the output of the MySQL Tuning Primer script: -- MYSQL PERFORMANCE TUNING PRIMER -- - By: Matthew Montgomery - MySQL Version 5.0.77-log x86_64 Uptime = 1 days 14 hrs 4 min 21 sec Avg. qps = 22 Total Questions = 3059456 Threads Connected = 13 Warning: Server has not been running for at least 48hrs. It may not be safe to use these recommendations To find out more information on how each of these runtime variables effects performance visit: http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html Visit http://www.mysql.com/products/enterprise/advisors.html for info about MySQL's Enterprise Monitoring and Advisory Service SLOW QUERIES The slow query log is enabled. Current long_query_time = 1 sec. You have 6 out of 3059477 that take longer than 1 sec. to complete Your long_query_time seems to be fine BINARY UPDATE LOG The binary update log is NOT enabled. You will not be able to do point in time recovery See http://dev.mysql.com/doc/refman/5.0/en/point-in-time-recovery.html WORKER THREADS Current thread_cache_size = 0 Current threads_cached = 0 Current threads_per_sec = 2 Historic threads_per_sec = 0 Threads created per/sec are overrunning threads cached You should raise thread_cache_size MAX CONNECTIONS Current max_connections = 100 Current threads_connected = 14 Historic max_used_connections = 20 The number of used connections is 20% of the configured maximum. Your max_connections variable seems to be fine. INNODB STATUS Current InnoDB index space = 6 M Current InnoDB data space = 18 M Current InnoDB buffer pool free = 0 % Current innodb_buffer_pool_size = 8 M Depending on how much space your innodb indexes take up it may be safe to increase this value to up to 2 / 3 of total system memory MEMORY USAGE Max Memory Ever Allocated : 2.07 G Configured Max Per-thread Buffers : 274 M Configured Max Global Buffers : 2.01 G Configured Max Memory Limit : 2.28 G Physical Memory : 3.84 G Max memory limit seem to be within acceptable norms KEY BUFFER Current MyISAM index space = 4 M Current key_buffer_size = 7 M Key cache miss rate is 1 : 40 Key buffer free ratio = 81 % Your key_buffer_size seems to be fine QUERY CACHE Query cache is supported but not enabled Perhaps you should set the query_cache_size SORT OPERATIONS Current sort_buffer_size = 2 M Current read_rnd_buffer_size = 256 K Sort buffer seems to be fine JOINS Current join_buffer_size = 132.00 K You have had 16 queries where a join could not use an index properly You should enable "log-queries-not-using-indexes" Then look for non indexed joins in the slow query log. If you are unable to optimize your queries you may want to increase your join_buffer_size to accommodate larger joins in one pass. Note! This script will still suggest raising the join_buffer_size when ANY joins not using indexes are found. OPEN FILES LIMIT Current open_files_limit = 1024 files The open_files_limit should typically be set to at least 2x-3x that of table_cache if you have heavy MyISAM usage. Your open_files_limit value seems to be fine TABLE CACHE Current table_cache value = 64 tables You have a total of 426 tables You have 64 open tables. Current table_cache hit rate is 1% , while 100% of your table cache is in use You should probably increase your table_cache TEMP TABLES Current max_heap_table_size = 16 M Current tmp_table_size = 32 M Of 15134 temp tables, 9% were created on disk Effective in-memory tmp_table_size is limited to max_heap_table_size. Created disk tmp tables ratio seems fine TABLE SCANS Current read_buffer_size = 128 K Current table scan ratio = 2915 : 1 read_buffer_size seems to be fine TABLE LOCKING Current Lock Wait ratio = 1 : 142213 Your table locking seems to be fine The app is a facebook game with about 50-100 concurrent users. Thanks, Rob

Read the article

Basics of Join Predicate Pushdown in Oracle

- by Maria Colgan

Happy New Year to all of our readers! We hope you all had a great holiday season. We start the new year by continuing our series on Optimizer transformations. This time it is the turn of Predicate Pushdown. I would like to thank Rafi Ahmed for the content of this blog.Normally, a view cannot be joined with an index-based nested loop (i.e., index access) join, since a view, in contrast with a base table, does not have an index defined on it. A view can only be joined with other tables using three methods: hash, nested loop, and sort-merge joins. Introduction The join predicate pushdown (JPPD) transformation allows a view to be joined with index-based nested-loop join method, which may provide a more optimal alternative. In the join predicate pushdown transformation, the view remains a separate query block, but it contains the join predicate, which is pushed down from its containing query block into the view. The view thus becomes correlated and must be evaluated for each row of the outer query block. These pushed-down join predicates, once inside the view, open up new index access paths on the base tables inside the view; this allows the view to be joined with index-based nested-loop join method, thereby enabling the optimizer to select an efficient execution plan. The join predicate pushdown transformation is not always optimal. The join predicate pushed-down view becomes correlated and it must be evaluated for each outer row; if there is a large number of outer rows, the cost of evaluating the view multiple times may make the nested-loop join suboptimal, and therefore joining the view with hash or sort-merge join method may be more efficient. The decision whether to push down join predicates into a view is determined by evaluating the costs of the outer query with and without the join predicate pushdown transformation under Oracle's cost-based query transformation framework. The join predicate pushdown transformation applies to both non-mergeable views and mergeable views and to pre-defined and inline views as well as to views generated internally by the optimizer during various transformations. The following shows the types of views on which join predicate pushdown is currently supported. UNION ALL/UNION view Outer-joined view Anti-joined view Semi-joined view DISTINCT view GROUP-BY view Examples Consider query A, which has an outer-joined view V. The view cannot be merged, as it contains two tables, and the join between these two tables must be performed before the join between the view and the outer table T4. A: SELECT T4.unique1, V.unique3 FROM T_4K T4, (SELECT T10.unique3, T10.hundred, T10.ten FROM T_5K T5, T_10K T10 WHERE T5.unique3 = T10.unique3) VWHERE T4.unique3 = V.hundred(+) AND T4.ten = V.ten(+) AND T4.thousand = 5; The following shows the non-default plan for query A generated by disabling join predicate pushdown. When query A undergoes join predicate pushdown, it yields query B. Note that query B is expressed in a non-standard SQL and shows an internal representation of the query. B: SELECT T4.unique1, V.unique3 FROM T_4K T4, (SELECT T10.unique3, T10.hundred, T10.ten FROM T_5K T5, T_10K T10 WHERE T5.unique3 = T10.unique3 AND T4.unique3 = V.hundred(+) AND T4.ten = V.ten(+)) V WHERE T4.thousand = 5; The execution plan for query B is shown below. In the execution plan BX, note the keyword 'VIEW PUSHED PREDICATE' indicates that the view has undergone the join predicate pushdown transformation. The join predicates (shown here in red) have been moved into the view V; these join predicates open up index access paths thereby enabling index-based nested-loop join of the view. With join predicate pushdown, the cost of query A has come down from 62 to 32. As mentioned earlier, the join predicate pushdown transformation is cost-based, and a join predicate pushed-down plan is selected only when it reduces the overall cost. Consider another example of a query C, which contains a view with the UNION ALL set operator.C: SELECT R.unique1, V.unique3 FROM T_5K R, (SELECT T1.unique3, T2.unique1+T1.unique1 FROM T_5K T1, T_10K T2 WHERE T1.unique1 = T2.unique1 UNION ALL SELECT T1.unique3, T2.unique2 FROM G_4K T1, T_10K T2 WHERE T1.unique1 = T2.unique1) V WHERE R.unique3 = V.unique3 and R.thousand < 1; The execution plan of query C is shown below. In the above, 'VIEW UNION ALL PUSHED PREDICATE' indicates that the UNION ALL view has undergone the join predicate pushdown transformation. As can be seen, here the join predicate has been replicated and pushed inside every branch of the UNION ALL view. The join predicates (shown here in red) open up index access paths thereby enabling index-based nested loop join of the view. Consider query D as an example of join predicate pushdown into a distinct view. We have the following cardinalities of the tables involved in query D: Sales (1,016,271), Customers (50,000), and Costs (787,766). D: SELECT C.cust_last_name, C.cust_city FROM customers C, (SELECT DISTINCT S.cust_id FROM sales S, costs CT WHERE S.prod_id = CT.prod_id and CT.unit_price > 70) V WHERE C.cust_state_province = 'CA' and C.cust_id = V.cust_id; The execution plan of query D is shown below. As shown in XD, when query D undergoes join predicate pushdown transformation, the expensive DISTINCT operator is removed and the join is converted into a semi-join; this is possible, since all the SELECT list items of the view participate in an equi-join with the outer tables. Under similar conditions, when a group-by view undergoes join predicate pushdown transformation, the expensive group-by operator can also be removed. With the join predicate pushdown transformation, the elapsed time of query D came down from 63 seconds to 5 seconds. Since distinct and group-by views are mergeable views, the cost-based transformation framework also compares the cost of merging the view with that of join predicate pushdown in selecting the most optimal execution plan. Summary We have tried to illustrate the basic ideas behind join predicate pushdown on different types of views by showing example queries that are quite simple. Oracle can handle far more complex queries and other types of views not shown here in the examples. Again many thanks to Rafi Ahmed for the content of this blog post.

Read the article

SQL SERVER – Stored Procedure and Transactions

- by pinaldave

I just overheard the following statement – “I do not use Transactions in SQL as I use Stored Procedure“. I just realized that there are so many misconceptions about this subject. Transactions has nothing to do with Stored Procedures. Let me demonstrate that with a simple example. USE tempdb GO -- Create 3 Test Tables CREATE TABLE TABLE1 (ID INT); CREATE TABLE TABLE2 (ID INT); CREATE TABLE TABLE3 (ID INT); GO -- Create SP CREATE PROCEDURE TestSP AS INSERT INTO TABLE1 (ID) VALUES (1) INSERT INTO TABLE2 (ID) VALUES ('a') INSERT INTO TABLE3 (ID) VALUES (3) GO -- Execute SP -- SP will error out EXEC TestSP GO -- Check the Values in Table SELECT * FROM TABLE1; SELECT * FROM TABLE2; SELECT * FROM TABLE3; GO Now, the main point is: If Stored Procedure is transactional then, it should roll back complete transactions when it encounters any errors. Well, that does not happen in this case, which proves that Stored Procedure does not only provide just the transactional feature to a batch of T-SQL. Let’s see the result very quickly. It is very clear that there were entries in table1 which are not shown in the subsequent tables. If SP was transactional in terms of T-SQL Query Batches, there would be no entries in any of the tables. If you want to use Transactions with Stored Procedure, wrap the code around with BEGIN TRAN and COMMIT TRAN. The example is as following. CREATE PROCEDURE TestSPTran AS BEGIN TRAN INSERT INTO TABLE1 (ID) VALUES (11) INSERT INTO TABLE2 (ID) VALUES ('b') INSERT INTO TABLE3 (ID) VALUES (33) COMMIT GO -- Execute SP EXEC TestSPTran GO -- Check the Values in Tables SELECT * FROM TABLE1; SELECT * FROM TABLE2; SELECT * FROM TABLE3; GO -- Clean up DROP TABLE Table1 DROP TABLE Table2 DROP TABLE Table3 GO In this case, there will be no entries in any part of the table. What is your opinion about this blog post? Please leave your comments about it here. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Stored Procedure, SQL Tips and Tricks, T SQL, Technology

Read the article

The penultimate audit trigger framework

- by Piotr Rodak

So, it’s time to see what I came up with after some time of playing with COLUMNS_UPDATED() and bitmasks. The first part of this miniseries describes the mechanics of the encoding which columns are updated within DML operation. The task I was faced with was to prepare an audit framework that will be fairly easy to use. The audited tables were to be the ones directly modified by user applications, not the ones heavily used by batch or ETL processes. The framework consists of several tables and procedures...(read more)

Read the article

Advanced TSQL Tuning: Why Internals Knowledge Matters

- by Paul White

There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes. Query tuning is not complete as soon as the query returns results quickly in the development or test environments. In production, your query will compete for memory, CPU, locks, I/O and other resources on the server. Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data. In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row. The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type. A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL, CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL, CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table. The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results. This seems slow, considering that the table only has 50,000 rows. Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time. Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring. The script below monitors STATISTICS IO output and the amount of tempdb used by the test query. We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB. The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row. With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first). This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts. In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted. SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed. Sorts typically require a very large number of comparisons, so this is usually a very effective optimization. One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself. SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use. The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory. Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant. The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources. More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1). Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB. The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file. Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case. In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms. If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much. Why is the data size so much smaller? The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored. By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output. You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead. SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows. The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes. The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page. There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option. By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row. SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage. You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now. This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query). SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant. No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers. Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’. Why? Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed. SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages. This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this? Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need. 245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory. Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily? That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column. That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal. I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator. Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted. We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need. The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb. The small remaining inefficiency is in reading the id column values from the clustered primary key index. As a clustered index, it contains all the in-row data at its leaf. The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead. The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure. This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only. This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data. The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings. It isn’t about the internal differences between TEXT and MAX data types either. It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load. No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance. Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows. 5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is! If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb. Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough. Tuning for logical reads and adding covering indexes is not enough. If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on. The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008. They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator. Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

Search Results

Search found 8362 results on 335 pages for 'rainbow tables'.

Page 70/335 | < Previous Page | 66 67 68 69 70 71 72 73 74 75 76 77 | Next Page >

- by Nate Bross

- by Tim Mahy

- by Mahmoud

- by cneeds

- by Tony Brandner

- by Jian Lin

- by uki

- by Carson

- by Jian Lin

- by Flavius

- by user1553381

- by Michael

- by Spidermain50

- by TMan

- by John R

- by chandru_cp

- by Karthick

- by Soo

- by user1695788

- by Most Valuable Yak (Rob Volk)

- by Rob Stevenson-Leggett

- by Maria Colgan

- by pinaldave

- by Piotr Rodak

- by Paul White

< Previous Page | 66 67 68 69 70 71 72 73 74 75 76 77 | Next Page >