Search Results

Search found 1220 results on 49 pages for 'nathan pk'.

Page 31/49 | < Previous Page | 27 28 29 30 31 32 33 34 35 36 37 38  | Next Page >

  • Utilizing a Queue

    - by Nathan
    I'm trying to store records of transactions all together and by category for the last 1, 7, 30 or 360 days. I've tried a couple things, but they've brutally failed. I had an idea of using a queue with 360 values, one for each day, but I don't know enough about queue's to figure out how that would work. Input will be an instance of this class: class Transaction { public string TotalEarned { get; set; } public string TotalHST { get; set; } public string TotalCost { get; set; } public string Category { get; set; } } New transactions can occur at any time during the day, and there could be as many as 15 transactions in a day. My program is using a plain text file as external storage, but how I load it depends on how I decide to store this data. What would be the best way to do this?

    Read the article

  • Query for multiple joins

    - by Shailaja
    i have 3 tables named dataset,dataelem and transformdataelem with column names as below: main.Dataset ------------ datasetID (PK) applicationID main.Dataelem ------------- dataelemID(PK) datasetID(FK) dataelemname biztermID main.Transformdataelem ---------------------- OutputdataelemID InputdataelemID My requirement is: All tables are referenced. Extract all the dataelemId rows from dataelem table where applicationID of dataset table is equal to 1044 and biztermid shud be null. Then whatever resultant dataelemIDs from the above query should be matched with outputdataelemID of Transformdataelem table and we shud get the respective input dataelemId's. Again with these matched inputdataelemID's we shud get the dataelemname's from datelem table.

    Read the article

  • How can I remove the head of a main function?

    - by Nathan McDavitt-Van Fleet
    I am trying to move some code from a seperate binary and have it inside my main program. Unfortunately I can't mimic the initialization variables for the main function. How can I create argc and argv by hand? Can someone give me some example assignments. since it looks like this: int main(int argc, char *argv[]) I figured I could assign them like this: int argc=1; char *argv[0]="Example"; But it doesn't work. Can anyone tell me how this might be done?

    Read the article

  • Most efficient way to make an activity log

    - by Nathan
    I am making a "recent activity" tab to profiles on my site and I also am going to have a log for moderators to see everything that happens on the site. This would require making an activity log of some sort. I just don't know what would be better. I have 2 options: Make a table called "activity" and then every time someone does something, add a record to it with the type of action, user id, timestamp, etc. Problem: table could get very long. Join all 3 tables (questions, answers, answer_comments) and then somehow show all these on the page in the order in which the action was taken. Problem: this would be extremely hard because I have no clue how I could make it say "John commented on an answer on Question Title Here" by just joining 3 tables. Does anyone know of a better way of making an activity log in this situation? I am using PHP and MySQL. If this is either too inefficient or hard I will probably just forget the Recent Activity tab on profiles but I still need an activity log for moderators. Here's some SQL that I started making for option 2, but this would not work because there is no way of detecting whether the action is a comment, question, or answer when I echo the info in a while loop: SELECT q.*, a.*, ac.* FROM questions q JOIN answers a ON a.questionid = q.qid JOIN answer_comments ac ON c.answerid = a.ans_id WHERE q.user = $userid AND a.userid = $userid AND ac.userid = $userid ORDER BY q.created DESC, a.created DESC, ac.created DESC Thanks in advance for any help!

    Read the article

  • Django: Why Doesn't the Current URL Match any Patterns in urls.py

    - by austin_sherron
    I've found a few questions here related to my issue, but I haven't found anything that has helped me resolve my issue. I'm using Python 2.7.5 and Django 1.8.dev20140627143448. I have a view that's interacting with my database to delete objects, and it takes two arguments in addition to a request: def delete_data_item(request, dataclass_id, dataitem_id): form = AddDataItemForm(request.POST) data_set = get_object_or_404(DataClass, pk=dataclass_id) context = {'data_set': data_set, 'form': form} data_item = get_object_or_404(DataItem, pk=dataitem_id) data_item.delete() data_set.save() return HttpResponseRedirect(reverse('detail', args=(dataclass_id,))) The URL in myapp.urls.py looks something like this: url(r'^(?P<dataclass_id>[0-9]+)/(?P<dataitem_id>[0-9]+)/delete_data_item/$', views.delete_data_item, name='delete_data_item') and the portion of my template relevant to the view is: <a href="{% url 'delete_data_item' data_set.id data_item.id %}">DELETE</a> Whenever I click on the DELETE link, django tells me that the request URL: http://127.0.0.1:8000/myapp/5/%7B%%20url%20'delete_data_item'%20data_set.id%20data_item.id%20%%7D doesn't match any of my URL patterns. What am I missing? The URL on which the DELETE links exist is myapp/(<dataclass_id>[0-9]+)/

    Read the article

  • <span> containing 3 overlapping images has 3x the necessary width

    - by Nathan Parrish
    Hi guys, I have a element, containing three overlapping images. Inspecting the element in Chrome shows this: <span id=?"span1">? <img id=?"img1" src=?"images/?progressbar.gif" width=?"120" style=?"position:? relative;? z-index:? 3;?">? <img id=?"img2" src=?"images/?progressbar.gif" style=?"width:? 120px;? height:? 12px;?? position:? relative;? left:? -120px;? z-index:? 2;?">? <img id=?"img3" src=?"images/?progressbar.gif" style=?"width:? 120px;? height:? 12px;? position:? relative;? left:? -240px;? z-index:? 1;?">? </span>? The important point is that the second two images are given a relative position, shifting them to the left so they perfectly overlap the first. But the span itself is still 360 pixels wide (3 x 120 pixels per image). So how can I achieve this effect while keeping the span width tightly bounded around the images? Thanks!

    Read the article

  • Controlling processes from Python

    - by Nathan
    Hi, I want to control several subprocesses of the same type from python (I am under linux). I want to: Start them. Stop them. Ask if they are still running. I can start a processes with with spawnl, and get the pid. Using this pid I can stop it with kill. And I am sure there is also a way to ask if it is running with the pid. The problem is, what if the following happens: I start a process, remember the pid. The process ends without me noticing and another completely different process starts getting assigned the same pid. I attempt to kill my process, I kill a completely different one. What is the better way to start and control processes in python? Thanks!

    Read the article

  • Join using combined conditions on one join table

    - by Nathan Wienert
    I have join a table joining songs to genres. The table has a 'source' column that's used to identify where the genre was found. Genres are found from blogs, artists, tags, and posts. So, songs | song_genre | genres id | song_id, source, genre_id | id What I want to build is a song SELECT query that works something like this, given I already have a genre_id: IF exists song_genre with source='artist' AND a song_genre with source='blog' OR exists song_genre with source='artist' AND a song_genre with source='post' OR exists song_genre with source='tag' I'm was going to do it by doing a bunch of joins, but am sure I'm not doing it very well. Using Postgres 9.1.

    Read the article

  • Most efficient way to maintain a 'set' in SQL Server?

    - by SEVEN YEAR LIBERAL ARTS DEGREE
    I have ~2 million rows or so of data, each row with an artificial PK, and two Id fields (so: PK, ID1, ID2). I have a unique constraint (and index) on ID1+ID2. I get two sorts of updates, both with a distinct ID1 per update. 100-1000 rows of all-new data (ID1 is new) 100-1000 rows of largely, but not necessarily completely overlapping data (ID1 already exists, maybe new ID1+ID2 pairs) What's the most efficient way to maintain this 'set'? Here are the options as I see them: Delete all the rows with ID1, insert all the new rows (yikes) Query all the existing rows from the set of new data ID1+ID2, only insert the new rows Insert all the new rows, ignore inserts that trigger unique constraint violations Any thoughts?

    Read the article

  • Storing Values in colunms alphabetic?

    - by Mdillion
    Is there any benefit to storing content alphabetic in columns? Maybe make lookups faster? If yes then when i add new lookup values to my tables do i need to rebuild the PK for the looup values to fit in the new text? Say a table like this: City_tbl city_id: example: 1120 City_name: example: New York. If I need to add Chicago to it, do i add it at the bottom of the list with the next ID which may be 2000 or do i inset it after the city in alphabetic order which would mean I need to update the PK Id of all following IDs by 1. Only benefit I know about is when I have to manually add lookup values without quering the database I can quickly check the lookup value list for exiting items with ease. But not sure if it may make lookups faster or something if the system knows the text is in aplhabetic order.

    Read the article

  • Change jQuery slider option dynamically based on window width

    - by Nathan
    I would like to change a jQuery option based on window width (on load as well as on resize). I've found solutions close to what I need, but I don't understand jQuery or javascript enough to customize them for my needs. Here's my jQuery code: <script type="text/javascript"> var tpj = jQuery; tpj.noConflict(); tpj(document).ready(function () { if (tpj.fn.cssOriginal != undefined) tpj.fn.css = tpj.fn.cssOriginal; tpj('#rev_slider_1_1').show().revolution({ delay: 5000, startwidth: 1920, startheight: 515, hideThumbs: 200, thumbWidth: 100, thumbHeight: 50, thumbAmount: 4, navigationType: "bullet", navigationArrows: "verticalcentered", navigationStyle: "navbar", touchenabled: "on", onHoverStop: "off", navOffsetHorizontal: 0, navOffsetVertical: 20, shadow: 0, fullWidth: "on" }); }); //ready </script> I want to change the startheight based on window width. If the window width is above 1280 I would like the value for the height to be 515, and if it is below 1280 I would like the height to be 615 and if the width is less than 480 make the height 715. With help from another post I am able to change the css I need using this script: $(window).on('load resize', function () { var w = $(window).width(); $("#rev_slider_1_1 #rev_slider_1_1_wrapper") .css('max-height', w > 1280 ? 515 : w > 480 ? 615 : 715); }); But I need to also change the jQuery startheight value on the fly. Can someone help? Thanks!

    Read the article

  • Next Identity Key LINQ + SQL Server

    - by user569347
    To represent our course tree structure in our Linq Dataclasses we have 2 columns that could potentially be the same as the PK. My problem is that if I want to Insert a new record and populate 2 other columns with the PK that was generated there is no way I can get the next identity and stop conflict with other administrators who might be doing the same insert at the same time. Case: A Leaf node has right_id and left_id = itself (prereq_id) **dbo.pre_req:** prereq_id left_id right_id op_id course_id is_head is_coreq is_enforced parent_course_id and I basically want to do this: pre_req rec = new pre_req { left_id = prereq_id, right_id = prereq_id, op_id = 3, course_id = query.course_id, is_head = true, is_coreq = false, parent_course_id = curCourse.course_id }; db.courses.InsertOnSubmit(rec); try { db.SubmitChanges(); } Any way to solve my dilemma? Thanks!

    Read the article

  • Performance implications of using a variable versus a magic number

    - by Nathan
    I'm often confused by this. I've always been taught to name numbers I use often using variables or constants, but if it reduces the efficiency of the program, should I still do it? Heres an example: private int CenterText(Font font, PrintPageEventArgs e, string text) { int recieptCenter = 125; int stringLength = Convert.ToInt32(e.Graphics.MeasureString(text, font)); return recieptCenter - stringLength / 2; } The above code is using named variables, but runs slower then this code: private int CenterText(Font font, PrintPageEventArgs e, string text) { return 125 - Convert.ToInt32(e.Graphics.MeasureString(text, font) / 2); } In this example, the difference in execution time is minimal, but what about in larger blocks of code?

    Read the article

  • jQuery: Load body of page into variable

    - by Nathan G.
    I'm using jQuery to load the result of a PHP script into a variable. The script is passed something that the user typed with a GET request. I want to take just what the script spit out into its <body> tag. Here's what I've tried: JS: function loader() { var typed = $('#i').val(); //get what user typed in $.get("script.php", {i: typed}, function(loaded) {dataloaded = loaded;}); alert($(dataloaded).find('body')) } But it just displays [Objec object]. How can I get a useful value that is just the contents of the body of a loaded page? I know the PHP works, I just need the JS. The script echos something like 1!!2 (two numbers separated by two exclamation points). Thanks!

    Read the article

  • simple query Delete records in a table based on count logic

    - by user1905941
    a table with a pk and status column which is having values as 'Y','N','NULL' Query: get the count of records with status column as 'Y', if this count exceeds 1% of total count of records then dont delete , else delete the records in the table. i tried like this Declare v_count Number; v_count1 Number; BEGIN v_count := select count(*) from temp; v_count1 := select count(*) from temp where status = 'Y' ; v_count := v_count + ((0.1) * (v_count)) if (v_count1 > v_count) { insert into temp1 values(pk,status) } else { Delete from temp ; } END;

    Read the article

  • Is it possible to turn a normal date into an ISO 8601 time format?

    - by Nathan
    I am trying to turn this type of format of the date: Thursday, November 10th, 2011 at 10:37 PM Into an ISO 8601 format (with PHP). How can I do this? I've tried: date("c", $row2['time']) Obviously, that's not correct, because the timeago jQuery plugin is saying "41 years ago", and that is definitely not 41 years ago. Is it not possible to turn that kind of date into the ISO 8601 format? I've tried searching for this and I haven't found any solutions on how to turn this format into ISO 8601.

    Read the article

  • Different characters take more/less data?

    - by Nathan
    I am working on a personal project and I'm wondering if certain characters take up more data in a text file than others. I need to choose a character to seperate items in my file, but if a 0 uses less bytes than a ! or something, it would be best to do that. I know all characters have an ASCII value, but would a lower ASCII value mean the character can be stored in fewer bytes? This might be an incredibly stupid question, but I don't see any information on the topic online so I came here to check. Thanks!

    Read the article

  • Python 3.1: Syntax Error for Everything! (Mac OS X)

    - by Nathan G.
    I updated to Python 3.1.3 (I've got OS X 10.6). If I type python in Terminal, I get a working 2.6.1 environment. If I type python3 in Terminal, I get a 3.1.3 environment. Everything looks fine until I do something. If I try to run print "hello", I get a syntax error. This problem is the same in IDLE. I tried deleting everything for 3.1 and then reinstalling, but it hasn't worked. Ideas? Thanks in advance!

    Read the article

  • Calling a constructor from method within the same class

    - by Nathan
    I'm new to java and I'm learning about creating object classes. One of my homework assignment requires that I call the constructor at least once within a method of the same object class. I'm getting an error that says The method DoubleMatrix(double[][]) is undefined for the type DoubleMatrix Here's my constructor: public DoubleMatrix(double[][] tempArray) { // Declaration int flag = 0; int cnt; // Statement // check to see if doubArray isn't null and has more than 0 rows if(tempArray == null || tempArray.length < 0) { flag++; } // check to see if each row has the same length if(flag == 0) { for(cnt = 0; cnt <= tempArray.length - 1 || flag != 1; cnt++) { if(tempArray[cnt + 1].length != tempArray[0].length) { flag++; } } } else if(flag == 1) { makeDoubMatrix(1, 1);// call makeDoubMatrix method } }// end constructor 2 Here's the method where I try and call the constructor: public double[][] addMatrix(double[][] tempDoub) { // Declaration double[][] newMatrix; int rCnt, cCnt; //Statement // checking to see if both are of same dimension if(doubMatrix.length == tempDoub.length && doubMatrix[0].length == tempDoub[0].length) { newMatrix = new double[doubMatrix.length][doubMatrix[0].length]; // for loop to add matrix to a new one for(rCnt = 0; rCnt <= doubMatrix.length; rCnt++) { for(cCnt = 0; cCnt <= doubMatrix.length; cCnt++) { newMatrix[rCnt][cCnt] = doubMatrix[rCnt][cCnt] + tempDoub[rCnt][cCnt]; } } } else { newMatrix = new double[0][0]; DoubleMatrix(newMatrix) } return newMatrix; }// end addMatrix method Can someone point me to the right direction and explain why I'm getting an error?

    Read the article

  • Windows 7 ODBC Text Driver

    - by nute
    Some software requires me to setup an ODBC Text Driver. In the Windows 7 control panel ODBC Data Source Administrator, the only driver available is "SQL Server". How do I find/download/install a TEXT driver? Thanks Nathan

    Read the article

  • LINQ to SQL and missing Many to Many EntityRefs

    - by Rick Strahl
    Ran into an odd behavior today with a many to many mapping of one of my tables in LINQ to SQL. Many to many mappings aren’t transparent in LINQ to SQL and it maps the link table the same way the SQL schema has it when creating one. In other words LINQ to SQL isn’t smart about many to many mappings and just treats it like the 3 underlying tables that make up the many to many relationship. Iain Galloway has a nice blog entry about Many to Many relationships in LINQ to SQL. I can live with that – it’s not really difficult to deal with this arrangement once mapped, especially when reading data back. Writing is a little more difficult as you do have to insert into two entities for new records, but nothing that can’t be handled in a small business object method with a few lines of code. When I created a database I’ve been using to experiment around with various different OR/Ms recently I found that for some reason LINQ to SQL was completely failing to map even to the linking table. As it turns out there’s a good reason why it fails, can you spot it below? (read on :-}) Here is the original database layout: There’s an items table, a category table and a link table that holds only the foreign keys to the Items and Category tables for a typical M->M relationship. When these three tables are imported into the model the *look* correct – I do get the relationships added (after modifying the entity names to strip the prefix): The relationship looks perfectly fine, both in the designer as well as in the XML document: <Table Name="dbo.wws_Item_Categories" Member="ItemCategories"> <Type Name="ItemCategory"> <Column Name="ItemId" Type="System.Guid" DbType="uniqueidentifier NOT NULL" CanBeNull="false" /> <Column Name="CategoryId" Type="System.Guid" DbType="uniqueidentifier NOT NULL" CanBeNull="false" /> <Association Name="ItemCategory_Category" Member="Categories" ThisKey="CategoryId" OtherKey="Id" Type="Category" /> <Association Name="Item_ItemCategory" Member="Item" ThisKey="ItemId" OtherKey="Id" Type="Item" IsForeignKey="true" /> </Type> </Table> <Table Name="dbo.wws_Categories" Member="Categories"> <Type Name="Category"> <Column Name="Id" Type="System.Guid" DbType="UniqueIdentifier NOT NULL" IsPrimaryKey="true" IsDbGenerated="true" CanBeNull="false" /> <Column Name="ParentId" Type="System.Guid" DbType="UniqueIdentifier" CanBeNull="true" /> <Column Name="CategoryName" Type="System.String" DbType="NVarChar(150)" CanBeNull="true" /> <Column Name="CategoryDescription" Type="System.String" DbType="NVarChar(MAX)" CanBeNull="true" /> <Column Name="tstamp" AccessModifier="Internal" Type="System.Data.Linq.Binary" DbType="rowversion" CanBeNull="true" IsVersion="true" /> <Association Name="ItemCategory_Category" Member="ItemCategory" ThisKey="Id" OtherKey="CategoryId" Type="ItemCategory" IsForeignKey="true" /> </Type> </Table> However when looking at the code generated these navigation properties (also on Item) are completely missing: [global::System.Data.Linq.Mapping.TableAttribute(Name="dbo.wws_Item_Categories")] [global::System.Runtime.Serialization.DataContractAttribute()] public partial class ItemCategory : Westwind.BusinessFramework.EntityBase { private System.Guid _ItemId; private System.Guid _CategoryId; public ItemCategory() { } [global::System.Data.Linq.Mapping.ColumnAttribute(Storage="_ItemId", DbType="uniqueidentifier NOT NULL")] [global::System.Runtime.Serialization.DataMemberAttribute(Order=1)] public System.Guid ItemId { get { return this._ItemId; } set { if ((this._ItemId != value)) { this._ItemId = value; } } } [global::System.Data.Linq.Mapping.ColumnAttribute(Storage="_CategoryId", DbType="uniqueidentifier NOT NULL")] [global::System.Runtime.Serialization.DataMemberAttribute(Order=2)] public System.Guid CategoryId { get { return this._CategoryId; } set { if ((this._CategoryId != value)) { this._CategoryId = value; } } } } Notice that the Item and Category association properties which should be EntityRef properties are completely missing. They’re there in the model, but the generated code – not so much. So what’s the problem here? The problem – it appears – is that LINQ to SQL requires primary keys on all entities it tracks. In order to support tracking – even of the link table entity – the link table requires a primary key. Real obvious ain’t it, especially since the designer happily lets you import the table and even shows the relationship and implicitly the related properties. Adding an Id field as a Pk to the database and then importing results in this model layout: which properly generates the Item and Category properties into the link entity. It’s ironic that LINQ to SQL *requires* the PK in the middle – the Entity Framework requires that a link table have *only* the two foreign key fields in a table in order to recognize a many to many relation. EF actually handles the M->M relation directly without the intermediate link entity unlike LINQ to SQL. [updated from comments – 12/24/2009] Another approach is to set up both ItemId and CategoryId in the database which shows up in LINQ to SQL like this: This also work in creating the Category and Item fields in the ItemCategory entity. Ultimately this is probably the best approach as it also guarantees uniqueness of the keys and so helps in database integrity. It took me a while to figure out WTF was going on here – lulled by the designer to think that the properties should be when they were not. It’s actually a well documented feature of L2S that each entity in the model requires a Pk but of course that’s easy to miss when the model viewer shows it to you and even the underlying XML model shows the Associations properly. This is one of the issue with L2S of course – you have to play by its rules and once you hit one of those rules there’s no way around them – you’re stuck with what it requires which in this case meant changing the database.© Rick Strahl, West Wind Technologies, 2005-2010Posted in ADO.NET  LINQ  

    Read the article

  • Toorcon 15 (2013)

    - by danx
    The Toorcon gang (senior staff): h1kari (founder), nfiltr8, and Geo Introduction to Toorcon 15 (2013) A Tale of One Software Bypass of MS Windows 8 Secure Boot Breaching SSL, One Byte at a Time Running at 99%: Surviving an Application DoS Security Response in the Age of Mass Customized Attacks x86 Rewriting: Defeating RoP and other Shinanighans Clowntown Express: interesting bugs and running a bug bounty program Active Fingerprinting of Encrypted VPNs Making Attacks Go Backwards Mask Your Checksums—The Gorry Details Adventures with weird machines thirty years after "Reflections on Trusting Trust" Introduction to Toorcon 15 (2013) Toorcon 15 is the 15th annual security conference held in San Diego. I've attended about a third of them and blogged about previous conferences I attended here starting in 2003. As always, I've only summarized the talks I attended and interested me enough to write about them. Be aware that I may have misrepresented the speaker's remarks and that they are not my remarks or opinion, or those of my employer, so don't quote me or them. Those seeking further details may contact the speakers directly or use The Google. For some talks, I have a URL for further information. A Tale of One Software Bypass of MS Windows 8 Secure Boot Andrew Furtak and Oleksandr Bazhaniuk Yuri Bulygin, Oleksandr ("Alex") Bazhaniuk, and (not present) Andrew Furtak Yuri and Alex talked about UEFI and Bootkits and bypassing MS Windows 8 Secure Boot, with vendor recommendations. They previously gave this talk at the BlackHat 2013 conference. MS Windows 8 Secure Boot Overview UEFI (Unified Extensible Firmware Interface) is interface between hardware and OS. UEFI is processor and architecture independent. Malware can replace bootloader (bootx64.efi, bootmgfw.efi). Once replaced can modify kernel. Trivial to replace bootloader. Today many legacy bootkits—UEFI replaces them most of them. MS Windows 8 Secure Boot verifies everything you load, either through signatures or hashes. UEFI firmware relies on secure update (with signed update). You would think Secure Boot would rely on ROM (such as used for phones0, but you can't do that for PCs—PCs use writable memory with signatures DXE core verifies the UEFI boat loader(s) OS Loader (winload.efi, winresume.efi) verifies the OS kernel A chain of trust is established with a root key (Platform Key, PK), which is a cert belonging to the platform vendor. Key Exchange Keys (KEKs) verify an "authorized" database (db), and "forbidden" database (dbx). X.509 certs with SHA-1/SHA-256 hashes. Keys are stored in non-volatile (NV) flash-based NVRAM. Boot Services (BS) allow adding/deleting keys (can't be accessed once OS starts—which uses Run-Time (RT)). Root cert uses RSA-2048 public keys and PKCS#7 format signatures. SecureBoot — enable disable image signature checks SetupMode — update keys, self-signed keys, and secure boot variables CustomMode — allows updating keys Secure Boot policy settings are: always execute, never execute, allow execute on security violation, defer execute on security violation, deny execute on security violation, query user on security violation Attacking MS Windows 8 Secure Boot Secure Boot does NOT protect from physical access. Can disable from console. Each BIOS vendor implements Secure Boot differently. There are several platform and BIOS vendors. It becomes a "zoo" of implementations—which can be taken advantage of. Secure Boot is secure only when all vendors implement it correctly. Allow only UEFI firmware signed updates protect UEFI firmware from direct modification in flash memory protect FW update components program SPI controller securely protect secure boot policy settings in nvram protect runtime api disable compatibility support module which allows unsigned legacy Can corrupt the Platform Key (PK) EFI root certificate variable in SPI flash. If PK is not found, FW enters setup mode wich secure boot turned off. Can also exploit TPM in a similar manner. One is not supposed to be able to directly modify the PK in SPI flash from the OS though. But they found a bug that they can exploit from User Mode (undisclosed) and demoed the exploit. It loaded and ran their own bootkit. The exploit requires a reboot. Multiple vendors are vulnerable. They will disclose this exploit to vendors in the future. Recommendations: allow only signed updates protect UEFI fw in ROM protect EFI variable store in ROM Breaching SSL, One Byte at a Time Yoel Gluck and Angelo Prado Angelo Prado and Yoel Gluck, Salesforce.com CRIME is software that performs a "compression oracle attack." This is possible because the SSL protocol doesn't hide length, and because SSL compresses the header. CRIME requests with every possible character and measures the ciphertext length. Look for the plaintext which compresses the most and looks for the cookie one byte-at-a-time. SSL Compression uses LZ77 to reduce redundancy. Huffman coding replaces common byte sequences with shorter codes. US CERT thinks the SSL compression problem is fixed, but it isn't. They convinced CERT that it wasn't fixed and they issued a CVE. BREACH, breachattrack.com BREACH exploits the SSL response body (Accept-Encoding response, Content-Encoding). It takes advantage of the fact that the response is not compressed. BREACH uses gzip and needs fairly "stable" pages that are static for ~30 seconds. It needs attacker-supplied content (say from a web form or added to a URL parameter). BREACH listens to a session's requests and responses, then inserts extra requests and responses. Eventually, BREACH guesses a session's secret key. Can use compression to guess contents one byte at-a-time. For example, "Supersecret SupersecreX" (a wrong guess) compresses 10 bytes, and "Supersecret Supersecret" (a correct guess) compresses 11 bytes, so it can find each character by guessing every character. To start the guess, BREACH needs at least three known initial characters in the response sequence. Compression length then "leaks" information. Some roadblocks include no winners (all guesses wrong) or too many winners (multiple possibilities that compress the same). The solutions include: lookahead (guess 2 or 3 characters at-a-time instead of 1 character). Expensive rollback to last known conflict check compression ratio can brute-force first 3 "bootstrap" characters, if needed (expensive) block ciphers hide exact plain text length. Solution is to align response in advance to block size Mitigations length: use variable padding secrets: dynamic CSRF tokens per request secret: change over time separate secret to input-less servlets Future work eiter understand DEFLATE/GZIP HTTPS extensions Running at 99%: Surviving an Application DoS Ryan Huber Ryan Huber, Risk I/O Ryan first discussed various ways to do a denial of service (DoS) attack against web services. One usual method is to find a slow web page and do several wgets. Or download large files. Apache is not well suited at handling a large number of connections, but one can put something in front of it Can use Apache alternatives, such as nginx How to identify malicious hosts short, sudden web requests user-agent is obvious (curl, python) same url requested repeatedly no web page referer (not normal) hidden links. hide a link and see if a bot gets it restricted access if not your geo IP (unless the website is global) missing common headers in request regular timing first seen IP at beginning of attack count requests per hosts (usually a very large number) Use of captcha can mitigate attacks, but you'll lose a lot of genuine users. Bouncer, goo.gl/c2vyEc and www.github.com/rawdigits/Bouncer Bouncer is software written by Ryan in netflow. Bouncer has a small, unobtrusive footprint and detects DoS attempts. It closes blacklisted sockets immediately (not nice about it, no proper close connection). Aggregator collects requests and controls your web proxies. Need NTP on the front end web servers for clean data for use by bouncer. Bouncer is also useful for a popularity storm ("Slashdotting") and scraper storms. Future features: gzip collection data, documentation, consumer library, multitask, logging destroyed connections. Takeaways: DoS mitigation is easier with a complete picture Bouncer designed to make it easier to detect and defend DoS—not a complete cure Security Response in the Age of Mass Customized Attacks Peleus Uhley and Karthik Raman Peleus Uhley and Karthik Raman, Adobe ASSET, blogs.adobe.com/asset/ Peleus and Karthik talked about response to mass-customized exploits. Attackers behave much like a business. "Mass customization" refers to concept discussed in the book Future Perfect by Stan Davis of Harvard Business School. Mass customization is differentiating a product for an individual customer, but at a mass production price. For example, the same individual with a debit card receives basically the same customized ATM experience around the world. Or designing your own PC from commodity parts. Exploit kits are another example of mass customization. The kits support multiple browsers and plugins, allows new modules. Exploit kits are cheap and customizable. Organized gangs use exploit kits. A group at Berkeley looked at 77,000 malicious websites (Grier et al., "Manufacturing Compromise: The Emergence of Exploit-as-a-Service", 2012). They found 10,000 distinct binaries among them, but derived from only a dozen or so exploit kits. Characteristics of Mass Malware: potent, resilient, relatively low cost Technical characteristics: multiple OS, multipe payloads, multiple scenarios, multiple languages, obfuscation Response time for 0-day exploits has gone down from ~40 days 5 years ago to about ~10 days now. So the drive with malware is towards mass customized exploits, to avoid detection There's plenty of evicence that exploit development has Project Manager bureaucracy. They infer from the malware edicts to: support all versions of reader support all versions of windows support all versions of flash support all browsers write large complex, difficult to main code (8750 lines of JavaScript for example Exploits have "loose coupling" of multipe versions of software (adobe), OS, and browser. This allows specific attacks against specific versions of multiple pieces of software. Also allows exploits of more obscure software/OS/browsers and obscure versions. Gave examples of exploits that exploited 2, 3, 6, or 14 separate bugs. However, these complete exploits are more likely to be buggy or fragile in themselves and easier to defeat. Future research includes normalizing malware and Javascript. Conclusion: The coming trend is that mass-malware with mass zero-day attacks will result in mass customization of attacks. x86 Rewriting: Defeating RoP and other Shinanighans Richard Wartell Richard Wartell The attack vector we are addressing here is: First some malware causes a buffer overflow. The malware has no program access, but input access and buffer overflow code onto stack Later the stack became non-executable. The workaround malware used was to write a bogus return address to the stack jumping to malware Later came ASLR (Address Space Layout Randomization) to randomize memory layout and make addresses non-deterministic. The workaround malware used was to jump t existing code segments in the program that can be used in bad ways "RoP" is Return-oriented Programming attacks. RoP attacks use your own code and write return address on stack to (existing) expoitable code found in program ("gadgets"). Pinkie Pie was paid $60K last year for a RoP attack. One solution is using anti-RoP compilers that compile source code with NO return instructions. ASLR does not randomize address space, just "gadgets". IPR/ILR ("Instruction Location Randomization") randomizes each instruction with a virtual machine. Richard's goal was to randomize a binary with no source code access. He created "STIR" (Self-Transofrming Instruction Relocation). STIR disassembles binary and operates on "basic blocks" of code. The STIR disassembler is conservative in what to disassemble. Each basic block is moved to a random location in memory. Next, STIR writes new code sections with copies of "basic blocks" of code in randomized locations. The old code is copied and rewritten with jumps to new code. the original code sections in the file is marked non-executible. STIR has better entropy than ASLR in location of code. Makes brute force attacks much harder. STIR runs on MS Windows (PEM) and Linux (ELF). It eliminated 99.96% or more "gadgets" (i.e., moved the address). Overhead usually 5-10% on MS Windows, about 1.5-4% on Linux (but some code actually runs faster!). The unique thing about STIR is it requires no source access and the modified binary fully works! Current work is to rewrite code to enforce security policies. For example, don't create a *.{exe,msi,bat} file. Or don't connect to the network after reading from the disk. Clowntown Express: interesting bugs and running a bug bounty program Collin Greene Collin Greene, Facebook Collin talked about Facebook's bug bounty program. Background at FB: FB has good security frameworks, such as security teams, external audits, and cc'ing on diffs. But there's lots of "deep, dark, forgotten" parts of legacy FB code. Collin gave several examples of bountied bugs. Some bounty submissions were on software purchased from a third-party (but bounty claimers don't know and don't care). We use security questions, as does everyone else, but they are basically insecure (often easily discoverable). Collin didn't expect many bugs from the bounty program, but they ended getting 20+ good bugs in first 24 hours and good submissions continue to come in. Bug bounties bring people in with different perspectives, and are paid only for success. Bug bounty is a better use of a fixed amount of time and money versus just code review or static code analysis. The Bounty program started July 2011 and paid out $1.5 million to date. 14% of the submissions have been high priority problems that needed to be fixed immediately. The best bugs come from a small % of submitters (as with everything else)—the top paid submitters are paid 6 figures a year. Spammers like to backstab competitors. The youngest sumitter was 13. Some submitters have been hired. Bug bounties also allows to see bugs that were missed by tools or reviews, allowing improvement in the process. Bug bounties might not work for traditional software companies where the product has release cycle or is not on Internet. Active Fingerprinting of Encrypted VPNs Anna Shubina Anna Shubina, Dartmouth Institute for Security, Technology, and Society (I missed the start of her talk because another track went overtime. But I have the DVD of the talk, so I'll expand later) IPsec leaves fingerprints. Using netcat, one can easily visually distinguish various crypto chaining modes just from packet timing on a chart (example, DES-CBC versus AES-CBC) One can tell a lot about VPNs just from ping roundtrips (such as what router is used) Delayed packets are not informative about a network, especially if far away from the network More needed to explore about how TCP works in real life with respect to timing Making Attacks Go Backwards Fuzzynop FuzzyNop, Mandiant This talk is not about threat attribution (finding who), product solutions, politics, or sales pitches. But who are making these malware threats? It's not a single person or group—they have diverse skill levels. There's a lot of fat-fingered fumblers out there. Always look for low-hanging fruit first: "hiding" malware in the temp, recycle, or root directories creation of unnamed scheduled tasks obvious names of files and syscalls ("ClearEventLog") uncleared event logs. Clearing event log in itself, and time of clearing, is a red flag and good first clue to look for on a suspect system Reverse engineering is hard. Disassembler use takes practice and skill. A popular tool is IDA Pro, but it takes multiple interactive iterations to get a clean disassembly. Key loggers are used a lot in targeted attacks. They are typically custom code or built in a backdoor. A big tip-off is that non-printable characters need to be printed out (such as "[Ctrl]" "[RightShift]") or time stamp printf strings. Look for these in files. Presence is not proof they are used. Absence is not proof they are not used. Java exploits. Can parse jar file with idxparser.py and decomile Java file. Java typially used to target tech companies. Backdoors are the main persistence mechanism (provided externally) for malware. Also malware typically needs command and control. Application of Artificial Intelligence in Ad-Hoc Static Code Analysis John Ashaman John Ashaman, Security Innovation Initially John tried to analyze open source files with open source static analysis tools, but these showed thousands of false positives. Also tried using grep, but tis fails to find anything even mildly complex. So next John decided to write his own tool. His approach was to first generate a call graph then analyze the graph. However, the problem is that making a call graph is really hard. For example, one problem is "evil" coding techniques, such as passing function pointer. First the tool generated an Abstract Syntax Tree (AST) with the nodes created from method declarations and edges created from method use. Then the tool generated a control flow graph with the goal to find a path through the AST (a maze) from source to sink. The algorithm is to look at adjacent nodes to see if any are "scary" (a vulnerability), using heuristics for search order. The tool, called "Scat" (Static Code Analysis Tool), currently looks for C# vulnerabilities and some simple PHP. Later, he plans to add more PHP, then JSP and Java. For more information see his posts in Security Innovation blog and NRefactory on GitHub. Mask Your Checksums—The Gorry Details Eric (XlogicX) Davisson Eric (XlogicX) Davisson Sometimes in emailing or posting TCP/IP packets to analyze problems, you may want to mask the IP address. But to do this correctly, you need to mask the checksum too, or you'll leak information about the IP. Problem reports found in stackoverflow.com, sans.org, and pastebin.org are usually not masked, but a few companies do care. If only the IP is masked, the IP may be guessed from checksum (that is, it leaks data). Other parts of packet may leak more data about the IP. TCP and IP checksums both refer to the same data, so can get more bits of information out of using both checksums than just using one checksum. Also, one can usually determine the OS from the TTL field and ports in a packet header. If we get hundreds of possible results (16x each masked nibble that is unknown), one can do other things to narrow the results, such as look at packet contents for domain or geo information. With hundreds of results, can import as CSV format into a spreadsheet. Can corelate with geo data and see where each possibility is located. Eric then demoed a real email report with a masked IP packet attached. Was able to find the exact IP address, given the geo and university of the sender. Point is if you're going to mask a packet, do it right. Eric wouldn't usually bother, but do it correctly if at all, to not create a false impression of security. Adventures with weird machines thirty years after "Reflections on Trusting Trust" Sergey Bratus Sergey Bratus, Dartmouth College (and Julian Bangert and Rebecca Shapiro, not present) "Reflections on Trusting Trust" refers to Ken Thompson's classic 1984 paper. "You can't trust code that you did not totally create yourself." There's invisible links in the chain-of-trust, such as "well-installed microcode bugs" or in the compiler, and other planted bugs. Thompson showed how a compiler can introduce and propagate bugs in unmodified source. But suppose if there's no bugs and you trust the author, can you trust the code? Hell No! There's too many factors—it's Babylonian in nature. Why not? Well, Input is not well-defined/recognized (code's assumptions about "checked" input will be violated (bug/vunerabiliy). For example, HTML is recursive, but Regex checking is not recursive. Input well-formed but so complex there's no telling what it does For example, ELF file parsing is complex and has multiple ways of parsing. Input is seen differently by different pieces of program or toolchain Any Input is a program input executes on input handlers (drives state changes & transitions) only a well-defined execution model can be trusted (regex/DFA, PDA, CFG) Input handler either is a "recognizer" for the inputs as a well-defined language (see langsec.org) or it's a "virtual machine" for inputs to drive into pwn-age ELF ABI (UNIX/Linux executible file format) case study. Problems can arise from these steps (without planting bugs): compiler linker loader ld.so/rtld relocator DWARF (debugger info) exceptions The problem is you can't really automatically analyze code (it's the "halting problem" and undecidable). Only solution is to freeze code and sign it. But you can't freeze everything! Can't freeze ASLR or loading—must have tables and metadata. Any sufficiently complex input data is the same as VM byte code Example, ELF relocation entries + dynamic symbols == a Turing Complete Machine (TM). @bxsays created a Turing machine in Linux from relocation data (not code) in an ELF file. For more information, see Rebecca "bx" Shapiro's presentation from last year's Toorcon, "Programming Weird Machines with ELF Metadata" @bxsays did same thing with Mach-O bytecode Or a DWARF exception handling data .eh_frame + glibc == Turning Machine X86 MMU (IDT, GDT, TSS): used address translation to create a Turning Machine. Page handler reads and writes (on page fault) memory. Uses a page table, which can be used as Turning Machine byte code. Example on Github using this TM that will fly a glider across the screen Next Sergey talked about "Parser Differentials". That having one input format, but two parsers, will create confusion and opportunity for exploitation. For example, CSRs are parsed during creation by cert requestor and again by another parser at the CA. Another example is ELF—several parsers in OS tool chain, which are all different. Can have two different Program Headers (PHDRs) because ld.so parses multiple PHDRs. The second PHDR can completely transform the executable. This is described in paper in the first issue of International Journal of PoC. Conclusions trusting computers not only about bugs! Bugs are part of a problem, but no by far all of it complex data formats means bugs no "chain of trust" in Babylon! (that is, with parser differentials) we need to squeeze complexity out of data until data stops being "code equivalent" Further information See and langsec.org. USENIX WOOT 2013 (Workshop on Offensive Technologies) for "weird machines" papers and videos.

    Read the article

  • Advanced TSQL Tuning: Why Internals Knowledge Matters

    - by Paul White
    There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes.  Query tuning is not complete as soon as the query returns results quickly in the development or test environments.  In production, your query will compete for memory, CPU, locks, I/O and other resources on the server.  Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data.  In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row.  The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type.  A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL,   CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL,   CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table.  The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results.  This seems slow, considering that the table only has 50,000 rows.  Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time.  Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring.  The script below monitors STATISTICS IO output and the amount of tempdb used by the test query.  We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB.  The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row.  With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first).  This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts.  In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted.  SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed.  Sorts typically require a very large number of comparisons, so this is usually a very effective optimization.  One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself.  SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use.  The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory.  Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant.  The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources.  More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1).  Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB.  The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file.  Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case.  In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms.  If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much.  Why is the data size so much smaller?  The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored.  By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output.  You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead.  SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows.  The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes.  The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page.  There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option.  By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row.  SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage.  You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now.  This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query).  SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant.  No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers.  Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’.  Why?  Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed.  SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages.  This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this?  Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need.  245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory.  Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily?  That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column.  That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal.  I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator.  Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted.  We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need.  The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb.  The small remaining inefficiency is in reading the id column values from the clustered primary key index.  As a clustered index, it contains all the in-row data at its leaf.  The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead.  The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure.  This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only.  This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data.  The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings.  It isn’t about the internal differences between TEXT and MAX data types either.  It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load.  No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance.  Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows.  5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is!  If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb.  Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough.  Tuning for logical reads and adding covering indexes is not enough.  If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on.  The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008.  They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator.  Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

    Read the article

  • It's called College.

    - by jeffreyabecker
    Today I saw yet another 'GUID vs int as your primary key' article. Like most of the ones I've read this was filled with technical misrepresentations and out-right fallices. Chef's famous line that "There's a time and a place for everything children" applies here. GUIDs have distinct advantages and disadvantages which should be considered when choosing a data type for the primary key. Fallacy 1: "Its easier" An integer data type(tinyint, smallint, int, bigint) is a better artifical key than a GUID because its easier to remember. I'm a firm believer that your artifical primary keys should be opaque gibberish. PK's are an implementation detail which should never be exposed to the user or relied on for business logic. If you want things to come back in an order, add and ORDER BY clause and SortOrder fields. If you want a human-usable look-up add a business key with a unique constraint. If you want to know what order things were inserted into a table add a timestamp. Fallacy 2: "Size Matters" For many applications, the size of the artifical primary key is going to be irrelevant. The particular article which kicked this post off stated repeatedly that joining against an int has better performance than joining against a GUID. In computer science the performance of your algorithm is always a function of the number of data points. This still holds true for databases. Unless your table is very large, the performance difference between an int and a guid probably isnt going to be mesurable let alone noticeable. My personal experience is that the performance becomes an issue when you start having billions of rows in the table. At this point, you should probably start looking to move from int to bigint so the effective space/performance gain isnt as much as you'd think. GUID Advantages: Insert-ability / Mergeability: You can reliably insert guids into tables without key collisions. Database Independence: Saving entities to the database often requires knowing ids. With identity based ids the id must be selected back after every insert. GUIDs can be generated application-side allowing much faster inserts. GUID Disadvantages: Generatability: You can calculate the next id for an integer pk pretty easily in your head but will need a program to generate GUIDs. Solution: "Select top 100 newid() from sysobjects" Fragmentation: most GUID generation algorithms generate pseudo random GUIDs. This can cause inserts into the middle of your clustered index. Solutions: add a default of newsequentialid() or use GuidComb in NHibernate.

    Read the article

< Previous Page | 27 28 29 30 31 32 33 34 35 36 37 38  | Next Page >