Search Results

Search found 2498 results on 100 pages for 'unary operator'.

Page 44/100 | < Previous Page | 40 41 42 43 44 45 46 47 48 49 50 51  | Next Page >

  • short-cutting equality checking in F#?

    - by John Clements
    In F#, the equality operator (=) is generally extensional, rather than intensional. That's great! Unfortunately, it appears to me that F# does not use pointer equality to short-cut these extensional comparisons. For instance, this code: type Z = MT | NMT of Z ref // create a Z: let a = ref MT // make it point to itself: a := NMT a // check to see whether it's equal to itself: printf "a = a: %A\n" (a = a) ... gives me a big fat segmentation fault[*], despite the fact that 'a' and 'a' both evaluate to the same reference. That's not so great. Other functional languages (e.g. PLT Scheme) get this right, using pointer comparisons conservatively, to return 'true' when it can be determined using a pointer comparison. So: I'll accept the fact that F#'s equality operator doesn't use short-cutting; is there some way to perform an intensional (pointer-based) equality check? The (==) operator is not defined on my types, and I'd love it if someone could tell me that it's available somehow. Or tell me that I'm wrong in my analysis of the situation: I'd love that, too... [*] That would probably be a stack overflow on Windows; there are things about Mono that I'm not that fond of...

    Read the article

  • Another boost error

    - by user1676605
    On this code I get the enourmous error static void ParseTheCommandLine(int argc, char *argv[]) { int count; int seqNumber; namespace po = boost::program_options; std::string appName = boost::filesystem::basename(argv[0]); po::options_description desc("Generic options"); desc.add_options() ("version,v", "print version string") ("help", "produce help message") ("sequence-number", po::value<int>(&seqNumber)->default_value(0), "sequence number") ("pem-file", po::value< vector<string> >(), "pem file") ; po::positional_options_description p; p.add("pem-file", -1); po::variables_map vm; po::store(po::command_line_parser(argc, argv). options(desc).positional(p).run(), vm); po::notify(vm); if (vm.count("pem file")) { cout << "Pem files are: " << vm["pem-file"].as< vector<string> >() << "\n"; } cout << "Sequence number is " << seqNumber << "\n"; exit(1); ../../../FIXMarketDataCommandLineParameters/FIXMarketDataCommandLineParameters.hpp|98|error: no match for ‘operator<<’ in ‘std::operator<< [with _Traits = std::char_traits](((std::basic_ostream &)(& std::cout)), ((const char*)"Pem files are: ")) << ((const boost::program_options::variable_value*)vm.boost::program_options::variables_map::operator[](((const std::string&)(& std::basic_string, std::allocator (((const char*)"pem-file"), ((const std::allocator&)((const std::allocator*)(& std::allocator()))))))))-boost::program_options::variable_value::as with T = std::vector, std::allocator , std::allocator, std::allocator ’|

    Read the article

  • Advanced TSQL Tuning: Why Internals Knowledge Matters

    - by Paul White
    There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes.  Query tuning is not complete as soon as the query returns results quickly in the development or test environments.  In production, your query will compete for memory, CPU, locks, I/O and other resources on the server.  Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data.  In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row.  The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type.  A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL,   CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL,   CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table.  The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results.  This seems slow, considering that the table only has 50,000 rows.  Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time.  Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring.  The script below monitors STATISTICS IO output and the amount of tempdb used by the test query.  We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB.  The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row.  With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first).  This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts.  In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted.  SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed.  Sorts typically require a very large number of comparisons, so this is usually a very effective optimization.  One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself.  SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use.  The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory.  Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant.  The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources.  More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1).  Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB.  The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file.  Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case.  In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms.  If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much.  Why is the data size so much smaller?  The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored.  By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output.  You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead.  SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows.  The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes.  The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page.  There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option.  By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row.  SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage.  You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now.  This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query).  SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant.  No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers.  Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’.  Why?  Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed.  SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages.  This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this?  Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need.  245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory.  Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily?  That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column.  That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal.  I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator.  Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted.  We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need.  The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb.  The small remaining inefficiency is in reading the id column values from the clustered primary key index.  As a clustered index, it contains all the in-row data at its leaf.  The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead.  The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure.  This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only.  This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data.  The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings.  It isn’t about the internal differences between TEXT and MAX data types either.  It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load.  No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance.  Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows.  5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is!  If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb.  Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough.  Tuning for logical reads and adding covering indexes is not enough.  If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on.  The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008.  They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator.  Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

    Read the article

  • [bash] checking wget's return value [if]

    - by wwrob
    I'm writing a script to download a bunch of files, and I want it to inform when a particular file doesn't exist. r=`wget -q www.someurl.com` if [ $r -ne 0 ] then echo "Not there" else echo "OK" fi But it gives the following error on execution: ./file: line 2: [: -ne: unary operator expected What's wrong?

    Read the article

  • MSDeploy - possible to call setAcl on multiple destinations in one go?

    - by growse
    I'm building a nice little continuous integration environment for our development team, based on TeamCity. It's working rather nicely, as it can build a mix of .NET and PHP projects, and push them to our internal and external platforms. I'm primarily using MsDeploy to push everything to the internal platform, as that's all IIS based. However, there's a number of builds where I need to set directory permissions on the destination directory. I can use the setAcl operator just fine, but that only seems to take a single destination as an argument. Therefore, if I need to alter the permissions on 5 destination directories, I need to call MsDeploy 5 times, which seems a lot of overhead. Is there a sensible way around this? Reading the documentation, I don't think MsDeploy takes more than a single argument for the setAcl operator, but could be wrong. Is there a better way for a build server to set multiple directory permissions in one go?

    Read the article

  • Dropbox.py on RHEL 6

    - by Timothy R. Butler
    I'm trying to run a headless install of Dropbox on RHEL 6. The daemon seems to be running, but when I try to use Dropbox's associated dropbox.py tool to control the daemon, it fails to run with the following error: Traceback (most recent call last): File "./dropbox.py", line 26, in <module> import locale File "/usr/lib64/python2.6/locale.py", line 202, in <module> import re, operator ImportError: /home/dropbox/.dropbox-dist/operator.so: undefined symbol: _PyUnicodeUCS2_AsDefaultEncodedString I'm running the current RHEL build of Python 2.6: root@cedar [/home/dropbox/.dropbox-dist]# rpm -qv python python-2.6.6-29.el6_3.3.x86_64 (I'm not sure if this would be better suited to Stack Overflow since it is on the verge of being a programming issue, but since I'm trying to use a program straight from Dropbox, I placed it here.)

    Read the article

  • Mutt: apply command to all tagged messages

    - by mrucci
    From the mutt manual: Once you have tagged the desired messages, you can use the tag-prefix operator, which is the ; (semicolon) key by default. When the tag-prefix operator is used, the next operation will be applied to all tagged messages if that operation can be used in that manner. But it seems that I can only execute commands that are already bound to a specific keyboard shortcut. For example I can use ;d to delete all selected messages. What if I want to apply an "unbound" command (such as purge-message)? I have also tried using something based on :exec tag-prefix or :push tag-prefix without success.

    Read the article

  • How do I run multiple commands on one line in Powershell?

    - by David
    In cmd prompt, you can run two commands on one line like so: ipconfig /release & ipconfig /renew When I run this command in PowerShell, I get: Ampersand not allowed. The & operator is reserved for future use Does PowerShell have an operator that allows me to quickly produce the equivalent of & in cmd prompt? Any method of running two commands in one line will do. I know that I can make a script, but I'm looking for something a little more off the cuff.

    Read the article

  • web grid server pagination trigger multiple controller call when changing page

    - by Thomas Scattolin
    When I server-filter on "au" my web grid and change page, multiple call to the controller are done : the first with 0 filtering, the second with "a" filtering, the third with "au" filtering. My table load huge data so the first call is longer than others. I see the grid displaying firstly the third call result, then the second, and finally the first call (this order correspond to the response time of my controller due to filter parameter) Why are all that controller call made ? Can't just my controller be called once with my total filter "au" ? What should I do ? Here is my grid : $("#" + gridId).kendoGrid({ selectable: "row", pageable: true, filterable:true, scrollable : true, //scrollable: { // virtual: true //false // Bug : Génère un affichage multiple... //}, navigatable: true, groupable: true, sortable: { mode: "multiple", // enables multi-column sorting allowUnsort: true }, dataSource: { type: "json", serverPaging: true, serverSorting: true, serverFiltering: true, serverGrouping:false, // Ne fonctionne pas... pageSize: '@ViewBag.Pagination', transport: { read: { url: Procvalue + "/LOV", type: "POST", dataType: "json", contentType: "application/json; charset=utf-8" }, parameterMap: function (options, type) { // Mise à jour du format d'envoi des paramètres // pour qu'ils puissent être correctement interprétés côté serveur. // Construction du paramètre sort : if (options.sort != null) { var sort = options.sort; var sort2 = ""; for (i = 0; i < sort.length; i++) { sort2 = sort2 + sort[i].field + '-' + sort[i].dir + '~'; } options.sort = sort2; } if (options.group != null) { var group = options.group; var group2 = ""; for (i = 0; i < group.length; i++) { group2 = group2 + group[i].field + '-' + group[i].dir + '~'; } options.group = group2; } if (options.filter != null) { var filter = options.filter.filters; var filter2 = ""; for (i = 0; i < filter.length; i++) { // Vérification si type colonne == string. // Parcours des colonnes pour trouver celle qui a le même nom de champ. var type = ""; for (j = 0 ; j < colonnes.length ; j++) { if (colonnes[j].champ == filter[i].field) { type = colonnes[j].type; break; } } if (filter2.length == 0) { if (type == "string") { // Avec '' autour de la valeur. filter2 = filter2 + filter[i].field + '~' + filter[i].operator + "~'" + filter[i].value + "'"; } else { // Sans '' autour de la valeur. filter2 = filter2 + filter[i].field + '~' + filter[i].operator + "~" + filter[i].value; } } else { if (type == "string") { // Avec '' autour de la valeur. filter2 = filter2 + '~' + options.filter.logic + '~' + filter[i].field + '~' + filter[i].operator + "~'" + filter[i].value + "'"; }else{ filter2 = filter2 + '~' + options.filter.logic + '~' + filter[i].field + '~' + filter[i].operator + "~" + filter[i].value; } } } options.filter = filter2; } var json = JSON.stringify(options); return json; } }, schema: { data: function (data) { return eval(data.data.Data); }, total: function (data) { return eval(data.data.Total); } }, filter: { logic: "or", filters:filtre(valeur) } }, columns: getColonnes(colonnes) }); Here is my controller : [HttpPost] public ActionResult LOV([DataSourceRequest] DataSourceRequest request) { return Json(CProduitsManager.GetProduits().ToDataSourceResult(request)); }

    Read the article

  • boost::serialization of mutual pointers

    - by KneLL
    First, please take a look at these code: class Key; class Door; class Key { public: int id; Door *pDoor; Key() : id(0), pDoor(NULL) {} private: friend class boost::serialization::access; template <typename A> void serialize(A &ar, const unsigned int ver) { ar & BOOST_SERIALIZATION_NVP(id) & BOOST_SERIALIZATION_NVP(pDoor); } }; class Door { public: int id; Key *pKey; Door() : id(0), pKey(NULL) {} private: friend class boost::serialization::access; template <typename A> void serialize(A &ar, const unsigned int ver) { ar & BOOST_SERIALIZATION_NVP(id) & BOOST_SERIALIZATION_NVP(pKey); } }; BOOST_CLASS_TRACKING(Key, track_selectively); BOOST_CLASS_TRACKING(Door, track_selectively); int main() { Key k1, k_in; Door d1, d_in; k1.id = 1; d1.id = 2; k1.pDoor = &d1; d1.pKey = &k1; // Save data { wofstream f1("test.xml"); boost::archive::xml_woarchive ar1(f1); // !!!!! (1) const Key *pK = &k1; const Door *pD = &d1; ar1 << BOOST_SERIALIZATION_NVP(pK) << BOOST_SERIALIZATION_NVP(pD); } // Load data { wifstream i1("test.xml"); boost::archive::xml_wiarchive ar1(i1); // !!!!! (2) A *pK = &k_in; B *pD = &d_in; // (2.1) //ar1 >> BOOST_SERIALIZATION_NVP(k_in) >> BOOST_SERIALIZATION_NVP(d_in); // (2.2) ar1 >> BOOST_SERIALIZATION_NVP(pK) >> BOOST_SERIALIZATION_NVP(pD); } } The first (1) is a simple question - is it possible to pass objects to archive without pointers? If simply pass objects 'as is' that boost throws exception about duplicated pointers. But I'm confused of creating pointers to save objects. The second (2) is a real trouble. If comment out string after (2.1) then boost will corectly load a first Key object (and init internal Door pointer pDoor), but will not init a second Door (d_in) object. After this I have an inited *k_in* object with valid pointer to Door and empty *d_in* object. If use string (2.2) then boost will create two Key and Door objects somewhere in memory and save addresses in pointers. But I want to have two objects *k_in* and *d_in*. So, if I copy a values of memory objects to local variables then I store only addresses, for example, I can write code after (2.2): d_in.id = pD->id; d_in.pKey = pD->pKey; But in this case I store only a pointer and memory object remains in memory and I cannot delete it, because *d_in.pKey* will be unvalid. And I cannot perform a deep copy with operator=(), because if I write code like this: Key &operator==(const Key &k) { if (this != &k) { id = k.id; // call to Door::operator=() that calls *pKey = *d.pKey and so on *pDoor = *k.pDoor; } return *this; } then I will get a something like recursion of operator=()s of Key and Door. How to implement proper serialization of such pointers?

    Read the article

  • boost::function & boost::lambda again

    - by John Dibling
    Follow-up to post: http://stackoverflow.com/questions/2978096/using-width-precision-specifiers-with-boostformat I'm trying to use boost::function to create a function that uses lambdas to format a string with boost::format. Ultimately what I'm trying to achieve is using width & precision specifiers for strings with format. boost::format does not support the use of the * width & precision specifiers, as indicated in the docs: Width or precision set to asterisk (*) are used by printf to read this field from an argument. e.g. printf("%1$d:%2$.*3$d:%4$.*3$d\n", hour, min, precision, sec); This class does not support this mechanism for now. so such precision or width fields are quietly ignored by the parsing. so I'm trying to find other ways to accomplish the same goal. Here is what I have so far, which isn't working: #include <string> #include <boost\function.hpp> #include <boost\lambda\lambda.hpp> #include <iostream> #include <boost\format.hpp> #include <iomanip> #include <boost\bind.hpp> int main() { using namespace boost::lambda; using namespace std; boost::function<std::string(int, std::string)> f = (boost::format("%s") % boost::io::group(setw(_1*2), setprecision(_2*2), _3)).str(); std::string s = (boost::format("%s") % f(15, "Hello")).str(); return 0; } This generates many compiler errors: 1>------ Build started: Project: hacks, Configuration: Debug x64 ------ 1>Compiling... 1>main.cpp 1>.\main.cpp(15) : error C2872: '_1' : ambiguous symbol 1> could be 'D:\Program Files (x86)\boost\boost_1_42\boost/lambda/core.hpp(69) : boost::lambda::placeholder1_type &boost::lambda::`anonymous-namespace'::_1' 1> or 'D:\Program Files (x86)\boost\boost_1_42\boost/bind/placeholders.hpp(43) : boost::arg<I> `anonymous-namespace'::_1' 1> with 1> [ 1> I=1 1> ] 1>.\main.cpp(15) : error C2664: 'std::setw' : cannot convert parameter 1 from 'boost::lambda::placeholder1_type' to 'std::streamsize' 1> No user-defined-conversion operator available that can perform this conversion, or the operator cannot be called 1>.\main.cpp(15) : error C2872: '_2' : ambiguous symbol 1> could be 'D:\Program Files (x86)\boost\boost_1_42\boost/lambda/core.hpp(70) : boost::lambda::placeholder2_type &boost::lambda::`anonymous-namespace'::_2' 1> or 'D:\Program Files (x86)\boost\boost_1_42\boost/bind/placeholders.hpp(44) : boost::arg<I> `anonymous-namespace'::_2' 1> with 1> [ 1> I=2 1> ] 1>.\main.cpp(15) : error C2664: 'std::setprecision' : cannot convert parameter 1 from 'boost::lambda::placeholder2_type' to 'std::streamsize' 1> No user-defined-conversion operator available that can perform this conversion, or the operator cannot be called 1>.\main.cpp(15) : error C2872: '_3' : ambiguous symbol 1> could be 'D:\Program Files (x86)\boost\boost_1_42\boost/lambda/core.hpp(71) : boost::lambda::placeholder3_type &boost::lambda::`anonymous-namespace'::_3' 1> or 'D:\Program Files (x86)\boost\boost_1_42\boost/bind/placeholders.hpp(45) : boost::arg<I> `anonymous-namespace'::_3' 1> with 1> [ 1> I=3 1> ] 1>.\main.cpp(15) : error C2660: 'boost::io::group' : function does not take 3 arguments 1>.\main.cpp(15) : error C2228: left of '.str' must have class/struct/union 1>Build log was saved at "file://c:\Users\john\Documents\Visual Studio 2005\Projects\hacks\x64\Debug\BuildLog.htm" 1>hacks - 7 error(s), 0 warning(s) ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ========== My fundamental understanding of boost's lambdas and functions is probably lacking. How can I get this to work?

    Read the article

  • Const references when dereferencing iterator on set, starting from Visual Studio 2010

    - by Patrick
    Starting from Visual Studio 2010, iterating over a set seems to return an iterator that dereferences the data as 'const data' instead of non-const. The following code is an example of something that does compile on Visual Studio 2005, but not on 2010 (this is an artificial example, but clearly illustrates the problem we found on our own code). In this example, I have a class that stores a position together with a temperature. I define comparison operators (not all them, just enough to illustrate the problem) that only use the position, not the temperature. The point is that for me two instances are identical if the position is identical; I don't care about the temperature. #include <set> class DataPoint { public: DataPoint (int x, int y) : m_x(x), m_y(y), m_temperature(0) {} void setTemperature(double t) {m_temperature = t;} bool operator<(const DataPoint& rhs) const { if (m_x==rhs.m_x) return m_y<rhs.m_y; else return m_x<rhs.m_x; } bool operator==(const DataPoint& rhs) const { if (m_x!=rhs.m_x) return false; if (m_y!=rhs.m_y) return false; return true; } private: int m_x; int m_y; double m_temperature; }; typedef std::set<DataPoint> DataPointCollection; void main(void) { DataPointCollection points; points.insert (DataPoint(1,1)); points.insert (DataPoint(1,1)); points.insert (DataPoint(1,2)); points.insert (DataPoint(1,3)); points.insert (DataPoint(1,1)); for (DataPointCollection::iterator it=points.begin();it!=points.end();++it) { DataPoint &point = *it; point.setTemperature(10); } } In the main routine I have a set to which I add some points. To check the correctness of the comparison operator, I add data points with the same position multiple times. When writing the contents of the set, I can clearly see there are only 3 points in the set. The for-loop loops over the set, and sets the temperature. Logically this is allowed, since the temperature is not used in the comparison operators. This code compiles correctly in Visual Studio 2005, but gives compilation errors in Visual Studio 2010 on the following line (in the for-loop): DataPoint &point = *it; The error given is that it can't assign a "const DataPoint" to a [non-const] "DataPoint &". It seems that you have no decent (= non-dirty) way of writing this code in VS2010 if you have a comparison operator that only compares parts of the data members. Possible solutions are: Adding a const-cast to the line where it gives an error Making temperature mutable and making setTemperature a const method But to me both solutions seem rather 'dirty'. It looks like the C++ standards committee overlooked this situation. Or not? What are clean solutions to solve this problem? Did some of you encounter this same problem and how did you solve it? Patrick

    Read the article

  • Critique my heap debugger

    - by FredOverflow
    I wrote the following heap debugger in order to demonstrate memory leaks, double deletes and wrong forms of deletes (i.e. trying to delete an array with delete p instead of delete[] p) to beginning programmers. I would love to get some feedback on that from strong C++ programmers because I have never done this before and I'm sure I've done some stupid mistakes. Thanks! #include <cstdlib> #include <iostream> #include <new> namespace { const int ALIGNMENT = 16; const char* const ERR = "*** ERROR: "; int counter = 0; struct heap_debugger { heap_debugger() { std::cerr << "*** heap debugger started\n"; } ~heap_debugger() { std::cerr << "*** heap debugger shutting down\n"; if (counter > 0) { std::cerr << ERR << "failed to release memory " << counter << " times\n"; } else if (counter < 0) { std::cerr << ERR << (-counter) << " double deletes detected\n"; } } } instance; void* allocate(size_t size, const char* kind_of_memory, size_t token) throw (std::bad_alloc) { void* raw = malloc(size + ALIGNMENT); if (raw == 0) throw std::bad_alloc(); *static_cast<size_t*>(raw) = token; void* payload = static_cast<char*>(raw) + ALIGNMENT; ++counter; std::cerr << "*** allocated " << kind_of_memory << " at " << payload << " (" << size << " bytes)\n"; return payload; } void release(void* payload, const char* kind_of_memory, size_t correct_token, size_t wrong_token) throw () { if (payload == 0) return; std::cerr << "*** releasing " << kind_of_memory << " at " << payload << '\n'; --counter; void* raw = static_cast<char*>(payload) - ALIGNMENT; size_t* token = static_cast<size_t*>(raw); if (*token == correct_token) { *token = 0xDEADBEEF; free(raw); } else if (*token == wrong_token) { *token = 0x177E6A7; std::cerr << ERR << "wrong form of delete\n"; } else { std::cerr << ERR << "double delete\n"; } } } void* operator new(size_t size) throw (std::bad_alloc) { return allocate(size, "non-array memory", 0x5AFE6A8D); } void* operator new[](size_t size) throw (std::bad_alloc) { return allocate(size, " array memory", 0x5AFE6A8E); } void operator delete(void* payload) throw () { release(payload, "non-array memory", 0x5AFE6A8D, 0x5AFE6A8E); } void operator delete[](void* payload) throw () { release(payload, " array memory", 0x5AFE6A8E, 0x5AFE6A8D); }

    Read the article

  • Purpose of overloading operators in C++?

    - by Geo Drawkcab
    What is the main purpose of overloading operators in C++? In the code below, << and >> are overloaded; what is the advantage to doing so? #include <iostream> #include <string> using namespace std; class book { string name,gvari; double cost; int year; public: book(){}; book(string a, string b, double c, int d) { a=name;b=gvari;c=cost;d=year; } ~book() {} double setprice(double a) { return a=cost; } friend ostream& operator <<(ostream& , book&); void printbook(){ cout<<"wignis saxeli "<<name<<endl; cout<<"wignis avtori "<<gvari<<endl; cout<<"girebuleba "<<cost<<endl; cout<<"weli "<<year<<endl; } }; ostream& operator <<(ostream& out, book& a){ out<<"wignis saxeli "<<a.name<<endl; out<<"wignis avtori "<<a.gvari<<endl; out<<"girebuleba "<<a.cost<<endl; out<<"weli "<<a.year<<endl; return out; } class library_card : public book { string nomeri; int raod; public: library_card(){}; library_card( string a, int b){a=nomeri;b=raod;} ~library_card() {}; void printcard(){ cout<<"katalogis nomeri "<<nomeri<<endl; cout<<"gacemis raodenoba "<<raod<<endl; } friend ostream& operator <<(ostream& , library_card&); }; ostream& operator <<(ostream& out, library_card& b) { out<<"katalogis nomeri "<<b.nomeri<<endl; out<<"gacemis raodenoba "<<b.raod<<endl; return out; } int main() { book A("robizon kruno","giorgi",15,1992); library_card B("910CPP",123); A.printbook(); B.printbook(); A.setprice(15); B.printbook(); system("pause"); return 0; }

    Read the article

  • SQL SERVER – Subquery or Join – Various Options – SQL Server Engine knows the Best

    - by pinaldave
    This is followup post of my earlier article SQL SERVER – Convert IN to EXISTS – Performance Talk, after reading all the comments I have received I felt that I could write more on the same subject to clear few things out. First let us run following four queries, all of them are giving exactly same resultset. USE AdventureWorks GO -- use of = SELECT * FROM HumanResources.Employee E WHERE E.EmployeeID = ( SELECT EA.EmployeeID FROM HumanResources.EmployeeAddress EA WHERE EA.EmployeeID = E.EmployeeID) GO -- use of in SELECT * FROM HumanResources.Employee E WHERE E.EmployeeID IN ( SELECT EA.EmployeeID FROM HumanResources.EmployeeAddress EA WHERE EA.EmployeeID = E.EmployeeID) GO -- use of exists SELECT * FROM HumanResources.Employee E WHERE EXISTS ( SELECT EA.EmployeeID FROM HumanResources.EmployeeAddress EA WHERE EA.EmployeeID = E.EmployeeID) GO -- Use of Join SELECT * FROM HumanResources.Employee E INNER JOIN HumanResources.EmployeeAddress EA ON E.EmployeeID = EA.EmployeeID GO Let us compare the execution plan of the queries listed above. Click on image to see larger image. It is quite clear from the execution plan that in case of IN, EXISTS and JOIN SQL Server Engines is smart enough to figure out what is the best optimal plan of Merge Join for the same query and execute the same. However, in the case of use of Equal (=) Operator, SQL Server is forced to use Nested Loop and test each result of the inner query and compare to outer query, leading to cut the performance. Please note that here I no mean suggesting that Nested Loop is bad or Merge Join is better. This can very well vary on your machine and amount of resources available on your computer. When I see Equal (=) operator used in query like above, I usually recommend to see if user can use IN or EXISTS or JOIN. As I said, this can very much vary on different system. What is your take in above query? I believe SQL Server Engines is usually pretty smart to figure out what is ideal execution plan and use it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Joins, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • So…is it a Seek or a Scan?

    - by Paul White
    You’re probably most familiar with the terms ‘Seek’ and ‘Scan’ from the graphical plans produced by SQL Server Management Studio (SSMS).  The image to the left shows the most common ones, with the three types of scan at the top, followed by four types of seek.  You might look to the SSMS tool-tip descriptions to explain the differences between them: Not hugely helpful are they?  Both mention scans and ranges (nothing about seeks) and the Index Seek description implies that it will not scan the index entirely (which isn’t necessarily true). Recall also yesterday’s post where we saw two Clustered Index Seek operations doing very different things.  The first Seek performed 63 single-row seeking operations; and the second performed a ‘Range Scan’ (more on those later in this post).  I hope you agree that those were two very different operations, and perhaps you are wondering why there aren’t different graphical plan icons for Range Scans and Seeks?  I have often wondered about that, and the first person to mention it after yesterday’s post was Erin Stellato (twitter | blog): Before we go on to make sense of all this, let’s look at another example of how SQL Server confusingly mixes the terms ‘Scan’ and ‘Seek’ in different contexts.  The diagram below shows a very simple heap table with two columns, one of which is the non-clustered Primary Key, and the other has a non-unique non-clustered index defined on it.  The right hand side of the diagram shows a simple query, it’s associated query plan, and a couple of extracts from the SSMS tool-tip and Properties windows. Notice the ‘scan direction’ entry in the Properties window snippet.  Is this a seek or a scan?  The different references to Scans and Seeks are even more pronounced in the XML plan output that the graphical plan is based on.  This fragment is what lies behind the single Index Seek icon shown above: You’ll find the same confusing references to Seeks and Scans throughout the product and its documentation. Making Sense of Seeks Let’s forget all about scans for a moment, and think purely about seeks.  Loosely speaking, a seek is the process of navigating an index B-tree to find a particular index record, most often at the leaf level.  A seek starts at the root and navigates down through the levels of the index to find the point of interest: Singleton Lookups The simplest sort of seek predicate performs this traversal to find (at most) a single record.  This is the case when we search for a single value using a unique index and an equality predicate.  It should be readily apparent that this type of search will either find one record, or none at all.  This operation is known as a singleton lookup.  Given the example table from before, the following query is an example of a singleton lookup seek: Sadly, there’s nothing in the graphical plan or XML output to show that this is a singleton lookup – you have to infer it from the fact that this is a single-value equality seek on a unique index.  The other common examples of a singleton lookup are bookmark lookups – both the RID and Key Lookup forms are singleton lookups (an RID lookup finds a single record in a heap from the unique row locator, and a Key Lookup does much the same thing on a clustered table).  If you happen to run your query with STATISTICS IO ON, you will notice that ‘Scan Count’ is always zero for a singleton lookup. Range Scans The other type of seek predicate is a ‘seek plus range scan’, which I will refer to simply as a range scan.  The seek operation makes an initial descent into the index structure to find the first leaf row that qualifies, and then performs a range scan (either backwards or forwards in the index) until it reaches the end of the scan range. The ability of a range scan to proceed in either direction comes about because index pages at the same level are connected by a doubly-linked list – each page has a pointer to the previous page (in logical key order) as well as a pointer to the following page.  The doubly-linked list is represented by the green and red dotted arrows in the index diagram presented earlier.  One subtle (but important) point is that the notion of a ‘forward’ or ‘backward’ scan applies to the logical key order defined when the index was built.  In the present case, the non-clustered primary key index was created as follows: CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col ASC) ) ; Notice that the primary key index specifies an ascending sort order for the single key column.  This means that a forward scan of the index will retrieve keys in ascending order, while a backward scan would retrieve keys in descending key order.  If the index had been created instead on key_col DESC, a forward scan would retrieve keys in descending order, and a backward scan would return keys in ascending order. A range scan seek predicate may have a Start condition, an End condition, or both.  Where one is missing, the scan starts (or ends) at one extreme end of the index, depending on the scan direction.  Some examples might help clarify that: the following diagram shows four queries, each of which performs a single seek against a column holding every integer from 1 to 100 inclusive.  The results from each query are shown in the blue columns, and relevant attributes from the Properties window appear on the right: Query 1 specifies that all key_col values less than 5 should be returned in ascending order.  The query plan achieves this by seeking to the start of the index leaf (there is no explicit starting value) and scanning forward until the End condition (key_col < 5) is no longer satisfied (SQL Server knows it can stop looking as soon as it finds a key_col value that isn’t less than 5 because all later index entries are guaranteed to sort higher). Query 2 asks for key_col values greater than 95, in descending order.  SQL Server returns these results by seeking to the end of the index, and scanning backwards (in descending key order) until it comes across a row that isn’t greater than 95.  Sharp-eyed readers may notice that the end-of-scan condition is shown as a Start range value.  This is a bug in the XML show plan which bubbles up to the Properties window – when a backward scan is performed, the roles of the Start and End values are reversed, but the plan does not reflect that.  Oh well. Query 3 looks for key_col values that are greater than or equal to 10, and less than 15, in ascending order.  This time, SQL Server seeks to the first index record that matches the Start condition (key_col >= 10) and then scans forward through the leaf pages until the End condition (key_col < 15) is no longer met. Query 4 performs much the same sort of operation as Query 3, but requests the output in descending order.  Again, we have to mentally reverse the Start and End conditions because of the bug, but otherwise the process is the same as always: SQL Server finds the highest-sorting record that meets the condition ‘key_col < 25’ and scans backward until ‘key_col >= 20’ is no longer true. One final point to note: seek operations always have the Ordered: True attribute.  This means that the operator always produces rows in a sorted order, either ascending or descending depending on how the index was defined, and whether the scan part of the operation is forward or backward.  You cannot rely on this sort order in your queries of course (you must always specify an ORDER BY clause if order is important) but SQL Server can make use of the sort order internally.  In the four queries above, the query optimizer was able to avoid an explicit Sort operator to honour the ORDER BY clause, for example. Multiple Seek Predicates As we saw yesterday, a single index seek plan operator can contain one or more seek predicates.  These seek predicates can either be all singleton seeks or all range scans – SQL Server does not mix them.  For example, you might expect the following query to contain two seek predicates, a singleton seek to find the single record in the unique index where key_col = 10, and a range scan to find the key_col values between 15 and 20: SELECT key_col FROM dbo.Example WHERE key_col = 10 OR key_col BETWEEN 15 AND 20 ORDER BY key_col ASC ; In fact, SQL Server transforms the singleton seek (key_col = 10) to the equivalent range scan, Start:[key_col >= 10], End:[key_col <= 10].  This allows both range scans to be evaluated by a single seek operator.  To be clear, this query results in two range scans: one from 10 to 10, and one from 15 to 20. Final Thoughts That’s it for today – tomorrow we’ll look at monitoring singleton lookups and range scans, and I’ll show you a seek on a heap table. Yes, a seek.  On a heap.  Not an index! If you would like to run the queries in this post for yourself, there’s a script below.  Thanks for reading! IF OBJECT_ID(N'dbo.Example', N'U') IS NOT NULL BEGIN DROP TABLE dbo.Example; END ; -- Test table is a heap -- Non-clustered primary key on 'key_col' CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ; -- Non-unique non-clustered index on the 'data' column CREATE NONCLUSTERED INDEX [IX dbo.Example data] ON dbo.Example (data) ; -- Add 100 rows INSERT dbo.Example WITH (TABLOCKX) ( key_col, data ) SELECT key_col = V.number, data = V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 100 ; -- ================ -- Singleton lookup -- ================ ; -- Single value equality seek in a unique index -- Scan count = 0 when STATISTIS IO is ON -- Check the XML SHOWPLAN SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col = 32 ; -- =========== -- Range Scans -- =========== ; -- Query 1 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col <= 5 ORDER BY E.key_col ASC ; -- Query 2 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col > 95 ORDER BY E.key_col DESC ; -- Query 3 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col >= 10 AND E.key_col < 15 ORDER BY E.key_col ASC ; -- Query 4 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col >= 20 AND E.key_col < 25 ORDER BY E.key_col DESC ; -- Final query (singleton + range = 2 range scans) SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col = 10 OR E.key_col BETWEEN 15 AND 20 ORDER BY E.key_col ASC ; -- === TIDY UP === DROP TABLE dbo.Example; © 2011 Paul White email: [email protected] twitter: @SQL_Kiwi

    Read the article

  • C#: String Concatenation vs Format vs StringBuilder

    - by James Michael Hare
    I was looking through my groups’ C# coding standards the other day and there were a couple of legacy items in there that caught my eye.  They had been passed down from committee to committee so many times that no one even thought to second guess and try them for a long time.  It’s yet another example of how micro-optimizations can often get the best of us and cause us to write code that is not as maintainable as it could be for the sake of squeezing an extra ounce of performance out of our software. So the two standards in question were these, in paraphrase: Prefer StringBuilder or string.Format() to string concatenation. Prefer string.Equals() with case-insensitive option to string.ToUpper().Equals(). Now some of you may already know what my results are going to show, as these items have been compared before on many blogs, but I think it’s always worth repeating and trying these yourself.  So let’s dig in. The first test was a pretty standard one.  When concattenating strings, what is the best choice: StringBuilder, string concattenation, or string.Format()? So before we being I read in a number of iterations from the console and a length of each string to generate.  Then I generate that many random strings of the given length and an array to hold the results.  Why am I so keen to keep the results?  Because I want to be able to snapshot the memory and don’t want garbage collection to collect the strings, hence the array to keep hold of them.  I also didn’t want the random strings to be part of the allocation, so I pre-allocate them and the array up front before the snapshot.  So in the code snippets below: num – Number of iterations. strings – Array of randomly generated strings. results – Array to hold the results of the concatenation tests. timer – A System.Diagnostics.Stopwatch() instance to time code execution. start – Beginning memory size. stop – Ending memory size. after – Memory size after final GC. So first, let’s look at the concatenation loop: 1: // build num strings using concattenation. 2: for (int i = 0; i < num; i++) 3: { 4: results[i] = "This is test #" + i + " with a result of " + strings[i]; 5: } Pretty standard, right?  Next for string.Format(): 1: // build strings using string.Format() 2: for (int i = 0; i < num; i++) 3: { 4: results[i] = string.Format("This is test #{0} with a result of {1}", i, strings[i]); 5: }   Finally, StringBuilder: 1: // build strings using StringBuilder 2: for (int i = 0; i < num; i++) 3: { 4: var builder = new StringBuilder(); 5: builder.Append("This is test #"); 6: builder.Append(i); 7: builder.Append(" with a result of "); 8: builder.Append(strings[i]); 9: results[i] = builder.ToString(); 10: } So I take each of these loops, and time them by using a block like this: 1: // get the total amount of memory used, true tells it to run GC first. 2: start = System.GC.GetTotalMemory(true); 3:  4: // restart the timer 5: timer.Reset(); 6: timer.Start(); 7:  8: // *** code to time and measure goes here. *** 9:  10: // get the current amount of memory, stop the timer, then get memory after GC. 11: stop = System.GC.GetTotalMemory(false); 12: timer.Stop(); 13: other = System.GC.GetTotalMemory(true); So let’s look at what happens when I run each of these blocks through the timer and memory check at 500,000 iterations: 1: Operator + - Time: 547, Memory: 56104540/55595960 - 500000 2: string.Format() - Time: 749, Memory: 57295812/55595960 - 500000 3: StringBuilder - Time: 608, Memory: 55312888/55595960 – 500000   Egad!  string.Format brings up the rear and + triumphs, well, at least in terms of speed.  The concat burns more memory than StringBuilder but less than string.Format().  This shows two main things: StringBuilder is not always the panacea many think it is. The difference between any of the three is miniscule! The second point is extremely important!  You will often here people who will grasp at results and say, “look, operator + is 10% faster than StringBuilder so always use StringBuilder.”  Statements like this are a disservice and often misleading.  For example, if I had a good guess at what the size of the string would be, I could have preallocated my StringBuffer like so:   1: for (int i = 0; i < num; i++) 2: { 3: // pre-declare StringBuilder to have 100 char buffer. 4: var builder = new StringBuilder(100); 5: builder.Append("This is test #"); 6: builder.Append(i); 7: builder.Append(" with a result of "); 8: builder.Append(strings[i]); 9: results[i] = builder.ToString(); 10: }   Now let’s look at the times: 1: Operator + - Time: 551, Memory: 56104412/55595960 - 500000 2: string.Format() - Time: 753, Memory: 57296484/55595960 - 500000 3: StringBuilder - Time: 525, Memory: 59779156/55595960 - 500000   Whoa!  All of the sudden StringBuilder is back on top again!  But notice, it takes more memory now.  This makes perfect sense if you examine the IL behind the scenes.  Whenever you do a string concat (+) in your code, it examines the lengths of the arguments and creates a StringBuilder behind the scenes of the appropriate size for you. But even IF we know the approximate size of our StringBuilder, look how much less readable it is!  That’s why I feel you should always take into account both readability and performance.  After all, consider all these timings are over 500,000 iterations.   That’s at best  0.0004 ms difference per call which is neglidgable at best.  The key is to pick the best tool for the job.  What do I mean?  Consider these awesome words of wisdom: Concatenate (+) is best at concatenating.  StringBuilder is best when you need to building. Format is best at formatting. Totally Earth-shattering, right!  But if you consider it carefully, it actually has a lot of beauty in it’s simplicity.  Remember, there is no magic bullet.  If one of these always beat the others we’d only have one and not three choices. The fact is, the concattenation operator (+) has been optimized for speed and looks the cleanest for joining together a known set of strings in the simplest manner possible. StringBuilder, on the other hand, excels when you need to build a string of inderterminant length.  Use it in those times when you are looping till you hit a stop condition and building a result and it won’t steer you wrong. String.Format seems to be the looser from the stats, but consider which of these is more readable.  Yes, ignore the fact that you could do this with ToString() on a DateTime.  1: // build a date via concatenation 2: var date1 = (month < 10 ? string.Empty : "0") + month + '/' 3: + (day < 10 ? string.Empty : "0") + '/' + year; 4:  5: // build a date via string builder 6: var builder = new StringBuilder(10); 7: if (month < 10) builder.Append('0'); 8: builder.Append(month); 9: builder.Append('/'); 10: if (day < 10) builder.Append('0'); 11: builder.Append(day); 12: builder.Append('/'); 13: builder.Append(year); 14: var date2 = builder.ToString(); 15:  16: // build a date via string.Format 17: var date3 = string.Format("{0:00}/{1:00}/{2:0000}", month, day, year); 18:  So the strength in string.Format is that it makes constructing a formatted string easy to read.  Yes, it’s slower, but look at how much more elegant it is to do zero-padding and anything else string.Format does. So my lesson is, don’t look for the silver bullet!  Choose the best tool.  Micro-optimization almost always bites you in the end because you’re sacrificing readability for performance, which is almost exactly the wrong choice 90% of the time. I love the rules of optimization.  They’ve been stated before in many forms, but here’s how I always remember them: For Beginners: Do not optimize. For Experts: Do not optimize yet. It’s so true.  Most of the time on today’s modern hardware, a micro-second optimization at the sake of readability will net you nothing because it won’t be your bottleneck.  Code for readability, choose the best tool for the job which will usually be the most readable and maintainable as well.  Then, and only then, if you need that extra performance boost after profiling your code and exhausting all other options… then you can start to think about optimizing.

    Read the article

  • Overriding GetHashCode in a mutable struct - What NOT to do?

    - by Kyle Baran
    I am using the XNA Framework to make a learning project. It has a Point struct which exposes an X and Y value; for the purpose of optimization, it breaks the rules for proper struct design, since its a mutable struct. As Marc Gravell, John Skeet, and Eric Lippert point out in their respective posts about GetHashCode() (which Point overrides), this is a rather bad thing, since if an object's values change while its contained in a hashmap (ie, LINQ queries), it can become "lost". However, I am making my own Point3D struct, following the design of Point as a guideline. Thus, it too is a mutable struct which overrides GetHashCode(). The only difference is that mine exposes and int for X, Y, and Z values, but is fundamentally the same. The signatures are below: public struct Point3D : IEquatable<Point3D> { public int X; public int Y; public int Z; public static bool operator !=(Point3D a, Point3D b) { } public static bool operator ==(Point3D a, Point3D b) { } public Point3D Zero { get; } public override int GetHashCode() { } public override bool Equals(object obj) { } public bool Equals(Point3D other) { } public override string ToString() { } } I have tried to break my struct in the way they describe, namely by storing it in a List<Point3D>, as well as changing the value via a method using ref, but I did not encounter they behavior they warn about (maybe a pointer might allow me to break it?). Am I being too cautious in my approach, or should I be okay to use it as is?

    Read the article

  • Is this kind of design - a class for Operations On Object - correct?

    - by Mithir
    In our system we have many complex operations which involve many validations and DB activities. One of the main Business functionality could have been designed better. In short, there were no separation of layers, and the code would only work from the scenario in which it was first designed at, and now there were more scenarios (like requests from an API or from other devices) So I had to redesign. I found myself moving all the DB code to objects which acts like Business to DB objects, and I've put all the business logic in an Operator kind of a class, which I've implemented like this: First, I created an object which will hold all the information needed for the operation let's call it InformationObject. Then I created an OperatorObject which will take the InformationObject as a parameter and act on it. The OperatorObject should activate different objects and validate or check for existence or any scenario in which the business logic is compromised and then make the operation according to the information on the InformationObject. So my question is - Is this kind of implementation correct? PS, this Operator only works on a single Business-wise Operation.

    Read the article

  • The best way to have a pointer to several methods - critique requested

    - by user827992
    I'm starting with a short introduction of what i know from the C language: a pointer is a type that stores an adress or a NULL the * operator reads the left value of the variable on its right and use this value as address and reads the value of the variable at that address the & operator generate a pointer to the variable on its right so i was thinking that in C++ the pointers can work this way too, but i was wrong, to generate a pointer to a static method i have to do this: #include <iostream> class Foo{ public: static void dummy(void){ std::cout << "I'm dummy" << std::endl; }; }; int main(){ void (*p)(); p = Foo::dummy; // step 1 p(); p = &(Foo::dummy); // step 2 p(); p = Foo; // step 3 p->dummy(); return(0); } now i have several questions: why step 1 works why step 2 works too, looks like a "pointer to pointer" for p to me, very different from step 1 why step 3 is the only one that doesn't work and is the only one that makes some sort of sense to me, honestly how can i write an array of pointers or a pointer to pointers structure to store methods ( static or non-static from real objects ) what is the best syntax and coding style for generating a pointer to a method?

    Read the article

  • Using prefix incremented loops in C#

    - by KChaloux
    Back when I started programming in college, a friend encouraged me to use the prefix incrementation operator ++i instead of the postfix i++, citing that there was a slight chance of better performance with no real chance of a downside. I realize this is true in C++, and it's become a general habit that I continue to do. I'm led to believe that it makes little to no difference when used in a loop in C#, regardless of data type. Apparently the ++ operator can't be overridden. Nevertheless, I like the appearance more, and don't see a direct downside to it. It did astonish a coworker just a moment ago though, he made the (fairly logical) assumption that my loop would terminate early as a result. He's a self-taught programmer, and apparently never came across the C++ convention. That made me question whether or not the equivalent behavior of pre- and post-fix increment and decrement operators in loops is well known enough. Is it acceptable for me to continue using ++i in looping constructs because of style preference, even though it has no real performance benefit? Or is it likely to cause confusion amongst other programmers? Note: This is assuming the ++i convention is used consistently throughout all code.

    Read the article

  • Design Pattern for Skipping Steps in a Wizard

    - by Eric J.
    I'm designing a flexible Wizard system that presents a number of screens to complete a task. Some screens may need to be skipped based on answers to prompts on one or more previous screens. The conditions to skip a given screen need to be editable by a non-technical user via a UI. Multiple conditions need only be combined with and. I have an initial design in mind, but it feels inelegant. I wonder if there's a better way to approach this class of problem. Initial Design UI where The first column allows the user to select a question from a previous screen. The second column allows the user to select an operator applicable to the type of question asked. The third column allows the user to enter one or more values depending on the selected operator. Object Model public enum Operations { ... } public class Condition { int QuestionId { get; set; } Operations Operation { get; set; } List<object> Parameters { get; private set; } } List<Condition> pageSkipConditions; Controller Logic bool allConditionsTrue = pageSkipConditions.Count > 0; foreach (Condition c in pageSkipConditions) { allConditionsTrue &= Evaluate(previousAnswers, c); } // ... private bool Evaluate(List<Answers> previousAnswers, Condition c) { switch (c.Operation) { case Operations.StartsWith: // logic for this operation // etc. } }

    Read the article

< Previous Page | 40 41 42 43 44 45 46 47 48 49 50 51  | Next Page >