Search Results

Search found 4815 results on 193 pages for 'parameterized queries'.

Page 36/193 | < Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43  | Next Page >

  • Changes to the LINQ-to-StreamInsight Dialect

    - by Roman Schindlauer
    In previous versions of StreamInsight (1.0 through 2.0), CepStream<> represents temporal streams of many varieties: Streams with ‘open’ inputs (e.g., those defined and composed over CepStream<T>.Create(string streamName) Streams with ‘partially bound’ inputs (e.g., those defined and composed over CepStream<T>.Create(Type adapterFactory, …)) Streams with fully bound inputs (e.g., those defined and composed over To*Stream – sequences or DQC) The stream may be embedded (where Server.Create is used) The stream may be remote (where Server.Connect is used) When adding support for new programming primitives in StreamInsight 2.1, we faced a choice: Add a fourth variety (use CepStream<> to represent streams that are bound the new programming model constructs), or introduce a separate type that represents temporal streams in the new user model. We opted for the latter. Introducing a new type has the effect of reducing the number of (confusing) runtime failures due to inappropriate uses of CepStream<> instances in the incorrect context. The new types are: IStreamable<>, which logically represents a temporal stream. IQStreamable<> : IStreamable<>, which represents a queryable temporal stream. Its relationship to IStreamable<> is analogous to the relationship of IQueryable<> to IEnumerable<>. The developer can compose temporal queries over remote stream sources using this type. The syntax of temporal queries composed over IQStreamable<> is mostly consistent with the syntax of our existing CepStream<>-based LINQ provider. However, we have taken the opportunity to refine certain aspects of the language surface. Differences are outlined below. Because 2.1 introduces new types to represent temporal queries, the changes outlined in this post do no impact existing StreamInsight applications using the existing types! SelectMany StreamInsight does not support the SelectMany operator in its usual form (which is analogous to SQL’s “CROSS APPLY” operator): static IEnumerable<R> SelectMany<T, R>(this IEnumerable<T> source, Func<T, IEnumerable<R>> collectionSelector) It instead uses SelectMany as a convenient syntactic representation of an inner join. The parameter to the selector function is thus unavailable. Because the parameter isn’t supported, its type in StreamInsight 1.0 – 2.0 wasn’t carefully scrutinized. Unfortunately, the type chosen for the parameter is nonsensical to LINQ programmers: static CepStream<R> SelectMany<T, R>(this CepStream<T> source, Expression<Func<CepStream<T>, CepStream<R>>> streamSelector) Using Unit as the type for the parameter accurately reflects the StreamInsight’s capabilities: static IQStreamable<R> SelectMany<T, R>(this IQStreamable<T> source, Expression<Func<Unit, IQStreamable<R>>> streamSelector) For queries that succeed – that is, queries that do not reference the stream selector parameter – there is no difference between the code written for the two overloads: from x in xs from y in ys select f(x, y) Top-K The Take operator used in StreamInsight causes confusion for LINQ programmers because it is applied to the (unbounded) stream rather than the (bounded) window, suggesting that the query as a whole will return k rows: (from win in xs.SnapshotWindow() from x in win orderby x.A select x.B).Take(k) The use of SelectMany is also unfortunate in this context because it implies the availability of the window parameter within the remainder of the comprehension. The following compiles but fails at runtime: (from win in xs.SnapshotWindow() from x in win orderby x.A select win).Take(k) The Take operator in 2.1 is applied to the window rather than the stream: Before After (from win in xs.SnapshotWindow() from x in win orderby x.A select x.B).Take(k) from win in xs.SnapshotWindow() from b in     (from x in win     orderby x.A     select x.B).Take(k) select b Multicast We are introducing an explicit multicast operator in order to preserve expression identity, which is important given the semantics about moving code to and from StreamInsight. This also better matches existing LINQ dialects, such as Reactive. This pattern enables expressing multicasting in two ways: Implicit Explicit var ys = from x in xs          where x.A > 1          select x; var zs = from y1 in ys          from y2 in ys.ShiftEventTime(_ => TimeSpan.FromSeconds(1))          select y1 + y2; var ys = from x in xs          where x.A > 1          select x; var zs = ys.Multicast(ys1 =>     from y1 in ys1     from y2 in ys1.ShiftEventTime(_ => TimeSpan.FromSeconds(1))     select y1 + y2; Notice the product translates an expression using implicit multicast into an expression using the explicit multicast operator. The user does not see this translation. Default window policies Only default window policies are supported in the new surface. Other policies can be simulated by using AlterEventLifetime. Before After xs.SnapshotWindow(     WindowInputPolicy.ClipToWindow,     SnapshotWindowInputPolicy.Clip) xs.SnapshotWindow() xs.TumblingWindow(     TimeSpan.FromSeconds(1),     HoppingWindowOutputPolicy.PointAlignToWindowEnd) xs.TumblingWindow(     TimeSpan.FromSeconds(1)) xs.TumblingWindow(     TimeSpan.FromSeconds(1),     HoppingWindowOutputPolicy.ClipToWindowEnd) Not supported … LeftAntiJoin Representation of LASJ as a correlated sub-query in the LINQ surface is problematic as the StreamInsight engine does not support correlated sub-queries (see discussion of SelectMany). The current syntax requires the introduction of an otherwise unsupported ‘IsEmpty()’ operator. As a result, the pattern is not discoverable and implies capabilities not present in the server. The direct representation of LASJ is used instead: Before After from x in xs where     (from y in ys     where x.A > y.B     select y).IsEmpty() select x xs.LeftAntiJoin(ys, (x, y) => x.A > y.B) from x in xs where     (from y in ys     where x.A == y.B     select y).IsEmpty() select x xs.LeftAntiJoin(ys, x => x.A, y => y.B) ApplyWithUnion The ApplyWithUnion methods have been deprecated since their signatures are redundant given the standard SelectMany overloads: Before After xs.GroupBy(x => x.A).ApplyWithUnion(gs => from win in gs.SnapshotWindow() select win.Count()) xs.GroupBy(x => x.A).SelectMany(     gs =>     from win in gs.SnapshotWindow()     select win.Count()) xs.GroupBy(x => x.A).ApplyWithUnion(gs => from win in gs.SnapshotWindow() select win.Count(), r => new { r.Key, Count = r.Payload }) from x in xs group x by x.A into gs from win in gs.SnapshotWindow() select new { gs.Key, Count = win.Count() } Alternate UDO syntax The representation of UDOs in the StreamInsight LINQ dialect confuses cardinalities. Based on the semantics of user-defined operators in StreamInsight, one would expect to construct queries in the following form: from win in xs.SnapshotWindow() from y in MyUdo(win) select y Instead, the UDO proxy method is referenced within a projection, and the (many) results returned by the user code are automatically flattened into a stream: from win in xs.SnapshotWindow() select MyUdo(win) The “many-or-one” confusion is exemplified by the following example that compiles but fails at runtime: from win in xs.SnapshotWindow() select MyUdo(win) + win.Count() The above query must fail because the UDO is in fact returning many values per window while the count aggregate is returning one. Original syntax New alternate syntax from win in xs.SnapshotWindow() select win.UdoProxy(1) from win in xs.SnapshotWindow() from y in win.UserDefinedOperator(() => new Udo(1)) select y -or- from win in xs.SnapshotWindow() from y in win.UdoMacro(1) select y Notice that this formulation also sidesteps the dynamic type pitfalls of the existing “proxy method” approach to UDOs, in which the type of the UDO implementation (TInput, TOuput) and the type of its constructor arguments (TConfig) need to align in a precise and non-obvious way with the argument and return types for the corresponding proxy method. UDSO syntax UDSO currently leverages the DataContractSerializer to clone initial state for logical instances of the user operator. Initial state will instead be described by an expression in the new LINQ surface. Before After xs.Scan(new Udso()) xs.Scan(() => new Udso()) Name changes ShiftEventTime => AlterEventStartTime: The alter event lifetime overload taking a new start time value has been renamed. CountByStartTimeWindow => CountWindow

    Read the article

  • MySQL Memory usage

    - by Rob Stevenson-Leggett
    Our MySQL server seems to be using a lot of memory. I've tried looking for slow queries and queries with no index and have halved the peak CPU usage and Apache memory usage but the MySQL memory stays constantly at 2.2GB (~51% of available memory on the server). Here's the graph from Plesk. Running top in the SSH window shows the same figures. Does anyone have any ideas on why the memory usage is constant like this and not peaks and troughs with usage of the app? Here's the output of the MySQL Tuning Primer script: -- MYSQL PERFORMANCE TUNING PRIMER -- - By: Matthew Montgomery - MySQL Version 5.0.77-log x86_64 Uptime = 1 days 14 hrs 4 min 21 sec Avg. qps = 22 Total Questions = 3059456 Threads Connected = 13 Warning: Server has not been running for at least 48hrs. It may not be safe to use these recommendations To find out more information on how each of these runtime variables effects performance visit: http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html Visit http://www.mysql.com/products/enterprise/advisors.html for info about MySQL's Enterprise Monitoring and Advisory Service SLOW QUERIES The slow query log is enabled. Current long_query_time = 1 sec. You have 6 out of 3059477 that take longer than 1 sec. to complete Your long_query_time seems to be fine BINARY UPDATE LOG The binary update log is NOT enabled. You will not be able to do point in time recovery See http://dev.mysql.com/doc/refman/5.0/en/point-in-time-recovery.html WORKER THREADS Current thread_cache_size = 0 Current threads_cached = 0 Current threads_per_sec = 2 Historic threads_per_sec = 0 Threads created per/sec are overrunning threads cached You should raise thread_cache_size MAX CONNECTIONS Current max_connections = 100 Current threads_connected = 14 Historic max_used_connections = 20 The number of used connections is 20% of the configured maximum. Your max_connections variable seems to be fine. INNODB STATUS Current InnoDB index space = 6 M Current InnoDB data space = 18 M Current InnoDB buffer pool free = 0 % Current innodb_buffer_pool_size = 8 M Depending on how much space your innodb indexes take up it may be safe to increase this value to up to 2 / 3 of total system memory MEMORY USAGE Max Memory Ever Allocated : 2.07 G Configured Max Per-thread Buffers : 274 M Configured Max Global Buffers : 2.01 G Configured Max Memory Limit : 2.28 G Physical Memory : 3.84 G Max memory limit seem to be within acceptable norms KEY BUFFER Current MyISAM index space = 4 M Current key_buffer_size = 7 M Key cache miss rate is 1 : 40 Key buffer free ratio = 81 % Your key_buffer_size seems to be fine QUERY CACHE Query cache is supported but not enabled Perhaps you should set the query_cache_size SORT OPERATIONS Current sort_buffer_size = 2 M Current read_rnd_buffer_size = 256 K Sort buffer seems to be fine JOINS Current join_buffer_size = 132.00 K You have had 16 queries where a join could not use an index properly You should enable "log-queries-not-using-indexes" Then look for non indexed joins in the slow query log. If you are unable to optimize your queries you may want to increase your join_buffer_size to accommodate larger joins in one pass. Note! This script will still suggest raising the join_buffer_size when ANY joins not using indexes are found. OPEN FILES LIMIT Current open_files_limit = 1024 files The open_files_limit should typically be set to at least 2x-3x that of table_cache if you have heavy MyISAM usage. Your open_files_limit value seems to be fine TABLE CACHE Current table_cache value = 64 tables You have a total of 426 tables You have 64 open tables. Current table_cache hit rate is 1% , while 100% of your table cache is in use You should probably increase your table_cache TEMP TABLES Current max_heap_table_size = 16 M Current tmp_table_size = 32 M Of 15134 temp tables, 9% were created on disk Effective in-memory tmp_table_size is limited to max_heap_table_size. Created disk tmp tables ratio seems fine TABLE SCANS Current read_buffer_size = 128 K Current table scan ratio = 2915 : 1 read_buffer_size seems to be fine TABLE LOCKING Current Lock Wait ratio = 1 : 142213 Your table locking seems to be fine The app is a facebook game with about 50-100 concurrent users. Thanks, Rob

    Read the article

  • need assistance with my.cnf - 1500% CPU usage

    - by Alan Long
    I'm running into a few issues with our new database server. It is a HP G8 with 2 INTEL XEON E5-2650 processors and 32GB of ram. This server is dedicated as a MySQL server (5.1.69) for our intranet portal. I have been having issues with this server staying alive - I notice high CPU usage during certain times of day (8% ~ 1500%+) and see very low memory usage (7 ~ 15%) based on using the 'top' command. When the CPU usage passes 1000%, that is when the app usually dies. I'm trying to see what I'm doing wrong with the config file, hopefully one of the experts can chime in and let me know what they think. See below for my.cnf file: [mysqld] default-storage-engine=InnoDB datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock #user=mysql large-pages # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 max_connections=275 tmp_table_size=1G key_buffer_size=384M key_buffer=384M thread_cache_size=1024 long_query_time=5 low_priority_updates=1 max_heap_table_size=1G myisam_sort_buffer_size=8M concurrent_insert=2 table_cache=1024 sort_buffer_size=8M read_buffer_size=5M read_rnd_buffer_size=6M join_buffer_size=16M table_definition_cache=6k open_files_limit=8k slow_query_log #skip-name-resolve # Innodb Settings innodb_buffer_pool_size=18G innodb_thread_concurrency=0 innodb_log_file_size=1G innodb_log_buffer_size=16M innodb_flush_log_at_trx_commit=2 innodb_lock_wait_timeout=50 innodb_file_per_table #innodb_buffer_pool_instances=4 #eliminating double buffering innodb_flush_method = O_DIRECT flush_time=86400 innodb_additional_mem_pool_size=40M #innodb_io_capacity = 5000 #innodb_read_io_threads = 64 #innodb_write_io_threads = 64 # increase until threads_created doesnt grow anymore thread_cache=1024 query_cache_type=1 query_cache_limit=4M query_cache_size=256M # Try number of CPU's*2 for thread_concurrency thread_concurrency = 0 wait_timeout = 1800 connect_timeout = 10 interactive_timeout = 60 [mysqldump] max_allowed_packet=32M [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid log-slow-queries=/var/log/mysql/slow-queries.log long_query_time = 1 log-queries-not-using-indexes we connect to one database with 75 tables, the largest table has 1,150,000 entries and the second largest has 128,036 entries. I have also verified that our PHP queries are optimized as best as possible. Reference - MySQLtuner: >> MySQLTuner 1.2.0 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.1.69-log [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in InnoDB tables: 420M (Tables: 75) [!!] Total fragmented tables: 75 -------- Security Recommendations ------------------------------------------- [!!] User '[email protected]' has no password set. -------- Performance Metrics ------------------------------------------------- [--] Up for: 1h 14m 50s (8M q [1K qps], 705 conn, TX: 6B, RX: 892M) [--] Reads / Writes: 68% / 32% [--] Total buffers: 19.7G global + 35.2M per thread (275 max threads) [!!] Maximum possible memory usage: 29.1G (93% of installed RAM) [OK] Slow queries: 0% (472/8M) [OK] Highest usage of available connections: 66% (183/275) [OK] Key buffer size / total MyISAM indexes: 384.0M/91.0K [OK] Key buffer hit rate: 100.0% (173 cached / 0 reads) [OK] Query cache efficiency: 96.2% (7M cached / 7M selects) [!!] Query cache prunes per day: 553614 [OK] Sorts requiring temporary tables: 0% (3 temp sorts / 1K sorts) [!!] Temporary tables created on disk: 49% (3K on disk / 7K total) [OK] Thread cache hit rate: 74% (183 created / 705 connections) [OK] Table cache hit rate: 97% (231 open / 238 opened) [OK] Open file limit used: 0% (17/8K) [OK] Table locks acquired immediately: 100% (432K immediate / 432K locks) [OK] InnoDB data size / buffer pool: 420.9M/18.0G -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Reduce your overall MySQL memory footprint for system stability Increasing the query_cache size over 128M may reduce performance Temporary table size is already large - reduce result set size Reduce your SELECT DISTINCT queries without LIMIT clauses Variables to adjust: *** MySQL's maximum memory usage is dangerously high *** *** Add RAM before increasing MySQL buffer variables *** query_cache_size (> 256M) [see warning above] Thanks in advanced for your help!

    Read the article

  • Accessing and Updating Data in ASP.NET: Filtering Data Using a CheckBoxList

    Filtering Database Data with Parameters, an earlier installment in this article series, showed how to filter the data returned by ASP.NET's data source controls. In a nutshell, the data source controls can include parameterized queries whose parameter values are defined via parameter controls. For example, the SqlDataSource can include a parameterized SelectCommand, such as: SELECT * FROM Books WHERE Price > @Price. Here, @Price is a parameter; the value for a parameter can be defined declaratively using a parameter control. ASP.NET offers a variety of parameter controls, including ones that use hard-coded values, ones that retrieve values from the querystring, and ones that retrieve values from session, and others. Perhaps the most useful parameter control is the ControlParameter, which retrieves its value from a Web control on the page. Using the ControlParameter we can filter the data returned by the data source control based on the end user's input. While the ControlParameter works well with most types of Web controls, it does not work as expected with the CheckBoxList control. The ControlParameter is designed to retrieve a single property value from the specified Web control, but the CheckBoxList control does not have a property that returns all of the values of its selected items in a form that the CheckBoxList control can use. Moreover, if you are using the selected CheckBoxList items to query a database you'll quickly find that SQL does not offer out of the box functionality for filtering results based on a user-supplied list of filter criteria. The good news is that with a little bit of effort it is possible to filter data based on the end user's selections in a CheckBoxList control. This article starts with a look at how to get SQL to filter data based on a user-supplied, comma-delimited list of values. Next, it shows how to programmatically construct a comma-delimited list that represents the selected CheckBoxList values and pass that list into the SQL query. Finally, we'll explore creating a custom parameter control to handle this logic declaratively. Read on to learn more! Read More >

    Read the article

  • Accessing and Updating Data in ASP.NET: Filtering Data Using a CheckBoxList

    Filtering Database Data with Parameters, an earlier installment in this article series, showed how to filter the data returned by ASP.NET's data source controls. In a nutshell, the data source controls can include parameterized queries whose parameter values are defined via parameter controls. For example, the SqlDataSource can include a parameterized SelectCommand, such as: SELECT * FROM Books WHERE Price > @Price. Here, @Price is a parameter; the value for a parameter can be defined declaratively using a parameter control. ASP.NET offers a variety of parameter controls, including ones that use hard-coded values, ones that retrieve values from the querystring, and ones that retrieve values from session, and others. Perhaps the most useful parameter control is the ControlParameter, which retrieves its value from a Web control on the page. Using the ControlParameter we can filter the data returned by the data source control based on the end user's input. While the ControlParameter works well with most types of Web controls, it does not work as expected with the CheckBoxList control. The ControlParameter is designed to retrieve a single property value from the specified Web control, but the CheckBoxList control does not have a property that returns all of the values of its selected items in a form that the CheckBoxList control can use. Moreover, if you are using the selected CheckBoxList items to query a database you'll quickly find that SQL does not offer out of the box functionality for filtering results based on a user-supplied list of filter criteria. The good news is that with a little bit of effort it is possible to filter data based on the end user's selections in a CheckBoxList control. This article starts with a look at how to get SQL to filter data based on a user-supplied, comma-delimited list of values. Next, it shows how to programmatically construct a comma-delimited list that represents the selected CheckBoxList values and pass that list into the SQL query. Finally, we'll explore creating a custom parameter control to handle this logic declaratively. Read on to learn more! Read More >

    Read the article

  • SQL Server Interview Questions

    - by Rodney Vinyard
    User-Defined Functions Scalar User-Defined Function A Scalar user-defined function returns one of the scalar data types. Text, ntext, image and timestamp data types are not supported. These are the type of user-defined functions that most developers are used to in other programming languages. Table-Value User-Defined Function An Inline Table-Value user-defined function returns a table data type and is an exceptional alternative to a view as the user-defined function can pass parameters into a T-SQL select command and in essence provide us with a parameterized, non-updateable view of the underlying tables. Multi-statement Table-Value User-Defined Function A Multi-Statement Table-Value user-defined function returns a table and is also an exceptional alternative to a view as the function can support multiple T-SQL statements to build the final result where the view is limited to a single SELECT statement. Also, the ability to pass parameters into a T-SQL select command or a group of them gives us the capability to in essence create a parameterized, non-updateable view of the data in the underlying tables. Within the create function command you must define the table structure that is being returned. After creating this type of user-defined function, I can use it in the FROM clause of a T-SQL command unlike the behavior found when using a stored procedure which can also return record sets.

    Read the article

  • How to dynamically modify NHibernate load queries at runtime? EventListeners? Interceptors?

    - by snicker
    I need to modify the query used to load many-to-one references in my model. Specifically, I need to be able to further filter this data. Unfortunately, NH will not allow me to filter many-to-one relationships using the built in filtering system (?). I could just be doing something incorrect. Is there a hook where I can manually and dynamically modify the query used to load the data? Or an alternative to filters that will allow me to specify parameters? Background: I am working with a database that is using a form of revision control, with each entity having a natural ID PK, an EntityId, a RevisionValidTo and RevisionValidFrom field. There may be many rows using the same EntityId, which is the reference for other tables to join on, but the Revision ranges are mutually exclusive. Thus, the relationship is only many-to-one IIF the filter is applied. However, NH offers no way to specify a filter on many-to-one references (they do for collections...)

    Read the article

  • LINQ to SQL Profiler

    In this article we will be taking a look at the new LINQ to SQL Profiler from HibernatingRhinos. This tool gives you a view into the goings on of LINQ to SQL. Not only does it allow you to see the SQL that is generated by your LINQ queries but it also shows you information about your connections, queries, as well as alerting you to all sorts of information that you might otherwise not know about.

    Read the article

  • SSAS DMVs: useful links

    - by Davide Mauri
    From time to time happens that I need to extract metadata informations from Analysis Services DMVS in order to quickly get an overview of the entire situation and/or drill down to detail level. As a memo I post the link I use most when need to get documentation on SSAS Objects Data DMVs: SSAS: Using DMV Queries to get Cube Metadata http://bennyaustin.wordpress.com/2011/03/01/ssas-dmv-queries-cube-metadata/ SSAS DMV (Dynamic Management View) http://dwbi1.wordpress.com/2010/01/01/ssas-dmv-dynamic-management-view/ Use Dynamic Management Views (DMVs) to Monitor Analysis Services http://msdn.microsoft.com/en-us/library/hh230820.aspx

    Read the article

  • Advanced TSQL Tuning: Why Internals Knowledge Matters

    - by Paul White
    There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes.  Query tuning is not complete as soon as the query returns results quickly in the development or test environments.  In production, your query will compete for memory, CPU, locks, I/O and other resources on the server.  Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data.  In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row.  The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type.  A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL,   CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL,   CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table.  The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results.  This seems slow, considering that the table only has 50,000 rows.  Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time.  Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring.  The script below monitors STATISTICS IO output and the amount of tempdb used by the test query.  We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB.  The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row.  With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first).  This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts.  In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted.  SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed.  Sorts typically require a very large number of comparisons, so this is usually a very effective optimization.  One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself.  SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use.  The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory.  Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant.  The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources.  More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1).  Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB.  The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file.  Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case.  In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms.  If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much.  Why is the data size so much smaller?  The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored.  By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output.  You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead.  SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows.  The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes.  The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page.  There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option.  By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row.  SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage.  You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now.  This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query).  SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant.  No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers.  Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’.  Why?  Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed.  SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages.  This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this?  Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need.  245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory.  Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily?  That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column.  That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal.  I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator.  Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted.  We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need.  The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb.  The small remaining inefficiency is in reading the id column values from the clustered primary key index.  As a clustered index, it contains all the in-row data at its leaf.  The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead.  The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure.  This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only.  This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data.  The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings.  It isn’t about the internal differences between TEXT and MAX data types either.  It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load.  No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance.  Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows.  5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is!  If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb.  Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough.  Tuning for logical reads and adding covering indexes is not enough.  If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on.  The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008.  They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator.  Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

    Read the article

  • MAXDOP in SQL Azure

    - by Herve Roggero
    In my search of better understanding the scalability options of SQL Azure I stumbled on an interesting aspect: Query Hints in SQL Azure. More specifically, the MAXDOP hint. A few years ago I did a lot of analysis on this query hint (see article on SQL Server Central:  http://www.sqlservercentral.com/articles/Configuring/managingmaxdegreeofparallelism/1029/).  Here is a quick synopsis of MAXDOP: It is a query hint you use when issuing a SQL statement that provides you control with how many processors SQL Server will use to execute the query. For complex queries with lots of I/O requirements, more CPUs can mean faster parallel searches. However the impact can be drastic on other running threads/processes. If your query takes all available processors at 100% for 5 minutes... guess what... nothing else works. The bottom line is that more is not always better. The use of MAXDOP is more art than science... and a whole lot of testing; it depends on two things: the underlying hardware architecture and the application design. So there isn't a magic number that will work for everyone... except 1... :) Let me explain. The rules of engagements are different. SQL Azure is about sharing. Yep... you are forced to nice with your neighbors.  To achieve this goal SQL Azure sets the MAXDOP to 1 by default, and ignores the use of the MAXDOP hint altogether. That means that all you queries will use one and only one processor.  It really isn't such a bad thing however. Keep in mind that in some of the largest SQL Server implementations MAXDOP is usually also set to 1. It is a well known configuration setting for large scale implementations. The reason is precisely to prevent rogue statements (like a SELECT * FROM HISTORY) from bringing down your systems (like a report that should have been running on a different in the first place) and to avoid the overhead generated by executing too many parallel queries that could cause internal memory management nightmares to the host Operating System. Is summary, forcing the MAXDOP to 1 in SQL Azure makes sense; it ensures that your database will continue to function normally even if one of the other tenants on the same server is running massive queries that would otherwise bring you down. Last but not least, keep in mind as well that when you test your database code for performance on-premise, make sure to set the DOP to 1 on your SQL Server databases to simulate SQL Azure conditions.

    Read the article

  • Fetching Data from Multiple Tables using Joins

    Applying normalization to relational databases tends to promote better accuracy of queries, but it also leads to queries that take a little more work to develop, as the data may be spread amongst several tables. In today's article, we'll learn how to fetch data from multiple tables by using joins.

    Read the article

  • SSAS DMVs: useful links

    - by Davide Mauri
    From time to time happens that I need to extract metadata informations from Analysis Services DMVS in order to quickly get an overview of the entire situation and/or drill down to detail level. As a memo I post the link I use most when need to get documentation on SSAS Objects Data DMVs: SSAS: Using DMV Queries to get Cube Metadata http://bennyaustin.wordpress.com/2011/03/01/ssas-dmv-queries-cube-metadata/ SSAS DMV (Dynamic Management View) http://dwbi1.wordpress.com/2010/01/01/ssas-dmv-dynamic-management-view/ Use Dynamic Management Views (DMVs) to Monitor Analysis Services http://msdn.microsoft.com/en-us/library/hh230820.aspx

    Read the article

  • Is inline SQL still classed as bad practice now that we have Micro ORMs?

    - by Grofit
    This is a bit of an open ended question but I wanted some opinions, as I grew up in a world where inline SQL scripts were the norm, then we were all made very aware of SQL injection based issues, and how fragile the sql was when doing string manipulations all over the place. Then came the dawn of the ORM where you were explaining the query to the ORM and letting it generate its own SQL, which in a lot of cases was not optimal but was safe and easy. Another good thing about ORMs or database abstraction layers were that the SQL was generated with its database engine in mind, so I could use Hibernate/Nhibernate with MSSQL, MYSQL and my code never changed it was just a configuration detail. Now fast forward to current day, where Micro ORMs seem to be winning over more developers I was wondering why we have seemingly taken a U-Turn on the whole in-line sql subject. I must admit I do like the idea of no ORM config files and being able to write my query in a more optimal manner but it feels like I am opening myself back up to the old vulnerabilities such as SQL injection and I am also tying myself to one database engine so if I want my software to support multiple database engines I would need to do some more string hackery which seems to then start to make code unreadable and more fragile. (Just before someone mentions it I know you can use parameter based arguments with most micro orms which offers protection in most cases from sql injection) So what are peoples opinions on this sort of thing? I am using Dapper as my Micro ORM in this instance and NHibernate as my regular ORM in this scenario, however most in each field are quite similar. What I term as inline sql is SQL strings within source code. There used to be design debates over SQL strings in source code detracting from the fundamental intent of the logic, which is why statically typed linq style queries became so popular its still just 1 language, but with lets say C# and Sql in one page you have 2 languages intermingled in your raw source code now. Just to clarify, the SQL injection is just one of the known issues with using sql strings, I already mention you can stop this from happening with parameter based queries, however I highlight other issues with having SQL queries ingrained in your source code, such as the lack of DB Vendor abstraction as well as losing any level of compile time error capturing on string based queries, these are all issues which we managed to side step with the dawn of ORMs with their higher level querying functionality, such as HQL or LINQ etc (not all of the issues but most of them). So I am less focused on the individual highlighted issues and more the bigger picture of is it now becoming more acceptable to have SQL strings directly in your source code again, as most Micro ORMs use this mechanism. Here is a similar question which has a few different view points, although is more about the inline sql without the micro orm context: http://stackoverflow.com/questions/5303746/is-inline-sql-hard-coding

    Read the article

  • Designing Mobile SMS text advertising system

    - by Ramraj Edagutti
    Currently, I am working on a product where we have an SMS text advertising system, and using this, we setup advertising campaigns for clients, and later these campaigns are sent to the end users. This is very similar to Google Adwords, but targeted to Mobile users via SMS. Just to give an overview of the system Each Campaign is mapped to an advertiser Campaign has start date and end date Campaign has a filter condition(s) or query to select the target user base from our database (to whom we send Campaigns) Target user base can be fixed, for e.g send campaign to 10000 users Target user base can also be dynamic based on query condition, for e.g send campaign to users who are active and from a particular state, district, town etc. (this way user base will be keep changing on daily basis) Campaign can have multiple campaign messages Each campaign message has start date and end date Each campaign message can have multiple message texts for different locales, for e.g English,Hindi,Telugu etc After creating an advertisement campaign, we run daily night job to provision the target user base for that a particular campaign in a separate table, and another daily job runs on morning times and checks provisioned table for campaigns and targeted users and sends the campaign to users via SMS. Problem is, current UI for creating advertising campaigns is designed in a very technical manner, I mean, normal user or business owner or clients can not use the UI to create a campaign. Below are reasons why the UI is very technical in nature Filter condition(s) or query input filed, takes user ids or mobile numbers or SQL queries. Most of times or almost every time, we use big SQL queries So we end up storing SQL queries in a database for a campaign, later we use this SQL query to fetch targeted user base. For scheduling these campaigns, we have input filed on UI which takes quartz cron expression(s) ( for e.g. send campaign on "0 0 9 1-10 MAR 2012" ), again very technical in nature Normal user or business owner, can not use the UI for creating campaigns for reasons mentioned above, Currently, we ourself (developers) helping clients to setup/create campaigns. we are trying to re-design the UI to make it more user friendly so that any user can go to UI and create an advertisement campaign by himself. I am thinking of re-designing the current UI similar to Google Adwords interface, especially for selecting target users based on user geography like country, state, city etc. I also need to select users based user subscription(s), which might make system even more complex. And also, for campaign scheduling, I am thinking of using weekdays with hours. For example, I will shows Monday to Sunday on UI, and user can select the from hours, to hours etc. Any better ideas or suggestion on how to design UI in very user friendly manner and what design should be followed on server side code (we write backend code on java/jpa/spring/quartz)? And I am looking for ideas or design patterns on how to build SQL queries (using JPA/Hinernate) programmatically on server side, based on varies conditions like based on country, state, town, village, and user subscriptions.

    Read the article

  • Proper Method name for XML builder

    - by Wesley
    I think this is the right stack for this. I have a helper class which builds CAML queries (SharePoint XML for getting list items from SQL) There is one method that is flexibly used to build the queries that get all related votes and comments for a social item. I don't want to call it BuildVoteorCommentXML or something long winded like that. Is there a good naming convention for getting all Join/Foreign Key objects from a core object?

    Read the article

  • Google Analytics show zero for "Search Engine Optimizations" graph

    - by Saeed Neamati
    In Google Analytics new design, there is an area related to the queries and impressions related to your site. You can get there by following Traffic Sources = Search Engine Optimization = Queries. However, it now shows zero for the "Site Usage" graph, at the top section, while other areas of Google Analytics definitely show that site has visitors and has been used. No matter how much I search, I can't find the source of the problem. Does anyone know where the problem might be?

    Read the article

  • Is MongoDB a good choice or not for my application?

    - by shubham
    I have a Reporting application which stores the reports in xml format as recieved from source (XML schema is not defined, it can be any format) and those reports contain some keys and values. Like jobid, setid be keys for 1 type of report and userid, groupId for another type of report etc. The type of keys that can be referred from the document is determined by the namespaces used in the xml doc. These keys are stored on the basis of namespace used in the xml document. For e.g. If a tag in xml fragment uses namespace= "myspace1", then I have keys A and B for myspace1 stored in another table. It will fetch those keys from that table for this namespace, look for their values in xml doc and store it in another table along with the pointer to this xml document (Id of a record storing complete xml document in a cell). Use cases: When the user comes and queries for that key and value, I return the document or a set of documents that are having those key/value pairs. When the user comes and queries for a certain key and provide a name for xslt (pre stored), I fetch the set of documents fulfilling that criteria and convert that xml to html with the specified xslt. When the user comes and asks for a particular fragment of a doc then it can fetch a subset from a particular document also. When the user comes and queries for top x values of a certain key, I return the set of documents that are having top 10 values of that key. I am using DB2 database for its support of xml along with relational capabilities. That makes easier for me to run xpath expressions and fetch values of keys and also aggregate a set of documents fullfilling a criteria, all on the database side. Problems: DB2 stores XML doc of upto 2GB in size. Retrieval is very slow. If some thing involves many documents, then it takes significant time for things to show up in browser, and the user has to wait. Can MongoDb help in this case, as it is document oriented? can I do xml related xpath queries and document transformations on db side? Or is it ok to use both in such a case?

    Read the article

  • Is there any free host which supports php and mySQL in utf-8? [closed]

    - by Maria Konnou
    Possible Duplicate: How to find web hosting that meets my requirements? Is there any free host which supports php and mySQL queries in utf-8? I've already tried to use x10hosting and 000webhosting, but they don't support utf8 mysql queries (got mojibake). The default encoding of mysql in both sites is latin-1, and you're not able to change that. Is there any other free host that fully supports utf-8?

    Read the article

  • Profiling Database Activity in the Entity Framework

    It’s important to profile your database queries to see what happens in response to Entity Framework queries and other data access activities, says Julie Lerman, who gives you the details on several profiling options to improve you coding. Join SQL Backup’s 35,000+ customers to compress and strengthen your backups "SQL Backup will be a REAL boost to any DBA lucky enough to use it." Jonathan Allen. Download a free trial now.

    Read the article

  • Improving 2D Range Query Performance in SQL Server

    When using the BETWEEN operator on multiple columns, you are likely using a 2D range query. Such queries perform very poorly in SQL Server. This article examines rewriting these queries for improved performance. Join SQL Backup’s 35,000+ customers to compress and strengthen your backups "SQL Backup will be a REAL boost to any DBA lucky enough to use it." Jonathan Allen. Download a free trial now.

    Read the article

  • Read Committed Snapshot Isolation– Two Considerations

    - by GavinPayneUK
      The Read Committed Snapshot database option in SQL Server, known perhaps more accurately as Read Committed Snapshot Isolation or RCSI, can be enabled to help readers from blocking writers and writers from blocking readers.  However, enabling it can cause two issues with the tempdb database which are often overlooked. One can slow down queries, the other can cause queries to fail . Overview of RCSI Enabling the option changes the behaviour of the default SQL Server isolation level, read...(read more)

    Read the article

  • Reporting what's not there

    It's easy to write queries that will show data in the database that matches a criteria. However, if no data in the database matches the criteria, it becomes more difficult. This article examines two different scenarios where it's necessary to create data in order to be able to report zero values in queries.

    Read the article

  • Should I have a separate method for Update(), Insert(), etc., or have a generic Query() that would be able to handle all of these?

    - by Prayos
    I'm currently trying to write a class library for a connection to a database. Looking over it, there are several different types of queries: Select From, Update, Insert, etc. My question is, what is the best practice for writing these queries in a C# application? Should I have a separate method for each of them(i.e. Update(), Insert()), or have a generic Query() that would be able to handle all of these? Thanks for any and all help!

    Read the article

< Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43  | Next Page >