Search Results

Search found 25852 results on 1035 pages for 'linq query syntax'.

Page 144/1035 | < Previous Page | 140 141 142 143 144 145 146 147 148 149 150 151  | Next Page >

  • SQL SERVER – SSIS Look Up Component – Cache Mode – Notes from the Field #028

    - by Pinal Dave
    [Notes from Pinal]: Lots of people think that SSIS is all about arranging various operations together in one logical flow. Well, the understanding is absolutely correct, but the implementation of the same is not as easy as it seems. Similarly most of the people think lookup component is just component which does look up for additional information and does not pay much attention to it. Due to the same reason they do not pay attention to the same and eventually get very bad performance. Linchpin People are database coaches and wellness experts for a data driven world. In this 28th episode of the Notes from the Fields series database expert Tim Mitchell (partner at Linchpin People) shares very interesting conversation related to how to write a good lookup component with Cache Mode. In SQL Server Integration Services, the lookup component is one of the most frequently used tools for data validation and completion.  The lookup component is provided as a means to virtually join one set of data to another to validate and/or retrieve missing values.  Properly configured, it is reliable and reasonably fast. Among the many settings available on the lookup component, one of the most critical is the cache mode.  This selection will determine whether and how the distinct lookup values are cached during package execution.  It is critical to know how cache modes affect the result of the lookup and the performance of the package, as choosing the wrong setting can lead to poorly performing packages, and in some cases, incorrect results. Full Cache The full cache mode setting is the default cache mode selection in the SSIS lookup transformation.  Like the name implies, full cache mode will cause the lookup transformation to retrieve and store in SSIS cache the entire set of data from the specified lookup location.  As a result, the data flow in which the lookup transformation resides will not start processing any data buffers until all of the rows from the lookup query have been cached in SSIS. The most commonly used cache mode is the full cache setting, and for good reason.  The full cache setting has the most practical applications, and should be considered the go-to cache setting when dealing with an untested set of data. With a moderately sized set of reference data, a lookup transformation using full cache mode usually performs well.  Full cache mode does not require multiple round trips to the database, since the entire reference result set is cached prior to data flow execution. There are a few potential gotchas to be aware of when using full cache mode.  First, you can see some performance issues – memory pressure in particular – when using full cache mode against large sets of reference data.  If the table you use for the lookup is very large (either deep or wide, or perhaps both), there’s going to be a performance cost associated with retrieving and caching all of that data.  Also, keep in mind that when doing a lookup on character data, full cache mode will always do a case-sensitive (and in some cases, space-sensitive) string comparison even if your database is set to a case-insensitive collation.  This is because the in-memory lookup uses a .NET string comparison (which is case- and space-sensitive) as opposed to a database string comparison (which may be case sensitive, depending on collation).  There’s a relatively easy workaround in which you can use the UPPER() or LOWER() function in the pipeline data and the reference data to ensure that case differences do not impact the success of your lookup operation.  Again, neither of these present a reason to avoid full cache mode, but should be used to determine whether full cache mode should be used in a given situation. Full cache mode is ideally useful when one or all of the following conditions exist: The size of the reference data set is small to moderately sized The size of the pipeline data set (the data you are comparing to the lookup table) is large, is unknown at design time, or is unpredictable Each distinct key value(s) in the pipeline data set is expected to be found multiple times in that set of data Partial Cache When using the partial cache setting, lookup values will still be cached, but only as each distinct value is encountered in the data flow.  Initially, each distinct value will be retrieved individually from the specified source, and then cached.  To be clear, this is a row-by-row lookup for each distinct key value(s). This is a less frequently used cache setting because it addresses a narrower set of scenarios.  Because each distinct key value(s) combination requires a relational round trip to the lookup source, performance can be an issue, especially with a large pipeline data set to be compared to the lookup data set.  If you have, for example, a million records from your pipeline data source, you have the potential for doing a million lookup queries against your lookup data source (depending on the number of distinct values in the key column(s)).  Therefore, one has to be keenly aware of the expected row count and value distribution of the pipeline data to safely use partial cache mode. Using partial cache mode is ideally suited for the conditions below: The size of the data in the pipeline (more specifically, the number of distinct key column) is relatively small The size of the lookup data is too large to effectively store in cache The lookup source is well indexed to allow for fast retrieval of row-by-row values No Cache As you might guess, selecting no cache mode will not add any values to the lookup cache in SSIS.  As a result, every single row in the pipeline data set will require a query against the lookup source.  Since no data is cached, it is possible to save a small amount of overhead in SSIS memory in cases where key values are not reused.  In the real world, I don’t see a lot of use of the no cache setting, but I can imagine some edge cases where it might be useful. As such, it’s critical to know your data before choosing this option.  Obviously, performance will be an issue with anything other than small sets of data, as the no cache setting requires row-by-row processing of all of the data in the pipeline. I would recommend considering the no cache mode only when all of the below conditions are true: The reference data set is too large to reasonably be loaded into SSIS memory The pipeline data set is small and is not expected to grow There are expected to be very few or no duplicates of the key values(s) in the pipeline data set (i.e., there would be no benefit from caching these values) Conclusion The cache mode, an often-overlooked setting on the SSIS lookup component, represents an important design decision in your SSIS data flow.  Choosing the right lookup cache mode directly impacts the fidelity of your results and the performance of package execution.  Know how this selection impacts your ETL loads, and you’ll end up with more reliable, faster packages. If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SSIS

    Read the article

  • Should experienced programmers know database queries?

    - by Shamim Hafiz
    There are so many programmers out there who are also an expert at Query writing and Database design. Should this be a core requirement to be an expert programmer or software engineer? Though there are lots of similarities in the way queries and codes are developed, my personal opinion is, Queries seem to have a different Structure than Code and it can be tough to Master both simultaneously due to the different approaches.

    Read the article

  • Temporary Tables in Stored Procedures

    - by Paul White
    Ask anyone what the primary advantage of temporary tables over table variables is, and the chances are they will say that temporary tables support statistics and table variables do not. This is true, of course; even the indexes that enforce PRIMARY KEY and UNIQUE constraints on table variables do not have populated statistics associated with them, and it is not possible to manually create statistics or non-constraint indexes on table variables. Intuitively, then, any query that has alternative execution...(read more)

    Read the article

  • SQL – Contest to Get The Date – Win USD 50 Amazon Gift Cards and Cool Gift

    - by Pinal Dave
    If you are a regular reader of this blog – you will find no issue at all in resolving this puzzle. This contest is based on my experience with NuoDB. If you are not familiar with NuoDB, here are few pointers for you. Step by Step Guide to Download and Install NuoDB – Getting Started with NuoDB Quick Start with Admin Sections of NuoDB – Manage NuoDB Database Quick Start with Explorer Sections of NuoDB – Query NuoDB Database In today’s contest you have to answer following questions: Q 1: Precision of NOW() What is the precision of the NuoDB’s NOW() function, which returns current date time? Hint: Run following script on NuoDB Console Explorer section: SELECT NOW() AS CurrentTime FROM dual; Here is the image. I have masked the area where the time precision is displayed. Q 2: Executing Date and Time Script When I execute following script - SELECT 'today' AS Today, 'tomorrow' AS Tomorrow, 'yesterday' AS Yesterday FROM dual; I will get the following result:   NOW – What will be the answer when we execute following script? and WHY? SELECT CAST('today' AS DATE) AS Today, CAST('tomorrow' AS DATE) AS Tomorrow, CAST('yesterday'AS DATE) AS Yesterday FROM dual; HINT: Install NuoDB (it takes 90 seconds). Prizes: 2 Amazon Gifts 2 Limited Edition Hoodies (US resident only)   Rules: Please leave an answer in the comments section below. You must answer both the questions together in a single comment. US resident who wants to qualify to win NuoDB apparel please mention your country in the comment. You can resubmit your answer multiple times, the latest entry will be considered valid. Last day to participate in the puzzle is June 24, 2013. All valid answer will be kept hidden till June 24, 2013. The winner will be announced on June 25, 2013. Two Winners will get USD 25 worth Amazon Gift Card. (Total Value = 25 x 2 = 50 USD) The winner will be selected using a random algorithm from all the valid answers. Anybody with a valid email address can take part in the contest. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

    Read the article

  • Oracle Index Skip Scan

    - by jchang
    There is a feature, called index skip scan that has been in Oracle since version 9i. When I across this, it seemed like a very clever trick, but not a critical capability. More recently, I have been advocating DW on SSD in approrpiate situations, and I am thinking this is now a valuable feature in keeping the number of nonclustered indexes to a minimum. Briefly, suppose we have an index with key columns: Col1 , Col2 , in that order. Obviously, a query with a search argument (SARG) on Col1 can use...(read more)

    Read the article

  • "t-sql" tool for csv files?

    - by adolf garlic
    Is there a tool that allows to query files based on the following criteria, and change them? creation date filenames (e.g. filetypeA_YYYYMMDD_data2.blah) file content For the example, the workflow could be the following one: Get all files created between date x and date y, where the filename contains z and the file itself contains value x under column zzz. Update it to be a different value/another column.

    Read the article

  • How to batch retrieve documents with mongoDB?

    - by edude05
    Hello everyone, I have an application that queries data from a mongoDB using the mongoDB C# driver something like this: public void main() { foreach (int i in listOfKey) { list.add(getObjectfromDB(i); } } public myObject getObjFromDb(int primaryKey) { document query = new document(); query["primKey"] = primaryKey; document result= mongo["myDatabase"]["myCollection"].findOne(query); return parseObject(result); } On my local (development) machine to get 100 object this way takes less than a second. However, I recently moved the database to a server on the internet, and this query takes about 30 seconds to execute for the same number of object. Furthermore, looking at the mongoDB log, it seems to open about 8-10 connections to the DB to perform this query. So what I'd like to do is have the query the database for an array of primaryKeys and get them all back at once, then do the parsing in a loop afterwards, using one connection if possible. How could I optimize my query to do so? Thanks, --Michael

    Read the article

  • Drawing up a session value within SQL query? PHP and MySQL

    - by Derek
    Hi, on one of my web pages I want my manager user to view all activities assigned to them (personally). In order to do this, I need something like this: $sql = "SELECT * FROM activities WHERE manager = $_SESSION['SESS_FULLNAME']"; Now obviously this syntax is all wrong, but because I am new to this stuff, is there a way I can call up the full name from the user's session within a query? This is so that when I call up the database values to be displayed within the web page, only the activities for the manager who is logged in is displayed. For example, the activities table has a manager column of a full name entry. Any help is much appreciated. Thanks.

    Read the article

  • How To Implement The Query Side Of CQS in DDD?

    - by Laz
    I have implemented the command side of DDD using the domain model and repositories, but how do I implement the query side? Do I create an entirely new domain model for the UI, and where is this kept in the project structure...in the domain layer, the UI layer, etc? Also, what do I use as my querying mechanism, do I create new repositories specifically for the UI domain objects, something other than repositories, or something else?

    Read the article

  • Optimize date query for large child tables: GiST or GIN?

    - by Dave Jarvis
    Problem 72 child tables, each having a year index and a station index, are defined as follows: CREATE TABLE climate.measurement_12_013 ( -- Inherited from table climate.measurement_12_013: id bigint NOT NULL DEFAULT nextval('climate.measurement_id_seq'::regclass), -- Inherited from table climate.measurement_12_013: station_id integer NOT NULL, -- Inherited from table climate.measurement_12_013: taken date NOT NULL, -- Inherited from table climate.measurement_12_013: amount numeric(8,2) NOT NULL, -- Inherited from table climate.measurement_12_013: category_id smallint NOT NULL, -- Inherited from table climate.measurement_12_013: flag character varying(1) NOT NULL DEFAULT ' '::character varying, CONSTRAINT measurement_12_013_category_id_check CHECK (category_id = 7), CONSTRAINT measurement_12_013_taken_check CHECK (date_part('month'::text, taken)::integer = 12) ) INHERITS (climate.measurement) CREATE INDEX measurement_12_013_s_idx ON climate.measurement_12_013 USING btree (station_id); CREATE INDEX measurement_12_013_y_idx ON climate.measurement_12_013 USING btree (date_part('year'::text, taken)); (Foreign key constraints to be added later.) The following query runs abysmally slow due to a full table scan: SELECT count(1) AS measurements, avg(m.amount) AS amount FROM climate.measurement m WHERE m.station_id IN ( SELECT s.id FROM climate.station s, climate.city c WHERE -- For one city ... -- c.id = 5182 AND -- Where stations are within an elevation range ... -- s.elevation BETWEEN 0 AND 3000 AND 6371.009 * SQRT( POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) + (COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) * POW(RADIANS(c.longitude_decimal - s.longitude_decimal), 2)) ) <= 50 ) AND -- -- Begin extracting the data from the database. -- -- The data before 1900 is shaky; insufficient after 2009. -- extract( YEAR FROM m.taken ) BETWEEN 1900 AND 2009 AND -- Whittled down by category ... -- m.category_id = 1 AND m.taken BETWEEN -- Start date. (extract( YEAR FROM m.taken )||'-01-01')::date AND -- End date. Calculated by checking to see if the end date wraps -- into the next year. If it does, then add 1 to the current year. -- (cast(extract( YEAR FROM m.taken ) + greatest( -1 * sign( (extract( YEAR FROM m.taken )||'-12-31')::date - (extract( YEAR FROM m.taken )||'-01-01')::date ), 0 ) AS text)||'-12-31')::date GROUP BY extract( YEAR FROM m.taken ) The sluggishness comes from this part of the query: m.taken BETWEEN /* Start date. */ (extract( YEAR FROM m.taken )||'-01-01')::date AND /* End date. Calculated by checking to see if the end date wraps into the next year. If it does, then add 1 to the current year. */ (cast(extract( YEAR FROM m.taken ) + greatest( -1 * sign( (extract( YEAR FROM m.taken )||'-12-31')::date - (extract( YEAR FROM m.taken )||'-01-01')::date ), 0 ) AS text)||'-12-31')::date The HashAggregate from the plan shows a cost of 10006220141.11, which is, I suspect, on the astronomically huge side. There is a full table scan on the measurement table (itself having neither data nor indexes) being performed. The table aggregates 237 million rows from its child tables. Question What is the proper way to index the dates to avoid full table scans? Options I have considered: GIN GiST Rewrite the WHERE clause Separate year_taken, month_taken, and day_taken columns to the tables What are your thoughts? Thank you!

    Read the article

  • How to construct this query? (Ordering by COUNT() and joining with users table)

    - by Andrew
    users table: id-user-other columns scores table: id-user_id-score-other columns They're are more than one rows for each user, but there's only two scores you can have. (0 or 1, == win or loss). So I want to output all the users ordered by the number of wins, and all the users ordered by the numbers of losses. I know how to do this by looping through each user, but I was wondering how to do it with one query. Any help is appreciated!

    Read the article

  • How to handle when SSRS does not automatically update fields based on database query?

    - by badpanda
    So I am trying to change the number of fields in my dataset in SSRS and the refresh button is not picking up the added field from the SQL server. The query is definitely returning the correct data, as I have double checked in the server engine itself. Also, I have tried manually adding the field using the SSRS menu, but as soon as I execute it disappears. Any suggestions or similar experiences?

    Read the article

  • MySQL: How can fetch SUM() of all fields in one Query?

    - by takpar
    Hi, I just want somthing like this: select SUM(*) from `mytable` group by `year` any suggestion? (I am using Zend Framework; if you have a suggestion using ZF rather than pure query would be great!) Update: I have a mass of columns in table and i do not want to write their name down one by one. No Idea??

    Read the article

  • How can an improvement to the query cache be tracked?

    - by Bill Paetzke
    I am parameterizing my web app's ad hoc sql. As a result, I expect the query plan cache to reduce in size and have a higher hit ratio. Perhaps even other important metrics will be improved. Could I use perfmon to track this? If so, what counters should I use? If not perfmon, how could I report on the impact of this change?

    Read the article

< Previous Page | 140 141 142 143 144 145 146 147 148 149 150 151  | Next Page >