Search Results

Search found 7116 results on 285 pages for 'nested queries'.

Page 101/285 | < Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108 | Next Page >

Understanding configuration for parallel calling in web app (IIS + MS SQL)

- by mmcteam.com.ua

We have an ASP.NET MVC application + IIS 7.5 + SQL Server 2008 R2. We have to load a lot of aggregate counters on the each page. We decided to use ajax and call with javascript for each counter or groups of counters and return them as JSON result. We solve the problem that user doesn't wait for page loading, page loads fast. User waits for counters loading while seeing other page content. But we thought that if we make calls from javascript - our queries will be make async, but we notice, that it is not. All our javascipt calls runs immediately, but action that they invoke are in queue. If we use Async Controller ability - all counters calculating simultaneously, but user has to wait for the longest counter calculating before page loads. The question: We want to understand what is happens if we use ajax and call two or more actions simultaneously. And how can we configuring this. (also in each action we make some queries to sql server)

Read the article
Username correct, password incorrect?

- by jonnnnnnnnnie

In a login system, how can you tell if the user has entered the password incorrectly? Do you perform two SQL queries, one to find the username, and then one to find the username and matching (salted+hashed etc) password? I'm asking this because If the user entered the password incorrectly, I want to update the failed_login_attempts column I have. If you perform two queries wouldn't that increase overhead? If you did a query like this, how would you tell if the password entered was correct or not, or whether the username doesn't exist: SELECT * FROM author WHERE username = '$username' AND password = '$password' LIMIT 1 ( ^ NB: I'm keeping it simple, will use hash and salt, and will sanitize input in real one.) Something like this: $user = perform_Query() // get username and password? if ($user['username'] == $username && $user['password'] == $password) { return $user; } elseif($user['username'] == $username && $user['password'] !== $password) { // here the password doesn't match // update failed_login_attemps += 1 }

Read the article
Dump Hibernate activity to sql script file

- by zeven

Hi, I'm trying to log hibernate activity (only dml operations) to an sql script file. My goal is to have a way to reconstruct the database from a given starting point to the current state by executing the generated script. I can get the sql queries from log4j logs but they have more information than the raw sql queries and i would need to parse them and extract only the helpful statements. So i'm looking for a programatic way, maybe by listening the persist/merge/delete operations and accessing the hibernate-generated sql statements. I don't like to reinvent the wheel so, if anybody know a way for doing this i would appreciate it very much. Thanks in advance

Read the article
Mysql query problem

- by Lost_in_code

Below is a sample table: fruits +-------+---------+ | id | type | +-------+---------+ | 1 | apple | | 2 | orange | | 3 | banana | | 4 | apple | | 5 | apple | | 6 | apple | | 7 | orange | | 8 | apple | | 9 | apple | | 10 | banana | +-------+---------+ Following are the two queries of interest: SELECT * FROM fruits WHERE type='apple' LIMIT 2; SELECT COUNT(*) AS total FROM fruits WHERE type='apple'; // output 6 I want to combine these two queries so that the results looks like this: +-------+---------+---------+ | id | type | total | +-------+---------+---------+ | 1 | apple | 6 | | 4 | apple | 6 | +-------+---------+---------+ The output has to be limited to 2 records but it should also contain the total number of records of the type apple. How can this be done with 1 query?

Read the article
How to organize database access code in Android project?

- by Mladen Jablanovic

I have created a ContentProvider for my main Sqlite table, pretty much following NotePad example from SDK (although I am not sure whether I will ever be exposing my data to other apps). However, I need to create lots of other, non-trivial queries on that and other tables and views. A good example would be queries to extract some statistics from the base data, averages, totals etc. So what's the best place for this code in an Android project? How it should be related and connected to the Uri-based data access exposed by a Provider? Any good examples out there?

Read the article
What do C# Table Adapters actually return?

- by Raven Dreamer

I'm working on an application that manipulates SQL tables in a windows form application. Up until now, I've only been using the pre-generated Fill queries, and self-made update and delete queries (which return nothing). I am interested in storing the value of a single value from a single column (an 'nchar(15)' name), and though I have easily written the SQL code to return that value to the application, I have no idea what it will be returned as. SELECT [Contact First Name] FROM ContactSkillSet WHERE [Contact ID] = @CurrentID Can the result be stored directly as a string? Do I need to cast it? Invoke a toString method?

Read the article
Django query - join on the same table

- by dana

i have a mini blog app, and a 'timeline' . there i want to be displayed all the posts from all the friends of a user, plus the posts of that user himself. For that, i have to make some kind of a 'join' between the results of two queries (queries on the same table) , so that the final result will be the combination of the user - posesor of the account, and all his friends. My query looks like this: blog = New.objects.filter(created_by = following,created_by = request.user) By that ',' i wanted to make a 'join' -i found something like this on a doc- but this method is not correct- i'm getting an error. How else could be done this 'join' ? Thanks!

Read the article
wordpress query custom fields and category

- by InnateDev

I have a query that creates a table view and then another that queries the view. The results are extremely slow. Here is the code: create or replace view $view_table_name as select * from wp_2_postmeta where post_id IN ( select ID FROM wp_2_posts wposts LEFT JOIN wp_2_term_relationships ON (wposts.ID = wp_2_term_relationships.object_id) LEFT JOIN wp_2_term_taxonomy ON (wp_2_term_relationships.term_taxonomy_id = wp_2_term_taxonomy.term_taxonomy_id) WHERE wp_2_term_taxonomy.taxonomy = 'category' AND wp_2_term_taxonomy.parent = $cat || wp_2_term_taxonomy.term_id = $cat AND wposts.post_status = 'publish' AND wposts.post_type = 'post') The $values have been put it in for this example that queries the view table for the results. select distinct(ID) from $view_table_name wposts LEFT JOIN wp_2_postmeta wpostmeta ON wposts.ID = wpostmeta.post_id WHERE post_status = 'publish' AND ID NOT IN (SELECT post_id FROM wp_2_postmeta WHERE meta_key = '$var' && meta_value = '$value1') AND ID NOT IN (SELECT post_id FROM wp_2_postmeta WHERE meta_key = '$var' && meta_value = '$value2') AND ID NOT IN (SELECT post_id FROM wp_2_postmeta WHERE meta_key = '$var' && meta_value = '$value3') AND postmeta.meta_key = 'pd_form' ORDER BY CASE wpostmeta.meta_value WHEN '$value5' THEN 1 WHEN '$value6' THEN 2 WHEN '$value7' THEN 3 WHEN '$value8' THEN 4 WHEN '$value9' THEN 5 THEN '$value10' THEN 6 WHEN '$value11' THEN 7 WHEN '$value11' THEN 8 END

Read the article
java virtual machine - how does it allocate resources?

- by Will

I am testing the performance of a data streaming system that supports continuous queries. This is how it works: - There is a polling service which sends data to my system. - As data passes into the system, each query evaluates based on a window of the stream at the current time. - The window slides as data passes in. My problem is this, when I add more queries to the system, I should expect the throughput to decrease because it can't cope the data rate. However, I actually observe an increase in throughput. I can't understand why this is the case and I am guessing that it's something to do with the way the JVM allocates CPU, memory etc. Can anyone shed any light to my problem?

Read the article
Criteria: search for two different entity classes...

- by RoCMe

Hi! I have a "super entity" SuperEntity and three entities ChildEntity1, ..., ChildEntity3 which extends the super class. It's easy to search for all entities in the database, i.e. we could use session.createCriteria(SuperEntity.class); It's no problem to search for one specific entity type, too, just replace the SuperEntity with any of the children to look for entities of that type. But I have a problem when allowing 'multiple choice' for the types. I.e., it could be neccessary to search all entities of type 1 and 2, but not of type 3. A first idea was to create two independent queries and join the results in a final list - but that would destroy the paging which uses offset and limit functionality of the database... Is there a possibility in Criteria to join two different queries in one single result list? Kind regards, RoCMe

Read the article
Make SQL query more efficient

- by Webnet

I currently have this query which runs 2 of the exact same sub queries but pull different data. When I make the values comma separated it throws an SQL error saying the sub query can return only one value. Is there anything else I can do to avoid running multiple sub queries? SELECT product_id, ( SELECT COUNT(listing_id) FROM ebay_archive_product_listing_assoc WHERE product_id = product_master.product_id) as listing_count, sku, type_id, ( SELECT AVG(ebay_archive_listing.current_price), AVG(ebay_archive_listing.buy_it_now_price) FROM ebay_archive_listing WHERE id IN ( SELECT listing_id FROM ebay_archive_product_listing_assoc WHERE product_id = product_master.product_id ) AND ebay_archive_listing.start_time >= '.$startTimestamp.' AND ebay_archive_listing.start_time <= '.$endTimestamp.' AND ebay_archive_listing.current_price > 0 ) as average_bid_price, ( SELECT FROM ebay_archive_listing WHERE id IN ( SELECT listing_id FROM ebay_archive_product_listing_assoc WHERE product_id = product_master.product_id ) AND ebay_archive_listing.start_time >= '.$startTimestamp.' AND ebay_archive_listing.start_time <= '.$endTimestamp.' AND ebay_archive_listing.buy_it_now_price > 0 ) as average_buyout_price FROM product_master I'm aware of the syntax error... I'm selecting 2 seperate averages and am wondering if I can do it any simpler way.

Read the article
do any of these nosql db's have GUI explorers?

- by mrblah

do any of these nosql type databases have GUI explorers where you can run queries, view the "tables" and their attributes etc?

Read the article
MySQL reserves too much RAM

- by Buddy

I have a cheap VPS with 128Mb RAM and 256Mb burst. MySQL starts and reserves about 110Mb, but uses not more than 20Mb of them. My VPS Control Panel shows, that I use 127Mb (I also running nginx and sphinx), I know, that it shows reserved RAM, but when I reach over 128Mb, my VPS reboots automatically every 4 hours. So I want to force MySQL to reserve less RAM. How can i do that? I did some tweaks with my.conf but it helped not so much. top output: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 15 0 2156 668 572 S 0.0 0.3 0:00.03 init 11311 root 15 0 11212 356 228 S 0.0 0.1 0:00.00 vzctl 11312 root 18 0 3712 1484 1248 S 0.0 0.6 0:00.01 bash 11347 root 18 0 2284 916 732 R 0.0 0.3 0:00.00 top 13978 root 17 -4 2248 552 344 S 0.0 0.2 0:00.00 udevd 14262 root 15 0 1812 564 472 S 0.0 0.2 0:00.03 syslogd 14293 sphinx 15 0 11816 1172 672 S 0.0 0.4 0:00.07 searchd 14305 root 25 0 7192 1036 636 S 0.0 0.4 0:00.00 sshd 14321 root 25 0 2832 836 668 S 0.0 0.3 0:00.00 xinetd 15389 root 18 0 3708 1300 1132 S 0.0 0.5 0:00.00 mysqld_safe 15441 mysql 15 0 113m 16m 4440 S 0.0 6.4 0:00.15 mysqld 15489 root 21 0 13056 1456 340 S 0.0 0.6 0:00.00 nginx 15490 nginx 18 0 13328 2388 992 S 0.0 0.9 0:00.06 nginx 15507 nginx 25 0 19520 5888 4244 S 0.0 2.2 0:00.00 php-cgi 15508 nginx 18 0 19636 4876 2748 S 0.0 1.9 0:00.12 php-cgi 15509 nginx 15 0 19668 4872 2716 S 0.0 1.9 0:00.11 php-cgi 15518 root 18 0 4492 1116 568 S 0.0 0.4 0:00.01 crond MySQL tuner: >> MySQLTuner 1.0.1 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering Please enter your MySQL administrative login: root Please enter your MySQL administrative password: -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.0.77 [OK] Operating on 32-bit architecture with less than 2GB RAM -------- Storage Engine Statistics ------------------------------------------- [--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in InnoDB tables: 1M (Tables: 1) [OK] Total fragmented tables: 0 -------- Performance Metrics ------------------------------------------------- [--] Up for: 38m 43s (37 q [0.016 qps], 20 conn, TX: 4M, RX: 3K) [--] Reads / Writes: 100% / 0% [--] Total buffers: 28.1M global + 832.0K per thread (100 max threads) [OK] Maximum possible memory usage: 109.4M (42% of installed RAM) [OK] Slow queries: 0% (0/37) [OK] Highest usage of available connections: 1% (1/100) [OK] Key buffer size / total MyISAM indexes: 128.0K/64.0K [OK] Query cache efficiency: 42.1% (8 cached / 19 selects) [OK] Query cache prunes per day: 0 [!!] Temporary tables created on disk: 27% (3 on disk / 11 total) [!!] Thread cache is disabled [OK] Table cache hit rate: 57% (8 open / 14 opened) [OK] Open file limit used: 1% (12/1K) [OK] Table locks acquired immediately: 100% (22 immediate / 22 locks) [!!] Connections aborted: 10% [OK] InnoDB data size / buffer pool: 1.5M/8.0M -------- Recommendations ----------------------------------------------------- General recommendations: MySQL started within last 24 hours - recommendations may be inaccurate Enable the slow query log to troubleshoot bad queries When making adjustments, make tmp_table_size/max_heap_table_size equal Reduce your SELECT DISTINCT queries without LIMIT clauses Set thread_cache_size to 4 as a starting value Your applications are not closing MySQL connections properly Variables to adjust: tmp_table_size (> 32M) max_heap_table_size (> 16M) thread_cache_size (start at 4) I think if I do what MySQLtuner says, MySQL will use more RAM.

Read the article
Using a "white list" for extracting terms for Text Mining, Part 2

- by [email protected]

In my last post, we set the groundwork for extracting specific tokens from a white list using a CTXRULE index. In this post, we will populate a table with the extracted tokens and produce a case table suitable for clustering with Oracle Data Mining. Our corpus of documents will be stored in a database table that is defined as create table documents(id NUMBER, text VARCHAR2(4000)); However, any suitable Oracle Text-accepted data type can be used for the text. We then create a table to contain the extracted tokens. The id column contains the unique identifier (or case id) of the document. The token column contains the extracted token. Note that a given document many have many tokens, so there will be one row per token for a given document. create table extracted_tokens (id NUMBER, token VARCHAR2(4000)); The next step is to iterate over the documents and extract the matching tokens using the index and insert them into our token table. We use the MATCHES function for matching the query_string from my_thesaurus_rules with the text. DECLARE cursor c2 is select id, text from documents; BEGIN for r_c2 in c2 loop insert into extracted_tokens select r_c2.id id, main_term token from my_thesaurus_rules where matches(query_string, r_c2.text)>0; end loop; END; Now that we have the tokens, we can compute the term frequency - inverse document frequency (TF-IDF) for each token of each document. create table extracted_tokens_tfidf as with num_docs as (select count(distinct id) doc_cnt from extracted_tokens), tf as (select a.id, a.token, a.token_cnt/b.num_tokens token_freq from (select id, token, count(*) token_cnt from extracted_tokens group by id, token) a, (select id, count(*) num_tokens from extracted_tokens group by id) b where a.id=b.id), doc_freq as (select token, count(*) overall_token_cnt from extracted_tokens group by token) select tf.id, tf.token, token_freq * ln(doc_cnt/df.overall_token_cnt) tf_idf from num_docs, tf, doc_freq df where df.token=tf.token; From the WITH clause, the num_docs query simply counts the number of documents in the corpus. The tf query computes the term (token) frequency by computing the number of times each token appears in a document and divides that by the number of tokens found in the document. The doc_req query counts the number of times the token appears overall in the corpus. In the SELECT clause, we compute the tf_idf. Next, we create the nested table required to produce one record per case, where a case corresponds to an individual document. Here, we COLLECT all the tokens for a given document into the nested column extracted_tokens_tfidf_1. CREATE TABLE extracted_tokens_tfidf_nt NESTED TABLE extracted_tokens_tfidf_1 STORE AS extracted_tokens_tfidf_tab AS select id, cast(collect(DM_NESTED_NUMERICAL(token,tf_idf)) as DM_NESTED_NUMERICALS) extracted_tokens_tfidf_1 from extracted_tokens_tfidf group by id; To build the clustering model, we create a settings table and then insert the various settings. Most notable are the number of clusters (20), using cosine distance which is better for text, turning off auto data preparation since the values are ready for mining, the number of iterations (20) to get a better model, and the split criterion of size for clusters that are roughly balanced in number of cases assigned. CREATE TABLE km_settings (setting_name VARCHAR2(30), setting_value VARCHAR2(30)); BEGIN INSERT INTO km_settings (setting_name, setting_value) VALUES VALUES (dbms_data_mining.clus_num_clusters, 20); INSERT INTO km_settings (setting_name, setting_value) VALUES (dbms_data_mining.kmns_distance, dbms_data_mining.kmns_cosine); INSERT INTO km_settings (setting_name, setting_value) VALUES VALUES (dbms_data_mining.prep_auto,dbms_data_mining.prep_auto_off); INSERT INTO km_settings (setting_name, setting_value) VALUES VALUES (dbms_data_mining.kmns_iterations,20); INSERT INTO km_settings (setting_name, setting_value) VALUES VALUES (dbms_data_mining.kmns_split_criterion,dbms_data_mining.kmns_size); COMMIT; END; With this in place, we can now build the clustering model. BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'TEXT_CLUSTERING_MODEL', mining_function => dbms_data_mining.clustering, data_table_name => 'extracted_tokens_tfidf_nt', case_id_column_name => 'id', settings_table_name => 'km_settings'); END;To generate cluster names from this model, check out my earlier post on that topic.

Read the article
VLOOKUP in Excel, part 2: Using VLOOKUP without a database

- by Mark Virtue

In a recent article, we introduced the Excel function called VLOOKUP and explained how it could be used to retrieve information from a database into a cell in a local worksheet. In that article we mentioned that there were two uses for VLOOKUP, and only one of them dealt with querying databases. In this article, the second and final in the VLOOKUP series, we examine this other, lesser known use for the VLOOKUP function. If you haven’t already done so, please read the first VLOOKUP article – this article will assume that many of the concepts explained in that article are already known to the reader. When working with databases, VLOOKUP is passed a “unique identifier” that serves to identify which data record we wish to find in the database (e.g. a product code or customer ID). This unique identifier must exist in the database, otherwise VLOOKUP returns us an error. In this article, we will examine a way of using VLOOKUP where the identifier doesn’t need to exist in the database at all. It’s almost as if VLOOKUP can adopt a “near enough is good enough” approach to returning the data we’re looking for. In certain circumstances, this is exactly what we need. We will illustrate this article with a real-world example – that of calculating the commissions that are generated on a set of sales figures. We will start with a very simple scenario, and then progressively make it more complex, until the only rational solution to the problem is to use VLOOKUP. The initial scenario in our fictitious company works like this: If a salesperson creates more than $30,000 worth of sales in a given year, the commission they earn on those sales is 30%. Otherwise their commission is only 20%. So far this is a pretty simple worksheet: To use this worksheet, the salesperson enters their sales figures in cell B1, and the formula in cell B2 calculates the correct commission rate they are entitled to receive, which is used in cell B3 to calculate the total commission that the salesperson is owed (which is a simple multiplication of B1 and B2). The cell B2 contains the only interesting part of this worksheet – the formula for deciding which commission rate to use: the one below the threshold of $30,000, or the one above the threshold. This formula makes use of the Excel function called IF. For those readers that are not familiar with IF, it works like this: IF(condition,value if true,value if false) Where the condition is an expression that evaluates to either true or false. In the example above, the condition is the expression B1<B5, which can be read as “Is B1 less than B5?”, or, put another way, “Are the total sales less than the threshold”. If the answer to this question is “yes” (true), then we use the value if true parameter of the function, namely B6 in this case – the commission rate if the sales total was below the threshold. If the answer to the question is “no” (false), then we use the value if false parameter of the function, namely B7 in this case – the commission rate if the sales total was above the threshold. As you can see, using a sales total of $20,000 gives us a commission rate of 20% in cell B2. If we enter a value of $40,000, we get a different commission rate: So our spreadsheet is working. Let’s make it more complex. Let’s introduce a second threshold: If the salesperson earns more than $40,000, then their commission rate increases to 40%: Easy enough to understand in the real world, but in cell B2 our formula is getting more complex. If you look closely at the formula, you’ll see that the third parameter of the original IF function (the value if false) is now an entire IF function in its own right. This is called a nested function (a function within a function). It’s perfectly valid in Excel (it even works!), but it’s harder to read and understand. We’re not going to go into the nuts and bolts of how and why this works, nor will we examine the nuances of nested functions. This is a tutorial on VLOOKUP, not on Excel in general. Anyway, it gets worse! What about when we decide that if they earn more than $50,000 then they’re entitled to 50% commission, and if they earn more than $60,000 then they’re entitled to 60% commission? Now the formula in cell B2, while correct, has become virtually unreadable. No-one should have to write formulae where the functions are nested four levels deep! Surely there must be a simpler way? There certainly is. VLOOKUP to the rescue! Let’s redesign the worksheet a bit. We’ll keep all the same figures, but organize it in a new way, a more tabular way: Take a moment and verify for yourself that the new Rate Table works exactly the same as the series of thresholds above. Conceptually, what we’re about to do is use VLOOKUP to look up the salesperson’s sales total (from B1) in the rate table and return to us the corresponding commission rate. Note that the salesperson may have indeed created sales that are not one of the five values in the rate table ($0, $30,000, $40,000, $50,000 or $60,000). They may have created sales of $34,988. It’s important to note that $34,988 does not appear in the rate table. Let’s see if VLOOKUP can solve our problem anyway… We select cell B2 (the location we want to put our formula), and then insert the VLOOKUP function from the Formulas tab: The Function Arguments box for VLOOKUP appears. We fill in the arguments (parameters) one by one, starting with the Lookup_value, which is, in this case, the sales total from cell B1. We place the cursor in the Lookup_value field and then click once on cell B1: Next we need to specify to VLOOKUP what table to lookup this data in. In this example, it’s the rate table, of course. We place the cursor in the Table_array field, and then highlight the entire rate table – excluding the headings: Next we must specify which column in the table contains the information we want our formula to return to us. In this case we want the commission rate, which is found in the second column in the table, so we therefore enter a 2 into the Col_index_num field: Finally we enter a value in the Range_lookup field. Important: It is the use of this field that differentiates the two ways of using VLOOKUP. To use VLOOKUP with a database, this final parameter, Range_lookup, must always be set to FALSE, but with this other use of VLOOKUP, we must either leave it blank or enter a value of TRUE. When using VLOOKUP, it is vital that you make the correct choice for this final parameter. To be explicit, we will enter a value of true in the Range_lookup field. It would also be fine to leave it blank, as this is the default value: We have completed all the parameters. We now click the OK button, and Excel builds our VLOOKUP formula for us: If we experiment with a few different sales total amounts, we can satisfy ourselves that the formula is working. Conclusion In the “database” version of VLOOKUP, where the Range_lookup parameter is FALSE, the value passed in the first parameter (Lookup_value) must be present in the database. In other words, we’re looking for an exact match. But in this other use of VLOOKUP, we are not necessarily looking for an exact match. In this case, “near enough is good enough”. But what do we mean by “near enough”? Let’s use an example: When searching for a commission rate on a sales total of $34,988, our VLOOKUP formula will return us a value of 30%, which is the correct answer. Why did it choose the row in the table containing 30% ? What, in fact, does “near enough” mean in this case? Let’s be precise: When Range_lookup is set to TRUE (or omitted), VLOOKUP will look in column 1 and match the highest value that is not greater than the Lookup_value parameter. It’s also important to note that for this system to work, the table must be sorted in ascending order on column 1! If you would like to practice with VLOOKUP, the sample file illustrated in this article can be downloaded from here. Similar Articles Productive Geek Tips Using VLOOKUP in ExcelImport Microsoft Access Data Into ExcelImport an Access Database into ExcelCopy a Group of Cells in Excel 2007 to the Clipboard as an ImageShare Access Data with Excel in Office 2010 TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 Quickly Schedule Meetings With NeedtoMeet Share Flickr Photos On Facebook Automatically Are You Blocked On Gtalk? Find out Discover Latest Android Apps On AppBrain The Ultimate Guide For YouTube Lovers Will it Blend? iPad Edition

Read the article
Operator of the week - Assert

- by Fabiano Amorim

Well my friends, I was wondering how to help you in a practical way to understand execution plans. So I think I'll talk about the Showplan Operators. Showplan Operators are used by the Query Optimizer (QO) to build the query plan in order to perform a specified operation. A query plan will consist of many physical operators. The Query Optimizer uses a simple language that represents each physical operation by an operator, and each operator is represented in the graphical execution plan by an icon. I'll try to talk about one operator every week, but so as to avoid having to continue to write about these operators for years, I'll mention only of those that are more common: The first being the Assert. The Assert is used to verify a certain condition, it validates a Constraint on every row to ensure that the condition was met. If, for example, our DDL includes a check constraint which specifies only two valid values for a column, the Assert will, for every row, validate the value passed to the column to ensure that input is consistent with the check constraint. Assert and Check Constraints: Let's see where the SQL Server uses that information in practice. Take the following T-SQL: IF OBJECT_ID('Tab1') IS NOT NULL DROP TABLE Tab1 GO CREATE TABLE Tab1(ID Integer, Gender CHAR(1)) GO ALTER TABLE TAB1 ADD CONSTRAINT ck_Gender_M_F CHECK(Gender IN('M','F')) GO INSERT INTO Tab1(ID, Gender) VALUES(1,'X') GO To the command above the SQL Server has generated the following execution plan: As we can see, the execution plan uses the Assert operator to check that the inserted value doesn't violate the Check Constraint. In this specific case, the Assert applies the rule, 'if the value is different to "F" and different to "M" than return 0 otherwise returns NULL'. The Assert operator is programmed to show an error if the returned value is not NULL; in other words, the returned value is not a "M" or "F". Assert checking Foreign Keys Now let's take a look at an example where the Assert is used to validate a foreign key constraint. Suppose we have this query: ALTER TABLE Tab1 ADD ID_Genders INT GO IF OBJECT_ID('Tab2') IS NOT NULL DROP TABLE Tab2 GO CREATE TABLE Tab2(ID Integer PRIMARY KEY, Gender CHAR(1)) GO INSERT INTO Tab2(ID, Gender) VALUES(1, 'F') INSERT INTO Tab2(ID, Gender) VALUES(2, 'M') INSERT INTO Tab2(ID, Gender) VALUES(3, 'N') GO ALTER TABLE Tab1 ADD CONSTRAINT fk_Tab2 FOREIGN KEY (ID_Genders) REFERENCES Tab2(ID) GO INSERT INTO Tab1(ID, ID_Genders, Gender) VALUES(1, 4, 'X') Let's look at the text execution plan to see what these Assert operators were doing. To see the text execution plan just execute SET SHOWPLAN_TEXT ON before run the insert command. |--Assert(WHERE:(CASE WHEN NOT [Pass1008] AND [Expr1007] IS NULL THEN (0) ELSE NULL END)) |--Nested Loops(Left Semi Join, PASSTHRU:([Tab1].[ID_Genders] IS NULL), OUTER REFERENCES:([Tab1].[ID_Genders]), DEFINE:([Expr1007] = [PROBE VALUE])) |--Assert(WHERE:(CASE WHEN [Tab1].[Gender]<>'F' AND [Tab1].[Gender]<>'M' THEN (0) ELSE NULL END)) | |--Clustered Index Insert(OBJECT:([Tab1].[PK]), SET:([Tab1].[ID] = RaiseIfNullInsert([@1]),[Tab1].[ID_Genders] = [@2],[Tab1].[Gender] = [Expr1003]), DEFINE:([Expr1003]=CONVERT_IMPLICIT(char(1),[@3],0))) |--Clustered Index Seek(OBJECT:([Tab2].[PK]), SEEK:([Tab2].[ID]=[Tab1].[ID_Genders]) ORDERED FORWARD) Here we can see the Assert operator twice, first (looking down to up in the text plan and the right to left in the graphical plan) validating the Check Constraint. The same concept showed above is used, if the exit value is "0" than keep running the query, but if NULL is returned shows an exception. The second Assert is validating the result of the Tab1 and Tab2 join. It is interesting to see the "[Expr1007] IS NULL". To understand that you need to know what this Expr1007 is, look at the Probe Value (green text) in the text plan and you will see that it is the result of the join. If the value passed to the INSERT at the column ID_Gender exists in the table Tab2, then that probe will return the join value; otherwise it will return NULL. So the Assert is checking the value of the search at the Tab2; if the value that is passed to the INSERT is not found then Assert will show one exception. If the value passed to the column ID_Genders is NULL than the SQL can't show a exception, in that case it returns "0" and keeps running the query. If you run the INSERT above, the SQL will show an exception because of the "X" value, but if you change the "X" to "F" and run again, it will show an exception because of the value "4". If you change the value "4" to NULL, 1, 2 or 3 the insert will be executed without any error. Assert checking a SubQuery: The Assert operator is also used to check one subquery. As we know, one scalar subquery can't validly return more than one value: Sometimes, however, a mistake happens, and a subquery attempts to return more than one value . Here the Assert comes into play by validating the condition that a scalar subquery returns just one value. Take the following query: INSERT INTO Tab1(ID_TipoSexo, Sexo) VALUES((SELECT ID_TipoSexo FROM Tab1), 'F') INSERT INTO Tab1(ID_TipoSexo, Sexo) VALUES((SELECT ID_TipoSexo FROM Tab1), 'F') |--Assert(WHERE:(CASE WHEN NOT [Pass1016] AND [Expr1015] IS NULL THEN (0) ELSE NULL END)) |--Nested Loops(Left Semi Join, PASSTHRU:([tempdb].[dbo].[Tab1].[ID_TipoSexo] IS NULL), OUTER REFERENCES:([tempdb].[dbo].[Tab1].[ID_TipoSexo]), DEFINE:([Expr1015] = [PROBE VALUE])) |--Assert(WHERE:([Expr1017])) | |--Compute Scalar(DEFINE:([Expr1017]=CASE WHEN [tempdb].[dbo].[Tab1].[Sexo]<>'F' AND [tempdb].[dbo].[Tab1].[Sexo]<>'M' THEN (0) ELSE NULL END)) | |--Clustered Index Insert(OBJECT:([tempdb].[dbo].[Tab1].[PK__Tab1__3214EC277097A3C8]), SET:([tempdb].[dbo].[Tab1].[ID_TipoSexo] = [Expr1008],[tempdb].[dbo].[Tab1].[Sexo] = [Expr1009],[tempdb].[dbo].[Tab1].[ID] = [Expr1003])) | |--Top(TOP EXPRESSION:((1))) | |--Compute Scalar(DEFINE:([Expr1008]=[Expr1014], [Expr1009]='F')) | |--Nested Loops(Left Outer Join) | |--Compute Scalar(DEFINE:([Expr1003]=getidentity((1856985942),(2),NULL))) | | |--Constant Scan | |--Assert(WHERE:(CASE WHEN [Expr1013]>(1) THEN (0) ELSE NULL END)) | |--Stream Aggregate(DEFINE:([Expr1013]=Count(*), [Expr1014]=ANY([tempdb].[dbo].[Tab1].[ID_TipoSexo]))) | |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[Tab1].[PK__Tab1__3214EC277097A3C8])) |--Clustered Index Seek(OBJECT:([tempdb].[dbo].[Tab2].[PK__Tab2__3214EC27755C58E5]), SEEK:([tempdb].[dbo].[Tab2].[ID]=[tempdb].[dbo].[Tab1].[ID_TipoSexo]) ORDERED FORWARD) You can see from this text showplan that SQL Server as generated a Stream Aggregate to count how many rows the SubQuery will return, This value is then passed to the Assert which then does its job by checking its validity. Is very interesting to see that the Query Optimizer is smart enough be able to avoid using assert operators when they are not necessary. For instance: INSERT INTO Tab1(ID_TipoSexo, Sexo) VALUES((SELECT ID_TipoSexo FROM Tab1 WHERE ID = 1), 'F') INSERT INTO Tab1(ID_TipoSexo, Sexo) VALUES((SELECT TOP 1 ID_TipoSexo FROM Tab1), 'F') For both these INSERTs, the Query Optimiser is smart enough to know that only one row will ever be returned, so there is no need to use the Assert. Well, that's all folks, I see you next week with more "Operators". Cheers, Fabiano

Read the article
Unable to import Maven project into IntelliJ IDEA

- by del

I'm having problems importing any Maven projects into IntelliJ IDEA. I create an empty Maven project like this: $ mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=my-app -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false Then I try to open the project in IDEA (File Open Project, then choose the pom.xml). A progress box saying "Reading pom.xml" displays for a few minutes, and then just disappears without opening the project. Looking in the IDEA log, I see some connection timeout exceptions like this: 2012-10-03 11:55:55,483 [ 16981] INFO - ution.rmi.RemoteProcessSupport - Port/ID: 18011/Maven2ServerImpl9407569f 2012-10-03 11:56:58,898 [ 80396] WARN - ution.rmi.RemoteProcessSupport - The cook failed to start due to java.net.ConnectException: Connection timed out 2012-10-03 11:57:55,483 [ 136981] WARN - ution.rmi.RemoteProcessSupport - java.rmi.NotBoundException: _DEAD_HAND_ 2012-10-03 11:57:55,484 [ 136982] WARN - ution.rmi.RemoteProcessSupport - at sun.rmi.registry.RegistryImpl.lookup(RegistryImpl.java:106) 2012-10-03 11:57:55,484 [ 136982] WARN - ution.rmi.RemoteProcessSupport - at com.intellij.execution.rmi.RemoteServer.start(RemoteServer.java:73) 2012-10-03 11:57:55,484 [ 136982] WARN - ution.rmi.RemoteProcessSupport - at org.jetbrains.idea.maven.server.RemoteMavenServer.main(RemoteMavenServer.java:22) 2012-10-03 11:58:01,749 [ 143247] ERROR - com.intellij.ide.IdeEventQueue - Error during dispatching of java.awt.event.MouseEvent[MOUSE_RELEASED,(65,116),absolute(64,140),button=1,modifiers=Button1,clickCount=1] on frame0 java.lang.RuntimeException: Cannot reconnect. at org.jetbrains.idea.maven.server.RemoteObjectWrapper.perform(RemoteObjectWrapper.java:82) at org.jetbrains.idea.maven.server.MavenServerManager.applyProfiles(MavenServerManager.java:311) at org.jetbrains.idea.maven.project.MavenProjectReader.applyProfiles(MavenProjectReader.java:369) at org.jetbrains.idea.maven.project.MavenProjectReader.doReadProjectModel(MavenProjectReader.java:98) at org.jetbrains.idea.maven.project.MavenProjectReader.readProject(MavenProjectReader.java:52) at org.jetbrains.idea.maven.project.MavenProject.read(MavenProject.java:405) at org.jetbrains.idea.maven.project.MavenProjectsTree.doUpdate(MavenProjectsTree.java:534) at org.jetbrains.idea.maven.project.MavenProjectsTree.doAdd(MavenProjectsTree.java:481) at org.jetbrains.idea.maven.project.MavenProjectsTree.update(MavenProjectsTree.java:442) at org.jetbrains.idea.maven.project.MavenProjectsTree.updateAll(MavenProjectsTree.java:413) at org.jetbrains.idea.maven.wizards.MavenProjectBuilder.readMavenProjectTree(MavenProjectBuilder.java:198) at org.jetbrains.idea.maven.wizards.MavenProjectBuilder.access$800(MavenProjectBuilder.java:44) at org.jetbrains.idea.maven.wizards.MavenProjectBuilder$3.run(MavenProjectBuilder.java:179) at org.jetbrains.idea.maven.utils.MavenUtil$8.run(MavenUtil.java:388) at com.intellij.openapi.progress.impl.ProgressManagerImpl$TaskRunnable.run(ProgressManagerImpl.java:469) at com.intellij.openapi.progress.impl.ProgressManagerImpl$6.run(ProgressManagerImpl.java:288) at com.intellij.openapi.progress.impl.ProgressManagerImpl$2.run(ProgressManagerImpl.java:178) at com.intellij.openapi.progress.impl.ProgressManagerImpl.executeProcessUnderProgress(ProgressManagerImpl.java:218) at com.intellij.openapi.progress.impl.ProgressManagerImpl.runProcess(ProgressManagerImpl.java:169) at com.intellij.openapi.application.impl.ApplicationImpl$8$1.run(ApplicationImpl.java:641) at com.intellij.openapi.application.impl.ApplicationImpl$6.run(ApplicationImpl.java:434) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) at com.intellij.openapi.application.impl.ApplicationImpl$1$1.run(ApplicationImpl.java:145) Caused by: java.rmi.RemoteException: Cannot start maven service; nested exception is: java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: java.net.ConnectException: Connection timed out at org.jetbrains.idea.maven.server.MavenServerManager.create(MavenServerManager.java:120) at org.jetbrains.idea.maven.server.MavenServerManager.create(MavenServerManager.java:71) at org.jetbrains.idea.maven.server.RemoteObjectWrapper.getOrCreateWrappee(RemoteObjectWrapper.java:41) at org.jetbrains.idea.maven.server.MavenServerManager$8.execute(MavenServerManager.java:314) at org.jetbrains.idea.maven.server.MavenServerManager$8.execute(MavenServerManager.java:311) at org.jetbrains.idea.maven.server.RemoteObjectWrapper.perform(RemoteObjectWrapper.java:76) ... 27 more Caused by: java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: java.net.ConnectException: Connection timed out at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601) at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:322) at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source) at com.intellij.execution.rmi.RemoteProcessSupport$2.compute(RemoteProcessSupport.java:215) at com.intellij.execution.rmi.RemoteUtil.executeWithClassLoader(RemoteUtil.java:122) at com.intellij.execution.rmi.RemoteProcessSupport.acquire(RemoteProcessSupport.java:212) at com.intellij.execution.rmi.RemoteProcessSupport.acquire(RemoteProcessSupport.java:133) at org.jetbrains.idea.maven.server.MavenServerManager.create(MavenServerManager.java:117) ... 32 more Caused by: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at java.net.Socket.connect(Socket.java:478) at java.net.Socket.(Socket.java:375) at java.net.Socket.(Socket.java:189) at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22) at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128) at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) ... 41 more I'm using the latest versions of IDEA (11.1.3) and Maven (3.0.4). Any ideas what I am doing wrong?

Read the article
Self-updating collection concurrency issues

- by DEHAAS

I am trying to build a self-updating collection. Each item in the collection has a position (x,y). When the position is changed, an event is fired, and the collection will relocate the item. Internally the collection is using a “jagged dictionary”. The outer dictionary uses the x-coordinate a key, while the nested dictionary uses the y-coordinate a key. The nested dictionary then has a list of items as value. The collection also maintains a dictionary to store the items position as stored in the nested dictionaries – item to stored location lookup. I am having some trouble making the collection thread safe, which I really need. Source code for the collection: public class PositionCollection<TItem, TCoordinate> : ICollection<TItem> where TItem : IPositionable<TCoordinate> where TCoordinate : struct, IConvertible { private readonly object itemsLock = new object(); private readonly Dictionary<TCoordinate, Dictionary<TCoordinate, List<TItem>>> items; private readonly Dictionary<TItem, Vector<TCoordinate>> storedPositionLookup; public PositionCollection() { this.items = new Dictionary<TCoordinate, Dictionary<TCoordinate, List<TItem>>>(); this.storedPositionLookup = new Dictionary<TItem, Vector<TCoordinate>>(); } public void Add(TItem item) { if (item.Position == null) { throw new ArgumentException("Item must have a valid position."); } lock (this.itemsLock) { if (!this.items.ContainsKey(item.Position.X)) { this.items.Add(item.Position.X, new Dictionary<TCoordinate, List<TItem>>()); } Dictionary<TCoordinate, List<TItem>> xRow = this.items[item.Position.X]; if (!xRow.ContainsKey(item.Position.Y)) { xRow.Add(item.Position.Y, new List<TItem>()); } xRow[item.Position.Y].Add(item); if (this.storedPositionLookup.ContainsKey(item)) { this.storedPositionLookup[item] = new Vector<TCoordinate>(item.Position); } else { this.storedPositionLookup.Add(item, new Vector<TCoordinate>(item.Position)); // Store a copy of the original position } item.Position.PropertyChanged += (object sender, PropertyChangedEventArgs eventArgs) => this.UpdatePosition(item, eventArgs.PropertyName); } } private void UpdatePosition(TItem item, string propertyName) { lock (this.itemsLock) { Vector<TCoordinate> storedPosition = this.storedPositionLookup[item]; this.RemoveAt(storedPosition, item); this.storedPositionLookup.Remove(item); } } } I have written a simple unit test to check for concurrency issues: [TestMethod] public void TestThreadedPositionChange() { PositionCollection<Crate, int> collection = new PositionCollection<Crate, int>(); Crate crate = new Crate(new Vector<int>(5, 5)); collection.Add(crate); Parallel.For(0, 100, new Action<int>((i) => crate.Position.X += 1)); Crate same = collection[105, 5].First(); Assert.AreEqual(crate, same); } The actual stored position varies every time I run the test. I appreciate any feedback you may have.

Read the article
Using linked servers, OPENROWSET and OPENQUERY

- by BuckWoody

SQL Server has a few mechanisms to reach out to another server (even another server type) and query data from within a Transact-SQL statement. Among them are a set of stored credentials and information (called a Linked Server), a statement that uses a linked server called called OPENQUERY, another called OPENROWSET, and one called OPENDATASOURCE. This post isn’t about those particular functions or statements – hit the links for more if you’re new to those topics. I’m actually more concerned about where I see these used than the particular method. In many cases, a Linked server isn’t another Relational Database Management System (RDMBS) like Oracle or DB2 (which is possible with a linked server), but another SQL Server. My concern is that linked servers are the new Data Transformation Services (DTS) from SQL Server 2000 – something that was designed for one purpose but which is being morphed into something much more. In the case of DTS, most of us turned that feature into a full-fledged job system. What was designed as a simple data import and export system has been pressed into service doing logic, routing and timing. And of course we all know how painful it was to move off of a complex DTS system onto SQL Server Integration Services. In the case of linked servers, what should be used as a method of running a simple query or two on another server where you have occasional connection or need a quick import of a small data set is morphing into a full federation strategy. In some cases I’ve seen a complex web of linked servers, and when credentials, names or anything else changes there are huge problems. Now don’t get me wrong – linked servers and other forms of distributing queries is a fantastic set of tools that we have to move data around. I’m just saying that when you start having lots of workarounds and when things get really complicated, you might want to step back a little and ask if there’s a better way. Are you able to tolerate some latency? Perhaps you’re able to use Service Broker. Would you like to be platform-independent on the data source? Perhaps a middle-tier might make more sense, abstracting the queries there and sending them to the proper server. Designed properly, I’ve seen these systems scale further and be more resilient than loading up on linked servers. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Google Search Parameter Question

- by Brian

I've been trying to determine different parameters used by Google in their search queries. In particular, the usg parameter is what is giving me troubles. Here is an example value given for it, which is from an actual Google query: usg=0_zDqudnCN52ATGjAl3tignXNtBo4%3D Does anyone know what it could be for / recognize it? I've done a bit of digging, but haven't found any confirmation as to what it could be. Here is the link that I took a look at: http://www.webmasterworld.com/google/3892573.htm

Read the article
ADNOC talks about 50x increase in performance

- by KLaker

If you are still wondering about how Exadata can revolutionise your business then I would recommend watching this great video which was recorded at this year's OpenWorld. First a little background...The Abu Dhabi National Oil Company for Distribution (ADNOC) is an integrated energy company that was founded in 1973. ADNOC Distribution markets and distributes petroleum products and services within the United Arab Emirates and internationally. As one of the largest and most innovative government-owned petroleum companies in the Arab Gulf, ADNOC Distribution is renowned and respected for the exceptional quality and reliability of its products and services. Its five corporate divisions include more than 200 filling stations (a number that is growing at 8% annually), more than 150 convenience stores, 10 vehicle inspection stations, as well as wholesale and retail sales of bulk fuel, gas, oil, diesel, and lubricants. ADNOC selected Oracle Exadata Database Machine after extensive research because it provided them with a single platform that can run mixed workloads in a single unified machine: "We chose Oracle Exadata Database Machine because it.offered a fully integrated and highly engineered system that was ready to deploy. With our infrastructure running all the same technology, we can operate any type of Oracle Database without restrictions and be prepared for business growth," said Ali Abdul Aziz Al-Ali, IT division manager, ADNOC Distribution. ".....we could consolidate our transaction processing and business intelligence onto one platform. Competing solutions are just not capable of doing that." - Awad Ahmed Ali El-Sidiq, Senior Database Administrator, ADNOC Distribution In this new video Awad Ahmen Ali El Sidddig, Senior DBA at ADNOC, talks about the impact that Exadata has had on his team and the whole business. ADNOC is using our engineered systems to drive and manage all their workloads: from transaction systems to payments system to data warehouse to BI environment. A true Disk-to-Dashboard revolution using Engineered Systems. This engineered approach is delivering 50x improvement in performance with one queries running 100x faster! The IT has even revolutionised some of their data warehouse related processes with the help of Exadata and now jobs that were taking over 4 hours now run in a few minutes. To watch the video click on the image below which will take you to our Oracle YouTube page: (if the above link does not work, click here: http://www.youtube.com/watch?v=zcRpxc6u5Ic) Now that queries are running 100x faster and jobs are completing in minutes not hours, what is next for the IT team at ADNOC? Like many of our customers ADNOC is now looking to take advantage of big data to help them better align their business operations with customer behaviour and customer insights. To help deliver this next level of insight the IT team is looking at the new features in Oracle Database 12c such as the new in-memory feature to deliver even more performance gains. The great news is that Awad Ahmen Ali El Sidddig was awarded DBA of the Year - EMEA within our Data Warehouse Global Leaders programme and you can see the badge for this award pop-up at the start of video. Well done to everyone at ADNOC and thanks for spending the time with us at OOW to create this great video.

Read the article
T-SQL Equivalents for Microsoft Access VBA Functions

If you need to migrate your Access application to SQL Server, don't count on The SQL Server Upsize Wizard in Microsoft Access to automatically convert your VBA functions. If you want to push the complex query processing done by your Access queries to the back end, you'll have to rewrite them in T-SQL.

Read the article
An XEvent a Day (13 of 31) – The system_health Session

- by Jonathan Kehayias

Today’s post was originally planned for this coming weekend, but seems I’ve caught whatever bug my kids had over the weekend so I am changing up today’s blog post with one that is easier to cover and shorter. If you’ve been running some of the queries from the posts in this series, you have no doubt come across an Event Session running on your server with the name of system_health. In today’s post I’ll go over this session and provide links to references related to it. When Extended Events...(read more)

Read the article
Basic Spatial Data with SQL Server and Entity Framework 5.0

- by Rick Strahl

In my most recent project we needed to do a bit of geo-spatial referencing. While spatial features have been in SQL Server for a while using those features inside of .NET applications hasn't been as straight forward as could be, because .NET natively doesn't support spatial types. There are workarounds for this with a few custom project like SharpMap or a hack using the Sql Server specific Geo types found in the Microsoft.SqlTypes assembly that ships with SQL server. While these approaches work for manipulating spatial data from .NET code, they didn't work with database access if you're using Entity Framework. Other ORM vendors have been rolling their own versions of spatial integration. In Entity Framework 5.0 running on .NET 4.5 the Microsoft ORM finally adds support for spatial types as well. In this post I'll describe basic geography features that deal with single location and distance calculations which is probably the most common usage scenario. SQL Server Transact-SQL Syntax for Spatial Data Before we look at how things work with Entity framework, lets take a look at how SQL Server allows you to use spatial data to get an understanding of the underlying semantics. The following SQL examples should work with SQL 2008 and forward. Let's start by creating a test table that includes a Geography field and also a pair of Long/Lat fields that demonstrate how you can work with the geography functions even if you don't have geography/geometry fields in the database. Here's the CREATE command:CREATE TABLE [dbo].[Geo]( [id] [int] IDENTITY(1,1) NOT NULL, [Location] [geography] NULL, [Long] [float] NOT NULL, [Lat] [float] NOT NULL ) Now using plain SQL you can insert data into the table using geography::STGeoFromText SQL CLR function:insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.527200 45.712113)', 4326), -121.527200, 45.712113 ) insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.517265 45.714240)', 4326), -121.517265, 45.714240 ) insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.511536 45.714825)', 4326), -121.511536, 45.714825) The STGeomFromText function accepts a string that points to a geometric item (a point here but can also be a line or path or polygon and many others). You also need to provide an SRID (Spatial Reference System Identifier) which is an integer value that determines the rules for how geography/geometry values are calculated and returned. For mapping/distance functionality you typically want to use 4326 as this is the format used by most mapping software and geo-location libraries like Google and Bing. The spatial data in the Location field is stored in binary format which looks something like this: Once the location data is in the database you can query the data and do simple distance computations very easily. For example to calculate the distance of each of the values in the database to another spatial point is very easy to calculate. Distance calculations compare two points in space using a direct line calculation. For our example I'll compare a new point to all the points in the database. Using the Location field the SQL looks like this:-- create a source point DECLARE @s geography SET @s = geography:: STGeomFromText('POINT(-121.527200 45.712113)' , 4326); --- return the ids select ID, Location as Geo , Location .ToString() as Point , @s.STDistance( Location) as distance from Geo order by distance The code defines a new point which is the base point to compare each of the values to. You can also compare values from the database directly, but typically you'll want to match a location to another location and determine the difference for which you can use the geography::STDistance function. This query produces the following output: The STDistance function returns the straight line distance between the passed in point and the point in the database field. The result for SRID 4326 is always in meters. Notice that the first value passed was the same point so the difference is 0. The other two points are two points here in town in Hood River a little ways away - 808 and 1256 meters respectively. Notice also that you can order the result by the resulting distance, which effectively gives you results that are ordered radially out from closer to further away. This is great for searches of points of interest near a central location (YOU typically!). These geolocation functions are also available to you if you don't use the Geography/Geometry types, but plain float values. It's a little more work, as each point has to be created in the query using the string syntax, but the following code doesn't use a geography field but produces the same result as the previous query.--- using float fields select ID, geography::STGeomFromText ('POINT(' + STR (long, 15,7 ) + ' ' + Str(lat ,15, 7) + ')' , 4326), geography::STGeomFromText ('POINT(' + STR (long, 15,7 ) + ' ' + Str(lat ,15, 7) + ')' , 4326). ToString(), @s.STDistance( geography::STGeomFromText ('POINT(' + STR(long ,15, 7) + ' ' + Str(lat ,15, 7) + ')' , 4326)) as distance from geo order by distance Spatial Data in the Entity Framework Prior to Entity Framework 5.0 on .NET 4.5 consuming of the data above required using stored procedures or raw SQL commands to access the spatial data. In Entity Framework 5 however, Microsoft introduced the new DbGeometry and DbGeography types. These immutable location types provide a bunch of functionality for manipulating spatial points using geometry functions which in turn can be used to do common spatial queries like I described in the SQL syntax above. The DbGeography/DbGeometry types are immutable, meaning that you can't write to them once they've been created. They are a bit odd in that you need to use factory methods in order to instantiate them - they have no constructor() and you can't assign to properties like Latitude and Longitude. Creating a Model with Spatial Data Let's start by creating a simple Entity Framework model that includes a Location property of type DbGeography: public class GeoLocationContext : DbContext { public DbSet<GeoLocation> Locations { get; set; } } public class GeoLocation { public int Id { get; set; } public DbGeography Location { get; set; } public string Address { get; set; } } That's all there's to it. When you run this now against SQL Server, you get a Geography field for the Location property, which looks the same as the Location field in the SQL examples earlier. Adding Spatial Data to the Database Next let's add some data to the table that includes some latitude and longitude data. An easy way to find lat/long locations is to use Google Maps to pinpoint your location, then right click and click on What's Here. Click on the green marker to get the GPS coordinates. To add the actual geolocation data create an instance of the GeoLocation type and use the DbGeography.PointFromText() factory method to create a new point to assign to the Location property:[TestMethod] public void AddLocationsToDataBase() { var context = new GeoLocationContext(); // remove all context.Locations.ToList().ForEach( loc => context.Locations.Remove(loc)); context.SaveChanges(); var location = new GeoLocation() { // Create a point using native DbGeography Factory method Location = DbGeography.PointFromText( string.Format("POINT({0} {1})", -121.527200,45.712113) ,4326), Address = "301 15th Street, Hood River" }; context.Locations.Add(location); location = new GeoLocation() { Location = CreatePoint(45.714240, -121.517265), Address = "The Hatchery, Bingen" }; context.Locations.Add(location); location = new GeoLocation() { // Create a point using a helper function (lat/long) Location = CreatePoint(45.708457, -121.514432), Address = "Kaze Sushi, Hood River" }; context.Locations.Add(location); location = new GeoLocation() { Location = CreatePoint(45.722780, -120.209227), Address = "Arlington, OR" }; context.Locations.Add(location); context.SaveChanges(); } As promised, a DbGeography object has to be created with one of the static factory methods provided on the type as the Location.Longitude and Location.Latitude properties are read only. Here I'm using PointFromText() which uses a "Well Known Text" format to specify spatial data. In the first example I'm specifying to create a Point from a longitude and latitude value, using an SRID of 4326 (just like earlier in the SQL examples). You'll probably want to create a helper method to make the creation of Points easier to avoid that string format and instead just pass in a couple of double values. Here's my helper called CreatePoint that's used for all but the first point creation in the sample above:public static DbGeography CreatePoint(double latitude, double longitude) { var text = string.Format(CultureInfo.InvariantCulture.NumberFormat, "POINT({0} {1})", longitude, latitude); // 4326 is most common coordinate system used by GPS/Maps return DbGeography.PointFromText(text, 4326); } Using the helper the syntax becomes a bit cleaner, requiring only a latitude and longitude respectively. Note that my method intentionally swaps the parameters around because Latitude and Longitude is the common format I've seen with mapping libraries (especially Google Mapping/Geolocation APIs with their LatLng type). When the context is changed the data is written into the database using the SQL Geography type which looks the same as in the earlier SQL examples shown. Querying Once you have some location data in the database it's now super easy to query the data and find out the distance between locations. A common query is to ask for a number of locations that are near a fixed point - typically your current location and order it by distance. Using LINQ to Entities a query like this is easy to construct:[TestMethod] public void QueryLocationsTest() { var sourcePoint = CreatePoint(45.712113, -121.527200); var context = new GeoLocationContext(); // find any locations within 5 kilometers ordered by distance var matches = context.Locations .Where(loc => loc.Location.Distance(sourcePoint) < 5000) .OrderBy( loc=> loc.Location.Distance(sourcePoint) ) .Select( loc=> new { Address = loc.Address, Distance = loc.Location.Distance(sourcePoint) }); Assert.IsTrue(matches.Count() > 0); foreach (var location in matches) { Console.WriteLine("{0} ({1:n0} meters)", location.Address, location.Distance); } } This example produces: 301 15th Street, Hood River (0 meters)The Hatchery, Bingen (809 meters)Kaze Sushi, Hood River (1,074 meters) The first point in the database is the same as my source point I'm comparing against so the distance is 0. The other two are within the 5 mile radius, while the Arlington location which is 65 miles or so out is not returned. The result is ordered by distance from closest to furthest away. In the code, I first create a source point that is the basis for comparison. The LINQ query then selects all locations that are within 5km of the source point using the Location.Distance() function, which takes a source point as a parameter. You can either use a pre-defined value as I'm doing here, or compare against another database DbGeography property (say when you have to points in the same database for things like routes). What's nice about this query syntax is that it's very clean and easy to read and understand. You can calculate the distance and also easily order by the distance to provide a result that shows locations from closest to furthest away which is a common scenario for any application that places a user in the context of several locations. It's now super easy to accomplish this. Meters vs. Miles As with the SQL Server functions, the Distance() method returns data in meters, so if you need to work with miles or feet you need to do some conversion. Here are a couple of helpers that might be useful (can be found in GeoUtils.cs of the sample project):/// <summary> /// Convert meters to miles /// </summary> /// <param name="meters"></param> /// <returns></returns> public static double MetersToMiles(double? meters) { if (meters == null) return 0F; return meters.Value * 0.000621371192; } /// <summary> /// Convert miles to meters /// </summary> /// <param name="miles"></param> /// <returns></returns> public static double MilesToMeters(double? miles) { if (miles == null) return 0; return miles.Value * 1609.344; } Using these two helpers you can query on miles like this:[TestMethod] public void QueryLocationsMilesTest() { var sourcePoint = CreatePoint(45.712113, -121.527200); var context = new GeoLocationContext(); // find any locations within 5 miles ordered by distance var fiveMiles = GeoUtils.MilesToMeters(5); var matches = context.Locations .Where(loc => loc.Location.Distance(sourcePoint) <= fiveMiles) .OrderBy(loc => loc.Location.Distance(sourcePoint)) .Select(loc => new { Address = loc.Address, Distance = loc.Location.Distance(sourcePoint) }); Assert.IsTrue(matches.Count() > 0); foreach (var location in matches) { Console.WriteLine("{0} ({1:n1} miles)", location.Address, GeoUtils.MetersToMiles(location.Distance)); } } which produces: 301 15th Street, Hood River (0.0 miles)The Hatchery, Bingen (0.5 miles)Kaze Sushi, Hood River (0.7 miles) Nice 'n simple. .NET 4.5 Only Note that DbGeography and DbGeometry are exclusive to Entity Framework 5.0 (not 4.4 which ships in the same NuGet package or installer) and requires .NET 4.5. That's because the new DbGeometry and DbGeography (and related) types are defined in the 4.5 version of System.Data.Entity which is a CLR assembly and is only updated by major versions of .NET. Why this decision was made to add these types to System.Data.Entity rather than to the frequently updated EntityFramework assembly that would have possibly made this work in .NET 4.0 is beyond me, especially given that there are no native .NET framework spatial types to begin with. I find it also odd that there is no native CLR spatial type. The DbGeography and DbGeometry types are specific to Entity Framework and live on those assemblies. They will also work for general purpose, non-database spatial data manipulation, but then you are forced into having a dependency on System.Data.Entity, which seems a bit silly. There's also a System.Spatial assembly that's apparently part of WCF Data Services which in turn don't work with Entity framework. Another example of multiple teams at Microsoft not communicating and implementing the same functionality (differently) in several different places. Perplexed as a I may be, for EF specific code the Entity framework specific types are easy to use and work well. Working with pre-.NET 4.5 Entity Framework and Spatial Data If you can't go to .NET 4.5 just yet you can also still use spatial features in Entity Framework, but it's a lot more work as you can't use the DbContext directly to manipulate the location data. You can still run raw SQL statements to write data into the database and retrieve results using the same TSQL syntax I showed earlier using Context.Database.ExecuteSqlCommand(). Here's code that you can use to add location data into the database:[TestMethod] public void RawSqlEfAddTest() { string sqlFormat = @"insert into GeoLocations( Location, Address) values ( geography::STGeomFromText('POINT({0} {1})', 4326),@p0 )"; var sql = string.Format(sqlFormat,-121.527200, 45.712113); Console.WriteLine(sql); var context = new GeoLocationContext(); Assert.IsTrue(context.Database.ExecuteSqlCommand(sql,"301 N. 15th Street") > 0); } Here I'm using the STGeomFromText() function to add the location data. Note that I'm using string.Format here, which usually would be a bad practice but is required here. I was unable to use ExecuteSqlCommand() and its named parameter syntax as the longitude and latitude parameters are embedded into a string. Rest assured it's required as the following does not work:string sqlFormat = @"insert into GeoLocations( Location, Address) values ( geography::STGeomFromText('POINT(@p0 @p1)', 4326),@p2 )";context.Database.ExecuteSqlCommand(sql, -121.527200, 45.712113, "301 N. 15th Street") Explicitly assigning the point value with string.format works however. There are a number of ways to query location data. You can't get the location data directly, but you can retrieve the point string (which can then be parsed to get Latitude and Longitude) and you can return calculated values like distance. Here's an example of how to retrieve some geo data into a resultset using EF's and SqlQuery method:[TestMethod] public void RawSqlEfQueryTest() { var sqlFormat = @" DECLARE @s geography SET @s = geography:: STGeomFromText('POINT({0} {1})' , 4326); SELECT Address, Location.ToString() as GeoString, @s.STDistance( Location) as Distance FROM GeoLocations ORDER BY Distance"; var sql = string.Format(sqlFormat, -121.527200, 45.712113); var context = new GeoLocationContext(); var locations = context.Database.SqlQuery<ResultData>(sql); Assert.IsTrue(locations.Count() > 0); foreach (var location in locations) { Console.WriteLine(location.Address + " " + location.GeoString + " " + location.Distance); } } public class ResultData { public string GeoString { get; set; } public double Distance { get; set; } public string Address { get; set; } } Hopefully you don't have to resort to this approach as it's fairly limited. Using the new DbGeography/DbGeometry types makes this sort of thing so much easier. When I had to use code like this before I typically ended up retrieving data pks only and then running another query with just the PKs to retrieve the actual underlying DbContext entities. This was very inefficient and tedious but it did work. Summary For the current project I'm working on we actually made the switch to .NET 4.5 purely for the spatial features in EF 5.0. This app heavily relies on spatial queries and it was worth taking a chance with pre-release code to get this ease of integration as opposed to manually falling back to stored procedures or raw SQL string queries to return spatial specific queries. Using native Entity Framework code makes life a lot easier than the alternatives. It might be a late addition to Entity Framework, but it sure makes location calculations and storage easy. Where do you want to go today? ;-) Resources Download Sample Project© Rick Strahl, West Wind Technologies, 2005-2012Posted in ADO.NET Sql Server .NET Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
Is Berkeley DB a NoSQL solution?

- by Gregory Burd

Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage. At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage. Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheep, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed. A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities in common. There were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems. The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations. The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk. Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partitioned the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not. So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really. Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle. Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

Read the article

Search Results

Search found 7116 results on 285 pages for 'nested queries'.

Page 101/285 | < Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108 | Next Page >

- by mmcteam.com.ua

- by jonnnnnnnnnie

- by zeven

- by Lost_in_code

- by Mladen Jablanovic

- by Raven Dreamer

- by dana

- by InnateDev

- by Will

- by RoCMe

- by Webnet

- by mrblah

- by Buddy

- by [email protected]

- by Mark Virtue

- by Fabiano Amorim

- by del

- by DEHAAS

- by BuckWoody

- by Brian

- by KLaker

- by Jonathan Kehayias

- by Rick Strahl

- by Gregory Burd

< Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108 | Next Page >