Search Results

Search found 12909 results on 517 pages for 'clustered index'.

Page 45/517 | < Previous Page | 41 42 43 44 45 46 47 48 49 50 51 52  | Next Page >

  • How to handle very frequent updates to a Lucene index

    - by fsm
    I am trying to prototype an indexing/search application which uses very volatile indexing data sources (forums, social networks etc), here are some of the performance requirements, Very fast turn-around time (by this I mean that any new data (such as a new message on a forum) should be available in the search results very soon (less than a minute)) I need to discard old documents on a fairly regular basis to ensure that the search results are not dated. Last but not least, the search application needs to be responsive. (latency on the order of 100 milliseconds, and should support at least 10 qps) All of the requirements I have currently can be met w/o using Lucene (and that would let me satisfy all 1,2 and 3), but I am anticipating other requirements in the future (like search relevance etc) which Lucene makes easier to implement. However, since Lucene is designed for use cases far more complex than the one I'm currently working on, I'm having a hard time satisfying my performance requirements. Here are some questions, a. I read that the optimize() method in the IndexWriter class is expensive, and should not be used by applications that do frequent updates, what are the alternatives? b. In order to do incremental updates, I need to keep committing new data, and also keep refreshing the index reader to make sure it has the new data available. These are going to affect 1 and 3 above. Should I try duplicate indices? What are some common approaches to solving this problem? c. I know that Lucene provides a delete method, which lets you delete all documents that match a certain query, in my case, I need to delete all documents which are older than a certain age, now one option is to add a date field to every document and use that to delete documents later. Is it possible to do range queries on document ids (I can create my own id field since I think that the one created by lucene keeps changing) to delete documents? Is it any faster than comparing dates represented as strings? I know these are very open questions, so I am not looking for a detailed answer, I will try to treat all of your answers as suggestions and use them to inform my design. Thanks! Please let me know if you need any other information.

    Read the article

  • Lucene - querying with long strings

    - by Mikos
    I have an index, with a field "Affiliation", some example values are: "Stanford University School of Medicine, Palo Alto, CA USA", "Institute of Neurobiology, School of Medicine, Stanford University, Palo Alto, CA", "School of Medicine, Harvard University, Boston MA", "Brigham & Women's, Harvard University School of Medicine, Boston, MA" "Harvard University, Cambridge MA" and so on... (the bottom-line being the affiliations are written in multiple ways with no apparent consistency) I query the index on the affiliation field using say "School of Medicine, Stanford University, Palo Alto, CA" (with QueryParser) to find all Stanford related documents, I get a lot of false +ves, presumably because of the presence of School of Medicine etc. etc. (note: I cannot use Phrase query because of variability in the way affiliation is constructed) I have tried the following: Use a SpanNearQuery by splitting the search phrase with a whitespace (here I get no results!) Tried boosting (using ^) by splitting with the comma and boosting the last parts such as "Palo Alto CA" with a much higher boost than the initial phrases. Here I still get lots of false +ves. Any suggestions on how to approach this? If SpanNearQuery the way to go, Any ideas on why I get 0 results?

    Read the article

  • Boost.MultiIndex: How to make an effective set intersection?

    - by Arman
    Hello, assume that we have a data1 and data2. How can I intersect them with std::set_intersect()? struct pID { int ID; unsigned int IDf;// postition in the file pID(int id,const unsigned int idf):ID(id),IDf(idf){} bool operator<(const pID& p)const { return ID<p.ID;} }; struct ID{}; struct IDf{}; typedef multi_index_container< pID, indexed_by< ordered_unique< tag<IDf>, BOOST_MULTI_INDEX_MEMBER(pID,unsigned int,IDf)>, ordered_non_unique< tag<ID>,BOOST_MULTI_INDEX_MEMBER(pID,int,ID)> > > pID_set; ID_set data1, data2; Load(data1); Load(data2); pID_set::index<ID>::type& L1_ID_index=L1.data.get<ID>(); pID_set::index<ID>::type& L2_ID_index=L2.data.get<ID>(); // How do I use set_intersect? Kind regards, Arman.

    Read the article

  • Broken flash movie player! allowFullScreen does not work with anything other than a wmode value of "

    - by lhnz
    I have a flash player on a page which plays videos. I also have modal popups which need to be able to display over the top of the flash player when they are opened, etc... I can't change either of these requirements since they are part of the spec I have been given. Flash seems to ignore z-indexes I set on it with css, and the modal popups will therefore only appear above the video player if I set the video player's wmode to opaque or transparent. However, if I do this then the full screen functionality stops working correctly: when I un-fullscreen the video it stays zoomed in. In short If you open a popup on an item page or another page containing flash the popup should be displayed above this. Flash ignores z-index values. You can stop flash ignoring z-index values by setting wmode to opaque or transparent rather than the default: window. This stops full screen from working correctly. Has anybody else faced this issue before? What can I do to fix it? I was thinking of recreating the video player with wmode=opaque whenever I opened a modal popup and then switching it back to wmode=window when the modal popup is closed, since this would mean that the popup should display above it (as wmode=opaque) and the fullscreen should work correct (as wmode=window). However, this is not ideal at all: as well as being a hack it would also mean that the video would stop playing if somebody clicked a button which opened a popup. Cheers!

    Read the article

  • DIVS over flash movies in Internet Explorer

    - by drew
    The age old question... why the hell doesn't a div positioned over a flash object stay on top with z-index. I have found the answer in the past, but it's been so long, I can't seem to get it. My flash movie is in a div floating left: <div id="flash"> <object width="614" height="289"> <param name="movie" value="images/75.swf"> <param name="wmode" value="transparent"> <embed src="images/75.swf" width="614" height="289" wmode"transparent"> </embed> </object> </div> My css for the div that needs to be on top is: .menu ul li:hover ul li a:hover { background:#5a3f2d; color:#FFF; z-index: 9999; I cannot get it to show above the flash movie in ie6 or ie8. I know this is old school but I'm frustrated! Does my nav div need to have an absolute position? Is that why it doesn't work? Example is here. Hover over the first link on the right: "CUSTOMER SERVICE" Thanks all :)

    Read the article

  • SQL Server - stored procedure suddenly become slow

    - by Barguast
    I have written a stored procedure that, yesterday, typically completed in under a second. Today, it takes about 18 seconds. I ran into the problem yesterday as well, and it seemed to be solved by DROPing and re-CREATEing the stored procedure. Today, that trick doesn't appear to be working. :( Interestingly, if I copy the body of the stored procedure and execute it as a straightforward query it completes quickly. It seems to be the fact that it's a stored procedure that's slowing it down...! Does anyone know what the problem might be? I've searched for answers, but often they recommend running it through Query Analyser, but I don't have have it - I'm using SQL Server 2008 Express for now. The stored procedure is as follows; ALTER PROCEDURE [dbo].[spGetPOIs] @lat1 float, @lon1 float, @lat2 float, @lon2 float, @minLOD tinyint, @maxLOD tinyint, @exact bit AS BEGIN -- Create the query rectangle as a polygon DECLARE @bounds geography; SET @bounds = dbo.fnGetRectangleGeographyFromLatLons(@lat1, @lon1, @lat2, @lon2); -- Perform the selection if (@exact = 0) BEGIN SELECT [ID], [Name], [Type], [Data], [MinLOD], [MaxLOD], [Location].[Lat] AS [Latitude], [Location].[Long] AS [Longitude], [SourceID] FROM [POIs] WHERE NOT ((@maxLOD < [MinLOD]) OR (@minLOD > [MaxLOD])) AND (@bounds.Filter([Location]) = 1) END ELSE BEGIN SELECT [ID], [Name], [Type], [Data], [MinLOD], [MaxLOD], [Location].[Lat] AS [Latitude], [Location].[Long] AS [Longitude], [SourceID] FROM [POIs] WHERE NOT ((@maxLOD < [MinLOD]) OR (@minLOD > [MaxLOD])) AND (@bounds.STIntersects([Location]) = 1) END END The 'POI' table has an index on MinLOD, MaxLOD, and a spatial index on Location.

    Read the article

  • Beginner Question: For extract a large subset of a table from MySQL, how does Indexing, order of tab

    - by chongman
    Sorry if this is too simple, but thanks in advance for helping. This is for MySQL but might be relevant for other RDMBSs tblA has 4 columns: colA, colB, colC, mydata, A_id It has about 10^9 records, with 10^3 distinct values for colA, colB, colC. tblB has 3 columns: colA, colB, B_id It has about 10^4 records. I want all the records from tblA (except the A_id) that have a match in tblB. In other words, I want to use tblB to describe the subset that I want to extract and then extract those records from tblA. Namely: SELECT a.colA, a.colB, a.colC, a.mydata FROM tblA as a INNER JOIN tblB as b ON a.colA=b.colA a.colB=b.colB ; It's taking a really long time (more than an hour) on a newish computer (4GB, Core2Quad, ubuntu), and I just want to check my understanding of the following optimization steps. ** Suppose this is the only query I will ever run on these tables. So ignore the need to run other queries. Now my questions: 1) What indexes should I create to optimize this query? I think I just need a multiple index on (colA, colB) for both tables. I don't think I need separate indexes for colA and colB. Another stack overflow article (that I can't find) mentioned that when adding new indexes, it is slower when there are existing indexes, so that might be a reason to use the multiple index. 2) Is INNER JOIN correct? I just want results where a match is found. 3) Is it faster if I join (tblA to tblB) or the other way around, (tblB to tblA)? This previous answer says that the optimizer should take care of that. 4) Does the order of the part after ON matter? This previous answer say that the optimizer also takes care of the execution order.

    Read the article

  • Indexing table with duplicates MySQL/MSSQL with millions of records

    - by Tesnep
    I need help in indexing in MySQL. I have a table in MySQL with following rows: ID Store_ID Feature_ID Order_ID Viewed_Date Deal_ID IsTrial The ID is auto generated. Store_ID goes from 1 - 8. Feature_ID from 1 - let's say 100. Viewed Date is Date and time on which the data is inserted. IsTrial is either 0 or 1. You can ignore Order_ID and Deal_ID from this discussion. There are millions of data in the table and we have a reporting backend that needs to view the number of views in a certain period or overall where trial is 0 for a particular store id and for a particular feature. The query takes the form of: select count(viewed_date) from theTable where viewed_date between '2009-12-01' and '2010-12-31' and store_id = '2' and feature_id = '12' and Istrial = 0 In MSSQL you can have a filtered index to use for Istrial. Is there anything similar to this in MySQL? Also, Store_ID and Feature_ID have a lot of duplicate data. I created an index using Store_ID and Feature_ID. Although this seems to have decreased the search period, I need better improvement than this. Right now I have more than 4 million rows. To search for a particular query like the one above, it looks at 3.5 million rows in order to give me the count of 500k rows. PS. I forgot to add view_date filter in the query. Now I have done this.

    Read the article

  • 2-column table with two foreign keys. Performance/design question.

    - by Emanuel
    Hello everyone! I recently ran into a quite complex problem and after looking around a lot I couldn't find a solution to it. I've found answers to my questions many times before on stackoverflow.com, so I decided to post here. So I'm making a user/group managment system for a web-based project, and I'm storing all related data into a postgreSQL database. This system relies on three tables: USERS GROUPS GROUP_USERS The two first tables simply define all the users and all the groups on the site, and the last table, GROUP_USERS, stores the groups every user is part of. It only has two columns: USER_ID GROUP_ID Since every user can be a member of several groups, I decided to make a separate table for this purpose, rather than storing a comma separated column in the USERS-table. Now, both columns are foreign keys, and I want to make them both primary keys as well, this since each combination of USER_ID and GROUP_ID has to be unique, and if I give them the constraint UNIQUE pgAdmin tells me that each table should have at least one Primary key. But now I am stuck with what seems to be a lot of indexes and relations to a very small table only containing numbers. In the end, I want this table to be as fast as possible, even if containing tens of thousands of rows. Size on disk shouldn't be a problem since its just all numbers anyway, but it feels quite stupid to have a full-sized index refering to a smaller table. Should I stick with my current solution, store comma-separated values in a column in the USERS-table or is there any other solution I should be aware of. PS. I don't want to use an array-column, even if they are supported by postgreSQL. I want to be as generic as possible so I can switch database later on, if necessary. EDIT: I other words, will using a compound primary key and two foreign keys in one table with only two columns have a negative impact on performance rather than the opposite due to the size of the generated index? Thank you!

    Read the article

  • Indexing table with duplicates MySQL/SQL Server with millions of records

    - by Tesnep
    I need help in indexing in MySQL. I have a table in MySQL with following rows: ID Store_ID Feature_ID Order_ID Viewed_Date Deal_ID IsTrial The ID is auto generated. Store_ID goes from 1 - 8. Feature_ID from 1 - let's say 100. Viewed Date is Date and time on which the data is inserted. IsTrial is either 0 or 1. You can ignore Order_ID and Deal_ID from this discussion. There are millions of data in the table and we have a reporting backend that needs to view the number of views in a certain period or overall where trial is 0 for a particular store id and for a particular feature. The query takes the form of: select count(viewed_date) from theTable where viewed_date between '2009-12-01' and '2010-12-31' and store_id = '2' and feature_id = '12' and Istrial = 0 In SQL Server you can have a filtered index to use for Istrial. Is there anything similar to this in MySQL? Also, Store_ID and Feature_ID have a lot of duplicate data. I created an index using Store_ID and Feature_ID. Although this seems to have decreased the search period, I need better improvement than this. Right now I have more than 4 million rows. To search for a particular query like the one above, it looks at 3.5 million rows in order to give me the count of 500k rows. PS. I forgot to add view_date filter in the query. Now I have done this.

    Read the article

  • Rails Nested Attributes Doesn't Insert ID Correctly

    - by MunkiPhD
    I'm attempting to edit a model's nested attributes, much as outline here, replicated here: <%= form_for @person do |person_form| %> <%= person_form.text_field :name %> <% for address in @person.addresses %> <%= person_form.fields_for address, :index => address do |address_form|%> <%= address_form.text_field :city %> <% end %> <% end %> <% end %> In my code, I have the following: <%= form_for(@meal) do |f| %> <!-- some other stuff that's irrelevant... --> <% for subitem in @meal.meal_line_items %> <%= f.fields_for subitem, :index => subitem do |line_item_form| %> <%= line_item_form.label :servings %><br/> <%= line_item_form.text_field :servings %><br/> <%= line_item_form.label :food_id %><br/> <%= line_item_form.text_field :food_id %><br/> <% end %> <% end %> <%= f.submit %> <% end %> This works great, except, when I look at the HTML, it's creating the inputs that look like the following, failing to input the correct id and instead placing the memory representation(?) of the model: <input type="text" value="2" size="30" name="meal[meal_line_item][#<MealLineItem:0x00000005c5d618>][servings]" id="meal_meal_line_item_#<MealLineItem:0x00000005c5d618>_servings">

    Read the article

  • Windows 7 search doesn’t find text strings

    - by Hugh Tash
    I’m not able to find any text strings starting not from the beginning of word in filename or in file content using Windows 7 search. My Windows 7 search configuration: Let’s say I’m searching for a documents containing word “content”. I’m able to find those documents when searching for “content”, “conte”, “con” (as long as the string includes the beginning of the word). "content" "con" But if I search for “ontent”, “tent” or any other combination that doesn’t include the beginning of the word, Windows search won't find it. I've tried other indexing/searching software such as Copernic Desktop search, Google desktop search. Those programs also weren’t able to find part of the word starting from the middle of the word. For instance, it finds “conte”, but doesn’t find “onte”. Finds “conte” Doesn’t find “onte” I got the same problem using Copernic desktop search. On the other hand, when I use non-indexing content search software such as Agent Ransack or FileSeek, I get the same results when searching for “conte” or “onte”: “conte” “onte” Why do all pre-indexing content search applications (Windows search, Google desktop, Copernic desktop search) fail to search for a string inside the words? Why do non-indexing applications find text strings wherever they are: in the beginning, middle or end of the word? I’ve tried wildcards and other constructions with no luck. *onte onte “onte” content:onte content:onte content:~onte All these searched doesn’t find the word “content”. How can I make Windows search find strings from any part of words? Could you try these searches and see if they work for you? Or is this normal behavior? Thank you. Update: Using wildcards before or after "onte" doesn't find any results. content:~=onte doesn't find any results.

    Read the article

  • How is SU indexed so fast on Google?

    - by ekaj
    I just did a quick Google for a question that was 20 minutes old, to look for an answer, and it was already on Google Search - how is this possible? I glanced over this article which seems to suggest that SU has added RSS feeds (which SU has, but when I opened the feed the article says last posted 6 minutes ago, but when Googled it is 11 hours old) - which leads me to think (Based on that article, I don't know much about search indexing but I am reading at the moment) that most of this indexing is done thanks to the sitemap - is there anything else I am unaware of that helps SU questions get on Google so fast?

    Read the article

  • Nginx Rewrite to Previous Directory

    - by ThinkBohemian
    I am trying to move my blog from blog.example.com to example.com/blog to do this I would rather not move anything on disk, so instead i changed my nginx configuration file to the following: location /blog { if (!-e $request_filename) { rewrite ^.*$ /index.php last; } root /home/demo/public_html/blog.example.com/current/public/; index index.php index.html index.html; passenger_enabled off; index index.html index.htm index.php; try_files $uri $uri/ @blog; } This works great but when i visit example.com/blog nginx looks for: /home/demo/public_html/blog.example.com/current/public/blog/index.php instead of /home/demo/public_html/blog.example.com/current/public/index.php Is there a way to put in a rewrite rule so that I can have the server automatically take out the /blog/ directory? something like ? location /blog { rewrite \\blog\D \; }

    Read the article

  • How to search for a string everywhere (C: and D:) using Findstr?

    - by amiregelz
    I have a text (.txt) file located somewhere on my PC that contains a bunch of data, including the following string: Secret Username: ********* Secret Password: ********* How can I find this file from command-line, using Findstr? I don't know if it's on C: drive or D: drive. I tried various Findstr queries, such as: findstr /s /m /n /i Secret Username C: findstr /s /m /n /i Secret Username D: findstr /s /m /n /i /c:"Secret Username" findstr /s /m /n /r /i .*Secret Username.* but couldn't find the file.

    Read the article

  • Indexing text file content with command line query

    - by Drew Carlton
    I take daily notes in a plaintext file labeled with date in the YYYYMMDD format. These files are no more than 100 lines long, and are written in a blog style format. I'd like to be able search these files as if they were blog posts indexed by google, with some phrase query returning the most relevant/recent date filenames, with a snippet containing the relevant part. Ideally it would be something like this: #searchindex "laptop no sound" returns: 20100909.txt: ... laptop sound isn't working... 20100101.txt ... sound is too loud... debating what laptop to buy... and so on and so forth. I'm working on a linux platform (Debian with GNOME). I've looked at beagle and tracker, but they just seem complete overkill for what I want.

    Read the article

  • Search /usr/local/texlive directory with Spotlight

    - by Teake Nutma
    Having TeXLive installed on my Mac, I frequently need to consult documentation for some of the packages. It seems silly to Google this when I have the PDFs all on my HDD in /usr/local/texlive/2011/texmf-dist/doc , so I want to be able to use Spotlight to search for them. However, I can't get Spotlight to cooperate. I tried mdimport /usr/local/texlive/2011/texmf-dist/doc which then does some work, but afterwards doesn't display any results in Spotlight. I've also added the folder in Alfred's search scope to no avail. Any ideas?

    Read the article

  • What are the popular file indexing engines on Linux?

    - by netvope
    It would be nice if you can share your experience on the pros and cons of each of them. Personally I only know Google Desktop and Beagle, and I haven't really used them. I mainly store my files on Windows (and use its integrated indexed search) but I'm trying to see if I can switch over to Linux. Also, can any one of the search indexer run without X? Does any of them provide an API for search queries?

    Read the article

  • Windows 7 search does not return results from indexed folders

    - by Dilbert
    I am experiencing this issue over and over again and I just cannot seem to find the answer. It doesn't make sense, but search simply does not return results from folders that certainly have these files inside. It's weird that this technology exists for more than 5 years now (it could be added to Windows XP as an addon), and they still haven't got it right. My folder contains 10 image files with .png extensions. Two scenarios: Scenario 1: I exclude the folder using Indexing options. Search works. Scenario 2: I turn on indexing for this folder. Search does not work. Of course, Agent Ransack returns results every time. When I check Advanced options for the Indexing options inside control panel, .png files are checked in the File Types tab, using the "File Properties filter". What's the deal with this? [Edit] To clarify, this doesn't happen with all folders, but does with more than one. For the "problematic" folders, even *.* doesn't return a single result. I found some advice to clear the archive and readonly attributes for all files (doesn't make sense, but hey), but it didn't work. Indexing status in Control panel is: Indexing complete. 100,000 items indexed. Folder is included in the list. File types list contains the .png extension (although it doesn't work with any filter, not even *.*).

    Read the article

  • How to rewrite index.php (and other valid default files) to the document root using mod_rewrite?

    - by TMG
    Hello, I would like to redirect index.php, as well as any other valid default file (e.g. index.html, index.asp, etc.) to the document root (which contains index.php) with something like this: RewriteRule ^index\.(php|htm|html|asp|cfm|shtml|shtm)/?$ / [NC,L] However, this is of course giving me an infinite redirect loop. What's the right way to do this? If possible, I'd like to have this work in both the development and production environment, so I don't want to specify an explicit url like http://www.mysite.com/ as the target. Thanks!

    Read the article

  • How to Prevent Spotlight from Indexing Non-Apps in /Applications Directory?

    - by Ross Charette
    The only thing I need indexed in the /Applications folder are the .app files. Is there any way to setup a filter to have mds or Spotlight ignore everything in /Applications except .apps? Otherwise, would it be possible to setup a rule for Alfred to omit any non-.app records from /Applications? I still want documents indexed and returned, just not from that specific directory. OS X 10.6.8 if you're wondering.

    Read the article

  • Hello Operator, My Switch Is Bored

    - by Paul White
    This is a post for T-SQL Tuesday #43 hosted by my good friend Rob Farley. The topic this month is Plan Operators. I haven’t taken part in T-SQL Tuesday before, but I do like to write about execution plans, so this seemed like a good time to start. This post is in two parts. The first part is primarily an excuse to use a pretty bad play on words in the title of this blog post (if you’re too young to know what a telephone operator or a switchboard is, I hate you). The second part of the post looks at an invisible query plan operator (so to speak). 1. My Switch Is Bored Allow me to present the rare and interesting execution plan operator, Switch: Books Online has this to say about Switch: Following that description, I had a go at producing a Fast Forward Cursor plan that used the TOP operator, but had no luck. That may be due to my lack of skill with cursors, I’m not too sure. The only application of Switch in SQL Server 2012 that I am familiar with requires a local partitioned view: CREATE TABLE dbo.T1 (c1 int NOT NULL CHECK (c1 BETWEEN 00 AND 24)); CREATE TABLE dbo.T2 (c1 int NOT NULL CHECK (c1 BETWEEN 25 AND 49)); CREATE TABLE dbo.T3 (c1 int NOT NULL CHECK (c1 BETWEEN 50 AND 74)); CREATE TABLE dbo.T4 (c1 int NOT NULL CHECK (c1 BETWEEN 75 AND 99)); GO CREATE VIEW V1 AS SELECT c1 FROM dbo.T1 UNION ALL SELECT c1 FROM dbo.T2 UNION ALL SELECT c1 FROM dbo.T3 UNION ALL SELECT c1 FROM dbo.T4; Not only that, but it needs an updatable local partitioned view. We’ll need some primary keys to meet that requirement: ALTER TABLE dbo.T1 ADD CONSTRAINT PK_T1 PRIMARY KEY (c1);   ALTER TABLE dbo.T2 ADD CONSTRAINT PK_T2 PRIMARY KEY (c1);   ALTER TABLE dbo.T3 ADD CONSTRAINT PK_T3 PRIMARY KEY (c1);   ALTER TABLE dbo.T4 ADD CONSTRAINT PK_T4 PRIMARY KEY (c1); We also need an INSERT statement that references the view. Even more specifically, to see a Switch operator, we need to perform a single-row insert (multi-row inserts use a different plan shape): INSERT dbo.V1 (c1) VALUES (1); And now…the execution plan: The Constant Scan manufactures a single row with no columns. The Compute Scalar works out which partition of the view the new value should go in. The Assert checks that the computed partition number is not null (if it is, an error is returned). The Nested Loops Join executes exactly once, with the partition id as an outer reference (correlated parameter). The Switch operator checks the value of the parameter and executes the corresponding input only. If the partition id is 0, the uppermost Clustered Index Insert is executed, adding a row to table T1. If the partition id is 1, the next lower Clustered Index Insert is executed, adding a row to table T2…and so on. In case you were wondering, here’s a query and execution plan for a multi-row insert to the view: INSERT dbo.V1 (c1) VALUES (1), (2); Yuck! An Eager Table Spool and four Filters! I prefer the Switch plan. My guess is that almost all the old strategies that used a Switch operator have been replaced over time, using things like a regular Concatenation Union All combined with Start-Up Filters on its inputs. Other new (relative to the Switch operator) features like table partitioning have specific execution plan support that doesn’t need the Switch operator either. This feels like a bit of a shame, but perhaps it is just nostalgia on my part, it’s hard to know. Please do let me know if you encounter a query that can still use the Switch operator in 2012 – it must be very bored if this is the only possible modern usage! 2. Invisible Plan Operators The second part of this post uses an example based on a question Dave Ballantyne asked using the SQL Sentry Plan Explorer plan upload facility. If you haven’t tried that yet, make sure you’re on the latest version of the (free) Plan Explorer software, and then click the Post to SQLPerformance.com button. That will create a site question with the query plan attached (which can be anonymized if the plan contains sensitive information). Aaron Bertrand and I keep a close eye on questions there, so if you have ever wanted to ask a query plan question of either of us, that’s a good way to do it. The problem The issue I want to talk about revolves around a query issued against a calendar table. The script below creates a simplified version and adds 100 years of per-day information to it: USE tempdb; GO CREATE TABLE dbo.Calendar ( dt date NOT NULL, isWeekday bit NOT NULL, theYear smallint NOT NULL,   CONSTRAINT PK__dbo_Calendar_dt PRIMARY KEY CLUSTERED (dt) ); GO -- Monday is the first day of the week for me SET DATEFIRST 1;   -- Add 100 years of data INSERT dbo.Calendar WITH (TABLOCKX) (dt, isWeekday, theYear) SELECT CA.dt, isWeekday = CASE WHEN DATEPART(WEEKDAY, CA.dt) IN (6, 7) THEN 0 ELSE 1 END, theYear = YEAR(CA.dt) FROM Sandpit.dbo.Numbers AS N CROSS APPLY ( VALUES (DATEADD(DAY, N.n - 1, CONVERT(date, '01 Jan 2000', 113))) ) AS CA (dt) WHERE N.n BETWEEN 1 AND 36525; The following query counts the number of weekend days in 2013: SELECT Days = COUNT_BIG(*) FROM dbo.Calendar AS C WHERE theYear = 2013 AND isWeekday = 0; It returns the correct result (104) using the following execution plan: The query optimizer has managed to estimate the number of rows returned from the table exactly, based purely on the default statistics created separately on the two columns referenced in the query’s WHERE clause. (Well, almost exactly, the unrounded estimate is 104.289 rows.) There is already an invisible operator in this query plan – a Filter operator used to apply the WHERE clause predicates. We can see it by re-running the query with the enormously useful (but undocumented) trace flag 9130 enabled: Now we can see the full picture. The whole table is scanned, returning all 36,525 rows, before the Filter narrows that down to just the 104 we want. Without the trace flag, the Filter is incorporated in the Clustered Index Scan as a residual predicate. It is a little bit more efficient than using a separate operator, but residual predicates are still something you will want to avoid where possible. The estimates are still spot on though: Anyway, looking to improve the performance of this query, Dave added the following filtered index to the Calendar table: CREATE NONCLUSTERED INDEX Weekends ON dbo.Calendar(theYear) WHERE isWeekday = 0; The original query now produces a much more efficient plan: Unfortunately, the estimated number of rows produced by the seek is now wrong (365 instead of 104): What’s going on? The estimate was spot on before we added the index! Explanation You might want to grab a coffee for this bit. Using another trace flag or two (8606 and 8612) we can see that the cardinality estimates were exactly right initially: The highlighted information shows the initial cardinality estimates for the base table (36,525 rows), the result of applying the two relational selects in our WHERE clause (104 rows), and after performing the COUNT_BIG(*) group by aggregate (1 row). All of these are correct, but that was before cost-based optimization got involved :) Cost-based optimization When cost-based optimization starts up, the logical tree above is copied into a structure (the ‘memo’) that has one group per logical operation (roughly speaking). The logical read of the base table (LogOp_Get) ends up in group 7; the two predicates (LogOp_Select) end up in group 8 (with the details of the selections in subgroups 0-6). These two groups still have the correct cardinalities as trace flag 8608 output (initial memo contents) shows: During cost-based optimization, a rule called SelToIdxStrategy runs on group 8. It’s job is to match logical selections to indexable expressions (SARGs). It successfully matches the selections (theYear = 2013, is Weekday = 0) to the filtered index, and writes a new alternative into the memo structure. The new alternative is entered into group 8 as option 1 (option 0 was the original LogOp_Select): The new alternative is to do nothing (PhyOp_NOP = no operation), but to instead follow the new logical instructions listed below the NOP. The LogOp_GetIdx (full read of an index) goes into group 21, and the LogOp_SelectIdx (selection on an index) is placed in group 22, operating on the result of group 21. The definition of the comparison ‘the Year = 2013’ (ScaOp_Comp downwards) was already present in the memo starting at group 2, so no new memo groups are created for that. New Cardinality Estimates The new memo groups require two new cardinality estimates to be derived. First, LogOp_Idx (full read of the index) gets a predicted cardinality of 10,436. This number comes from the filtered index statistics: DBCC SHOW_STATISTICS (Calendar, Weekends) WITH STAT_HEADER; The second new cardinality derivation is for the LogOp_SelectIdx applying the predicate (theYear = 2013). To get a number for this, the cardinality estimator uses statistics for the column ‘theYear’, producing an estimate of 365 rows (there are 365 days in 2013!): DBCC SHOW_STATISTICS (Calendar, theYear) WITH HISTOGRAM; This is where the mistake happens. Cardinality estimation should have used the filtered index statistics here, to get an estimate of 104 rows: DBCC SHOW_STATISTICS (Calendar, Weekends) WITH HISTOGRAM; Unfortunately, the logic has lost sight of the link between the read of the filtered index (LogOp_GetIdx) in group 22, and the selection on that index (LogOp_SelectIdx) that it is deriving a cardinality estimate for, in group 21. The correct cardinality estimate (104 rows) is still present in the memo, attached to group 8, but that group now has a PhyOp_NOP implementation. Skipping over the rest of cost-based optimization (in a belated attempt at brevity) we can see the optimizer’s final output using trace flag 8607: This output shows the (incorrect, but understandable) 365 row estimate for the index range operation, and the correct 104 estimate still attached to its PhyOp_NOP. This tree still has to go through a few post-optimizer rewrites and ‘copy out’ from the memo structure into a tree suitable for the execution engine. One step in this process removes PhyOp_NOP, discarding its 104-row cardinality estimate as it does so. To finish this section on a more positive note, consider what happens if we add an OVER clause to the query aggregate. This isn’t intended to be a ‘fix’ of any sort, I just want to show you that the 104 estimate can survive and be used if later cardinality estimation needs it: SELECT Days = COUNT_BIG(*) OVER () FROM dbo.Calendar AS C WHERE theYear = 2013 AND isWeekday = 0; The estimated execution plan is: Note the 365 estimate at the Index Seek, but the 104 lives again at the Segment! We can imagine the lost predicate ‘isWeekday = 0’ as sitting between the seek and the segment in an invisible Filter operator that drops the estimate from 365 to 104. Even though the NOP group is removed after optimization (so we don’t see it in the execution plan) bear in mind that all cost-based choices were made with the 104-row memo group present, so although things look a bit odd, it shouldn’t affect the optimizer’s plan selection. I should also mention that we can work around the estimation issue by including the index’s filtering columns in the index key: CREATE NONCLUSTERED INDEX Weekends ON dbo.Calendar(theYear, isWeekday) WHERE isWeekday = 0 WITH (DROP_EXISTING = ON); There are some downsides to doing this, including that changes to the isWeekday column may now require Halloween Protection, but that is unlikely to be a big problem for a static calendar table ;)  With the updated index in place, the original query produces an execution plan with the correct cardinality estimation showing at the Index Seek: That’s all for today, remember to let me know about any Switch plans you come across on a modern instance of SQL Server! Finally, here are some other posts of mine that cover other plan operators: Segment and Sequence Project Common Subexpression Spools Why Plan Operators Run Backwards Row Goals and the Top Operator Hash Match Flow Distinct Top N Sort Index Spools and Page Splits Singleton and Range Seeks Bitmaps Hash Join Performance Compute Scalar © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • configuration issue with respect to .htaccess file on ubuntu

    - by Registered User
    I am building an application tshirtshop I have following configuration in /etc/apache2/sites-enabled/tshirtshop <VirtualHost *:80> ServerAdmin webmaster@localhost DocumentRoot /var/www/tshirtshop <Directory /var/www/tshirtshop> Options Indexes FollowSymLinks AllowOverride All Order allow,deny allow from all </Directory> ErrorLog ${APACHE_LOG_DIR}/error.log # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. LogLevel warn CustomLog ${APACHE_LOG_DIR}/access.log combined </VirtualHost> and following in .htaccess file in location /var/www/tshirtshop/.htaccess <IfModule mod_rewrite.c> # Enable mod_rewrite RewriteEngine On # Specify the folder in which the application resides. # Use / if the application is in the root. RewriteBase /tshirtshop #RewriteBase / # Rewrite to correct domain to avoid canonicalization problems # RewriteCond %{HTTP_HOST} !^www\.example\.com # RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L] # Rewrite URLs ending in /index.php or /index.html to / RewriteCond %{THE_REQUEST} ^GET\ .*/index\.(php|html?)\ HTTP RewriteRule ^(.*)index\.(php|html?)$ $1 [R=301,L] # Rewrite category pages RewriteRule ^.*-d([0-9]+)/.*-c([0-9]+)/page-([0-9]+)/?$ index.php?DepartmentId=$1&CategoryId=$2&Page=$3 [L] RewriteRule ^.*-d([0-9]+)/.*-c([0-9]+)/?$ index.php?DepartmentId=$1&CategoryId=$2 [L] # Rewrite department pages RewriteRule ^.*-d([0-9]+)/page-([0-9]+)/?$ index.php?DepartmentId=$1&Page=$2 [L] RewriteRule ^.*-d([0-9]+)/?$ index.php?DepartmentId=$1 [L] # Rewrite subpages of the home page RewriteRule ^page-([0-9]+)/?$ index.php?Page=$1 [L] # Rewrite product details pages RewriteRule ^.*-p([0-9]+)/?$ index.php?ProductId=$1 [L] </IfModule> the site is working on localhost and is working as if there is no .htaccess rule specified i.e. if I were to view a page as http://localhost/tshirtshop/nature-d2 then I get a 404 Error but if I view the same page as http://localhost/tshirtshop/index.php?DepartmentId=2 then I can view it. sudo apache2ctl -M Loaded Modules: core_module (static) log_config_module (static) logio_module (static) mpm_prefork_module (static) http_module (static) so_module (static) alias_module (shared) auth_basic_module (shared) authn_file_module (shared) authz_default_module (shared) authz_groupfile_module (shared) authz_host_module (shared) authz_user_module (shared) autoindex_module (shared) cgi_module (shared) deflate_module (shared) dir_module (shared) env_module (shared) mime_module (shared) negotiation_module (shared) php5_module (shared) reqtimeout_module (shared) rewrite_module (shared) setenvif_module (shared) status_module (shared) Syntax OK What is the mistake if any one can point out in above configuration, or else I need to check any thing else?

    Read the article

  • mod_rewrite and SEO friendliness

    - by John Doe
    My website has an atypical structure and I'm not sure if this could create problems in the long run, specially for SEO positioning purposes. I have a unique, large PHP script, and I use the Apache module mod_rewrite in the .htaccess file to create friendly URLs, for example: RewriteRule ^$ /index.php?section=Main RewriteRule ^createArticle$ /index.php?section=Main&view=CreateArticle RewriteRule ^configuration$ /index.php?section=Configuration RewriteRule ^article/([0-9]{1,10})$ /index.php?section=Article&view=Default&id=$1 RewriteRule ^deleteArticle/([0-9]{1,10})$ /index.php?section=Article&view=Delete&id=$1 RewriteRule ^reportArticle/([0-9]{1,10})$ /index.php?section=Article&view=Report&id=$1 RewriteRule ^logIn$ /index.php?section=Authentication ... So, www.example.com/index.php?section=Article&view=Default&id=105 would become www.example.com/article/105. The only real physical file is index.php, in which the parameters of the URL queried is processed and the corresponding result is outputted. My question is, do the crawling robots (e.g. Googlebot) recognize these links? Do they index the resulting HTML outputted by index.php with the specified parameters as if it was a actual HTML file? Also, would this become a problem when creating a Sitemap?

    Read the article

< Previous Page | 41 42 43 44 45 46 47 48 49 50 51 52  | Next Page >