Search Results

Search found 1124 results on 45 pages for 'indexing'.

Page 43/45 | < Previous Page | 39 40 41 42 43 44 45  | Next Page >

  • SQL Server - Rebuilding Indexes

    - by Renso
    Goal: Rebuild indexes in SQL server. This can be done one at a time or with the example script below to rebuild all index for a specified table or for all tables in a given database. Why? The data in indexes gets fragmented over time. That means that as the index grows, the newly added rows to the index are physically stored in other sections of the allocated database storage space. Kind of like when you load your Christmas shopping into the trunk of your car and it is full you continue to load some on the back seat, in the same way some storage buffer is created for your index but once that runs out the data is then stored in other storage space and your data in your index is no longer stored in contiguous physical pages. To access the index the database manager has to "string together" disparate fragments to create the full-index and create one contiguous set of pages for that index. Defragmentation fixes that. What does the fragmentation affect?Depending of course on how large the table is and how fragmented the data is, can cause SQL Server to perform unnecessary data reads, slowing down SQL Server’s performance.Which index to rebuild?As a rule consider that when reorganize a table's clustered index, all other non-clustered indexes on that same table will automatically be rebuilt. A table can only have one clustered index.How to rebuild all the index for one table:The DBCC DBREINDEX command will not automatically rebuild all of the indexes on a given table in a databaseHow to rebuild all indexes for all tables in a given database:USE [myDB]    -- enter your database name hereDECLARE @tableName varchar(255)DECLARE TableCursor CURSOR FORSELECT table_name FROM information_schema.tablesWHERE table_type = 'base table'OPEN TableCursorFETCH NEXT FROM TableCursor INTO @tableNameWHILE @@FETCH_STATUS = 0BEGINDBCC DBREINDEX(@tableName,' ',90)     --a fill factor of 90%FETCH NEXT FROM TableCursor INTO @tableNameENDCLOSE TableCursorDEALLOCATE TableCursorWhat does this script do?Reindexes all indexes in all tables of the given database. Each index is filled with a fill factor of 90%. While the command DBCC DBREINDEX runs and rebuilds the indexes, that the table becomes unavailable for use by your users temporarily until the rebuild has completed, so don't do this during production  hours as it will create a shared lock on the tables, although it will allow for read-only uncommitted data reads; i.e.e SELECT.What is the fill factor?Is the percentage of space on each index page for storing data when the index is created or rebuilt. It replaces the fill factor when the index was created, becoming the new default for the index and for any other nonclustered indexes rebuilt because a clustered index is rebuilt. When fillfactor is 0, DBCC DBREINDEX uses the fill factor value last specified for the index. This value is stored in the sys.indexes catalog view. If fillfactor is specified, table_name and index_name must be specified. If fillfactor is not specified, the default fill factor, 100, is used.How do I determine the level of fragmentation?Run the DBCC SHOWCONTIG command. However this requires you to specify the ID of both the table and index being. To make it a lot easier by only requiring you to specify the table name and/or index you can run this script:DECLARE@ID int,@IndexID int,@IndexName varchar(128)--Specify the table and index namesSELECT @IndexName = ‘index_name’    --name of the indexSET @ID = OBJECT_ID(‘table_name’)  -- name of the tableSELECT @IndexID = IndIDFROM sysindexesWHERE id = @ID AND name = @IndexName--Show the level of fragmentationDBCC SHOWCONTIG (@id, @IndexID)Here is an example:DBCC SHOWCONTIG scanning 'Tickets' table...Table: 'Tickets' (1829581556); index ID: 1, database ID: 13TABLE level scan performed.- Pages Scanned................................: 915- Extents Scanned..............................: 119- Extent Switches..............................: 281- Avg. Pages per Extent........................: 7.7- Scan Density [Best Count:Actual Count].......: 40.78% [115:282]- Logical Scan Fragmentation ..................: 16.28%- Extent Scan Fragmentation ...................: 99.16%- Avg. Bytes Free per Page.....................: 2457.0- Avg. Page Density (full).....................: 69.64%DBCC execution completed. If DBCC printed error messages, contact your system administrator.What's important here?The Scan Density; Ideally it should be 100%. As time goes by it drops as fragmentation occurs. When the level drops below 75%, you should consider re-indexing.Here are the results of the same table and clustered index after running the script:DBCC SHOWCONTIG scanning 'Tickets' table...Table: 'Tickets' (1829581556); index ID: 1, database ID: 13TABLE level scan performed.- Pages Scanned................................: 692- Extents Scanned..............................: 87- Extent Switches..............................: 86- Avg. Pages per Extent........................: 8.0- Scan Density [Best Count:Actual Count].......: 100.00% [87:87]- Logical Scan Fragmentation ..................: 0.00%- Extent Scan Fragmentation ...................: 22.99%- Avg. Bytes Free per Page.....................: 639.8- Avg. Page Density (full).....................: 92.10%DBCC execution completed. If DBCC printed error messages, contact your system administrator.What's different?The Scan Density has increased from 40.78% to 100%; no fragmentation on the clustered index. Note that since we rebuilt the clustered index, all other index were also rebuilt.

    Read the article

  • Wordpress Theme

    - by HotPizzaBox
    I'm trying to create a basic wordpress theme. As far as I know the basic files I need are the style.css, header.php, index.php, footer.php, functions.php. Then it should show a blank site with some meta tags in the header. These are my files: functions.php <?php // load the language files load_theme_textdomain('brianroyfoundation', get_template_directory() . '/languages'); // add menu support add_theme_support('menus'); register_nav_menus(array('primary_navigation' => __('Primary Navigation', 'BrianRoyFoundation'))); // create widget areas: sidebar $sidebars = array('Sidebar'); foreach ($sidebars as $sidebar) { register_sidebar(array('name'=> $sidebar, 'before_widget' => '<div class="widget %2$s">', 'after_widget' => '</div>', 'before_title' => '<h6><strong>', 'after_title' => '</strong></h6>' )); } // Add Foundation 'active' class for the current menu item function active_nav_class($classes, $item) { if($item->current == 1) { $classes[] = 'active'; } return $classes; } add_filter( 'nav_menu_css_class', 'active_nav_class', 10, 2); ?> header.php <!DOCTYPE html> <html <?php language_attributes(); ?>> <head> <meta charset="<?php bloginfo('charset'); ?>" /> <meta name="description" content="<?php bloginfo('description'); ?>"> <meta name="google-site-verification" content=""> <meta name="author" content="Your Name Here"> <!-- No indexing if Search page is displayed --> <?php if(is_search()){ echo '<meta name="robots" content="noindex, nofollow" />' } ?> <title><?php wp_title('|', true, 'right'); bloginfo('name'); ?></title> <link rel="stylesheet" type="text/css" href="<?php bloginfo('stylesheet_url'); ?>" /> <?php wp_head(); ?> </head> <body> <div id="page"> <div id="page-header"> <div id="page-title"> <a href="<?php bloginfo('url'); ?>" title="<?php bloginfo('name'); ?>"><?php bloginfo('name'); ?></a> </div> <div id="page-navigation"> <?php wp_nav_menu( array( 'theme_location' => 'primary_navigation', 'container' =>false, 'menu_class' => '' ); ?> </div> </div> <div id="page-content"> index.php <?php get_header(); ?> <div class="page-blog"> <?php get_template_part('loop', 'index'); ?> </div> <div class="page-sidebar"> <?php get_sidebar(); ?> </div> <?php get_footer(); ?> footer.php </div> <div id="page-footer"> &copy; 2008 - <?php echo date('Y'); ?> All rights reserved. </div> </div> <?php wp_footer(); ?> </body> </html> I activated the theme in wordpress. But it just shows nothing. Not even if I view the page source. Can anyone help?

    Read the article

  • Redirect Google crawler to different robots.txt via .htaccess

    - by user3474818
    I have googled for the answer all day and still couldn't find an answer. I have a virtual subdomain www.static.example.com which is a mirror site of www.example.com. It means I have just one root folder for subdomain and domain aswell. I want to redirect crawlers to different robots.txt file - robots_static.txt when they see .static in url in which I will forbid indexing via /disallow command. I want to do this because I have duplicated content in Google search results. Subdomain is showing the exact same content as the main domain. Does anyone know how could I achieve that crawlers sees robots_static.txt instead of robots.txt? What I have managed to find so far is this: RewriteCond %{HTTP_HOST} ^www.static.*$ [NC] RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*robots\.txt.*\ HTTP/ [NC] RewriteRule ^robots\.txt /robots_static.txt [NC,L] but when I check in webmaster tools, it still sees robots.txt as my robots file instead of robots_static.txt, so it crawls and index everything twice. What did I do wrong? Thanks EDIT: This is my .htaccess file ## # @package Joomla # @copyright Copyright (C) 2005 - 2013 Open Source Matters. All rights reserved. # @license GNU General Public License version 2 or later; see LICENSE.txt ## ## # READ THIS COMPLETELY IF YOU CHOOSE TO USE THIS FILE! # # The line just below this section: 'Options +FollowSymLinks' may cause problems # with some server configurations. It is required for use of mod_rewrite, but may already # be set by your server administrator in a way that dissallows changing it in # your .htaccess file. If using it causes your server to error out, comment it out (add # to # beginning of line), reload your site in your browser and test your sef url's. If they work, # it has been set by your server administrator and you do not need it set here. ## ## Can be commented out if causes errors, see notes above. Options +FollowSymLinks ## Mod_rewrite in use. RewriteEngine On RewriteEngine On RewriteCond %{HTTP_HOST} !^www\. RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L] RewriteCond %{HTTP_HOST} ^www.static.*$ [NC] RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*robots\.txt.*\ HTTP/ [NC] RewriteRule ^robots\.txt /robots_static.txt [NC,L] ## Begin - Rewrite rules to block out some common exploits. # If you experience problems on your site block out the operations listed below # This attempts to block the most common type of exploit `attempts` to Joomla! # # Block out any script trying to base64_encode data within the URL. RewriteCond %{QUERY_STRING} base64_encode[^(]*\([^)]*\) [OR] # Block out any script that includes a <script> tag in URL. RewriteCond %{QUERY_STRING} (<|%3C)([^s]*s)+cript.*(>|%3E) [NC,OR] # Block out any script trying to set a PHP GLOBALS variable via URL. RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR] # Block out any script trying to modify a _REQUEST variable via URL. RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2}) # Return 403 Forbidden header and show the content of the root homepage RewriteRule .* index.php [F] # ## End - Rewrite rules to block out some common exploits. ## Begin - Custom redirects # # If you need to redirect some pages, or set a canonical non-www to # www redirect (or vice versa), place that code here. Ensure those # redirects use the correct RewriteRule syntax and the [R=301,L] flags. # ## End - Custom redirects ## # Uncomment following line if your webserver's URL # is not directly related to physical file paths. # Update Your Joomla! Directory (just / for root). ## # RewriteBase / RewriteCond %{THE_REQUEST} ^GET.*index\.php [NC] RewriteCond %{THE_REQUEST} !/system/.* RewriteRule (.*?)index\.php/*(.*) /$1$2 [R=301,L] RewriteCond %{THE_REQUEST} ^GET ## Begin - Joomla! core SEF Section. # RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}] # # If the requested path and file is not /index.php and the request # has not already been internally rewritten to the index.php script RewriteCond %{REQUEST_URI} !^/index\.php # and the request is for something within the component folder, # or for the site root, or for an extensionless URL, or the # requested URL ends with one of the listed extensions RewriteCond %{REQUEST_URI} /component/|(/[^.]*|\.(php|html?|feed|pdf|vcf|raw))$ [NC] # and the requested path and file doesn't directly match a physical file RewriteCond %{REQUEST_FILENAME} !-f # and the requested path and file doesn't directly match a physical folder RewriteCond %{REQUEST_FILENAME} !-d # internally rewrite the request to the index.php script RewriteRule .* index.php [L] # ## End - Joomla! core SEF Section. <FilesMatch "\.(ico|pdf|flv|jpg|ttf|jpg|jpeg|png|gif|js|css|swf)$"> Header set Expires "Wed, 15 Apr 2020 20:00:00 GMT" Header set Cache-Control "public" </FilesMatch> <ifModule mod_headers.c> Header set Connection keep-alive </ifModule> ########## Begin - Remove Etags # FileETag none # ########## End - Remove Etags

    Read the article

  • XML: Process large data

    - by Atmocreations
    Hello What XML-parser do you recommend for the following purpose: The XML-file (formatted, containing whitespaces) is around 800 MB. It mostly contains three types of tag (let's call them n, w and r). They have an attribute called id which i'd have to search for, as fast as possible. Removing attributes I don't need could save around 30%, maybe a bit more. First part for optimizing the second part: Is there any good tool (command line linux and windows if possible) to easily remove unused attributes in certain tags? I know that XSLT could be used. Or are there any easy alternatives? Also, I could split it into three files, one for each tag to gain speed for later parsing... Speed is not too important for this preparation of the data, of course it would be nice when it took rather minutes than hours. Second part: Once I have the data prepared, be it shortened or not, I should be able to search for the ID-attribute I was mentioning, this being time-critical. Estimations using wc -l tell me that there are around 3M N-tags and around 418K W-tags. The latter ones can contain up to approximately 20 subtags each. W-Tags also contain some, but they would be stripped away. "All I have to do" is navigating between tags containing certain id-attributes. Some tags have references to other id's, therefore giving me a tree, maybe even a graph. The original data is big (as mentioned), but the resultset shouldn't be too big as I only have to pick out certain elements. Now the question: What XML parsing library should I use for this kind of processing? I would use Java 6 in a first instance, with having in mind to be porting it to BlackBerry. Might it be useful to just create a flat file indexing the id's and pointing to an offset in the file? Is it even necessary to do the optimizations mentioned in the upper part? Or are there parser known to be quite as fast with the original data? Little note: To test, I took the id being on the very last line on the file and searching for the id using grep. This took around a minute on a Core 2 Duo. What happens if the file grows even bigger, let's say 5 GB? I appreciate any notice or recommendation. Thank you all very much in advance and regards

    Read the article

  • MySql multiple selects batching in .net

    - by Amith George
    I have a situation in my application. For each x-axis point in my chart, I am plotting 5 y-axis values. To calculate each of these 5 values, I need to make 4 different queries. Ie, for each x-axis point I need to fire 20 sql queries. Now, I need to plot 40 such points in the my chart. Its resulting in a pathetic performance where it takes close to a minute to get all the data back from the database. Each of 4 different queries consists of a join between 2 tables. One has only 6 rows. The other close to 10,000. Each of the 4 queries has different WHERE clauses, so they are different queries. For each point in the x-axis, only the values for the where clauses change. I have tried combining each of the 4 queries into one big string. Basically batch the four selects. These are again batched for each y-axis value. So, for each x-axis point, I am now firing one big command that consists of 20 different select statements. Technically, I should be experiencing a big performance boost, right? Instead of hitting the db 40x5x4 = 800 times, I am now hitting it just 40 times. But instead of taking 60 seconds, it taking 50-55 seconds... not much of a help. I am using MySql 5.1, and the 6.1 version of its .Net connector. What can I do to improve the performance? Edit: One of the 4 queries is as follows: SELECT SUM(TIME_TO_SEC(TIMEDIFF(T1.col2, T1.col1))* T2.col1 / (3600 *1000)) AS TotalTime FROM Table T1 JOIN Table T2 ON T1.col3 = T2.col3 WHERE T1.col4 = 'i' AND T1.col1 >= '2009-12-25 00:00:00' AND T1.col2 <= '2009-12-26 00:00:00'; The other 3 queries are similar, only the where clause changes slightly. This set of 4 queries is fired 5 times. The first 3 times against the join of table T1 and T2, passing in different values for col4. And the next two times against the join of table T3 and T2 passing in different values for col4. These 5 values are the y-axis values for a particular x-axis point. The data returned by all these queries is the same format. so, we tried doing a UNION ALL on all these queries. No substantial difference. One strange thing, however, after indexing the foreign key on the table T1 [while it contained over a lakh records], the queries were using the index, but they had become slower. At times, the queries would take double the time to return the data.

    Read the article

  • How to optimize Core Data query for full text search

    - by dk
    Can I optimize a Core Data query when searching for matching words in a text? (This question also pertains to the wisdom of custom SQL versus Core Data on an iPhone.) I'm working on a new (iPhone) app that is a handheld reference tool for a scientific database. The main interface is a standard searchable table view and I want as-you-type response as the user types new words. Words matches must be prefixes of words in the text. The text is composed of 100,000s of words. In my prototype I coded SQL directly. I created a separate "words" table containing every word in the text fields of the main entity. I indexed words and performed searches along the lines of SELECT id, * FROM textTable JOIN (SELECT DISTINCT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) ON id=textTableId LIMIT 50 This runs very fast. Using an IN would probably work just as well, i.e. SELECT * FROM textTable WHERE id IN (SELECT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) LIMIT 50 The LIMIT is crucial and allows me to display results quickly. I notify the user that there are too many to display if the limit is reached. This is kludgy. I've spent the last several days pondering the advantages of moving to Core Data, but I worry about the lack of control in the schema, indexing, and querying for an important query. Theoretically an NSPredicate of textField MATCHES '.*\bfoo.*' would just work, but I'm sure it will be slow. This sort of text search seems so common that I wonder what is the usual attack? Would you create a words entity as I did above and use a predicate of "word BEGINSWITH 'foo'"? Will that work as fast as my prototype? Will Core Data automatically create the right indexes? I can't find any explicit means of advising the persistent store about indexes. I see some nice advantages of Core Data in my iPhone app. The faulting and other memory considerations allow for efficient database retrievals for tableview queries without setting arbitrary limits. The object graph management allows me to easily traverse entities without writing lots of SQL. Migration features will be nice in the future. On the other hand, in a limited resource environment (iPhone) I worry that an automatically generated database will be bloated with metadata, unnecessary inverse relationships, inefficient attribute datatypes, etc. Should I dive in or proceed with caution?

    Read the article

  • Lucene.NET search index approach

    - by Tim Peel
    Hi, I am trying to put together a test case for using Lucene.NET on one of our websites. I'd like to do the following: Index in a single unique id. Index across a comma delimitered string of terms or tags. For example. Item 1: Id = 1 Tags = Something,Separated-Term I will then be structuring the search so I can look for documents against tag i.e. tags:something OR tags:separate-term I need to maintain the exact term value in order to search against it. I have something running, and the search query is being parsed as expected, but I am not seeing any results. Here's some code. My parser (_luceneAnalyzer is passed into my indexing service): var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_CURRENT, "Tags", _luceneAnalyzer); parser.SetDefaultOperator(QueryParser.Operator.AND); return parser; My Lucene.NET document creation: var doc = new Document(); var id = new Field( "Id", NumericUtils.IntToPrefixCoded(indexObject.id), Field.Store.YES, Field.Index.NOT_ANALYZED, Field.TermVector.NO); var tags = new Field( "Tags", string.Join(",", indexObject.Tags.ToArray()), Field.Store.NO, Field.Index.ANALYZED, Field.TermVector.YES); doc.Add(id); doc.Add(tags); return doc; My search: var parser = BuildQueryParser(); var query = parser.Parse(searchQuery); var searcher = Searcher; TopDocs hits = searcher.Search(query, null, max); IList<SearchResult> result = new List<SearchResult>(); float scoreNorm = 1.0f / hits.GetMaxScore(); for (int i = 0; i < hits.scoreDocs.Length; i++) { float score = hits.scoreDocs[i].score * scoreNorm; result.Add(CreateSearchResult(searcher.Doc(hits.scoreDocs[i].doc), score)); } return result; I have two documents in my index, one with the tag "Something" and one with the tags "Something" and "Separated-Term". It's important for the - to remain in the terms as I want an exact match on the full value. When I search with "tags:Something" I do not get any results. Question What Analyzer should I be using to achieve the search index I am after? Are there any pointers for putting together a search such as this? Why is my current search not returning any results? Many thanks

    Read the article

  • Wildcard searching and highlighting with Solr 1.4

    - by andy
    Hey guys, I've got a pretty much vanilla install of SOLR 1.4 apart from a few small config and schema changes. <requestHandler name="standard" class="solr.SearchHandler" default="true"> <!-- default values for query parameters --> <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <str name="qf"> text </str> <str name="spellcheck.dictionary">default</str> <str name="spellcheck.onlyMorePopular">false</str> <str name="spellcheck.extendedResults">false</str> <str name="spellcheck.count">1</str> </lst> </requestHandler> The main field type I'm using for Indexing is this: <fieldType name="textNoHTML" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <charFilter class="solr.HTMLStripCharFilterFactory" /> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> </fieldType> now, when I perform a search using "q=search+term&hl=on" I get highlighting, and nice accurate scores. BUT, for wildcard, I'm assuming you need to use "q.alt"? Is that true? If so my query looks like this: "q.alt=search*&hl=on" When I use the above query, highlighting doesn't work, and all the scores are "1.0". What am I doing wrong? is what I want possible without bypassing some of the really cool SOLR optimizations. cheers!

    Read the article

  • NoSQL DB for .Net document-based database (ECM)

    - by Dane
    I'm halfway through coding a basic multi-tenant SaaS ECM solution. Each client has it's own instance of the database / datastore, but the .Net app is single instance. The documents are pretty much read only (i.e. an image archive of tiffs or PDFs) I've used MSSQL so far, but then started thinking this might be viable in a NoSQL DB (e.g. MongoDB, CouchDB). The basic premise is that it stores documents, each with their own particular indexes. Each tenant can have multiple document types. e.g. One tenant might have an invoice type, which has Customer ID, Invoice Number and Invoice Date. Another tenant might have an application form, which has Member Number, Application Number, Member Name, and Application Date. So far I've used the old method which Sharepoint (used?) to use, and created a document table which has int_field_1, int_field_2, date_field_1, date_field_2, etc. Then, I've got a "mapping" table which stores the customer specific index name, and the database field that will map to. I've avoided the key-value pair model in the DB due to volume of documents. This way, we can support multiple document types in the one table, and get reasonably high performance out of it, and allow for custom document type searches (i.e. user selects a document type, then they're presented with a list of search fields). However, a NoSQL DB might make this a lot simpler, as I don't need to worry about denormalizing the document. However, I've just got concerns about the rest of the data around a document. We store an "action history" against the document. This tracks views, whether someone emails the document from within the system, and other "future" functionality (e.g. faxing). We have control over the document load process, so we can manipulate the data however it needs to be to get it in the document store (e.g. assign unique IDs). Users will not be adding in their own documents, so we shouldn't need to worry about ACID compliance, as the documents are relatively static. So, my questions I guess : Is a NoSQL DB a good fit Is MongoDB the best for Asp.Net (I saw Raven and Velocity, but they're still kinda beta) Can I store a key for each document, and then store the action history in a MSSQL DB with this key? I don't need to do joins, it would be if a person clicks "View History" against a document. How would performance compare between the two (NoSQL DB vs denormalized "document" table) Volumes would be up to 200,000 new documents per month for a single tenant. My current scaling plan with the SQL DB involves moving the SQL DB into a cluster when certain thresholds are reached, and then reviewing partitioning and indexing structures.

    Read the article

  • Lucene: Wildcards are missing from index

    - by Eleasar
    Hi - i am building a search index that contains special names - containing ! and ? and & and + and ... I have to tread the following searches different: me & you me + you But whatever i do (did try with queryparser escaping before indexing, escaped it manually, tried different indexers...) - if i check the search index with Luke they do not show up (question marks and @-symbols and the like show up) The logic behind is that i am doing partial searches for a live suggestion (and the fields are not that large) so i split it up into "m" and "me" and "+" and "y" and "yo" and "you" and then index it (that way it is way faster than a wildcard query search (and the index size is not a big problem). So what i would need is to also have this special wildcard characters be inserted into the index. This is my code: using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text; using Lucene.Net.Analysis; using Lucene.Net.Util; namespace AnalyzerSpike { public class CustomAnalyzer : Analyzer { public override TokenStream TokenStream(string fieldName, TextReader reader) { return new ASCIIFoldingFilter(new LowerCaseFilter(new CustomCharTokenizer(reader))); } } public class CustomCharTokenizer : CharTokenizer { public CustomCharTokenizer(TextReader input) : base(input) { } public CustomCharTokenizer(AttributeSource source, TextReader input) : base(source, input) { } public CustomCharTokenizer(AttributeFactory factory, TextReader input) : base(factory, input) { } protected override bool IsTokenChar(char c) { return c != ' '; } } } The code to create the index: private void InitIndex(string path, Analyzer analyzer) { var writer = new IndexWriter(path, analyzer, true); //some multiline textbox that contains one item per line: var all = new List<string>(txtAllAvailable.Text.Replace("\r","").Split('\n')); foreach (var item in all) { writer.AddDocument(GetDocument(item)); } writer.Optimize(); writer.Close(); } private static Document GetDocument(string name) { var doc = new Document(); doc.Add(new Field( "name", DeNormalizeName(name), Field.Store.YES, Field.Index.ANALYZED)); doc.Add(new Field( "raw_name", name, Field.Store.YES, Field.Index.NOT_ANALYZED)); return doc; } (Code is with Lucene.net in version 1.9.x (EDIT: sorry - was 2.9.x) but is compatible with Lucene from Java) Thx

    Read the article

  • How effective are technical test(s) and is it necessary?

    - by The Elite Gentleman
    Hi everyone, I recently took a Java technical test (for a company who wanted are looking for senior java developers) and, funny enough, I only realised the technical terms of what I've been doing all along after I've written the test. I'm not too IT jargon when it comes to development but I can pretty much code and create solutions unaware that I'm using design pattern (or the specifics of that design pattern) or technology. I learned things such as JMS, Frameworks, etc. while programming at home and having to google stuff online to problems I have. Others e.g. IoC, Surrogates in Databases, etc., I have used extensively without knowing that it had a name for it. Do you think that these technical test are effective and why? What interesting questions did you find that boggled your brains out while the clock kept ticking? Seeing that IT is vastly evolving at a rapid rate, do we have to constantly be updated with new terms that comes out? Some questions I was asked : What object oriented principle is violated by this architectural mechanism for dot notation? Is indexing tables effective for range query or point query search? What is ThreadLocal and what is it used for? Method overloading vs Method overriding. What is the difference between the 2? What is dynamic binding? Now, imagine my poor head trying to understand these jargons (considering I use it almost everyday) PS The question was not a programming question, where you have a problem and write code to solve it. Rather, a thinking type question and you write answers (against the clock). Update I clearly didn't come out clearly as I should have. There are those that are technically "book smart" but with very little hands-on experience and vice versa. So, the question (in connection to what I've asked) is that are these technical test seeking "book smart" people or people with lots of hands-on experience (some who are not that well clued up with too much book-smart jargons). How effective is it then, for companies to look for developers if most of the questions are too terminology-centric? (if that's the correct term, :))

    Read the article

  • using '.each' method: how do I get the indexes of multiple ordered lists to each begin at [0]?

    - by shecky
    I've got multiple divs, each with an ordered list (various lengths). I'm using jquery to add a class to each list item according to its index (for the purpose of columnizing portions of each list). What I have so far ... <script type="text/javascript"> /* Objective: columnize list items from a single ul or ol in a pre-determined number of columns 1. get the index of each list item 2. assign column class according to li's index */ $(document).ready(function() { $('ol li').each(function(index){ // assign class according to li's index ... index = li number -1: 1-6 = 0-5; 7-12 = 6-11, etc. if ( index <= 5 ) { $(this).addClass('column-1'); } if ( index > 5 && index < 12 ) { $(this).addClass('column-2'); } if ( index > 11 ) { $(this).addClass('column-3'); } // add another class to the first list item in each column $('ol li').filter(function(index) { return index != 0 && index % 6 == 0; }).addClass('reset'); }); // closes li .each func }); // closes doc.ready.func </script> ... succeeds if there's only one list; when there are additional lists, the last column class ('column-3') is added to all remaining list items on the page. In other words, the script is presently indexing continuously through all subsequent lists/list items, rather than being re-set to [0] for each ordered list. Can someone please show me the proper method/syntax to correct/amend this, so that the script addresses/indexes each ordered list anew? many thanks in advance. shecky p.s. the markup is pretty straight-up: <div class="tertiary"> <h1>header</h1> <ol> <li><a href="#" title="a link">a link</a></li> <li><a href="#" title="a link">a link</a></li> <li><a href="#" title="a link">a link</a></li> </ol> </div><!-- END div class="tertiary" -->

    Read the article

  • IEnumerable<T> ToArray usage, is it a copy or a pointer?

    - by Daniel
    I am parsing an arbitrary length byte array that is going to be passed around to a few different layers of parsing. Each parser creates a Header and a Packet payload just like any ordinary encapsulation. And my problem lies in how the encapsulation holds its packet byte array payload. Say i have a 100 byte array, and it has 3 levels of encapsulation. 3 packet objects will be created and i want to set the payload of these packets to the corresponding position in the byte array of the packet. For example lets say the payload size is 20 for all levels, then imagine it has a public byte[] Payload on each object. However the problem is that this byte[] Payload is a copy of the original 100 bytes. So i'm going to end up with 160 bytes in memory instead of 100. If it were in c++ i could just easily use a pointer however i'm writing this in c#. So i created the following class: public class PayloadSegment<T> : IEnumerable<T> { public readonly T[] Array; public readonly int Offset; public readonly int Count; public PayloadSegment(T[] array, int offset, int count) { this.Array = array; this.Offset = offset; this.Count = count; } public T this[int index] { get { if (index < 0 || index >= this.Count) throw new IndexOutOfRangeException(); else return Array[Offset + index]; } set { if (index < 0 || index >= this.Count) throw new IndexOutOfRangeException(); else Array[Offset + index] = value; } } public IEnumerator<T> GetEnumerator() { for (int i = Offset; i < Offset + Count; i++) yield return Array[i]; } System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { IEnumerator<T> enumerator = this.GetEnumerator(); while (enumerator.MoveNext()) { yield return enumerator.Current; } } } This way i can simply reference a position inside the original byte array but use positional indexing. However if i do something like: PayloadSegment<byte> something = new PayloadSegment<byte>(someArray, 5, 10); byte[] somethingArray = something.ToArray(); Will the somethingArray be a copy of the bytes, or a reference to the original PayloadSegment which in turn is a reference to the original byte array? Sorry it was hard to word this lol _<

    Read the article

  • How to define and work with an array of bits in C?

    - by Eddy
    I want to create a very large array on which I write '0's and '1's. I'm trying to simulate a physical process called random sequential adsorption, where units of length 2, dimers, are deposited onto an n-dimensional lattice at a random location, without overlapping each other. The process stops when there is no more room left on the lattice for depositing more dimers (lattice is jammed). Initially I start with a lattice of zeroes, and the dimers are represented by a pair of '1's. As each dimer is deposited, the site on the left of the dimer is blocked, due to the fact that the dimers cannot overlap. So I simulate this process by depositing a triple of '1's on the lattice. I need to repeat the entire simulation a large number of times and then work out the average coverage %. I've already done this using an array of chars for 1D and 2D lattices. At the moment I'm trying to make the code as efficient as possible, before working on the 3D problem and more complicated generalisations. This is basically what the code looks like in 1D, simplified: int main() { /* Define lattice */ array = (char*)malloc(N * sizeof(char)); total_c = 0; /* Carry out RSA multiple times */ for (i = 0; i < 1000; i++) rand_seq_ads(); /* Calculate average coverage efficiency at jamming */ printf("coverage efficiency = %lf", total_c/1000); return 0; } void rand_seq_ads() { /* Initialise array, initial conditions */ memset(a, 0, N * sizeof(char)); available_sites = N; count = 0; /* While the lattice still has enough room... */ while(available_sites != 0) { /* Generate random site location */ x = rand(); /* Deposit dimer (if site is available) */ if(array[x] == 0) { array[x] = 1; array[x+1] = 1; count += 1; available_sites += -2; } /* Mark site left of dimer as unavailable (if its empty) */ if(array[x-1] == 0) { array[x-1] = 1; available_sites += -1; } } /* Calculate coverage %, and add to total */ c = count/N total_c += c; } For the actual project I'm doing, it involves not just dimers but trimers, quadrimers, and all sorts of shapes and sizes (for 2D and 3D). I was hoping that I would be able to work with individual bits instead of bytes, but I've been reading around and as far as I can tell you can only change 1 byte at a time, so either I need to do some complicated indexing or there is a simpler way to do it? Thanks for your answers

    Read the article

  • How can I improve my select query for storing large versioned data sets?

    - by Jason Francis
    At work, we build large multi-page web applications, consisting mostly of radio and check boxes. The primary purpose of each application is to gather data, but as users return to a page they have previously visited, we report back to them their previous responses. Worst-case scenario, we might have up to 900 distinct variables and around 1.5 million users. For several reasons, it makes sense to use an insert-only approach to storing the data (as opposed to update-in-place) so that we can capture historical data about repeated interactions with variables. The net result is that we might have several responses per user per variable. Our table to collect the responses looks something like this: CREATE TABLE [dbo].[results]( [id] [bigint] IDENTITY(1,1) NOT NULL, [userid] [int] NULL, [variable] [varchar](8) NULL, [value] [tinyint] NULL, [submitted] [smalldatetime] NULL) Where id serves as the primary key. Virtually every request results in a series of insert statements (one per variable submitted), and then we run a select to produce previous responses for the next page (something like this): SELECT t.id, t.variable, t.value FROM results t WITH (NOLOCK) WHERE t.userid = '2111846' AND (t.variable='internat' OR t.variable='veteran' OR t.variable='athlete') AND t.id IN (SELECT MAX(id) AS id FROM results WITH (NOLOCK) WHERE userid = '2111846' AND (t.variable='internat' OR t.variable='veteran' OR t.variable='athlete') GROUP BY variable) Which, in this case, would return the most recent responses for the variables "internat", "veteran", and "athlete" for user 2111846. We have followed the advice of the database tuning tools in indexing the tables, and against our data, this is the best-performing version of the select query that we have been able to come up with. Even so, there seems to be significant performance degradation as the table approaches 1 million records (and we might have about 150x that). We have a fairly-elegant solution in place for sharding the data across multiple tables which has been working quite well, but I am open for any advice about how I might construct a better version of the select query. We use this structure frequently for storing lots of independent data points, and we like the benefits it provides. So the question is, how can I improve the performance of the select query? I assume the nested select statement is a bad idea, but I have yet to find an alternative that performs as well. Thanks in advance. NB: Since we emphasize creating over reading in this case, and since we never update in place, there doesn't seem to be any penalty (and some advantage) for using the NOLOCK directive in this case.

    Read the article

  • How to compare DateTime Objects while looping through a list?

    - by Taniq
    I'm trying to loop through a list (csv) containing two fields; a name and a date. There are various duplicated names and various dates in the list. I'm trying to deduce for each name in the list, where there are multiple instances of the same name, which corresponding date is the latest. I realise, from looking at another answer, that I need to use the DateTime.Compare method which is fine, but my problem is working out which date is later. Once I know this I need to produce a file with unique names and the latest date relating to it. This is my first question which makes me a newbie. EDIT: Initially I thought it would be 'ok' to set the LatestDate object to a date that wouldn't show up in my file, therefore making any later dates in the file the LatestDate. Here's my coding so far: using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.IO; namespace flybe_overwriter { class Program { static DateTime currentDate; static DateTime latestDate = new DateTime(1000,1,1); static HashSet<string> uniqueNames = new HashSet<string>(); static string indexpath = @"e:\flybe test\indexing.csv"; static string[] indexlist = File.ReadAllLines(indexpath); static StreamWriter outputfile = new StreamWriter(@"e:\flybe test\match.csv"); static void Main(string[] args) { foreach (string entry in indexlist) { uniqueNames.Add(entry.Split(',')[0]); } HashSet<string>.Enumerator fenum = new HashSet<string>.Enumerator(); fenum = uniqueNames.GetEnumerator(); while (fenum.MoveNext()) { string currentName = fenum.Current; foreach (string line in indexlist) { currentDate = new DateTime(Convert.ToInt32(line.Split(',')[1].Substring(4, 4)), Convert.ToInt32(line.Split(',')[1].Substring(2, 2)), Convert.ToInt32(line.Split(',')[1].Substring(0, 2))); if (currentName == line.Split(',')[0]) { if(DateTime.Compare(latestDate.Date, currentDate.Date) < 1) { // Console.WriteLine(currentName + " " + latestDate.ToShortDateString() + " is earlier than " + currentDate.ToShortDateString()); } else if (DateTime.Compare(latestDate.Date, currentDate.Date) > 1) { // Console.WriteLine(currentName + " " + latestDate.ToShortDateString() + " is later than " + currentDate.ToShortDateString()); } else if (DateTime.Compare(latestDate.Date, currentDate.Date) == 0) { // Console.WriteLine(currentName + " " + latestDate.ToShortDateString() + " is the same as " + currentDate.ToShortDateString()); } } } } } } } Any help appreciated. Thanks.

    Read the article

  • jquery mouseent/leave to make div appears

    - by Blake
    I am looking to hover over my list item and have an effect similar to something like facebook chat is my best example..I am able to get the first div to appear but I believe this may be a selector issue because I cant get the rest working properly html <ul id="menu_seo" class="menu"> <li id="menu-seo"><span class="arrowout1"></span>SEO</li> <li id="menu-siteaudits"><span class="arrowout2"></span>Site Audits </li> <li id="menu-linkbuilding"><span class="arrowout3"></span>Link-Building</li> <li id="menu-localseo"><span class="arrowout4"></span>Local SEO</li> </ul> <div id="main_content"> <div id="menu-seo-desc"> <p>SEO management begins with a full website diagnosis of current web strategy Adjustments are made to improve your site’s ability to rank higher on search engines and draw more traffic </p> </div> <div id="menu-seo-desc2"> <p>Usability & site architecture review, Search Engine accessibility and indexing, Keyword research & targeting and Conversion rate optimization </p> </div> </div> css #menu-seo-desc { height:125px; width:210px; background-color:red; border-color:#CCC #E8E8E8 #E8E8E8 #CCC; border-style:solid; border-width:1.5px; border-radius:5px; box-shadow: 1px 0 2px 0px #888; -moz-box-shadow: 1px 0 2px 0px #888; -webkit-box-shadow: 1px 0 2px 1px #888; position:absolute; top:220px; left:350px; display:none; } js <script> $(document).ready(function(){ <script> $(document).ready(function(){ $('#menu_seo').on('#menu-seo', { 'mouseenter': function() { $('#menu-seo-desc').fadeIn(600); $('#menu-seo-desc2').fadeIn(600); }, 'mouseleave': function() { $('#menu-seo-desc').fadeOut(300); $('#menu-seo-desc2').fadeOut(300); } }); }); </script> });

    Read the article

  • vim-powerline colors are out of whack in urxvt

    - by komidore64
    I have attached two images showing what my vim-powerline looks like. As you can see, something has happened to the colors and I cannot figure out how to fix it. I'm running Fedora 17 on a clean install with i3 (default config) and urxvt. Here is my bashrc: # .bashrc if [[ "$(uname)" != "Darwin" ]]; then # non mac os x # source global bashrc if [[ -f "/etc/bashrc" ]]; then . /etc/bashrc fi export TERM='xterm-256color' # probably shouldn't do this fi # bash prompt with colors # [ <user>@<hostname> <working directory> {current git branch (if you're in a repo)} ] # ==> PS1="\[\e[1;33m\][ \u\[\e[1;37m\]@\[\e[1;32m\]\h\[\e[1;33m\] \W\$(git branch 2> /dev/null | grep -e '\* ' | sed 's/^..\(.*\)/ {\[\e[1;36m\]\1\[\e[1;33m\]}/') ]\[\e[0m\]\n==> " # execute only in Mac OS X if [[ "$(uname)" == 'Darwin' ]]; then # if OS X has a $HOME/bin folder, then add it to PATH if [[ -d "$HOME/bin" ]]; then export PATH="$PATH:$HOME/bin" fi alias ls='ls -G' # ls with colors fi alias ll='ls -lah' # long listing of all files with human readable file sizes alias tree='tree -C' # turns on coloring for tree command alias mkdir='mkdir -p' # create parent directories as needed alias vim='vim -p' # if more than one file, open files in tabs export EDITOR='vim' # super-secret work stuff if [[ -f "$HOME/.workbashrc" ]]; then . $HOME/.workbashrc fi # Add RVM to PATH for scripting if [[ -d "$HOME/.rvm/bin" ]]; then # if installed PATH=$PATH:$HOME/.rvm/bin fi and my Xdefaults: ! URxvt config ! colors! URxvt.background: #101010 URxvt.foreground: #ededed URxvt.cursorColor: #666666 URxvt.color0: #2E3436 URxvt.color8: #555753 URxvt.color1: #993C3C URxvt.color9: #BF4141 URxvt.color2: #3C993C URxvt.color10: #41BF41 URxvt.color3: #99993C URxvt.color11: #BFBF41 URxvt.color4: #3C6199 URxvt.color12: #4174FB URxvt.color5: #993C99 URxvt.color13: #BF41BF URxvt.color6: #3C9999 URxvt.color14: #41BFBF URxvt.color7: #D3D7CF URxvt.color15: #E3E3E3 ! options URxvt*loginShell: true URxvt*font: xft:DejaVu Sans Mono for Powerline:antialias=true:size=12 URxvt*saveLines: 8192 URxvt*scrollstyle: plain URxvt*scrollBar_right: true URxvt*scrollTtyOutput: true URxvt*scrollTtyKeypress: true URxvt*urlLauncher: google-chrome and finally my vimrc set nocompatible set dir=~/.vim/ " set one place for vim swap files " vundler for vim plugins ---- filetype off set rtp+=~/.vim/bundle/vundle call vundle#rc() Bundle 'gmarik/vundle' Bundle 'tpope/vim-surround' Bundle 'greyblake/vim-preview' Bundle 'Lokaltog/vim-powerline' Bundle 'tpope/vim-endwise' Bundle 'kien/ctrlp.vim' " ---------------------------- syntax enable filetype plugin indent on " Powerline ------------------ set noshowmode set laststatus=2 let g:Powerline_symbols = 'fancy' " show fancy symbols (requires patched font) set encoding=utf-8 " ---------------------------- " ctrlp ---------------------- let g:ctrlp_open_multiple_files = 'tj' " open multiple files in additional tabs let g:ctrlp_show_hidden = 1 " include dotfiles and dotdirs in ctrlp indexing let g:ctrlp_prompt_mappings = { \ 'AcceptSelection("e")': ['<c-t>'], \ 'AcceptSelection("t")': ['<cr>', '<2-LeftMouse>'], \ } " remap <cr> to open file in a new tab " ---------------------------- set showcmd set tabpagemax=100 set hlsearch set incsearch set nowrapscan set ignorecase set smartcase set ruler set tabstop=4 set shiftwidth=4 set expandtab set wildmode=list:longest autocmd BufWritePre * :%s/\s\+$//e "remove trailing whitespace " :REV to "revert" file to state of the most recent save command REV earlier 1f " disable netrw -------------- let g:loaded_netrw = 1 let g:loaded_netrwPlugin = 1 " ---------------------------- Any guidance as to fixing the statusline would be fantastic. I've found a github issue outlining almost the exact same problem, but the solution was never posted. Thank you.

    Read the article

  • When searching in Outlook, encountered this error "Instant Search encountered a problem while trying

    - by Imagineer
    Sometime when searching for certain keyword, I get this error "Instant Search encountered a problem while trying to display search results. Modifying your query may resolve this problem." I have enabled Outlook logging to determine what is the error as suggested by someone in other forum. but I haven't had a clue how to decipher it. 2010.05.11 09:38:10 <<<< Logging Started (level is LTF_TRACE) >>>> 2010.05.11 09:38:10 HELPER::Initialize called 2010.05.11 09:38:10 Initializing: Finding a Transport 2010.05.11 09:38:10 MAPI XP Call: XPProviderInit in EMSMDB.DLL, hr = 0x00000000 2010.05.11 09:38:10 MAPI XP Call: TransportLogon, hr = 0x8004011d 2010.05.11 09:38:10 MAPI XP Call: Shutdown, hr = 0x00000000 2010.05.11 09:38:10 MAPI XP Call: XPProviderInit in EMSMDB.DLL, hr = 0x00000000 2010.05.11 09:38:10 MAPI Status: (-- -- ---/--- -- ---) 2010.05.11 09:38:10 MAPI XP Call: TransportLogon, hr = 0x00000000 2010.05.11 09:38:10 Initializing: Found a transport, Error code = 0x00000000 2010.05.11 09:38:10 MAPI XP Call: AddressTypes, hr = 0x00000000, cAddrs = 4, cUids = 1 2010.05.11 09:38:10 MAPI XP Call: RegisterOptions, hr = 0x00000000, cOptions = 2 2010.05.11 09:38:10 MAPI Status: (IN -- ---/OUT -- ---) 2010.05.11 09:38:10 MAPI XP Call: TransportNotify(BEGIN_IN|BEGIN_OUT), hr = 0x00000000 2010.05.11 09:38:10 HELPER::Initialize done, Error code = 0x00000000 2010.05.11 09:38:10 HELPER::GetCapabilities called, Error code = 0x00000000 2010.05.11 09:38:10 Microsoft Exchange: Synch operation started (flags = 00000031) 2010.05.11 09:38:10 Microsoft Exchange: StartImport(flags = 00000000, max msg = ffffffff): full items 2010.05.11 09:38:10 Microsoft Exchange: UploadItems: 0 messages to send 2010.05.11 09:38:11 Starting the Spooling Cycle 2010.05.11 09:38:11 MAPI Status: (IN fl ---/OUT -- ---) 2010.05.11 09:38:11 MAPI XP Call: FlushQueues, hr = 0x00000000, ulFlushFlags = 0x0000001c 2010.05.11 09:38:11 MAPI XP Call: Poll, hr = 0x00000000, cPollCount = 855 2010.05.11 09:38:11 Progress: Receiving message (message 1 out of 856, size unknown) 2010.05.11 09:38:11 Downloading one message 2010.05.11 09:38:11 Transport tightly coupled with store, download is NOOP 2010.05.11 09:38:11 Downloading done, Error code = 0x8004010f 2010.05.11 09:38:11 MAPI Status: (IN -- ---/OUT -- ---) 2010.05.11 09:38:11 FINISHED MAPI TASK 2010.05.11 09:38:11 Microsoft Exchange: ReportStatus: RSF_COMPLETED, hr = 0x00000000 2010.05.11 09:38:11 Finishing the Spooling Cycle, Error code = 0x00000000 2010.05.11 09:38:11 EXECUTING EndSession MAPI TASK 2010.05.11 09:38:11 Starting the Simplified Transfer Cycle 2010.05.11 09:38:11 MAPI XP Call: Poll, hr = 0x00000000, iMsgsReceived = 0, cPollCount = 855 2010.05.11 09:38:11 Progress: Receiving message (message 1 out of 856, size unknown) 2010.05.11 09:38:11 Downloading one message 2010.05.11 09:38:11 MAPI Status: (IN -- act/OUT -- ---) 2010.05.11 09:38:11 MAPI Status: (IN -- ---/OUT -- ---) 2010.05.11 09:38:11 Downloading done, Error code = 0x8004010f 2010.05.11 09:38:11 Finishing the Spooling Cycle, Error code = 0x00000000 2010.05.11 09:38:11 FINISHED MAPI TASK 2010.05.11 09:38:11 Microsoft Exchange: ReportStatus: RSF_COMPLETED, hr = 0x00000000 2010.05.11 09:38:11 Microsoft Exchange: Synch operation completed 2010.05.11 10:08:15 Microsoft Exchange: Synch operation started (flags = 00000031) 2010.05.11 10:08:15 Microsoft Exchange: StartImport(flags = 00000000, max msg = ffffffff): full items 2010.05.11 10:08:15 Microsoft Exchange: UploadItems: 0 messages to send 2010.05.11 10:08:16 Starting the Spooling Cycle 2010.05.11 10:08:16 MAPI Status: (IN fl ---/OUT -- ---) 2010.05.11 10:08:16 MAPI XP Call: FlushQueues, hr = 0x00000000, ulFlushFlags = 0x0000001c 2010.05.11 10:08:16 MAPI XP Call: Poll, hr = 0x00000000, cPollCount = 858 2010.05.11 10:08:16 Progress: Receiving message (message 1 out of 859, size unknown) 2010.05.11 10:08:16 Downloading one message 2010.05.11 10:08:16 Transport tightly coupled with store, download is NOOP 2010.05.11 10:08:16 Downloading done, Error code = 0x8004010f 2010.05.11 10:08:16 MAPI Status: (IN -- ---/OUT -- ---) 2010.05.11 10:08:16 FINISHED MAPI TASK 2010.05.11 10:08:16 Microsoft Exchange: ReportStatus: RSF_COMPLETED, hr = 0x00000000 2010.05.11 10:08:16 Finishing the Spooling Cycle, Error code = 0x00000000 2010.05.11 10:08:16 EXECUTING EndSession MAPI TASK 2010.05.11 10:08:16 Starting the Simplified Transfer Cycle 2010.05.11 10:08:16 MAPI XP Call: Poll, hr = 0x00000000, iMsgsReceived = 0, cPollCount = 858 2010.05.11 10:08:16 Progress: Receiving message (message 1 out of 859, size unknown) 2010.05.11 10:08:16 Downloading one message 2010.05.11 10:08:16 MAPI Status: (IN -- act/OUT -- ---) 2010.05.11 10:08:16 MAPI Status: (IN -- ---/OUT -- ---) 2010.05.11 10:08:16 Downloading done, Error code = 0x8004010f 2010.05.11 10:08:16 Finishing the Spooling Cycle, Error code = 0x00000000 2010.05.11 10:08:16 FINISHED MAPI TASK 2010.05.11 10:08:16 Microsoft Exchange: ReportStatus: RSF_COMPLETED, hr = 0x00000000 2010.05.11 10:08:16 Microsoft Exchange: Synch operation completed 2010.05.11 10:09:48 HELPER::Uninitialize called 2010.05.11 10:09:48 MAPI Status: (-- -- ---/--- -- ---) 2010.05.11 10:09:48 MAPI XP Call: TransportNotify(END_IN|END_OUT), hr = 0x00000000 2010.05.11 10:09:48 MAPI XP Call: TransportLogoff in EMSMDB.DLL, hr = 0x00000000 2010.05.11 10:09:48 MAPI XP Call: Shutdown, hr = 0x00000000 2010.05.11 10:09:48 Resource manager terminated I'm running Outlook 2007 SP1 in Citrix environment and should be running in Cache Mode. In my Outlook Tools-Options-Search Option, there is nothing under indexing. Any help is greatly appreciated! Thank you.

    Read the article

  • SQL SERVER – Developer Training Resources and Summary Roundup

    - by pinaldave
    It is always pleasure for any author when other renowned authors in the industry write about you. Earlier I wrote a five part blog series on Developer Training and I have received a phenomenal response to the series. I have received plenty of comments, questions and feedback. I thought it would be nice to sum up the whole series as well answer a few of the questions received. Quick Recap Developer Training - Importance and Significance - Part 1 In this part we discussed the importance of training in the real world. The most important and valuable resource any company is its employee. Employees who have been well-trained will be better at their jobs and produce a better product.  An employee who is well trained obviously knows more about their job and all the technical aspects. I have a very high opinion about training employees and it is the most important task. Developer Training – Employee Morals and Ethics – Part 2 In this part we discussed the most crucial components of training. Often employees are expecting the company to pay for their training and the company expresses no interest in training the employee. Quite often training expenses are the real issue for both the employee and employer. There are companies that pay for 100% of the expenses and there are employees who opt for training on their own expense during their personal time. Training is often looked at as vacation by employee and employers and we need to change this mind-set. One of the ways is to report back the learning to your manager and implement newly learned knowledge in day-to-day work. Developer Training – Difficult Questions and Alternative Perspective - Part 3 This part was the most difficult to write as I tried to address a few difficult questions and answers. Training is such a sensitive issue that many developers when not receiving chance for training think about leaving the organization. The manager often feels pressure to accommodate every single employee for training even though his training budget is limited. It is indeed the responsibility of the developer to get maximum advantage from the training. Training immediately helps organizations but stays as a part of an employee’s knowledge forever. Developer Training – Various Options for Developer Training – Part 4 In this part I tried to explore a few methods and options for training. The generic feedback I received on this blog post was short and I should have explored each of the subject of the training in details. I believe there are two big buckets of training 1) Instructor Lead Training and 2) Self Lead Training. The common element between both the methods is “learning material”. Learning material can be of any format – videos, books, paper notes or just a plain black board. Instructor-led training is a very effective mode but not possible every single time. During the course of the developer’s career, one has to learn lots of new technology and it is almost impossible to have a quality trainer available on that subject at that time. Books are most effective and proven methods, however, it always helps if someone explains the concepts of the book with a demonstration. In recent times I have started to believe in online trainings which leads to a hybrid experience. Online trainings take the best part of the books and the best part of the instructor-led training and gives effective training in a matter of hours. Developer Training – A Conclusive Summary- Part 5 In this part, I shared what I was continuously thinking about developer training. There is no better teacher than oneself. There is no better motivation than a personal desire to learn new technology. Honestly there is nothing more personal learning. That “change is the only constant” and “adapt & overcome” are the essential lessons of life. One cannot stop the learning and resist the change. In the IT industry “ego of knowing all” and the “resistance to change” are the most challenging issues. Once someone overcomes them, life is much easier. I believe that proper and appropriate high quality training can help to address the burning issues. Opinion of Friends I invited a few of my friends to express their opinion about developer training and here are their opinions. I am listing them here in the order of the blog post publishing date. Nakul Vachhrajani - Developer Trainings-Importance, Benefits, Tips and follow-up Nakul’s sums of many of the concepts which are complementary to my blog posts. Nakul addresses the burning question of developer training with different angles. I am personally very impressed by his following statement - “Being skilled does not mean having just a stack of certifications, but it also means having an understanding about the internals of the products that you are working on – and using that knowledge to improve the efficiency & productivity at the workplace in turn resulting in better products, better consulting abilities and a happier self.” Nakul also suggests the online training options of Pluralsight. Vinod Kumar - Training–a necessity or bonus Vinod Kumar comes up with excellent follow up on developer training. Vinod is known for his inspirational writing about SQL Server. Vinod starts with a story of a student who is extremely eager to learn the wisdom of life from a monk but the monk does not accept him as a disciple for a long time. The conversation between student and monk is indeed an essence of all learning. We all want to learn quickly and be successful but the most important thing in life is to have the right attitude towards learning and more so towards life. The blog post end with a very important thought about how to avoid the famous excuse – “I don’t have enough time.” Ritesh Shah - Training – useful or useless? Ritesh brings up very important concept related to training. Ritesh in his meticulous style explains why training is an important and lifelong process. Training must not stop at any age but should continue forever. The moment training stops, progress stops along with. Paras Doshi - Professional Development Resource Paras is known for his to–the-point writing, and has summarized the five part series very precisely. He read the five part series and created a digest summary of the blog post. If you are in a rush and have no time to read my five series – I suggest you read his blog post. Training Resources I am often asked what the best resources for learning new technology are. This is the most difficult question EVER. There are plenty of good training resources available. When it is about training our needs are different, our preference of learning is different and we all have an opinion. Additionally, we all are located in different geographic locations worldwide and there is no way one solution will fit all. However, let me list a few of the training resources which I have built so far and you can consume them if you find it relevant to your need. SQL Server Books SQL Server Interview Questions and Answers SQL Wait Stats SQL Programming Joes 2 Pros SQL Server Video Tutorials SQL Server Questions and Answers SQL Server Performance: Indexing Basics SQL Server Performance: Introduction to Query Tuning SQL in Sixty Seconds Series of Sixty Seconds Learning Video on YouTube Trust me worldwide web is very big and there are plenty of high quality learning materials available worldwide – trainer-led as well online. I suggest you explore various options and make the best choice for yourself. Remember, training is your personal journey and it should never stop. Are you ready? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Developer Training, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • ASP.NET MVC 3: Implicit and Explicit code nuggets with Razor

    - by ScottGu
    This is another in a series of posts I’m doing that cover some of the new ASP.NET MVC 3 features: New @model keyword in Razor (Oct 19th) Layouts with Razor (Oct 22nd) Server-Side Comments with Razor (Nov 12th) Razor’s @: and <text> syntax (Dec 15th) Implicit and Explicit code nuggets with Razor (today) In today’s post I’m going to discuss how Razor enables you to both implicitly and explicitly define code nuggets within your view templates, and walkthrough some code examples of each of them.  Fluid Coding with Razor ASP.NET MVC 3 ships with a new view-engine option called “Razor” (in addition to the existing .aspx view engine).  You can learn more about Razor, why we are introducing it, and the syntax it supports from my Introducing Razor blog post. Razor minimizes the number of characters and keystrokes required when writing a view template, and enables a fast, fluid coding workflow. Unlike most template syntaxes, you do not need to interrupt your coding to explicitly denote the start and end of server blocks within your HTML. The Razor parser is smart enough to infer this from your code. This enables a compact and expressive syntax which is clean, fast and fun to type. For example, the Razor snippet below can be used to iterate a collection of products and output a <ul> list of product names that link to their corresponding product pages: When run, the above code generates output like below: Notice above how we were able to embed two code nuggets within the content of the foreach loop.  One of them outputs the name of the Product, and the other embeds the ProductID within a hyperlink.  Notice that we didn’t have to explicitly wrap these code-nuggets - Razor was instead smart enough to implicitly identify where the code began and ended in both of these situations.  How Razor Enables Implicit Code Nuggets Razor does not define its own language.  Instead, the code you write within Razor code nuggets is standard C# or VB.  This allows you to re-use your existing language skills, and avoid having to learn a customized language grammar. The Razor parser has smarts built into it so that whenever possible you do not need to explicitly mark the end of C#/VB code nuggets you write.  This makes coding more fluid and productive, and enables a nice, clean, concise template syntax.  Below are a few scenarios that Razor supports where you can avoid having to explicitly mark the beginning/end of a code nugget, and instead have Razor implicitly identify the code nugget scope for you: Property Access Razor allows you to output a variable value, or a sub-property on a variable that is referenced via “dot” notation: You can also use “dot” notation to access sub-properties multiple levels deep: Array/Collection Indexing: Razor allows you to index into collections or arrays: Calling Methods: Razor also allows you to invoke methods: Notice how for all of the scenarios above how we did not have to explicitly end the code nugget.  Razor was able to implicitly identify the end of the code block for us. Razor’s Parsing Algorithm for Code Nuggets The below algorithm captures the core parsing logic we use to support “@” expressions within Razor, and to enable the implicit code nugget scenarios above: Parse an identifier - As soon as we see a character that isn't valid in a C# or VB identifier, we stop and move to step 2 Check for brackets - If we see "(" or "[", go to step 2.1., otherwise, go to step 3  Parse until the matching ")" or "]" (we track nested "()" and "[]" pairs and ignore "()[]" we see in strings or comments) Go back to step 2 Check for a "." - If we see one, go to step 3.1, otherwise, DO NOT ACCEPT THE "." as code, and go to step 4 If the character AFTER the "." is a valid identifier, accept the "." and go back to step 1, otherwise, go to step 4 Done! Differentiating between code and content Step 3.1 is a particularly interesting part of the above algorithm, and enables Razor to differentiate between scenarios where an identifier is being used as part of the code statement, and when it should instead be treated as static content: Notice how in the snippet above we have ? and ! characters at the end of our code nuggets.  These are both legal C# identifiers – but Razor is able to implicitly identify that they should be treated as static string content as opposed to being part of the code expression because there is whitespace after them.  This is pretty cool and saves us keystrokes. Explicit Code Nuggets in Razor Razor is smart enough to implicitly identify a lot of code nugget scenarios.  But there are still times when you want/need to be more explicit in how you scope the code nugget expression.  The @(expression) syntax allows you to do this: You can write any C#/VB code statement you want within the @() syntax.  Razor will treat the wrapping () characters as the explicit scope of the code nugget statement.  Below are a few scenarios where we could use the explicit code nugget feature: Perform Arithmetic Calculation/Modification: You can perform arithmetic calculations within an explicit code nugget: Appending Text to a Code Expression Result: You can use the explicit expression syntax to append static text at the end of a code nugget without having to worry about it being incorrectly parsed as code: Above we have embedded a code nugget within an <img> element’s src attribute.  It allows us to link to images with URLs like “/Images/Beverages.jpg”.  Without the explicit parenthesis, Razor would have looked for a “.jpg” property on the CategoryName (and raised an error).  By being explicit we can clearly denote where the code ends and the text begins. Using Generics and Lambdas Explicit expressions also allow us to use generic types and generic methods within code expressions – and enable us to avoid the <> characters in generics from being ambiguous with tag elements. One More Thing….Intellisense within Attributes We have used code nuggets within HTML attributes in several of the examples above.  One nice feature supported by the Razor code editor within Visual Studio is the ability to still get VB/C# intellisense when doing this. Below is an example of C# code intellisense when using an implicit code nugget within an <a> href=”” attribute: Below is an example of C# code intellisense when using an explicit code nugget embedded in the middle of a <img> src=”” attribute: Notice how we are getting full code intellisense for both scenarios – despite the fact that the code expression is embedded within an HTML attribute (something the existing .aspx code editor doesn’t support).  This makes writing code even easier, and ensures that you can take advantage of intellisense everywhere. Summary Razor enables a clean and concise templating syntax that enables a very fluid coding workflow.  Razor’s ability to implicitly scope code nuggets reduces the amount of typing you need to perform, and leaves you with really clean code. When necessary, you can also explicitly scope code expressions using a @(expression) syntax to provide greater clarity around your intent, as well as to disambiguate code statements from static markup. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

    Read the article

  • Convert a DVD Movie Directly to AVI with FairUse Wizard 2.9

    - by DigitalGeekery
    Are you looking for a way to backup your DVD movie collection to AVI?  Today we’ll show you how to rip a DVD movie directly to AVI with FairUse Wizard. About FairUse Wizard FairUse Wizard 2.9 uses the DivX, Xvid, or h.264 codec to convert DVD to an AVI file. It comes in both a free version and commercial version. The free, or “Light” version, can create files up 700MB while the commercial version can output a 1400MB file. This will allow you to back up your movies to CD, or even multiple movies on a single DVD. FairUse Wizard states that it does not work on copy protected discs, but we’ve seen it work on all but some of the most recent copy protection. For this tutorial we’re using the free Light Edition to convert a DVD to AVI. They also offer a commercial version that you can get for $29.99 and it offers even more encoding possibilities for converting video to you portable digital devices. Installation and Configuration Download and install FairUse Wizard. (Download link below). Once the install is complete, open FairUse Wizard by going to Start > All Programs >  FairUse Wizard 2 >  FairUse Wizard 2.   FairUse Wizard will open on the new project screen. Select “Create a new project” and type a project name into the text box. This will be used as the file output name.  Ex: A project name of Simpsons Movie will give you an output file of Simpsons Movie.avi.   Next, browse for a destination folder for the output file and temp files. Note that you will need a minimum of 6 GB of free disk space for the conversion process. Note: Much of that 6 GB will be used for temporary files that we will delete after the conversion process.   Click on the Options button at the bottom.   Under Preferences, choose your preferred video codec and file output size. XviD and x264 are installed by default. If you prefer to use DivX, you will have to install it separately. Also note the “Two pass” option. Checking the “Two pass” box will encode your video twice for higher quality, but will take more time. Un-checking the box will speed up the conversion process.   Under Audio track, note that English subtitles are enabled by default, so to remove the subtitles, you will need to change the dropdown list so it shows only a dash (-). You can also select “Use TV Mode” if your primary playback will be on a 4:3 TV screen. Click “Next.” Full Auto Mode vs. Manual Mode You should now be back to the initial screen. Next, we’ll need to determine whether or not we can use “Full Auto Mode” to convert the movie. The difference is that “Full Auto Mode” will automatically perform a few steps that you will otherwise have to do manually. If you choose the “Full Auto Mode” option, FairUse Wizard will look for the video on the DVD with the longest duration and assume it is the chain that it should convert to AVI. It’s possible, however, your disc may contain a few chains of similar size, such as a theatrical cut and director’s cut, and the longest chain may not be the one you wish to convert. Make sure that “Full auto mode” is not checked yet, and click “Next.”   FairUse Wizard will parse the IFO files and display all video chains longer than 60  seconds. In most cases, you will only find that the largest chain is the one closely matching the duration of the movie. In these instances, you can use “Full Auto Mode.” If you find more than one chain that are close in duration to the length of the movie, consult the literature on the DVD case, or search online, to find the actual running time of the movie. If the proper file chain is not the longest chain, you won’t be able to use “Full Auto Mode.”   Full Auto Mode To use “Full Auto Mode,” simply click the “Back” button to return to the initial screen Now, place a check in the “Full auto mode” check box. Click “Next.” You will then be prompted to chose your DVD drive, then click “OK.” FairUse Wizard will parse the IFO files… … and then prompt you to Select your drive that contains the DVD one more time before beginning the conversion process. Click “OK.”   Manual Mode If you cannot (or don’t wish to) use Full Auto Mode, choose the appropriate video chain and click “Next.” FairUse Wizard will first go through the process of indexing the video. Note: If you get a runtime error during this portion of the process, it likely means that FairUse Wizard cannot handle the copy protection, and thus cannot convert the DVD. FairUse Wizard will automatically detect a cropping region. If necessary, you can edit the cropping region by adjusting the cropping region settings to the left. Click “Next.” Next, click “Auto Detect” to choose the proper field combination. Click “OK” on the pop up window that displays your Field Mode. Then click “Next.” This next screen is mainly comprised of settings from the Options screen. You can make changes at this point such as codec or output size. Click “Next” when ready.   Video Conversion Now the video conversion process will begin. This may take a few hours depending on your system’s hardware. Note: There is a check box to “Shutdown computer when done” if you choose to run the conversion overnight or before leaving for work. The first phase will be video encoding… Then the audio… If you chose the “Two Pass” option, your video video will be encoded again on 2nd pass. Then you’re finished. Unfortunately, FairUse Wizard doesn’t clean up after itself very well. After the process is complete, you’ll want to browse to your output directory and delete all the temporary files as they take up a considerable amount of hard drive space. Now you’re ready to enjoy your movie. Conclusion FairUse Wizard is a nice way to backup your DVD movies to good quality .avi files. You can store them on your hard drive, watch them on a media PC, or burn them to disc. Many DVD players even allow for playback of DivX or XviD encoded video from a CD or DVD. For those of you with children, you can burn that AVI file to CD for your kids, and keep your original DVDs stored safely out of harms way. Download Download FairUse Wizard 2.9 LE Similar Articles Productive Geek Tips Kantaris is a Unique Media Player Based on VLCHow to Make/Edit a movie with Windows Movie Maker in Windows VistaAutomatically Mount and View ISO files in Windows 7 Media CenterTune Your ClearType Font Settings in Windows VistaAdd Images and Metadata to Windows 7 Media Center Movie Library TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional Make your Joomla & Drupal Sites Mobile with OSMOBI Integrate Twitter and Delicious and Make Life Easier Design Your Web Pages Using the Golden Ratio Worldwide Growth of the Internet How to Find Your Mac Address Use My TextTools to Edit and Organize Text

    Read the article

  • Using a "white list" for extracting terms for Text Mining

    - by [email protected]
    In Part 1 of my post on "Generating cluster names from a document clustering model" (part 1, part 2, part 3), I showed how to build a clustering model from text documents using Oracle Data Miner, which automates preparing data for text mining. In this process we specified a custom stoplist and lexer and relied on Oracle Text to identify important terms.  However, there is an alternative approach, the white list, which uses a thesaurus object with the Oracle Text CTXRULE index to allow you to specify the important terms. INTRODUCTIONA stoplist is used to exclude, i.e., black list, specific words in your documents from being indexed. For example, words like a, if, and, or, and but normally add no value when text mining. Other words can also be excluded if they do not help to differentiate documents, e.g., the word Oracle is ubiquitous in the Oracle product literature. One problem with stoplists is determining which words to specify. This usually requires inspecting the terms that are extracted, manually identifying which ones you don't want, and then re-indexing the documents to determine if you missed any. Since a corpus of documents could contain thousands of words, this could be a tedious exercise. Moreover, since every word is considered as an individual token, a term excluded in one context may be needed to help identify a term in another context. For example, in our Oracle product literature example, the words "Oracle Data Mining" taken individually are not particular helpful. The term "Oracle" may be found in nearly all documents, as with the term "Data." The term "Mining" is more unique, but could also refer to the Mining industry. If we exclude "Oracle" and "Data" by specifying them in the stoplist, we lose valuable information. But it we include them, they may introduce too much noise. Still, when you have a broad vocabulary or don't have a list of specific terms of interest, you rely on the text engine to identify important terms, often by computing the term frequency - inverse document frequency metric. (This is effectively a weight associated with each term indicating its relative importance in a document within a collection of documents. We'll revisit this later.) The results using this technique is often quite valuable. As noted above, an alternative to the subtractive nature of the stoplist is to specify a white list, or a list of terms--perhaps multi-word--that we want to extract and use for data mining. The obvious downside to this approach is the need to specify the set of terms of interest. However, this may not be as daunting a task as it seems. For example, in a given domain (Oracle product literature), there is often a recognized glossary, or a list of keywords and phrases (Oracle product names, industry names, product categories, etc.). Being able to identify multi-word terms, e.g., "Oracle Data Mining" or "Customer Relationship Management" as a single token can greatly increase the quality of the data mining results. The remainder of this post and subsequent posts will focus on how to produce a dataset that contains white list terms, suitable for mining. CREATING A WHITE LIST We'll leverage the thesaurus capability of Oracle Text. Using a thesaurus, we create a set of rules that are in effect our mapping from single and multi-word terms to the tokens used to represent those terms. For example, "Oracle Data Mining" becomes "ORACLEDATAMINING." First, we'll create and populate a mapping table called my_term_token_map. All text has been converted to upper case and values in the TERM column are intended to be mapped to the token in the TOKEN column. TERM                                TOKEN DATA MINING                         DATAMINING ORACLE DATA MINING                  ORACLEDATAMINING 11G                                 ORACLE11G JAVA                                JAVA CRM                                 CRM CUSTOMER RELATIONSHIP MANAGEMENT    CRM ... Next, we'll create a thesaurus object my_thesaurus and a rules table my_thesaurus_rules: CTX_THES.CREATE_THESAURUS('my_thesaurus', FALSE); CREATE TABLE my_thesaurus_rules (main_term     VARCHAR2(100),                                  query_string  VARCHAR2(400)); We next populate the thesaurus object and rules table using the term token map. A cursor is defined over my_term_token_map. As we iterate over  the rows, we insert a synonym relationship 'SYN' into the thesaurus. We also insert into the table my_thesaurus_rules the main term, and the corresponding query string, which specifies synonyms for the token in the thesaurus. DECLARE   cursor c2 is     select token, term     from my_term_token_map; BEGIN   for r_c2 in c2 loop     CTX_THES.CREATE_RELATION('my_thesaurus',r_c2.token,'SYN',r_c2.term);     EXECUTE IMMEDIATE 'insert into my_thesaurus_rules values                        (:1,''SYN(' || r_c2.token || ', my_thesaurus)'')'     using r_c2.token;   end loop; END; We are effectively inserting the token to return and the corresponding query that will look up synonyms in our thesaurus into the my_thesaurus_rules table, for example:     'ORACLEDATAMINING'        SYN ('ORACLEDATAMINING', my_thesaurus)At this point, we create a CTXRULE index on the my_thesaurus_rules table: create index my_thesaurus_rules_idx on        my_thesaurus_rules(query_string)        indextype is ctxsys.ctxrule; In my next post, this index will be used to extract the tokens that match each of the rules specified. We'll then compute the tf-idf weights for each of the terms and create a nested table suitable for mining.

    Read the article

  • SQL SERVER – Shrinking Database is Bad – Increases Fragmentation – Reduces Performance

    - by pinaldave
    Earlier, I had written two articles related to Shrinking Database. I wrote about why Shrinking Database is not good. SQL SERVER – SHRINKDATABASE For Every Database in the SQL Server SQL SERVER – What the Business Says Is Not What the Business Wants I received many comments on Why Database Shrinking is bad. Today we will go over a very interesting example that I have created for the same. Here are the quick steps of the example. Create a test database Create two tables and populate with data Check the size of both the tables Size of database is very low Check the Fragmentation of one table Fragmentation will be very low Truncate another table Check the size of the table Check the fragmentation of the one table Fragmentation will be very low SHRINK Database Check the size of the table Check the fragmentation of the one table Fragmentation will be very HIGH REBUILD index on one table Check the size of the table Size of database is very HIGH Check the fragmentation of the one table Fragmentation will be very low Here is the script for the same. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO Let us check the table size and fragmentation. Now let us TRUNCATE the table and check the size and Fragmentation. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can clearly see that after TRUNCATE, the size of the database is not reduced and it is still the same as before TRUNCATE operation. After the Shrinking database operation, we were able to reduce the size of the database. If you notice the fragmentation, it is considerably high. The major problem with the Shrink operation is that it increases fragmentation of the database to very high value. Higher fragmentation reduces the performance of the database as reading from that particular table becomes very expensive. One of the ways to reduce the fragmentation is to rebuild index on the database. Let us rebuild the index and observe fragmentation and database size. -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REBUILD GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can notice that after rebuilding, Fragmentation reduces to a very low value (almost same to original value); however the database size increases way higher than the original. Before rebuilding, the size of the database was 5 MB, and after rebuilding, it is around 20 MB. Regular rebuilding the index is rebuild in the same user database where the index is placed. This usually increases the size of the database. Look at irony of the Shrinking database. One person shrinks the database to gain space (thinking it will help performance), which leads to increase in fragmentation (reducing performance). To reduce the fragmentation, one rebuilds index, which leads to size of the database to increase way more than the original size of the database (before shrinking). Well, by Shrinking, one did not gain what he was looking for usually. Rebuild indexing is not the best suggestion as that will create database grow again. I have always remembered the excellent post from Paul Randal regarding Shrinking the database is bad. I suggest every one to read that for accuracy and interesting conversation. Let us run following script where we Shrink the database and REORGANIZE. -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Shrink the Database DBCC SHRINKDATABASE (ShrinkIsBed); GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REORGANIZE GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can see that REORGANIZE does not increase the size of the database or remove the fragmentation. Again, I no way suggest that REORGANIZE is the solution over here. This is purely observation using demo. Read the blog post of Paul Randal. Following script will clean up the database -- Clean up USE MASTER GO ALTER DATABASE ShrinkIsBed SET SINGLE_USER WITH ROLLBACK IMMEDIATE GO DROP DATABASE ShrinkIsBed GO There are few valid cases of the Shrinking database as well, but that is not covered in this blog post. We will cover that area some other time in future. Additionally, one can rebuild index in the tempdb as well, and we will also talk about the same in future. Brent has written a good summary blog post as well. Are you Shrinking your database? Well, when are you going to stop Shrinking it? Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • SQL SERVER – Introduction to SQL Server 2014 In-Memory OLTP

    - by Pinal Dave
    In SQL Server 2014 Microsoft has introduced a new database engine component called In-Memory OLTP aka project “Hekaton” which is fully integrated into the SQL Server Database Engine. It is optimized for OLTP workloads accessing memory resident data. In-memory OLTP helps us create memory optimized tables which in turn offer significant performance improvement for our typical OLTP workload. The main objective of memory optimized table is to ensure that highly transactional tables could live in memory and remain in memory forever without even losing out a single record. The most significant part is that it still supports majority of our Transact-SQL statement. Transact-SQL stored procedures can be compiled to machine code for further performance improvements on memory-optimized tables. This engine is designed to ensure higher concurrency and minimal blocking. In-Memory OLTP alleviates the issue of locking, using a new type of multi-version optimistic concurrency control. It also substantially reduces waiting for log writes by generating far less log data and needing fewer log writes. Points to remember Memory-optimized tables refer to tables using the new data structures and key words added as part of In-Memory OLTP. Disk-based tables refer to your normal tables which we used to create in SQL Server since its inception. These tables use a fixed size 8 KB pages that need to be read from and written to disk as a unit. Natively compiled stored procedures refer to an object Type which is new and is supported by in-memory OLTP engine which convert it into machine code, which can further improve the data access performance for memory –optimized tables. Natively compiled stored procedures can only reference memory-optimized tables, they can’t be used to reference any disk –based table. Interpreted Transact-SQL stored procedures, which is what SQL Server has always used. Cross-container transactions refer to transactions that reference both memory-optimized tables and disk-based tables. Interop refers to interpreted Transact-SQL that references memory-optimized tables. Using In-Memory OLTP In-Memory OLTP engine has been available as part of SQL Server 2014 since June 2013 CTPs. Installation of In-Memory OLTP is part of the SQL Server setup application. The In-Memory OLTP components can only be installed with a 64-bit edition of SQL Server 2014 hence they are not available with 32-bit editions. Creating Databases Any database that will store memory-optimized tables must have a MEMORY_OPTIMIZED_DATA filegroup. This filegroup is specifically designed to store the checkpoint files needed by SQL Server to recover the memory-optimized tables, and although the syntax for creating the filegroup is almost the same as for creating a regular filestream filegroup, it must also specify the option CONTAINS MEMORY_OPTIMIZED_DATA. Here is an example of a CREATE DATABASE statement for a database that can support memory-optimized tables: CREATE DATABASE InMemoryDB ON PRIMARY(NAME = [InMemoryDB_data], FILENAME = 'D:\data\InMemoryDB_data.mdf', size=500MB), FILEGROUP [SampleDB_mod_fg] CONTAINS MEMORY_OPTIMIZED_DATA (NAME = [InMemoryDB_mod_dir], FILENAME = 'S:\data\InMemoryDB_mod_dir'), (NAME = [InMemoryDB_mod_dir], FILENAME = 'R:\data\InMemoryDB_mod_dir') LOG ON (name = [SampleDB_log], Filename='L:\log\InMemoryDB_log.ldf', size=500MB) COLLATE Latin1_General_100_BIN2; Above example code creates files on three different drives (D:  S: and R:) for the data files and in memory storage so if you would like to run this code kindly change the drive and folder locations as per your convenience. Also notice that binary collation was specified as Windows (non-SQL). BIN2 collation is the only collation support at this point for any indexes on memory optimized tables. It is also possible to add a MEMORY_OPTIMIZED_DATA file group to an existing database, use the below command to achieve the same. ALTER DATABASE AdventureWorks2012 ADD FILEGROUP hekaton_mod CONTAINS MEMORY_OPTIMIZED_DATA; GO ALTER DATABASE AdventureWorks2012 ADD FILE (NAME='hekaton_mod', FILENAME='S:\data\hekaton_mod') TO FILEGROUP hekaton_mod; GO Creating Tables There is no major syntactical difference between creating a disk based table or a memory –optimized table but yes there are a few restrictions and a few new essential extensions. Essentially any memory-optimized table should use the MEMORY_OPTIMIZED = ON clause as shown in the Create Table query example. DURABILITY clause (SCHEMA_AND_DATA or SCHEMA_ONLY) Memory-optimized table should always be defined with a DURABILITY value which can be either SCHEMA_AND_DATA or  SCHEMA_ONLY the former being the default. A memory-optimized table defined with DURABILITY=SCHEMA_ONLY will not persist the data to disk which means the data durability is compromised whereas DURABILITY= SCHEMA_AND_DATA ensures that data is also persisted along with the schema. Indexing Memory Optimized Table A memory-optimized table must always have an index for all tables created with DURABILITY= SCHEMA_AND_DATA and this can be achieved by declaring a PRIMARY KEY Constraint at the time of creating a table. The following example shows a PRIMARY KEY index created as a HASH index, for which a bucket count must also be specified. CREATE TABLE Mem_Table ( [Name] VARCHAR(32) NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000), [City] VARCHAR(32) NULL, [State_Province] VARCHAR(32) NULL, [LastModified] DATETIME NOT NULL, ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); Now as you can see in the above query example we have used the clause MEMORY_OPTIMIZED = ON to make sure that it is considered as a memory optimized table and not just a normal table and also used the DURABILITY Clause= SCHEMA_AND_DATA which means it will persist data along with metadata and also you can notice this table has a PRIMARY KEY mentioned upfront which is also a mandatory clause for memory-optimized tables. We will talk more about HASH Indexes and BUCKET_COUNT in later articles on this topic which will be focusing more on Row and Index storage on Memory-Optimized tables. So stay tuned for that as well. Now as we covered the basics of Memory Optimized tables and understood the key things to remember while using memory optimized tables, let’s explore more using examples to understand the Performance gains using memory-optimized tables. I will be using the database which i created earlier in this article i.e. InMemoryDB in the below Demo Exercise. USE InMemoryDB GO -- Creating a disk based table CREATE TABLE dbo.Disktable ( Id INT IDENTITY, Name CHAR(40) ) GO CREATE NONCLUSTERED INDEX IX_ID ON dbo.Disktable (Id) GO -- Creating a memory optimized table with similar structure and DURABILITY = SCHEMA_AND_DATA CREATE TABLE dbo.Memorytable_durable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA) GO -- Creating an another memory optimized table with similar structure but DURABILITY = SCHEMA_Only CREATE TABLE dbo.Memorytable_nondurable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_only) GO -- Now insert 100000 records in dbo.Disktable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Disktable(Name) VALUES('sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Do the same inserts for Memory table dbo.Memorytable_durable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_durable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Now finally do the same inserts for Memory table dbo.Memorytable_nondurable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_nondurable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END The above 3 Inserts took 1.20 minutes, 54 secs, and 2 secs respectively to insert 100000 records on my machine with 8 Gb RAM. This proves the point that memory-optimized tables can definitely help businesses achieve better performance for their highly transactional business table and memory- optimized tables with Durability SCHEMA_ONLY is even faster as it does not bother persisting its data to disk which makes it supremely fast. Koenig Solutions is one of the few organizations which offer IT training on SQL Server 2014 and all its updates. Now, I leave the decision on using memory_Optimized tables on you, I hope you like this article and it helped you understand  the fundamentals of IN-Memory OLTP . Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Koenig

    Read the article

< Previous Page | 39 40 41 42 43 44 45  | Next Page >