Search Results

Search found 76 results on 4 pages for 'cardinality'.

Page 3/4 | < Previous Page | 1 2 3 4  | Next Page >

  • Counting bits set in a .Net BitArray Class

    - by Sam
    I am implementing a library where I am extensively using the .Net BitArray class and need an equivalent to the Java BitSet.Cardinality() method, i.e. a method which returns the number of bits set. I was thinking of implementing it as an extension method for the BitArray class. The trivial implementation is to iterate and count the bits set (like below), but I wanted a faster implementation as I would be performing thousands of set operations and counting the answer. Is there a faster way than the example below? count = 0; for (int i = 0; i < mybitarray.Length; i++) { if (mybitarray [i]) count++; }

    Read the article

  • How to find the entity with the greatest primary key?

    - by simpatico
    I've an entity LearningUnit that has an int primary key. Actually, it has nothing more. Entity Concept has the following relationship with it: @ManyToOne @Size(min=1,max=7) private LearningUnit learningUnit; In a constructor of Concept I need to retrieve the LearningUnit with the greatest primary key. If no LearningUnit exists yet I instantiate one. I then set this.learningUnit to the retrieved/instantied. Finally, I call the empty constructor of Concept in a try-catch block, to have the entitymanager do the cardinality check. If an exception is thrown (I expect one in the case that already another 7 Concepts are referring to the same LearningUnit. In that case, I case instantiate a new LearningUnit with a new greater primary key. Please, also point out, if any, clear pitfalls in my outlined algorithm above.

    Read the article

  • SQL: Optimize insensive SELECTs on DateTime fields

    - by Fedyashev Nikita
    I have an application for scheduling certain events. And all these events must be reviewed after each scheduled time. So basically we have 3 tables: items(id, name) scheduled_items(id, item_id, execute_at - datetime) - item_id column has an index option. reviewed_items(id, item_id, created_at - datetime) - item_id column has an index option. So core function of the application is "give me any items(which are not yet reviewed) for the actual moment". How can I optimize this solution for speed(because it is very core business feature and not micro optimization)? I suppose that adding index to the datetime fields doesn't make any sense because the cardinality or uniqueness on that fields are very high and index won't give any(?) speed-up. Is it correct? What would you recommend? Should I try no-SQL? -- mysql -V 5.075 I use caching where it makes sence.

    Read the article

  • weird index behavior

    - by TasostheGreat
    I have set up my table with an index only on done_status(done_status =INT), when I use EXPLAIN SELECT * FROM reminder WHERE done_status=2 i get this back id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE reminder ALL done_status NULL NULL NULL 5 Using where but when I give this command EXPLAIN SELECT * FROM reminder WHERE done_status=1 that's what I get back: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE reminder ref done_status done_status 4 const 2 first time it shows me it uses 5 rows second time 2 rows I don't think the index works, if I understood it right first time it should give me 3 rows. What do I do wrong? SHOW INDEX FROM reminder: Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment reminder 1 done_status 1 done_status A 5 NULL NULL BTREE

    Read the article

  • T-SQL Improvements And Data Types in ms sql 2008

    - by Aamir Hasan
     Microsoft SQL Server 2008 is a new version released in the first half of 2008 introducing new properties and capabilities to SQL Server product family. All these new and enhanced capabilities can be defined as the classic words like secure, reliable, scalable and manageable. SQL Server 2008 is secure. It is reliable. SQL2008 is scalable and is more manageable when compared to previous releases. Now we will have a look at the features that are making MS SQL Server 2008 more secure, more reliable, more scalable, etc. in details.Microsoft SQL Server 2008 provides T-SQL enhancements that improve performance and reliability. Itzik discusses composable DML, the ability to declare and initialize variables in the same statement, compound assignment operators, and more reliable object dependency information. Table-Valued ParametersInserts into structures with 1-N cardinality problematicOne order -> N order line items"N" is variable and can be largeDon't want to force a new order for every 20 line itemsOne database round-trip / line item slows things downNo ARRAY data type in SQL ServerXML composition/decomposition used as an alternativeTable-valued parameters solve this problemTable-Valued ParametersSQL Server has table variablesDECLARE @t TABLE (id int);SQL Server 2008 adds strongly typed table variablesCREATE TYPE mytab AS TABLE (id int);DECLARE @t mytab;Parameters must use strongly typed table variables Table Variables are Input OnlyDeclare and initialize TABLE variable  DECLARE @t mytab;  INSERT @t VALUES (1), (2), (3);  EXEC myproc @t;Procedure must declare variable READONLY  CREATE PROCEDURE usetable (    @t mytab READONLY ...)  AS    INSERT INTO lineitems SELECT * FROM @t;    UPDATE @t SET... -- no!T-SQL Syntax EnhancementsSingle statement declare and initialize  DECLARE @iint = 4;Compound Assignment Operators  SET @i += 1;Row constructors  DECLARE @t TABLE (id int, name varchar(20));  INSERT INTO @t VALUES    (1, 'Fred'), (2, 'Jim'), (3, 'Sue');Grouping SetsGrouping Sets allow multiple GROUP BY clauses in a single SQL statementMultiple, arbitrary, sets of subtotalsSingle read pass for performanceNested subtotals provide ever better performanceGrouping Sets are an ANSI-standardCOMPUTE BY is deprecatedGROUPING SETS, ROLLUP, CUBESQL Server 2008 - ANSI-syntax ROLLUP and CUBEPre-2008 non-ANSI syntax is deprecatedWITH ROLLUP produces n+1 different groupings of datawhere n is the number of columns in GROUP BYWITH CUBE produces 2^n different groupingswhere n is the number of columns in GROUP BYGROUPING SETS provide a "halfway measure"Just the number of different groupings you needGrouping Sets are visible in query planGROUPING_ID and GROUPINGGrouping Sets can produce non-homogeneous setsGrouping set includes NULL values for group membersNeed to distinguish by grouping and NULL valuesGROUPING (column expression) returns 0 or 1Is this a group based on column expr. or NULL value?GROUPING_ID (a,b,c) is a bitmaskGROUPING_ID bits are set based on column expressions a, b, and cMERGE StatementMultiple set operations in a single SQL statementUses multiple sets as inputMERGE target USING source ON ...Operations can be INSERT, UPDATE, DELETEOperations based onWHEN MATCHEDWHEN NOT MATCHED [BY TARGET] WHEN NOT MATCHED [BY SOURCE]More on MERGEMERGE statement can reference a $action columnUsed when MERGE used with OUTPUT clauseMultiple WHEN clauses possible For MATCHED and NOT MATCHED BY SOURCEOnly one WHEN clause for NOT MATCHED BY TARGETMERGE can be used with any table sourceA MERGE statement causes triggers to be fired onceRows affected includes total rows affected by all clausesMERGE PerformanceMERGE statement is transactionalNo explicit transaction requiredOne Pass Through TablesAt most a full outer joinMatching rows = when matchedLeft-outer join rows = when not matched by targetRight-outer join rows = when not matched by sourceMERGE and DeterminismUPDATE using a JOIN is non-deterministicIf more than one row in source matches ON clause, either/any row can be used for the UPDATEMERGE is deterministicIf more than one row in source matches ON clause, its an errorKeeping Track of DependenciesNew dependency views replace sp_dependsViews are kept in sync as changes occursys.dm_sql_referenced_entitiesLists all named entities that an object referencesExample: which objects does this stored procedure use?sys.dm_sql_referencing_entities 

    Read the article

  • SQL SERVER – Learn SQL Server 2014 Online in a Day – My Latest Pluralsight Course

    - by Pinal Dave
    Click here watch SQL Server 2014 Administration New Features.  SQL Server 2014 was released earlier this year and it has been extremely popular in Microsoft world. Here is the announcement for everyone, who have been asking me to build a tutorial around SQL Server 2014. I have authored latest Pluralsight courses on the subject of SQL Server 2014. This course is 4 hours and 17 minutes long, but the best part is that this course contains all the latest features of SQL Server 2014. I have build this course with the assumption that DBA is familiar with earlier versions of SQL Server and wants to explore and learn new features of SQL Server 2014. The Challenge I Faced The biggest challenge I faced was how to come up with the outline for the course. The reason is that there are so many different features introduced in SQL Server 2014 that is will be difficult to cover each of the features in a single course. I wanted to cover the topics which are the most relevant and useful to developers, but in addition I also wanted to cover the topics which may be useful to develop if they know that they exists in the product. I finally decided to depend on blog readers and few of the SQL Experts. I reached out to selected 20 people via email and gave them a list of the topics which I should be covering in this course. They all work in different organizations and have a good understanding about the need of the DBA and Developers. Based on their feedback, I was able to come up with a very good outline which is currently very popular with Pluralsight library. Lots of people have asked me how was I able to come up with a course content outline so accurately. The credit for the same goes to the developers and DBA, who have voted in the topics and have helped me to build a very solid outline for the course. Outline of the Course Here is a quick outline for the course: Introduction Backup Enhancements Security Enhancements Columnstore Enhancements Online Data Operations Enhancements Enhancements with Microsoft Azure SSD Buffer Pool Extensions Resource Governor IO Miscellaneous Features Online Index Rebuilding Live Plans for Long Running Queries Transaction Durability Cardinality Estimation In Memory OLTP Optimization Well, I had a great fun working on the topics which I have mentioned in the outline. I am very confident that once you start with the course, you will indeed understand how each of the topics builds and presented. I have made sure that each of the topic has a vivid and clear story to begin with. I first explain the story and right after that I explain the concept. Who Should Attend This Course Everyone who has basic knowledge of SQL Server and wants to update themselves with SQL Server 2014. They should attend this course. One thing I have made sure that this course is easy to understand and I have decided complex subject into multiple parts. This way the learning is progressive and anyone with a poor knowledge of the subject can have enough time to understand the presented concept. Screenshot of the Course Here are few of the screenshot of the courses. How to Watch Video Course This course is available at Pluralsight, and you will need a valid login to Pluralsight. If you do not have Pluralsight login, you can quickly sign up for the FREE Trial. Click here watch SQL Server 2014 Administration New Features.  Reference: Pinal Dave (http://blog.SQLAuthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, SQLAuthority News, T SQL, Video

    Read the article

  • Algorithm for tracking progress of controller method running in background

    - by SilentAssassin
    I am using Codeigniter framework for PHP on Windows platform. My problem is I am trying to track progress of a controller method running in background. The controller extracts data from the database(MySQL) then does some processing and then stores the results again in the database. The complete aforesaid process can be considered as a single task. A new task can be assigned while another task is running. The newly assigned task will be added in a queue. So if I can track progress of the controller, I can show status for each of these tasks. Like I can show "Pending" status for tasks in the queue, "In Progress" for tasks running and "Done" for tasks that are completed. Main Issue: Now first thing I need to find is an algorithm to track the progress of how much amount of execution the controller method has completed and that means tracking how much amount of method has completed execution. For instance, this PHP script tracks progress of array being counted. Here the current state and state after total execution are known so it is possible to track its progress. But I am not able to devise anything analogous to it in my case. Maybe what I am trying to achieve is programmtically not possible. If its not possible then suggest me a workaround or a completely new approach. If some details are pending you can mention them. Sorry for my ignorance this is my first post here. I welcome you to point out my mistakes. EDIT: Database outline: The URL(s) and keyword(s) are first entered by user which are stored in a database table called link_master and keyword_master respectively. Then keywords are extracted from all the links present in this table and compared with keywords entered by user and their frequency is calculated which is the final result. And the results are stored in another table called link_result. Now sub-links are extracted from the domain links and stored in a table called sub_link_master. Now again the keywords are extracted from these sub-links and the corresponding results are stored in a table called sub_link_result. The number of records cannot be defined beforehand as the number of links on any web page can be different. Only the cardinality of *link_result* table can be known which will be equal to multiplication of number of keyword(s) and URL(s) . I insert multiple records at a time using this resource. Controller outline: The controller extracts keywords from a web page and also extracts keywords from all the links present on that page. There is a method called crawlLink. I used Rolling Curl to extract keywords and web page content. It has callback function which I used for extracting keywords alongwith generating results and extracting valid sub-links. There is a insertResult method which stores results for links and sub-links in the respective tables. Yes, the processing depends on the number of records. The more the number of records, the more time it takes to execute: Consider this scenario: Number of Domain Links = 1 Number of Keywords = 3 Number of Domain Links Result generated = 3 (3 x 1 as described in the question) Number of Sub Links generated = 41 Number of Sub Links Result = 117 (41 x 3 = 123 but some links are not valid or searchable) Approximate time taken for above process to complete = 55 seconds. The above result is for a single link. I want to track the progress of the above results getting stored in database. When all results are stored, the task is complete. If results are getting stored, the task is In Progress. I am not clear how can I track this progress.

    Read the article

  • Rethinking Oracle Optimizer Statistics for P6 Part 2

    - by Brian Diehl
    In the previous post (Part 1), I tried to draw some key insights about the relationship between P6 and Oracle Optimizer Statistics.  The first is that average cardinality has the greatest impact on query optimization and that the particular queries generated by P6 are more likely to use this average during calculations. The second is that these are statistics that are unlikely to change greatly over the life of the application. Ultimately, our goal is to get the best query optimization possible.  Or is it? Stability No application administrator wants to get the call at 9am that their application users cannot get there work done because everything is running slow. This is a possibility with a regularly scheduled nightly collection of statistics. It may not just be slow performance, but a complete loss of service because one or more queries are optimized poorly. Ideally, this should not be the case. The database optimizer should make better decisions with more up-to-date data. Better statistics may give incremental performance benefit. However, this benefit must be balanced against the potential cost of system down time.  It is stability that we ultimately desire and not absolute optimal performance. We do want the benefit from more accurate statistics and better query plans, but not at the risk of an unusable system. As a result, I've developed the following methodology around managing database statistics for the P6 database.  1. No Automatic Re-Gathering - The daily, weekly, or other interval of statistic gathering is unlikely to be beneficial. Quite the opposite. It is more likely to cause problems. 2. Smart Re-Gathering - The time to collect statistics is when things have changed significantly. For a new installation of P6, this is happening more often because the data is growing from a few rows to thousands and more. But for a mature system, the data is not changing significantly from week-to-week. There are times to collect statistics: New releases of the application Changes in the underlying hardware or software versions (ex. new Oracle RDBMS version) When additional user groups are added. The new groups may use the software in significantly different ways. After significant changes in the data. This may be monthly, quarterly or yearly.  3. Always Test - If you take away one thing from this post, it would be to always have a plan to test after changing statistics. In reality, statistics can be collected as often as you desire provided there are tests in place to verify that performance is the same or better. These might be automated tests or simply a manual script of application functions. 4. Have a Way Out - Never change the statistics without a way to return to the previous set. Think of the statistics as one part of the overall application code that also includes the source code--both application and RDBMS. It would be foolish to change to the new code without a way to get back to the previous version. In the final post, I will talk about the actual script I created for P6 PMDB and possible future direction for managing query performance. 

    Read the article

  • Entity Framework: Delete Object and its related entities

    - by Waheed
    Hi, Does anyone know how to delete an object and all of it's related entities. For example i have tables, Products, Category, ProductCategory and productDetails, the productCategory is joining table of both Product and Category. I have red from http://msdn.microsoft.com/en-us/library/bb738580.aspx that Deleting the parent object also deletes all the child objects in the constrained relationship. This result is the same as enabling the CascadeDelete property on the association for the relationship. I am using this code Product productObj = this.ObjectContext.Product.Where(p => p.ProductID.Equals(productID)).First(); if (!productObj.ProductCategory.IsLoaded) productObj.ProductCategory.Load(); if (!productObj.ProductDetails.IsLoaded) productObj.ProductDetails.Load(); //my own methods. base.Delete(productObj); base.SaveAllObjectChanges(); But i am getting error on ObjectContext.SaveChanges(); i.e A relationship is being added or deleted from an AssociationSet 'FK_ProductCategory_Product'. With cardinality constraints, a corresponding 'ProductCategory' must also be added or deleted. Thanks in advance....

    Read the article

  • Data Modeling: Logical Modeling Exercise

    - by swisscheese
    In trying to learn the art of data storage I have been trying to take in as much solid information as possible. PerformanceDBA posted some really helpful tutorials/examples in the following posts among others: is my data normalized? and Relational table naming convention. I already asked a subset question of this model here. So to make sure I understood the concepts he presented and I have seen elsewhere I wanted to take things a step or two further and see if I am grasping the concepts. Hence the purpose of this post, which hopefully others can also learn from. Everything I present is conceptual to me and for learning rather than applying it in some production system. It would be cool to get some input from PerformanceDBA also since I used his models to get started, but I appreciate all input given from anyone. As I am new to databases and especially modeling I will be the first to admit that I may not always ask the right questions, explain my thoughts clearly, or use the right verbage due to lack of expertise on the subject. So please keep that in mind and feel free to steer me in the right direction if I head off track. If there is enough interest in this I would like to take this from the logical to physical phases to show the evolution of the process and share it here on Stack. I will keep this thread for the Logical Diagram though and start new one for the additional steps. For my understanding I will be building a MySQL DB in the end to run some tests and see if what I came up with actually works. Here is the list of things that I want to capture in this conceptual model. Edit for V1.2 The purpose of this is to list Bands, their members, and the Events that they will be appearing at, as well as offer music and other merchandise for sale Members will be able to match up with friends Members can write reviews on the Bands, their music, and their events. There can only be one review per member on a given item, although they can edit their reviews and history will be maintained. BandMembers will have the chance to write a single Comment on Reviews about the Band they are associated with. Collectively as a Band only one Comment is allowed per Review. Members can then rate all Reviews and Comments but only once per given instance Members can select their favorite Bands, music, Merchandise, and Events Bands, Songs, and Events will be categorized into the type of Genre that they are and then further subcategorized into a SubGenre if necessary. It is ok for a Band or Event to fall into more then one Genre/SubGenre combination. Event date, time, and location will be posted for a given band and members can show that they will be attending the Event. An Event can be comprised of more than one Band, and multiple Events can take place at a single location on the same day Every party will be tied to at least one address and address history shall be maintained. Each party could also be tied to more then one address at a time (i.e. billing, shipping, physical) There will be stored profiles for Bands, BandMembers, and general members. So there it is, maybe a bit involved but could be a great learning tool for many hopefully as the process evolves and input is given by the community. Any input? EDIT v1.1 In response to PerformanceDBA U.3) That means no merchandise other than Band merchandise in the database. Correct ? That was my original thought but you got me thinking. Maybe the site would want to sell its own merchandise or even other merchandise from the bands. Not sure a mod to make for that. Would it require an entire rework of the Catalog section or just the identifying relationship that exists with the Band? Attempted a mod to sell both complete albums or song. Either way they would both be in electronic format only available for download. That is why I listed an Album as being comprised of Songs rather then 2 separate entities. U.5) I understand what you bring up about the circular relation with Favorite. I would like to get to this “It is either one Entity with some form of differentiation (FavoriteType) which identifies its treatment” but how to is not clear to me. What am I missing here? u.6) “Business Rules This is probably the only area you are weak in.” Thanks for the honest response. I will readdress these but I hope to clear up some confusion in my head first with the responses I have posted back to you. Q.1) Yes I would like to have Accepted, Rejected, and Blocked. I am not sure what you are referring to as to how this would change the logical model? Q.2) A person does not have to be a User. They can exist only as a BandMember. Is that what you are asking? Minor Issue Zero, One, or More…Oops I admit I forgot to give this attention when building the model. I am submitting this version as is and will address in a future version. I need to read up more on Constraint Checking to make sure I am understanding things. M.4) Depends if you envision OrderPurchase in the future. Can you expand as to what you mean here? EDIT V1.2 In response to PerformanceDBA input... Lessons learned. I was mixing the concept of Identifying / Non-Identifying and Cardinality (i.e. Genre / SubGenre), and doing so inconsistently to make things worse. Associative Tables are not required in Logical Diagrams as their many-to-many relationships can be depicted and then expanded in the Physical Model. I was overlooking the Cardinality in a lot of the relationships The importance of reading through relationships using effective Verb Phrases to reassure I am modeling what I want to accomplish. U.2) In the concept of this model it is only required to track a Venue as a location for an Event. No further data needs to be collected. With that being said Events will take place on a given EventDate and will be hosted at a Venue. Venues will host multiple events and possibly multiple events on a given date. In my new model my thinking was that EventDate is already tied to Event . Therefore, Venue will not need a relationship with EventDate. The 5th and 6th bullets you have listed under U.2) leave me questioning my thinking though. Am I missing something here? U.3) Is it time to move the link between Item and Band up to Item and Party instead? With the current design I don't see a possibility to sell merchandise not tied to the band as you have brought up. U.5) I left as per your input rather than making it a discrete Supertype/Subtype Relationship as I don’t see a benefit of having that type of roll up. Additional Revisions AR.1) After going through the exercise for FavoriteItem, I feel that Item to Review requires a many-to-many relationship so that is indicated. Necessary? Ok here we go for v1.3 I took a few days on this version, going back and forth with my design. Once the logical process is complete, as I want to see if I am on the right track, I will go through in depth what I had learned and the troubles I faced as a beginner going through this process. The big point for this version was it took throwing in some Keys to help see what I was missing in the past. Going through the process of doing a matrix proved to be of great help also. Regardless of anything, if it wasn't for the input given by PerformanceDBA I would still be a lost soul wondering in the dark. Who knows my current design might reaffirm that I still am, but I have learned a lot so I am know I at least have a flashlight in my hand. At this point in time I admit that I am still confused about identifying and non-identifying relationships. In my model I had to use non-identifying relationships with non nulls just to join the relationships I wanted to model. In reading a lot on the subject there seems to be a lot of disagreement and indecisiveness on the subject so I did what I thought represented the right things in my model. When to force (identifying) and when to be free (non-identifying)? Anyone have inputs? EDIT V1.4 Ok took the V1.3 inputs and cleaned things up for this V1.4 Currently working on a V1.5 to include attributes.

    Read the article

  • Quickest algorithm for finding sets with high intersection

    - by conradlee
    I have a large number of user IDs (integers), potentially millions. These users all belong to various groups (sets of integers), such that there are on the order of 10 million groups. To simplify my example and get to the essence of it, let's assume that all groups contain 20 user IDs (i.e., all integer sets have a cardinality of 100). I want to find all pairs of integer sets that have an intersection of 15 or greater. Should I compare every pair of sets? (If I keep a data structure that maps userIDs to set membership, this would not be necessary.) What is the quickest way to do this? That is, what should my underlying data structure be for representing the integer sets? Sorted sets, unsorted---can hashing somehow help? And what algorithm should I use to compute set intersection)? I prefer answers that relate C/C++ (especially STL), but also any more general, algorithmic insights are welcome. Update Also, note that I will be running this in parallel in a shared memory environment, so ideas that cleanly extend to a parallel solution are preferred.

    Read the article

  • Cache Class compilation error using parent-child relationships and cache sql storage

    - by Fred Altman
    I have the global listed below that I'm trying to create a couple of cache classes using sql stoarage for: ^WHEAIPP(1,26,1)=2 ^WHEAIPP(1,26,1,1)="58074^^SMSNARE^58311" 2)="58074^59128^MPHILLIPS^59135" ^WHEAIPP(1,29,1)=2 ^WHEAIPP(1,29,1,1)="58074^^SMSNARE^58311" 2)="58074^59128^MPHILLIPS^59135" ^WHEAIPP(1,93,1)=2 ^WHEAIPP(1,93,1,1)="58884^^SSNARE^58948" 2)="58884^59128^MPHILLIPS^59135" ^WHEAIPP(1,166,1)=2 ^WHEAIPP(1,166,1,1)="58407^^SMSNARE^58420" 2)="58407^59128^MPHILLIPS^59135" ^WHEAIPP(1,324,1)=2 ^WHEAIPP(1,324,1,1)="58884^^SSNARE^58948" 2)="58884^59128^MPHILLIPS^59135" ^WHEAIPP(1,419,1)=3 ^WHEAIPP(1,419,1,1)="59707^^SSNARE^59708" 2)="59707^^MPHILLIPS^59910,58000^^^^" 3)="59707^59981^SSNARE^60117,53241^^^^" The first two subscripts of the global (Hmo and Keen) make a unique entry. The third subscript (Seq) has a property (IppLineCount) which is the number of IppLines in the fourth subscript level (Seq2). I create the class WIppProv below which is the parent class: /// <PRE> /// ============================ /// Generated Class Definition /// Table: WMCA_B_IPP_PROV /// Generated by: FXALTMAN /// Generated on: 05/21/2012 13:46:41 /// Generator: XWESTblClsGenV2 /// ---------------------------- /// </PRE> Class XFXA.MCA.WIppProv Extends (%Persistent, %XML.Adaptor) [ ClassType = persistent, Inheritance = right, ProcedureBlock, StorageStrategy = SQLMapping ] { /// .HMO Property Hmo As %Integer; /// .KEEN Property Keen As %Integer; /// .SEQ Property Seq As %String; Property IppLineCount As %Integer; Index iMaster On (Hmo, Keen, Seq) [ IdKey, Unique ]; Relationship IppLines As XFXA.MCA.WIppProvLine [ Cardinality = many, Inverse = relWIppProv ]; <Storage name="SQLMapping"> <DataLocation>^WHEAIPP</DataLocation> <ExtentSize>1000000</ExtentSize> <SQLMap name="DBMS"> <Data name="IppLineCount"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>1</Piece> </Data> <Global>^WHEAIPP</Global> <PopulationType>full</PopulationType> <Subscript name="1"> <AccessType>Sub</AccessType> <Expression>{Hmo}</Expression> <LoopInitValue>1</LoopInitValue> </Subscript> <Subscript name="2"> <AccessType>Sub</AccessType> <Expression>{Keen}</Expression> </Subscript> <Subscript name="3"> <AccessType>Sub</AccessType> <LoopInitValue>1</LoopInitValue> <Expression>{Seq}</Expression> </Subscript> <Type>data</Type> </SQLMap> <StreamLocation>^XFXA.MCA.WIppProvS</StreamLocation> <Type>%Library.CacheSQLStorage</Type> </Storage> } This class compiles fine. Next I created the WIppProvLine class listed below and made a parent-child relationship between the two: /// Used to represent a single line of IPP data Class XFXA.MCA.WIppProvLine Extends (%Persistent, %XML.Adaptor) [ ClassType = persistent, Inheritance = right, ProcedureBlock, StorageStrategy = SQLMapping ] { /// .CLM_AMT_ALLOWED node: 0 piece: 6<BR> /// This field should be used in conjunction with the Claim Operator field to /// define a whole claim dollar amount at which a particular claim should be /// flagged with a Pend status. Property ClmAmtAllowed As %String; /// .CLM_LINE_AMT_ALLOWED node: 0 piece: 8<BR> /// This field should be used in conjunction with the Clm Line Operator field to /// define a claim line dollar amount at which a particular claim should be flagged /// with a Pend status. Property ClmLineAmtAllowed As %String; /// .CLM_LINE_OP node: 0 piece: 7<BR> /// A new Table/Column Reference that gives the SIU (Special Investigative Unit) /// the ability to look for claim line dollars above, below, or equal to a set /// amount. Property ClmLineOp As %String; /// .CLM_OP node: 0 piece: 5<BR> /// A new Table/Column Reference that gives the SIU (Special Investigative Unit) /// the ability to look for claim dollars above, below, or equal to a set amount. Property ClmOp As %String; Property EffDt As %Date; Property Hmo As %Integer; /// .IPP_REASON node: 0 piece: 10<BR> /// IPP Reason Code Property IppCode As %Integer; Property Keen As %Integer; /// .LAST_CHG_DT node: 0 piece: 4<BR> /// Last Changed Date Property LastChgDt As %Date; /// .PX_DX_CDE_FLAG node: 0 piece: 9<BR> /// A Flag to indicate whether or not Procedure Codes or Diagnosis Codes are to be /// associated with this SIU Flag Type Entry. If the Flag = Y, then control would /// jump to a new screen where the user can enter the necessary codes. Property PxDxCdeFlag As %String; Property Seq As %String; Property Seq2 As %String; Index iMaster On (Hmo, Keen, Seq, Seq2) [ IdKey, PrimaryKey, Unique ]; /// .TERM_DT node: 0 piece: 2<BR> /// Term Date Property TermDt As %Date; /// .USER_INI node: 0 piece: 3 Property UserIni As %String; Relationship relWIppProv As XFXA.MCA.WIppProv [ Cardinality = one, Inverse = IppLines ]; Index relWIppProvIndex On relWIppProv; //Index NewIndex1 On (RelWIppProv, Seq2) [ IdKey, PrimaryKey, Unique ]; <Storage name="SQLMapping"> <ExtentSize>1000000</ExtentSize> <SQLMap name="DBMS"> <ConditionalWithHostVars></ConditionalWithHostVars> <Data name="ClmAmtAllowed"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>6</Piece> </Data> <Data name="ClmLineAmtAllowed"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>8</Piece> </Data> <Data name="ClmLineOp"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>7</Piece> </Data> <Data name="ClmOp"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>5</Piece> </Data> <Data name="EffDt"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>1</Piece> </Data> <Data name="Hmo"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>11</Piece> </Data> <Data name="IppCode"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>10</Piece> </Data> <Data name="LastChgDt"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>4</Piece> </Data> <Data name="PxDxCdeFlag"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>9</Piece> </Data> <Data name="TermDt"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>2</Piece> </Data> <Data name="UserIni"> <Delimiter>"^"</Delimiter> <Node>+0</Node> <Piece>3</Piece> </Data> <Global>^WHEAIPP</Global> <Subscript name="1"> <AccessType>Sub</AccessType> <Expression>{Hmo}</Expression> <LoopInitValue>1</LoopInitValue> </Subscript> <Subscript name="2"> <AccessType>Sub</AccessType> <Expression>{Keen}</Expression> <LoopInitValue>1</LoopInitValue> </Subscript> <Subscript name="3"> <AccessType>Sub</AccessType> <Expression>{Seq}</Expression> <LoopInitValue>1</LoopInitValue> </Subscript> <Subscript name="4"> <AccessType>Sub</AccessType> <Expression>{Seq2}</Expression> <LoopInitValue>1</LoopInitValue> </Subscript> <Type>data</Type> </SQLMap> <StreamLocation>^XFXA.MCA.WIppProvLineS</StreamLocation> <Type>%Library.CacheSQLStorage</Type> </Storage> } When I try to compile this one I get the following error: ERROR #5502: Error compiling SQL Table 'XFXA_MCA.WIppProvLine %msg: Table XFXA_MCA.WIppProvLine has the following unmapped (not defined on the data map) fields: relWIppProv' ERROR #5030: An error occurred while compiling class XFXA.MCA.WIppProvLine Detected 1 errors during compilation in 2.745s. What am I doing wrong? Thanks in Advance, Fred

    Read the article

  • Removing "Using temporary; Using filesort" from this MySQL select+join+group by query

    - by claytontstanley
    I have the following query: select t.Chunk as LeftChunk, t.ChunkHash as LeftChunkHash, q.Chunk as RightChunk, q.ChunkHash as RightChunkHash, count(t.ChunkHash) as ChunkCount from chunksubset as t join chunksubset as q on t.ID = q.ID group by LeftChunkHash, RightChunkHash And the following explain table: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE subsets ref PRIMARY,IDIndex,SubsetIndex SubsetIndex 767 const 522014 "Using where; Using temporary; Using filesort" 1 SIMPLE subsets eq_ref PRIMARY,IDIndex,SubsetIndex PRIMARY 771 sotero.subsets.Id,const 1 "Using where; Using index" 1 SIMPLE c ref IDIndex IDIndex 4 sotero.subsets.Id 12 "Using where" 1 SIMPLE c ref IDIndex IDIndex 4 sotero.subsets.Id 12 note the "using temporary; using filesort". When this query is run, I quickly run out of RAM (presumably b/c of the temp table), and then the HDD kicks in, and the query slows to a halt. I thought it might be an index issue, so I started adding a few that sort of made sense: Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment chunks 0 PRIMARY 1 ChunkId A 17796190 NULL NULL BTREE chunks 1 ChunkHashIndex 1 ChunkHash A 243783 NULL NULL BTREE chunks 1 IDIndex 1 Id A 1483015 NULL NULL BTREE chunks 1 ChunkIndex 1 Chunk A 243783 NULL NULL BTREE chunks 1 ChunkTypeIndex 1 ChunkType A 2 NULL NULL BTREE chunks 1 chunkHashByChunkIDIndex 1 ChunkHash A 243783 NULL NULL BTREE chunks 1 chunkHashByChunkIDIndex 2 ChunkId A 17796190 NULL NULL BTREE chunks 1 chunkHashByChunkTypeIndex 1 ChunkHash A 243783 NULL NULL BTREE chunks 1 chunkHashByChunkTypeIndex 2 ChunkType A 261708 NULL NULL BTREE chunks 1 chunkHashByIDIndex 1 ChunkHash A 243783 NULL NULL BTREE chunks 1 chunkHashByIDIndex 2 Id A 17796190 NULL NULL BTREE But still using the temporary table. The db engine is MyISAM. How can I get rid of the using temporary; using filesort in this query? Just changing to InnoDB w/o explaining the underlying cause is not a particularly satisfying answer. Besides, if the solution is to just add the proper index, then that's much easier than migrating to another db engine.

    Read the article

  • JAXB code generation: how to remove a zero occurrence field?

    - by reef
    Hi all, I use JAXB 2.1 to generate Java classes from several XSD files, and I have a problem related to complex type restriction. On of the restrictions modifies the occurence configuration from minOccurs="0" maxOccurs="unbounded" to minOccurs="0" maxOccurs="0". Thus this field is not needed anymore in the restricted type. But actually JAXB generates the restricted class with a [0..1] cardinality instead of 0. By the way the generation is tuned with <xjc:treatRestrictionLikeNewType / so that a XSD restriction is not mapped to a Java class inheritance. Here is an example: Here is the way a field is defined in a complex type A: <element name="qualifier" type="CR" maxOccurs="unbounded" minOccurs="0"/ Here is the way the same field is restricted in another complex type B that restricts A: <element name="qualifier" type="CR" minOccurs="0" maxOccurs="0"/ In the A generated class I have: @XmlElement(name = "qualifier") protected List<CR qualifiers; And in the B generated class I have: protected CR qualifiers; With my poor understanding of JAXB the absence of the XmlElement annotation tells JAXB not to marshall/unmarshall this field. Am I wrong? If I am right is there a way to tell JAXB not to generate the qualifiers field at all? This would be in my opinion a much better generation as it respects the constraints. Any idea, thougths on the topic? Thanks!!

    Read the article

  • Slow query. Wrong database structure?

    - by Tin
    I have a database with table that contains tasks. Tasks have a lifecycle. The status of the task's lifecycle can change. These state transitions are stored in a separate table tasktransitions. Now I wrote a query to find all open/reopened tasks and recently changed tasks but I already see with a rather small number of tasks (<1000) that execution time has becoming very long (0.5s). Tasks +-------------+---------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------+---------+------+-----+---------+----------------+ | taskid | int(11) | NO | PRI | NULL | auto_increment | | description | text | NO | | NULL | | +-------------+---------+------+-----+---------+----------------+ Tasktransitions +------------------+-----------+------+-----+-------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------+-----------+------+-----+-------------------+----------------+ | tasktransitionid | int(11) | NO | PRI | NULL | auto_increment | | taskid | int(11) | NO | MUL | NULL | | | status | int(11) | NO | MUL | NULL | | | description | text | NO | | NULL | | | userid | int(11) | NO | | NULL | | | transitiondate | timestamp | NO | | CURRENT_TIMESTAMP | | +------------------+-----------+------+-----+-------------------+----------------+ Query SELECT tasks.taskid,tasks.description,tasklaststatus.status FROM tasks LEFT OUTER JOIN ( SELECT tasktransitions.taskid,tasktransitions.transitiondate,tasktransitions.status FROM tasktransitions INNER JOIN ( SELECT taskid,MAX(transitiondate) AS lasttransitiondate FROM tasktransitions GROUP BY taskid ) AS tasklasttransition ON tasklasttransition.lasttransitiondate=tasktransitions.transitiondate AND tasklasttransition.taskid=tasktransitions.taskid ) AS tasklaststatus ON tasklaststatus.taskid=tasks.taskid WHERE tasklaststatus.status IS NULL OR tasklaststatus.status=0 or tasklaststatus.transitiondate>'2013-09-01'; I'm wondering if the database structure is best choice performance wise. Could adding indexes help? I already tried to add some but I don't see great improvements. +-----------------+------------+----------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-----------------+------------+----------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | tasktransitions | 0 | PRIMARY | 1 | tasktransitionid | A | 896 | NULL | NULL | | BTREE | | | | tasktransitions | 1 | taskid_date_ix | 1 | taskid | A | 896 | NULL | NULL | | BTREE | | | | tasktransitions | 1 | taskid_date_ix | 2 | transitiondate | A | 896 | NULL | NULL | | BTREE | | | | tasktransitions | 1 | status_ix | 1 | status | A | 3 | NULL | NULL | | BTREE | | | +-----------------+------------+----------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ Any other suggestions?

    Read the article

  • On counting pairs of words that differ by one letter

    - by Quintofron
    Let us consider n words, each of length k. Those words consist of letters over an alphabet (whose cardinality is n) with defined order. The task is to derive an O(nk) algorithm to count the number of pairs of words that differ by one position (no matter which one exactly, as long as it's only a single position). For instance, in the following set of words (n = 5, k = 4): abcd, abdd, adcb, adcd, aecd there are 5 such pairs: (abcd, abdd), (abcd, adcd), (abcd, aecd), (adcb, adcd), (adcd, aecd). So far I've managed to find an algorithm that solves a slightly easier problem: counting the number of pairs of words that differ by one GIVEN position (i-th). In order to do this I swap the letter at the ith position with the last letter within each word, perform a Radix sort (ignoring the last position in each word - formerly the ith position), linearly detect words whose letters at the first 1 to k-1 positions are the same, eventually count the number of occurrences of each letter at the last (originally ith) position within each set of duplicates and calculate the desired pairs (the last part is simple). However, the algorithm above doesn't seem to be applicable to the main problem (under the O(nk) constraint) - at least not without some modifications. Any idea how to solve this?

    Read the article

  • Checksum Transformation

    The Checksum Transformation computes a hash value, the checksum, across one or more columns, returning the result in the Checksum output column. The transformation provides functionality similar to the T-SQL CHECKSUM function, but is encapsulated within SQL Server Integration Services, for use within the pipeline without code or a SQL Server connection. As featured in The Microsoft Data Warehouse Toolkit by Joy Mundy and Warren Thornthwaite from the Kimbal Group. Have a look at the book samples especially Sample package for custom SCD handling. All input columns are passed through the transformation unaltered, those selected are used to generate the checksum which is passed out through a single output column, Checksum. This does not restrict the number of columns available downstream from the transformation, as columns will always flow through a transformation. The Checksum output column is in addition to all existing columns within the pipeline buffer. The Checksum Transformation uses an algorithm based on the .Net framework GetHashCode method, it is not consistent with the T-SQL CHECKSUM() or BINARY_CHECKSUM() functions. The transformation does not support the following Integration Services data types, DT_NTEXT, DT_IMAGE and DT_BYTES. ChecksumAlgorithm Property There ChecksumAlgorithm property is defined with an enumeration. It was first added in v1.3.0, when the FrameworkChecksum was added. All previous algorithms are still supported for backward compatibility as ChecksumAlgorithm.Original (0). Original - Orginal checksum function, with known issues around column separators and null columns. This was deprecated in the first SQL Server 2005 RTM release. FrameworkChecksum - The hash function is based on the .NET Framework GetHash method for object types. This is based on the .NET Object.GetHashCode() method, which unfortunately differs between x86 and x64 systems. For that reason we now default to the CRC32 option. CRC32 - Using a standard 32-bit cyclic redundancy check (CRC), this provides a more open implementation. The component is provided as an MSI file, however to complete the installation, you will have to add the transformation to the Visual Studio toolbox by hand. This process has been described in detail in the related FAQ entry for How do I install a task or transform component?, just select Checksum from the SSIS Data Flow Items list in the Choose Toolbox Items window. Downloads The Checksum Transformation is available for SQL Server 2005, SQL Server 2008 (includes R2) and SQL Server 2012. Please choose the version to match your SQL Server version, or you can install multiple versions and use them side by side if you have more than one version of SQL Server installed. Checksum Transformation for SQL Server 2005 Checksum Transformation for SQL Server 2008 Checksum Transformation for SQL Server 2012 Version History SQL Server 2012 Version 3.0.0.27 – SQL Server 2012 release. Includes upgrade support for both 2005 and 2008 packages to 2012. (5 Jun 2010) SQL Server 2008 Version 2.0.0.27 – Fix for CRC-32 algorithm that inadvertently made it sort dependent. Fix for race condition which sometimes lead to the error Item has already been added. Key in dictionary: '79764919' . Fix for upgrade mappings between 2005 and 2008. (19 Oct 2010) Version 2.0.0.24 - SQL Server 2008 release. Introduces the new CRC-32 algorithm, which is consistent across x86 and x64.. The default algorithm is now CRC32. (29 Oct 2008) Version 2.0.0.6 - SQL Server 2008 pre-release. This version was released by mistake as part of the site migration, and had known issues. (20 Oct 2008) SQL Server 2005 Version 1.5.0.43 – Fix for CRC-32 algorithm that inadvertently made it sort dependent. Fix for race condition which sometimes lead to the error Item has already been added. Key in dictionary: '79764919' . (19 Oct 2010) Version 1.5.0.16 - Introduces the new CRC-32 algorithm, which is consistent across x86 and x64. The default algorithm is now CRC32. (20 Oct 2008) Version 1.4.0.0 - Installer refresh only. (22 Dec 2007) Version 1.4.0.0 - Refresh for minor UI enhancements. (5 Mar 2006) Version 1.3.0.0 - SQL Server 2005 RTM. The checksum algorithm has changed to improve cardinality when calculating multiple column checksums. The original algorithm is still available for backward compatibility. Fixed custom UI bug with Output column name not persisting. (10 Nov 2005) Version 1.2.0.1 - SQL Server 2005 IDW 15 June CTP. A user interface is provided, as well as the ability to change the checksum output column name. (29 Aug 2005) Version 1.0.0 - Public Release (Beta). (30 Oct 2004) Screenshot

    Read the article

  • Lawler's Algorithm Implementation Assistance

    - by Richard Knop
    Here is my implemenation of Lawler's algorithm in PHP (I know... but I'm used to it): <?php $jobs = array(1, 2, 3, 4, 5, 6); $jobsSubset = array(2, 5, 6); $n = count($jobs); $processingTimes = array(2, 3, 4, 3, 2, 1); $dueDates = array(3, 15, 9, 7, 11, 20); $optimalSchedule = array(); foreach ($jobs as $j) { $optimalSchedule[] = 0; } $dicreasedCardinality = array(); for ($i = $n; $i >= 1; $i--) { $x = 0; $max = 0; // loop through all jobs for ($j = 0; $j < $i; $j++) { // ignore if $j already is in the $dicreasedCardinality array if (false === in_array($j, $dicreasedCardinality)) { // if the job has no succesor in $jobsSubset if (false === isset($jobs[$j+1]) || false === in_array($jobs[$j+1], $jobsSubset)) { // here I find an array index of a job with the maximum due date // amongst jobs with no sucessor in $jobsSubset if ($x < $dueDates[$j]) { $x = $dueDates[$j]; $max = $j; } } } } // move the job at the end of $optimalSchedule $optimalSchedule[$i-1] = $jobs[$max]; // decrease the cardinality of $jobs $dicreasedCardinality[] = $max; } print_r($optimalSchedule); Now the above returns an optimal schedule like this: Array ( [0] => 1 [1] => 1 [2] => 1 [3] => 3 [4] => 2 [5] => 6 ) Which doesn't seem right to me. The problem might be with my implementation of the algorithm because I am not sure I understand it correctly. I used this source to implement it: http://www.google.com/books?id=aSiBs6PDm9AC&pg=PA166&dq=lawler%27s+algorithm+code&lr=&hl=sk&cd=4#v=onepage&q=&f=false The description there is a little confusing. For example, I didn't quite get how is the subset D defined (I guess it is arbitrary). Could anyone help me out with this? I have been trying to find some sources with simpler explanation of the algorithm but all sources I found were even more complicated (with math proofs and such) so I am stuck with the link above. Yes, this is a homework, if it wasn't obvious. I still have few weeks to crack this but I have spent few days already trying to get how exactly this algorithm works with no success so I don't think I will get any brighter during that time.

    Read the article

  • XSD: xs:sequence & xs:choice combination for xs:extension elements?

    - by bguiz
    Hi, My question is about defining an XML schema that will validate the following sample XML: <rules> <other>...</other> <bool>...</bool> <other>...</other> <string>...</string> <other>...</other> </rules> The order of the child nodes does not matter. The cardinality of the child nodes is 0..unbounded. All the child elements of the rules node have a common base type, rule, like so: <xs:complexType name="booleanRule"> <xs:complexContent> <xs:extension base="rule"> ... </xs:extension> </xs:complexContent> </xs:complexType> <xs:complexType name="stringFilterRule"> <xs:complexContent> <xs:extension base="filterRule"> ... </xs:extension> </xs:complexContent> </xs:complexType> My current attempt at defining the schema for the rules node is below. However, Can I nest xs:choice within xs:sequence? If, where do I specify the maxOccurs="unbounded" attribute? Is there a better way to do this, such as an xs:sequence which specifies only the base type of its child elements? <xs:element name="rules"> <xs:complexType> <xs:sequence> <xs:choice> <xs:element name="bool" type="booleanRule" /> <xs:element name="string" type="stringRule" /> <xs:element name="other" type="someOtherRule" /> </xs:choice> </xs:sequence> </xs:complexType> </xs:element>

    Read the article

  • MySQL forgot about automatically creating an index for a foreign key?

    - by bobo
    After running the following SQL statements, you will see that, MySQL has automatically created the non-unique index question_tag_tag_id_tag_id on the tag_id column for me after the first ALTER TABLE statement has run. But after the second ALTER TABLE statement has run, I think MySQL should also automatically create another non-unique index question_tag_question_id_question_id on the question_id column for me. But as you can see from the SHOW INDEXES statement output, it's not there. Why does MySQL forget about the second ALTER TABLE statement? By the way, since I have already created a unique index question_id_tag_id_idx used by both question_id and tag_id columns. Is creating a separate index for each of them redundant? mysql> DROP DATABASE mydatabase; Query OK, 1 row affected (0.00 sec) mysql> CREATE DATABASE mydatabase; Query OK, 1 row affected (0.00 sec) mysql> USE mydatabase; Database changed mysql> CREATE TABLE question (id BIGINT AUTO_INCREMENT, html TEXT, PRIMARY KEY(id)) ENGINE = INNODB; Query OK, 0 rows affected (0.05 sec) mysql> CREATE TABLE tag (id BIGINT AUTO_INCREMENT, name VARCHAR(10) NOT NULL, UNIQUE INDEX name_idx (name), PRIMARY KEY(id)) ENGINE = INNODB; Query OK, 0 rows affected (0.05 sec) mysql> CREATE TABLE question_tag (question_id BIGINT, tag_id BIGINT, UNIQUE INDEX question_id_tag_id_idx (question_id, tag_id), PRIMARY KEY(question_id, tag_id)) ENGINE = INNODB; Query OK, 0 rows affected (0.00 sec) mysql> ALTER TABLE question_tag ADD CONSTRAINT question_tag_tag_id_tag_id FOREIGN KEY (tag_id) REFERENCES tag(id); Query OK, 0 rows affected (0.10 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE question_tag ADD CONSTRAINT question_tag_question_id_question_id FOREIGN KEY (question_id) REFERENCES question(id); Query OK, 0 rows affected (0.13 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> SHOW INDEXES FROM question_tag; +--------------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +--------------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | question_tag | 0 | PRIMARY | 1 | question_id | A | 0 | NULL | NULL | | BTREE | | | question_tag | 0 | PRIMARY | 2 | tag_id | A | 0 | NULL | NULL | | BTREE | | | question_tag | 0 | question_id_tag_id_idx | 1 | question_id | A | 0 | NULL | NULL | | BTREE | | | question_tag | 0 | question_id_tag_id_idx | 2 | tag_id | A | 0 | NULL | NULL | | BTREE | | | question_tag | 1 | question_tag_tag_id_tag_id | 1 | tag_id | A | 0 | NULL | NULL | | BTREE | | +--------------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 5 rows in set (0.01 sec) mysql>

    Read the article

  • XSD: xs:sequence & xs:choice combination for xs:extension elements of a common base type?

    - by bguiz
    Hi, My question is about defining an XML schema that will validate the following XML: <rules> <other>...</other> <bool>...</bool> <other>...</other> <string>...</string> <other>...</other> </rules> The order of the child nodes does not matter. The cardinality of the child nodes is 0..unbounded. All the child elements of the rules node have a common base type, rule, like so: <xs:complexType name="booleanRule"> <xs:complexContent> <xs:extension base="rule"> ... </xs:extension> </xs:complexContent> </xs:complexType> <xs:complexType name="stringFilterRule"> <xs:complexContent> <xs:extension base="filterRule"> ... </xs:extension> </xs:complexContent> </xs:complexType> My current (feeble) attempt at defining the schema for the rules node is below. However, Can I nest xs:choice within xs:sequence? If, where do I specify the maxOccurs="unbounded" attribute? Is there a better way to do this, such as an xs:sequence which specifies only the base type of its child elements? <xs:element name="rules"> <xs:complexType> <xs:sequence> <xs:choice> <xs:element name="bool" type="booleanRule" /> <xs:element name="string" type="stringRule" /> <xs:element name="other" type="someOtherRule" /> </xs:choice> </xs:sequence> </xs:complexType> </xs:element>

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • Execution plan warnings–The final chapter

    - by Dave Ballantyne
    In my previous posts (here and here), I showed examples of some of the execution plan warnings that have been added to SQL Server 2012.  There is one other warning that is of interest to me : “Unmatched Indexes”. Firstly, how do I know this is the final one ?  The plan is an XML document, right ? So that means that it can have an accompanying XSD.  As an XSD is a schema definition, we can poke around inside it to find interesting things that *could* be in the final XML file. The showplan schema is stored in the folder Microsoft SQL Server\110\Tools\Binn\schemas\sqlserver\2004\07\showplan and by comparing schemas over releases you can get a really good idea of any new functionality that has been added. Here is the section of the Sql Server 2012 showplan schema that has been interesting me so far : <xsd:complexType name="AffectingConvertWarningType"> <xsd:annotation> <xsd:documentation>Warning information for plan-affecting type conversion</xsd:documentation> </xsd:annotation> <xsd:sequence> <!-- Additional information may go here when available --> </xsd:sequence> <xsd:attribute name="ConvertIssue" use="required"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="Cardinality Estimate" /> <xsd:enumeration value="Seek Plan" /> <!-- to be extended here --> </xsd:restriction> </xsd:simpleType> </xsd:attribute> <xsd:attribute name="Expression" type ="xsd:string" use="required" /></xsd:complexType><xsd:complexType name="WarningsType"> <xsd:annotation> <xsd:documentation>List of all possible iterator or query specific warnings (e.g. hash spilling, no join predicate)</xsd:documentation> </xsd:annotation> <xsd:choice minOccurs="1" maxOccurs="unbounded"> <xsd:element name="ColumnsWithNoStatistics" type="shp:ColumnReferenceListType" minOccurs="0" maxOccurs="1" /> <xsd:element name="SpillToTempDb" type="shp:SpillToTempDbType" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="Wait" type="shp:WaitWarningType" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="PlanAffectingConvert" type="shp:AffectingConvertWarningType" minOccurs="0" maxOccurs="unbounded" /> </xsd:choice> <xsd:attribute name="NoJoinPredicate" type="xsd:boolean" use="optional" /> <xsd:attribute name="SpatialGuess" type="xsd:boolean" use="optional" /> <xsd:attribute name="UnmatchedIndexes" type="xsd:boolean" use="optional" /> <xsd:attribute name="FullUpdateForOnlineIndexBuild" type="xsd:boolean" use="optional" /></xsd:complexType> I especially like the “to be extended here” comment,  high hopes that we will see more of these in the future.   So “Unmatched Indexes” was a warning that I couldn’t get and many thanks must go to Fabiano Amorim (b|t) for showing me the way.   Filtered indexes were introduced in Sql Server 2008 and are really useful if you only need to index only a portion of the data within a table.  However,  if your SQL code uses a variable as a predicate on the filtered data that matches the filtered condition, then the filtered index cannot be used as, naturally,  the value in the variable may ( and probably will ) change and therefore will need to read data outside the index.  As an aside,  you could use option(recompile) here , in which case the optimizer will build a plan specific to the variable values and use the filtered index,  but that can bring about other problems.   To demonstrate this warning, we need to generate some test data :   DROP TABLE #TestTab1GOCREATE TABLE #TestTab1 (Col1 Int not null, Col2 Char(7500) not null, Quantity Int not null)GOINSERT INTO #TestTab1 VALUES (1,1,1),(1,2,5),(1,2,10),(1,3,20), (2,1,101),(2,2,105),(2,2,110),(2,3,120)GO and then add a filtered index CREATE INDEX ixFilter ON #TestTab1 (Col1)WHERE Quantity = 122 Now if we execute SELECT COUNT(*) FROM #TestTab1 WHERE Quantity = 122 We will see the filtered index being scanned But if we parameterize the query DECLARE @i INT = 122SELECT COUNT(*) FROM #TestTab1 WHERE Quantity = @i The plan is very different a table scan, as the value of the variable used in the predicate can change at run time, and also we see the familiar warning triangle. If we now look at the properties pane, we will see two pieces of information “Warnings” and “UnmatchedIndexes”. So, handily, we are being told which filtered index is not being used due to parameterization.

    Read the article

  • Query optimization using composite indexes

    - by xmarch
    Many times, during the process of creating a new Coherence application, developers do not pay attention to the way cache queries are constructed; they only check that these queries comply with functional specs. Later, performance testing shows that these perform poorly and it is then when developers start working on improvements until the non-functional performance requirements are met. This post describes the optimization process of a real-life scenario, where using a composite attribute index has brought a radical improvement in query execution times.  The execution times went down from 4 seconds to 2 milliseconds! E-commerce solution based on Oracle ATG – Endeca In the context of a new e-commerce solution based on Oracle ATG – Endeca, Oracle Coherence has been used to calculate and store SKU prices. In this architecture, a Coherence cache stores the final SKU prices used for Endeca baseline indexing. Each SKU price is calculated from a base SKU price and a series of calculations based on information from corporate global discounts. Corporate global discounts information is stored in an auxiliary Coherence cache with over 800.000 entries. In particular, to obtain each price the process needs to execute six queries over the global discount cache. After the implementation was finished, we discovered that the most expensive steps in the price calculation discount process were the global discounts cache query. This query has 10 parameters and is executed 6 times for each SKU price calculation. The steps taken to optimise this query are described below; Starting point Initial query was: String filter = "levelId = :iLevelId AND  salesCompanyId = :iSalesCompanyId AND salesChannelId = :iSalesChannelId "+ "AND departmentId = :iDepartmentId AND familyId = :iFamilyId AND brand = :iBrand AND manufacturer = :iManufacturer "+ "AND areaId = :iAreaId AND endDate >=  :iEndDate AND startDate <= :iStartDate"; Map<String, Object> params = new HashMap<String, Object>(10); // Fill all parameters. params.put("iLevelId", xxxx); // Executing filter. Filter globalDiscountsFilter = QueryHelper.createFilter(filter, params); NamedCache globalDiscountsCache = CacheFactory.getCache(CacheConstants.GLOBAL_DISCOUNTS_CACHE_NAME); Set applicableDiscounts = globalDiscountsCache.entrySet(globalDiscountsFilter); With the small dataset used for development the cache queries performed very well. However, when carrying out performance testing with a real-world sample size of 800,000 entries, each query execution was taking more than 4 seconds. First round of optimizations The first optimisation step was the creation of separate Coherence index for each of the 10 attributes used by the filter. This avoided object deserialization while executing the query. Each index was created as follows: globalDiscountsCache.addIndex(new ReflectionExtractor("getXXX" ) , false, null); After adding these indexes the query execution time was reduced to between 450 ms and 1s. However, these execution times were still not good enough.  Second round of optimizations In this optimisation phase a Coherence query explain plan was used to identify how many entires each index reduced the results set by, along with the cost in ms of executing that part of the query. Though the explain plan showed that all the indexes for the query were being used, it also showed that the ordering of the query parameters was "sub-optimal".  Parameters associated to object attributes with high-cardinality should appear at the beginning of the filter, or more specifically, the attributes that filters out the highest of number records should be placed at the beginning. But examining corporate global discount data we realized that depending on the values of the parameters used in the query the “good” order for the attributes was different. In particular, if the attributes brand and family had specific values it was more optimal to have a different query changing the order of the attributes. Ultimately, we ended up with three different optimal variants of the query that were used in its relevant cases: String filter = "brand = :iBrand AND familyId = :iFamilyId AND departmentId = :iDepartmentId AND levelId = :iLevelId "+ "AND manufacturer = :iManufacturer AND endDate >= :iEndDate AND salesCompanyId = :iSalesCompanyId "+ "AND areaId = :iAreaId AND salesChannelId = :iSalesChannelId AND startDate <= :iStartDate"; String filter = "familyId = :iFamilyId AND departmentId = :iDepartmentId AND levelId = :iLevelId AND brand = :iBrand "+ "AND manufacturer = :iManufacturer AND endDate >=  :iEndDate AND salesCompanyId = :iSalesCompanyId "+ "AND areaId = :iAreaId  AND salesChannelId = :iSalesChannelId AND startDate <= :iStartDate"; String filter = "brand = :iBrand AND departmentId = :iDepartmentId AND familyId = :iFamilyId AND levelId = :iLevelId "+ "AND manufacturer = :iManufacturer AND endDate >= :iEndDate AND salesCompanyId = :iSalesCompanyId "+ "AND areaId = :iAreaId AND salesChannelId = :iSalesChannelId AND startDate <= :iStartDate"; Using the appropriate query depending on the value of brand and family parameters the query execution time dropped to between 100 ms and 150 ms. But these these execution times were still not good enough and the solution was cumbersome. Third and last round of optimizations The third and final optimization was to introduce a composite index. However, this did mean that it was not possible to use the Coherence Query Language (CohQL), as composite indexes are not currently supporte in CohQL. As the original query had 8 parameters using EqualsFilter, 1 using GreaterEqualsFilter and 1 using LessEqualsFilter, the composite index was built for the 8 attributes using EqualsFilter. The final query had an EqualsFilter for the multiple extractor, a GreaterEqualsFilter and a LessEqualsFilter for the 2 remaining attributes.  All individual indexes were dropped except the ones being used for LessEqualsFilter and GreaterEqualsFilter. We were now running in an scenario with an 8-attributes composite filter and 2 single attribute filters. The composite index created was as follows: ValueExtractor[] ve = { new ReflectionExtractor("getSalesChannelId" ), new ReflectionExtractor("getLevelId" ),    new ReflectionExtractor("getAreaId" ), new ReflectionExtractor("getDepartmentId" ),    new ReflectionExtractor("getFamilyId" ), new ReflectionExtractor("getManufacturer" ),    new ReflectionExtractor("getBrand" ), new ReflectionExtractor("getSalesCompanyId" )}; MultiExtractor me = new MultiExtractor(ve); NamedCache globalDiscountsCache = CacheFactory.getCache(CacheConstants.GLOBAL_DISCOUNTS_CACHE_NAME); globalDiscountsCache.addIndex(me, false, null); And the final query was: ValueExtractor[] ve = { new ReflectionExtractor("getSalesChannelId" ), new ReflectionExtractor("getLevelId" ),    new ReflectionExtractor("getAreaId" ), new ReflectionExtractor("getDepartmentId" ),    new ReflectionExtractor("getFamilyId" ), new ReflectionExtractor("getManufacturer" ),    new ReflectionExtractor("getBrand" ), new ReflectionExtractor("getSalesCompanyId" )}; MultiExtractor me = new MultiExtractor(ve); // Fill composite parameters.String SalesCompanyId = xxxx;...AndFilter composite = new AndFilter(new EqualsFilter(me,                   Arrays.asList(iSalesChannelId, iLevelId, iAreaId, iDepartmentId, iFamilyId, iManufacturer, iBrand, SalesCompanyId)),                                     new GreaterEqualsFilter(new ReflectionExtractor("getEndDate" ), iEndDate)); AndFilter finalFilter = new AndFilter(composite, new LessEqualsFilter(new ReflectionExtractor("getStartDate" ), iStartDate)); NamedCache globalDiscountsCache = CacheFactory.getCache(CacheConstants.GLOBAL_DISCOUNTS_CACHE_NAME); Set applicableDiscounts = globalDiscountsCache.entrySet(finalFilter);      Using this composite index the query improved dramatically and the execution time dropped to between 2 ms and  4 ms.  These execution times completely met the non-functional performance requirements . It should be noticed than when using the composite index the order of the attributes inside the ValueExtractor was not relevant.

    Read the article

< Previous Page | 1 2 3 4  | Next Page >