predicates - Page 4 - Developer IT

Use the repository pattern when using PLINQO generated data?

- by Chad

I'm "upgrading" an MVC app. Previously, the DAL was a part of the Model, as a series of repositories (based on the entity name) using standard LINQ to SQL queries. Now, it's a separate project and is generated using PLINQO. Since PLINQO generates query extensions based on the properties of the entity, I started using them directly in my controller... and eliminated the repositories all together. It's working fine, this is more a question to draw upon your experience, should I continue down this path or should I rebuild the repositories (using PLINQO as the DAL within the repository files)? One benefit of just using the PLINQO generated data context is that when I need DB access, I just make one reference to the the data context. Under the repository pattern, I had to reference each repository when I needed data access, sometimes needing to reference multiple repositories on a single controller. The big benefit I saw on the repositories, were aptly named query methods (i.e. FindAllProductsByCategoryId(int id), etc...). With the PLINQO code, it's _db.Product.ByCatId(int id) - which isn't too bad either. I like both, but where it gets "harrier" is when the query uses predicates. I can roll that up into the repository query method. But on the PLINQO code, it would be something like _db.Product.Where(x = x.CatId == 1 && x.OrderId == 1); I'm not so sure I like having code like that in my controllers. Whats your take on this?

Read the article

Help creating a predicate for use with filteredArrayUsingPredicate

- by johnbdh

I am trying to learn how to use predicates and so am trying to replace the following working code with filteredArrayUsingPredicate... [filteredLocations removeAllObjects]; for (NSString *location in locations) { NSRange range = [location rangeOfString:query options:NSCaseInsensitiveSearch]; if (range.length > 0) { [filteredLocations addObject:location]; } } Instead I am trying.... [filteredLocations removeAllObjects]; NSPredicate *predicate = [NSPredicate predicateWithFormat:@"SELF contains %@", searchText]; [filteredLocations addObjectsFromArray: [locations filteredArrayUsingPredicate:predicate]]; I am not getting the same results with the predicate as I am with for loop rangeOfString. With the range of string for example searchText returns an 8 item array while with the same value returns only 2 with the predicate. Another example, hono will find honolulu in the locations array while it will not find anything using the predicate. As I understand it SELF represents the object object being evaluated ie. the locations array, so I think that is the correct syntax. Any help would be appreciated Thanks, John

Read the article

Prolog adding and removing list element if non present in second list

- by logically

I don't know what I'm missing here. I wan't to add an element if it is in arg1 but not in arg2 and want to remove an element if it is in arg1 but not in arg2. I'm using an if condition with includes function that return true if the element is in the arg2 list, false otherwise. Then use built it predicates append and select to add or remove. I'm getting false to all my objectives searches. I comment and uncomment depending on what predicate I want, add or remove. includes([],_). includes([P|Z],S) :- memberchk(P,S), includes(Z,S). addop([],list,res). addop([P|R],list,res) :- includes(P,s0) - addop(R,list,res) ; append(P,list,res), addop(R,list,res). rem([],list,res). rem([P|R],list,res) :- includes(P,list) - rem(R,list,res) ; select(P,list,res),rem(R,list,res). Thanks for help.

Read the article

NSFetchedResultsController: changing predicate not working?

- by icerelic

Hi, I'm writing an app with two tables on one screen. The left table is a list of folders and the right table shows a list of files. When tapped on a row on the left, the right table will display the files belonging to that folder. I'm using Core Data for storage. When the selection of folder changes, the fetch predicate of the right table's NSFetchedResultsController will change and perform a new fetch, then reload the table data. I used the following code snippet: NSPredicate *predicate = [NSPredicate predicateWithFormat:@"list = %@",self.list]; [fetchedResultsController.fetchRequest setPredicate:predicate]; NSError *error = nil; if (![[self fetchedResultsController] performFetch:&error]) { NSLog(@"Unresolved error %@, %@", error, [error userInfo]); abort(); } [table reloadData]; However the fetch results are still the same. I've NSLog'ed "predicate" before and after the fetch, and they were correct with updated information. The fetch results stay the same as initial fetch (when view is loaded). I'm not very familiar with the way Core Data fetches objects (is there a caching system?), but I've done similar things before(changing predicates, re-fetching data, and refreshing table) with single table views and everything went well. If someone could gave me a hint I would be very appreciated. Thanks in advance.

Read the article

How do I bind an iTunes style source list to an NSTableView using Core Data?

- by Austin

I have an iTunes style interface in my application: Source list (NSOutlineView) on the left that contains different libraries and playlists with an NSTableView on the right side of the interface displaying information for "Presentations". Similar to iTunes, I am showing the same type of information in the table view whether a library or playlist is selected (title, author, date created, etc). I currently have an NSArrayController connected to my NSTableView and was setting the fetch predicate based on what was selected in the source list. This works fine when selecting a library because I can just set the fetch predicate to filter by the "type" field in my Presentation Core Data entity. When I try to adjust the fetch predicate for the playlist however, it doesn't look like there is any way to set the fetch predicate because I've got a table in between Playlists and Presentations to keep up with the order within the Playlist. According to the Apple docs, these type of predicates are not doable with Core Data (it basically doesn't multiple inner joins). Below is the relevant portion of my Data Model. Is my data model setup incorrectly? Should I drop the NSArrayController and handle connecting the NSTableView up by hand? I'm trying to figure out if there is a simple fix, or really a design flaw.

Read the article

Why linking doesn't work in my Xtext-based DSL?

- by reprogrammer

The following is the Xtext grammar for my DSL. Model: variableTypes=VariableTypes predicateTypes=PredicateTypes variableDeclarations= VariableDeclarations rules=Rules; VariableType: name=ID; VariableTypes: 'var types' (variableTypes+=VariableType)+; PredicateTypes: 'predicate types' (predicateTypes+=PredicateType)+; PredicateType: name=ID '(' (variableTypes+=[VariableType|ID])+ ')'; VariableDeclarations: 'vars' (variableDeclarations+=VariableDeclaration)+; VariableDeclaration: name=ID ':' type=[VariableType|ID]; Rules: 'rules' (rules+=Rule)+; Rule: head=Head ':-' body=Body; Head: predicate=Predicate; Body: (predicates+=Predicate)+; Predicate: predicateType=[PredicateType|ID] '(' (terms+=Term)+ ')'; Term: variable=Variable; Variable: variableDeclaration=[VariableDeclaration|ID]; terminal WS: (' ' | '\t' | '\r' | '\n' | ',')+; And, the following is a program in the above DSL. var types Node predicate types Edge(Node, Node) Path(Node, Node) vars x : Node y : Node z : Node rules Path(x, y) :- Edge(x, y) Path(x, y) :- Path(x, z) Path(z, y) When I used the generated Switch class to traverse the EMF object model corresponding to the above program, I realized that the nodes are not linked together properly. For example, the getPredicateType() method on a Predicate node returns null. Having read the Xtext user's guide, my impression is that the Xtext default linking semantics should work for my DSL. But, for some reason, the AST nodes of my DSL don't get linked together properly. Can anyone help me in diagnosing this problem?

Read the article

Total row count for pagination using JPA Criteria API

- by ThinkFloyd

I am implementing "Advanced Search" kind of functionality for an Entity in my system such that user can search that entity using multiple conditions(eq,ne,gt,lt,like etc) on attributes of this entity. I am using JPA's Criteria API to dynamically generate the Criteria query and then using setFirstResult() & setMaxResults() to support pagination. All was fine till this point but now I want to show total number of results on results grid but I did not see a straight forward way to get total count of Criteria query. This is how my code looks like: CriteriaBuilder builder = em.getCriteriaBuilder(); CriteriaQuery<Brand> cQuery = builder.createQuery(Brand.class); Root<Brand> from = cQuery.from(Brand.class); CriteriaQuery<Brand> select = cQuery.select(from); . . //Created many predicates and added to **Predicate[] pArray** . . select.where(pArray); // Added orderBy clause TypedQuery typedQuery = em.createQuery(select); typedQuery.setFirstResult(startIndex); typedQuery.setMaxResults(pageSize); List resultList = typedQuery.getResultList(); My result set could be big so I don't want to load my entities for count query, so tell me efficient way to get total count like rowCount() method on Criteria (I think its there in Hibernate's Criteria).

Read the article

How to filter node list based on the contents of another node list

- by ~otakuj462

Hi, I'd like to use XSLT to filter a node list based on the contents of another node list. Specifically, I'd like to filter a node list such that elements with identical id attributes are eliminated from the resulting node list. Priority should be given to one of the two node lists. The way I originally imagined implementing this was to do something like this: <xsl:variable name="filteredList1" select="$list1[not($list2[@id_from_list1 = @id_from_list2])]"/> The problem is that the context node changes in the predicate for $list2, so I don't have access to attribute @id_from_list1. Due to these scoping constraints, it's not clear to me how I would be able to refer to an attribute from the outer node list using nested predicates in this fashion. To get around the issue of the context node, I've tried to create a solution involving a for-each loop, like the following: <xsl:variable name="filteredList1"> <xsl:for-each select="$list1"> <xsl:variable name="id_from_list1" select="@id_from_list1"/> <xsl:if test="not($list2[@id_from_list2 = $id_from_list1])"> <xsl:copy-of select="."/> </xsl:if> </xsl:for-each> </xsl:variable> But this doesn't work correctly. It's also not clear to me how it fails... Using the above technique, filteredList1 has a length of 1, but appears to be empty. It's strange behaviour, and anyhow, I feel there must be a more elegant approach. I'd appreciate any guidance anyone can offer. Thanks.

Read the article

MSMQ - Message Queue Abstraction and Pattern

- by Maxim Gershkovich

Hi All, Let me define the problem first and why a messagequeue has been chosen. I have a datalayer that will be transactional and EXTREMELY insert heavy and rather then attempt to deal with these issues when they occur I am hoping to implement my application from the ground up with this in mind. I have decided to tackle this problem by using the Microsoft Message Queue and perform inserts as time permits asynchronously. However I quickly ran into a problem. Certain inserts that I perform may need to be recalled (ie: retrieved) immediately (imagine this is for POS system and what happens if you need to recall the last transaction - one that still hasn’t been inserted). The way I decided to tackle this problem is by abstracting the MessageQueue and combining it in my data access layer thereby creating the illusion of a single set of data being returned to the user of the datalayer (I have considered the other issues that occur in such a scenario (ie: essentially dirty reads and such) and have concluded for my purposes I can control these issues). However this is where things get a little nasty... I’ve worked out how to get the messages back and such (trivial enough problem) but where I am stuck is; how do I create a generic (or at least somewhat generic) way of querying my message queue? One where I can minimize the duplication between the SQL queries and MessageQueue queries. I have considered using LINQ (but have very limited understanding of the technology) and have also attempted an implementation with Predicates which so far is pretty smelly. Are there any patterns for such a problem that I can utilize? Am I going about this the wrong way? Does anyone have an of their own ideas about how I can tackle this problem? Does anyone even understand what I am talking about? :-) Any and ALL input would be highly appreciated and seriously considered… Thanks again.

Read the article

NSFetchedResultsController on secondary UITableView - how to query data?

- by Jason

I am creating a core-data based Navigation iPhone app with multiple screens. Let's say it is a flash-card application. The data model is very simple, with only two entities: Language, and CardSet. There is a one-to-many relationship between the Language entity and the CardSet entities, so each Language may contain multiple CardSets. In other words, Language has a one-to-many relationship Language.cardSets which points to the list of CardSets, and CardSet has a relationship CardSet.language which points to the Language. There are two screens: (1) An initial TableView screen, which displays the list of languages; and (2) a secondary TableView screen, which displays the list of CardSets in the Language. In the initial screen, which lists the languages, I am using NSFetchedResultsController to keep the list of languages up-to-date. The screen passes the Language selected to the secondary screen. On the secondary screen, I am trying to figure out whether I should again use an NSFetchedResultsController to maintain the list of CardSets, or if I should work through Language.cardSets to simply pull the list out of the object model. The latter makes the most sense programatically because I already have the Language - but then it would not automatically be updated on changes. I have looked at the NSFetchedResultsController documentation, and it seems like I can easily create predicates based on attributes - but not relationships. I.e., I can create the following NSFetchedResultsController: NSPredicate *predicate = [NSPredicate predicateWithFormat:@"name LIKE[c] 'Chuck Norris'"]; How can I access my data through the direct relationship - Language.cardSets - and also have the table auto-update using NSFetchedResultsController? Is this possible?

Read the article

Find the order among tasks in a company by using prolog?

- by Cem

First of all,I wish a happy new year for everyone.I searched more and worked a lot but I could not solve this question.I am quite a new in prolog and I must do this homework. In my homework,the question is like this: Write a prolog program that determines a valid order for the tasks to be carried out in a company. The prolog program will consist of a set of "before" predicates which denotes the order between task pairs. Here is an example; before(a,b). before(a,e). before(d,c). before(b,c). before(c,e). Here, task a should be carried before tasks b and e, d before c and so on. Hence a valid ordering of the tasks would be [a, b, d, c, e]. The order predicate in your program will be queried as follows. ?- order([a,b,c,d,e],X). X = [a, b, d, c, e] ; X = [a, d, b, c, e] ; X = [d, a, b, c, e] ; false. Hint: Try to generate different orders for the tasks (permutation) and then check if the order is consistent with the "before" relationships given. Even if you can generate a single valid order, you will get reasonable partial credits.

Read the article

Prolog Beginner: Trivial Example that I cannot get to work.

- by sixtyfootersdude

I have some prolog. The lessThanTen and example predicates work as expected however the exam predicate does not work. lessThanTen(9). lessThanTen(8). lessThanTen(7). lessThanTen(6). lessThanTen(5). lessThanTen(4). lessThanTen(3). lessThanTen(2). lessThanTen(1). lessThanTen(0). example(X) :- X is 5. exam(X) :- X is lessThanTen(Y). Here is the output: % swipl ... ?- [addv1]. Warning: /.../addv1.pl:17: Singleton variables: [Y] % addv1 compiled 0.00 sec, 1,484 bytes true. ?- lessThanTen(X). X = 9 ; X = 8 ; X = 7 ; ... ?- example(X). X = 5. ?- exam(X). ERROR: is/2: Arithmetic: `lessThanTen/1' is not a function ?- exam(5). ERROR: is/2: Arithmetic: `lessThanTen/1' is not a function I am thinking that the warning I am getting is pretty key.

Read the article

Joining on NULLs

- by Dave Ballantyne

A problem I see on a fairly regular basis is that of dealing with NULL values. Specifically here, where we are joining two tables on two columns, one of which is ‘optional’ ie is nullable. So something like this: i.e. Lookup where all the columns are equal, even when NULL. NULL’s are a tricky thing to initially wrap your mind around. Statements like “NULL is not equal to NULL and neither is it not not equal to NULL, it’s NULL” can cause a serious brain freeze and leave you a gibbering wreck and needing your mummy. Before we plod on, time to setup some data to demo against. Create table #SourceTable ( Id integer not null, SubId integer null, AnotherCol char(255) not null ) go create unique clustered index idxSourceTable on #SourceTable(id,subID) go with cteNums as ( select top(1000) number from master..spt_values where type ='P' ) insert into #SourceTable select Num1.number,nullif(Num2.number,0),'SomeJunk' from cteNums num1 cross join cteNums num2 go Create table #LookupTable ( Id integer not null, SubID integer null ) go insert into #LookupTable Select top(100) id,subid from #SourceTable where subid is not null order by newid() go insert into #LookupTable Select top(3) id,subid from #SourceTable where subid is null order by newid() If that has run correctly, you will have 1 million rows in #SourceTable and 103 rows in #LookupTable. We now want to join one to the other. First attempt – Lets just join select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and #LookupTable.SubID = #SourceTable.SubID OK, that’s a fail. We had 100 rows back, we didn’t correctly account for the 3 rows that have null values. Remember NULL <> NULL and the join clause specifies SUBID=SUBID, which for those rows is not true. Second attempt – Lets deal with those pesky NULLS select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and isnull(#LookupTable.SubID,0) = isnull(#SourceTable.SubID,0) OK, that’s the right result, well done and 99.9% of the time that is where its left. It is a relatively trivial CPU overhead to wrap ISNULL around both columns and compare that result, so no problems. But, although that’s true, this a relational database we are using here, not a procedural language. SQL is a declarative language, we are making a request to the engine to get the results we want. How we ask for them can make a ton of difference. Lets look at the plan for our second attempt, specifically the clustered index seek on the #SourceTable There are 2 predicates. The ‘seek predicate’ and ‘predicate’. The ‘seek predicate’ describes how SQLServer has been able to use an Index. Here, it has been able to navigate the index to resolve where ID=ID. So far so good, but what about the ‘predicate’ (aka residual probe) ? This is a row-by-row operation. For each row found in the index matching the Seek Predicate, the leaf level nodes have been scanned and tested using this logical condition. In this example [Expr1007] is the result of the IsNull operation on #LookupTable and that is tested for equality with the IsNull operation on #SourceTable. This residual probe is quite a high overhead, if we can express our statement slightly differently to take full advantage of the index and make the test part of the ‘Seek Predicate’. Third attempt – X is null and Y is null So, lets state the query in a slightly manner: select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and ( #LookupTable.SubID = #SourceTable.SubID or (#LookupTable.SubID is null and #SourceTable.SubId is null) ) So its slightly wordier and may not be as clear in its intent to the human reader, that is what comments are for, but the key point is that it is now clearer to the query optimizer what our intention is. Let look at the plan for that query, again specifically the index seek operation on #SourceTable No ‘predicate’, just a ‘Seek Predicate’ against the index to resolve both ID and SubID. A subtle difference that can be easily overlooked. But has it made a difference to the performance ? Well, yes , a perhaps surprisingly high one. Clever query optimizer well done. If you are using a scalar function on a column, you a pretty much guaranteeing that a residual probe will be used. By re-wording the query you may well be able to avoid this and use the index completely to resolve lookups. In-terms of performance and scalability your system will be in a much better position if you can.

Read the article

In MySQL, what is the most effective query design for joining large tables with many to many relatio

- by lighthouse65

In our application, we collect data on automotive engine performance -- basically source data on engine performance based on the engine type, the vehicle running it and the engine design. Currently, the basis for new row inserts is an engine on-off period; we monitor performance variables based on a change in engine state from active to inactive and vice versa. The related engineState table looks like this: +---------+-----------+---------------+---------------------+---------------------+-----------------+ | vehicle | engine | engine_state | state_start_time | state_end_time | engine_variable | +---------+-----------+---------------+---------------------+---------------------+-----------------+ | 080025 | E01 | active | 2008-01-24 16:19:15 | 2008-01-24 16:24:45 | 720 | | 080028 | E02 | inactive | 2008-01-24 16:19:25 | 2008-01-24 16:22:17 | 304 | +---------+-----------+---------------+---------------------+---------------------+-----------------+ For a specific analysis, we would like to analyze table content based on a row granularity of minutes, rather than the current basis of active / inactive engine state. For this, we are thinking of creating a simple productionMinute table with a row for each minute in the period we are analyzing and joining the productionMinute and engineEvent tables on the date-time columns in each table. So if our period of analysis is from 2009-12-01 to 2010-02-28, we would create a new table with 129,600 rows, one for each minute of each day for that three-month period. The first few rows of the productionMinute table: +---------------------+ | production_minute | +---------------------+ | 2009-12-01 00:00 | | 2009-12-01 00:01 | | 2009-12-01 00:02 | | 2009-12-01 00:03 | +---------------------+ The join between the tables would be engineState AS es LEFT JOIN productionMinute AS pm ON es.state_start_time <= pm.production_minute AND pm.production_minute <= es.event_end_time. This join, however, brings up multiple environmental issues: The engineState table has 5 million rows and the productionMinute table has 130,000 rows When an engineState row spans more than one minute (i.e. the difference between es.state_start_time and es.state_end_time is greater than one minute), as is the case in the example above, there are multiple productionMinute table rows that join to a single engineState table row When there is more than one engine in operation during any given minute, also as per the example above, multiple engineState table rows join to a single productionMinute row In testing our logic and using only a small table extract (one day rather than 3 months, for the productionMinute table) the query takes over an hour to generate. In researching this item in order to improve performance so that it would be feasible to query three months of data, our thoughts were to create a temporary table from the engineEvent one, eliminating any table data that is not critical for the analysis, and joining the temporary table to the productionMinute table. We are also planning on experimenting with different joins -- specifically an inner join -- to see if that would improve performance. What is the best query design for joining tables with the many:many relationship between the join predicates as outlined above? What is the best join type (left / right, inner)?

Read the article

Question about DBD::CSB Statement-Functions

- by sid_com

From the SQL::Statement::Functions documentation: Function syntax When using SQL::Statement/SQL::Parser directly to parse SQL, functions (either built-in or user-defined) may occur anywhere in a SQL statement that values, column names, table names, or predicates may occur. When using the modules through a DBD or in any other context in which the SQL is both parsed and executed, functions can occur in the same places except that they can not occur in the column selection clause of a SELECT statement that contains a FROM clause. # valid for both parsing and executing SELECT MyFunc(args); SELECT * FROM MyFunc(args); SELECT * FROM x WHERE MyFuncs(args); SELECT * FROM x WHERE y < MyFuncs(args); # valid only for parsing (won't work from a DBD) SELECT MyFunc(args) FROM x WHERE y; Reading this I would expect that the first SELECT-statement of my example shouldn't work and the second should but it is quite the contrary. #!/usr/bin/env perl use warnings; use strict; use 5.010; use DBI; open my $fh, '>', 'test.csv' or die $!; say $fh "id,name"; say $fh "1,Brown"; say $fh "2,Smith"; say $fh "7,Smith"; say $fh "8,Green"; close $fh; my $dbh = DBI->connect ( 'dbi:CSV:', undef, undef, { RaiseError => 1, f_ext => '.csv', }); my $table = 'test'; say "\nSELECT 1"; my $sth = $dbh->prepare ( "SELECT MAX( id ) FROM $table WHERE name LIKE 'Smith'" ); $sth->execute (); $sth->dump_results(); say "\nSELECT 2"; $sth = $dbh->prepare ( "SELECT * FROM $table WHERE id = MAX( id )" ); $sth->execute (); $sth->dump_results(); outputs: SELECT 1 '7' 1 rows SELECT 2 Unknown function 'MAX' at /usr/lib/perl5/site_perl/5.10.0/SQL/Parser.pm line 2893. DBD::CSV::db prepare failed: Unknown function 'MAX' at /usr/lib/perl5/site_perl/5.10.0/SQL/Parser.pm line 2894. [for Statement "SELECT * FROM test WHERE id = MAX( id )"] at ./so_3.pl line 30. DBD::CSV::db prepare failed: Unknown function 'MAX' at /usr/lib/perl5/site_perl/5.10.0/SQL/Parser.pm line 2894. [for Statement "SELECT * FROM test WHERE id = MAX( id )"] at ./so_3.pl line 30. Could someone explaine me this behavior?

Read the article

How to hide certain elements on a page using jQuery

- by Ankur

I am trying to implement something that is similar to a faceted search. My data is a series of objects and relationships. The idea is that you click an object (in this case "95 Theses" and then the possibly relationships are displayed, in this case "author" and clicking the relationship shows the object that matches the relationship, in this case "Martin Luther". My clicking of objects and relationsips (predicates) works fine. What I need to do is allow users to click an object or relationship and have all those that extend from it removed. This is what I thought of adding when a object or relationship 'tag' is clicked (every time I add another object or relationship I increment the global attribute called 'level'): if($(".objHolder,. preHolder").filter("[level>'"+level+"']").filter("[holderId='"+holderId+"']").length) { $(".objHolder,. preHolder").filter("[level>'"+level+"']").filter("[holderId='"+holderId+"']").remove(); } <table border="0" cellpadding="4" cellspacing="2"> <tbody><tr> <td class="objHolder" objid="1" holderid="1" level="1"> <table border="0" cellpadding="4" cellspacing="2"> <tbody><tr class="objItemRow" objid="1" holderid="1" level="1"> <td class="objItem" objid="1" holderid="1" level="2" bgcolor="#eeeeee" nowrap="nowrap">95 Theses</td> </tr></tbody> </table></td> <td><img src="images/right.jpg" alt="" height="10" width="16"></td> <td class="preHolder" level="2" holderid="1"> <table border="0" cellpadding="4" cellspacing="2"><tbody> <tr><td class="preItem" level="3" subid="1" preid="1" holderid="1" bgcolor="#eeeeee" nowrap="nowrap">author</td></tr> </tbody></table></td> <td><img src="images/right.jpg" alt="" height="10" width="16"></td> <td class="objHolder" level="3" holderid="1"> <table border="0" cellpadding="4" cellspacing="2"><tbody><tr><td class="objItem" level="4" objid="3" holderid="1" bgcolor="#eeeeee" nowrap="nowrap">Martin Luther</td></tr></tbody></table> </td> </tr></tbody> </table>

Read the article

How do I call a function name that is stored in a hash in Perl?

- by Ether

I'm sure this is covered in the documentation somewhere but I have been unable to find it... I'm looking for the syntactic sugar that will make it possible to call a method on a class whose name is stored in a hash (as opposed to a simple scalar): use strict; use warnings; package Foo; sub foo { print "in foo()\n" } package main; my %hash = (func => 'foo'); Foo->$hash{func}; If I copy $hash{func} into a scalar variable first, then I can call Foo->$func just fine... but what is missing to enable Foo->$hash{func} to work? (EDIT: I don't mean to do anything special by calling a method on class Foo -- this could just as easily be a blessed object (and in my actual code it is); it was just easier to write up a self-contained example using a class method.) EDIT 2: Just for completeness re the comments below, this is what I'm actually doing (this is in a library of Moose attribute sugar, created with Moose::Exporter): # adds an accessor to a sibling module sub foreignTable { my ($meta, $table, %args) = @_; my $class = 'MyApp::Dir1::Dir2::' . $table; my $dbAccessor = lcfirst $table; eval "require $class" or do { die "Can't load $class: $@" }; $meta->add_attribute( $table, is => 'ro', isa => $class, init_arg => undef, # don't allow in constructor lazy => 1, predicate => 'has_' . $table, default => sub { my $this = shift; $this->debug("in builder for $class"); ### here's the line that uses a hash value as the method name my @args = ($args{primaryKey} => $this->${\$args{primaryKey}}); push @args, ( _dbObject => $this->_dbObject->$dbAccessor ) if $args{fkRelationshipExists}; $this->debug("passing these values to $class -> new: @args"); $class->new(@args); }, ); } I've replaced the marked line above with this: my $pk_accessor = $this->meta->find_attribute_by_name($args{primaryKey})->get_read_method_ref; my @args = ($args{primaryKey} => $this->$pk_accessor); PS. I've just noticed that this same technique (using the Moose meta class to look up the coderef rather than assuming its naming convention) cannot also be used for predicates, as Class::MOP::Attribute does not have a similar get_predicate_method_ref accessor. :(

Read the article

How to store array of NSManagedObjects in an NSManagedObject

- by David Tay

I am loading my app with a property list of data from a web site. This property list file contains an NSArray of NSDictionaries which itself contains an NSArray of NSDictionaries. Basically, I'm trying to load a tableView of restaurant menu categories each of which contains menu items. My property list file is fine. I am able to load the file and loop through the nodes structure creating NSEntityDescriptions and am able to save to Core Data. Everything works fine and expectedly except that in my menu category managed object, I have an NSArray of menu items for that category. Later on, when I fetch the categories, the pointers to the menu items in a category is lost and I get all the menu items. Am I suppose to be using predicates or does Core Data keep track of my object graph for me? Can anyone look at how I am loading Core Data and point out the flaw in my logic? I'm pretty good with either SQL and OOP by themselves, but am a little bewildered by ORM. I thought that I should just be able to use aggregation in my managed objects and that the framework would keep track of the pointers for me, but apparently not. NSError *error; NSURL *url = [NSURL URLWithString:@"http://foo.com"]; NSArray *categories = [[NSArray alloc] initWithContentsOfURL:url]; NSMutableArray *menuCategories = [[NSMutableArray alloc] init]; for (int i=0; i<[categories count]; i++){ MenuCategory *menuCategory = [NSEntityDescription insertNewObjectForEntityForName:@"MenuCategory" inManagedObjectContext:[self managedObjectContext]]; NSDictionary *category = [categories objectAtIndex:i]; menuCategory.name = [category objectForKey:@"name"]; NSArray *items = [category objectForKey:@"items"]; NSMutableArray *menuItems = [[NSMutableArray alloc] init]; for (int j=0; j<[items count]; j++){ MenuItem *menuItem = [NSEntityDescription insertNewObjectForEntityForName:@"MenuItem" inManagedObjectContext:[self managedObjectContext]]; NSDictionary *item = [items objectAtIndex:j]; menuItem.name = [item objectForKey:@"name"]; menuItem.price = [item objectForKey:@"price"]; menuItem.image = [item objectForKey:@"image"]; menuItem.details = [item objectForKey:@"details"]; [menuItems addObject:menuItem]; } [menuCategory setValue:menuItems forKey:@"menuItems"]; [menuCategories addObject:menuCategory]; [menuItems release]; } if (![[self managedObjectContext] save:&error]) { NSLog(@"An error occurred: %@", [error localizedDescription]); }

Read the article

How do I restrict concurrent statistics gathering to a small set of tables from a single schema?

- by Maria Colgan

I got an interesting question from one of my colleagues in the performance team last week about how to restrict a concurrent statistics gather to a small subset of tables from one schema, rather than the entire schema. I thought I would share the solution we came up with because it was rather elegant, and took advantage of concurrent statistics gathering, incremental statistics, and the not so well known “obj_filter_list” parameter in DBMS_STATS.GATHER_SCHEMA_STATS procedure. You should note that the solution outline below with “obj_filter_list” still applies, even when concurrent statistics gathering and/or incremental statistics gathering is disabled. The reason my colleague had asked the question in the first place was because he wanted to enable incremental statistics for 5 large partitioned tables in one schema. The first time you gather statistics after you enable incremental statistics on a table, you have to gather statistics for all of the existing partitions so that a synopsis may be created for them. If the partitioned table in question is large and contains a lot of partition, this could take a considerable amount of time. Since my colleague only had the Exadata environment at his disposal overnight, he wanted to re-gather statistics on 5 partition tables as quickly as possible to ensure that it all finished before morning. Prior to Oracle Database 11g Release 2, the only way to do this would have been to write a script with an individual DBMS_STATS.GATHER_TABLE_STATS command for each partition, in each of the 5 tables, as well as another one to gather global statistics on the table. Then, run each script in a separate session and manually manage how many of this session could run concurrently. Since each table has over one thousand partitions that would definitely be a daunting task and would most likely keep my colleague up all night! In Oracle Database 11g Release 2 we can take advantage of concurrent statistics gathering, which enables us to gather statistics on multiple tables in a schema (or database), and multiple (sub)partitions within a table concurrently. By using concurrent statistics gathering we no longer have to run individual statistics gathering commands for each partition. Oracle will automatically create a statistics gathering job for each partition, and one for the global statistics on each partitioned table. With the use of concurrent statistics, our script can now be simplified to just five DBMS_STATS.GATHER_TABLE_STATS commands, one for each table. This approach would work just fine but we really wanted to get this down to just one command. So how can we do that? You may be wondering why we didn’t just use the DBMS_STATS.GATHER_SCHEMA_STATS procedure with the OPTION parameter set to ‘GATHER STALE’. Unfortunately the statistics on the 5 partitioned tables were not stale and enabling incremental statistics does not mark the existing statistics stale. Plus how would we limit the schema statistics gather to just the 5 partitioned tables? So we went to ask one of the statistics developers if there was an alternative way. The developer told us the advantage of the “obj_filter_list” parameter in DBMS_STATS.GATHER_SCHEMA_STATS procedure. The “obj_filter_list” parameter allows you to specify a list of objects that you want to gather statistics on within a schema or database. The parameter takes a collection of type DBMS_STATS.OBJECTTAB. Each entry in the collection has 5 feilds; the schema name or the object owner, the object type (i.e., ‘TABLE’ or ‘INDEX’), object name, partition name, and subpartition name. You don't have to specify all five fields for each entry. Empty fields in an entry are treated as if it is a wildcard field (similar to ‘*’ character in LIKE predicates). Each entry corresponds to one set of filter conditions on the objects. If you have more than one entry, an object is qualified for statistics gathering as long as it satisfies the filter conditions in one entry. You first must create the collection of objects, and then gather statistics for the specified collection. It’s probably easier to explain this with an example. I’m using the SH sample schema but needed a couple of additional partitioned table tables to get recreate my colleagues scenario of 5 partitioned tables. So I created SALES2, SALES3, and COSTS2 as copies of the SALES and COSTS table respectively (setup.sql). I also deleted statistics on all of the tables in the SH schema beforehand to more easily demonstrate our approach. Step 0. Delete the statistics on the tables in the SH schema. Step 1. Enable concurrent statistics gathering. Remember, this has to be done at the global level. Step 2. Enable incremental statistics for the 5 partitioned tables. Step 3. Create the DBMS_STATS.OBJECTTAB and pass it to the DBMS_STATS.GATHER_SCHEMA_STATS command. Here, you will notice that we defined two variables of DBMS_STATS.OBJECTTAB type. The first, filter_lst, will be used to pass the list of tables we want to gather statistics on, and will be the value passed to the obj_filter_list parameter. The second, obj_lst, will be used to capture the list of tables that have had statistics gathered on them by this command, and will be the value passed to the objlist parameter. In Oracle Database 11g Release 2, you need to specify the objlist parameter in order to get the obj_filter_list parameter to work correctly due to bug 14539274. Will also needed to define the number of objects we would supply in the obj_filter_list. In our case we ere specifying 5 tables (filter_lst.extend(5)). Finally, we need to specify the owner name and object name for each of the objects in the list. Once the list definition is complete we can issue the DBMS_STATS.GATHER_SCHEMA_STATS command. Step 4. Confirm statistics were gathered on the 5 partitioned tables. Here are a couple of other things to keep in mind when specifying the entries for the obj_filter_list parameter. If a field in the entry is empty, i.e., null, it means there is no condition on this field. In the above example , suppose you remove the statement Obj_filter_lst(1).ownname := ‘SH’; You will get the same result since when you have specified gather_schema_stats so there is no need to further specify ownname in the obj_filter_lst. All of the names in the entry are normalized, i.e., uppercased if they are not double quoted. So in the above example, it is OK to use Obj_filter_lst(1).objname := ‘sales’;. However if you have a table called ‘MyTab’ instead of ‘MYTAB’, then you need to specify Obj_filter_lst(1).objname := ‘”MyTab”’; As I said before, although we have illustrated the usage of the obj_filter_list parameter for partitioned tables, with concurrent and incremental statistics gathering turned on, the obj_filter_list parameter is generally applicable to any gather_database_stats, gather_dictionary_stats and gather_schema_stats command. You can get a copy of the script I used to generate this post here. +Maria Colgan

Read the article

When is a Seek not a Seek?

- by Paul White

The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10. One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before. The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return. The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0). Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate. Adding the opaque modulo expression results in SQL Server guessing at the selectivity. As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement. For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models. Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054. The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution. We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1). It is no coincidence that 126 = 63 * 2, by the way. It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing. There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon. To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query. I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101. Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value. Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf). We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101). It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169). Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks. It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster. Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms. The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek? When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

Read the article

Freezes (not crashes) with GCD, blocks and Core Data

- by Lukasz

I have recently rewritten my Core Data driven database controller to use Grand Central Dispatch to manage fetching and importing in the background. Controller can operate on 2 NSManagedContext's: NSManagedObjectContext *mainMoc instance variable for main thread. this contexts is used only by quick access for UI by main thread or by dipatch_get_main_queue() global queue. NSManagedObjectContext *bgMoc for background tasks (importing and fetching data for NSFetchedresultsController for tables). This background tasks are fired ONLY by user defined queue: dispatch_queue_t bgQueue (instance variable in database controller object). Fetching data for tables is done in background to not block user UI when bigger or more complicated predicates are performed. Example fetching code for NSFetchedResultsController in my table view controllers: -(void)fetchData{ dispatch_async([CDdb db].bgQueue, ^{ NSError *error = nil; [[self.fetchedResultsController fetchRequest] setPredicate:self.predicate]; if (self.fetchedResultsController && ![self.fetchedResultsController performFetch:&error]) { NSSLog(@"Unresolved error in fetchData %@", error); } if (!initial_fetch_attampted)initial_fetch_attampted = YES; fetching = NO; dispatch_async(dispatch_get_main_queue(), ^{ [self.table reloadData]; [self.table scrollRectToVisible:CGRectMake(0, 0, 100, 20) animated:YES]; }); }); } // end of fetchData function bgMoc merges with mainMoc on save using NSManagedObjectContextDidSaveNotification: - (void)bgMocDidSave:(NSNotification *)saveNotification { // CDdb - bgMoc didsave - merging changes with main mainMoc dispatch_async(dispatch_get_main_queue(), ^{ [self.mainMoc mergeChangesFromContextDidSaveNotification:saveNotification]; // Extra notification for some other, potentially interested clients [[NSNotificationCenter defaultCenter] postNotificationName:DATABASE_SAVED_WITH_CHANGES object:saveNotification]; }); } - (void)mainMocDidSave:(NSNotification *)saveNotification { // CDdb - main mainMoc didSave - merging changes with bgMoc dispatch_async(self.bgQueue, ^{ [self.bgMoc mergeChangesFromContextDidSaveNotification:saveNotification]; }); } NSfetchedResultsController delegate has only one method implemented (for simplicity): - (void)controllerDidChangeContent:(NSFetchedResultsController *)controller { dispatch_async(dispatch_get_main_queue(), ^{ [self fetchData]; }); } This way I am trying to follow Apple recommendation for Core Data: 1 NSManagedObjectContext per thread. I know this pattern is not completely clean for at last 2 reasons: bgQueue not necessarily fires the same thread after suspension but since it is serial, it should not matter much (there is never 2 threads trying access bgMoc NSManagedObjectContext dedicated to it). Sometimes table view data source methods will ask NSFetchedResultsController for info from bgMoc (since fetch is done on bgQueue) like sections count, fetched objects in section count, etc.... Event with this flaws this approach works pretty well of the 95% of application running time until ... AND HERE GOES MY QUESTION: Sometimes, very randomly application freezes but not crashes. It does not response on any touch and the only way to get it back to live is to restart it completely (switching back to and from background does not help). No exception is thrown and nothing is printed to the console (I have Breakpoints set for all exception in Xcode). I have tried to debug it using Instruments (time profiles especially) to see if there is something hard going on on main thread but nothing is showing up. I am aware that GCD and Core Data are the main suspects here, but I have no idea how to track / debug this. Let me point out, that this also happens when I dispatch all the tasks to the queues asynchronously only (using dispatch_async everywhere). This makes me think it is not just standard deadlock. Is there any possibility or hints of how could I get more info what is going on? Some extra debug flags, Instruments magical tricks or build setting etc... Any suggestions on what could be the cause are very much appreciated as well as (or) pointers to how to implement background fetching for NSFetchedResultsController and background importing in better way.

Read the article

Managing highly repetitive code and documentation in Java

- by polygenelubricants

Highly repetitive code is generally a bad thing, and there are design patterns that can help minimize this. However, sometimes it's simply inevitable due to the constraints of the language itself. Take the following example from java.util.Arrays: /** * Assigns the specified long value to each element of the specified * range of the specified array of longs. The range to be filled * extends from index <tt>fromIndex</tt>, inclusive, to index * <tt>toIndex</tt>, exclusive. (If <tt>fromIndex==toIndex</tt>, the * range to be filled is empty.) * * @param a the array to be filled * @param fromIndex the index of the first element (inclusive) to be * filled with the specified value * @param toIndex the index of the last element (exclusive) to be * filled with the specified value * @param val the value to be stored in all elements of the array * @throws IllegalArgumentException if <tt>fromIndex > toIndex</tt> * @throws ArrayIndexOutOfBoundsException if <tt>fromIndex < 0</tt> or * <tt>toIndex > a.length</tt> */ public static void fill(long[] a, int fromIndex, int toIndex, long val) { rangeCheck(a.length, fromIndex, toIndex); for (int i=fromIndex; i<toIndex; i++) a[i] = val; } The above snippet appears in the source code 8 times, with very little variation in the documentation/method signature but exactly the same method body, one for each of the root array types int[], short[], char[], byte[], boolean[], double[], float[], and Object[]. I believe that unless one resorts to reflection (which is an entirely different subject in itself), this repetition is inevitable. I understand that as a utility class, such high concentration of repetitive Java code is highly atypical, but even with the best practice, repetition does happen! Refactoring doesn't always work because it's not always possible (the obvious case is when the repetition is in the documentation). Obviously maintaining this source code is a nightmare. A slight typo in the documentation, or a minor bug in the implementation, is multiplied by however many repetitions was made. In fact, the best example happens to involve this exact class: Google Research Blog - Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken (by Joshua Bloch, Software Engineer) The bug is a surprisingly subtle one, occurring in what many thought to be just a simple and straightforward algorithm. // int mid =(low + high) / 2; // the bug int mid = (low + high) >>> 1; // the fix The above line appears 11 times in the source code! So my questions are: How are these kinds of repetitive Java code/documentation handled in practice? How are they developed, maintained, and tested? Do you start with "the original", and make it as mature as possible, and then copy and paste as necessary and hope you didn't make a mistake? And if you did make a mistake in the original, then just fix it everywhere, unless you're comfortable with deleting the copies and repeating the whole replication process? And you apply this same process for the testing code as well? Would Java benefit from some sort of limited-use source code preprocessing for this kind of thing? Perhaps Sun has their own preprocessor to help write, maintain, document and test these kind of repetitive library code? A comment requested another example, so I pulled this one from Google Collections: com.google.common.base.Predicates lines 276-310 (AndPredicate) vs lines 312-346 (OrPredicate). The source for these two classes are identical, except for: AndPredicate vs OrPredicate (each appears 5 times in its class) "And(" vs Or(" (in the respective toString() methods) #and vs #or (in the @see Javadoc comments) true vs false (in apply; ! can be rewritten out of the expression) -1 /* all bits on */ vs 0 /* all bits off */ in hashCode() &= vs |= in hashCode()

Read the article

Can I constrain a template parameter class to implement the interfaces that are supported by other?

- by K. Georgiev

The name is a little blurry, so here's the situation: I'm writing code to use some 'trajectories'. The trajectories are an abstract thing, so I describe them with different interfaces. So I have a code as this: namespace Trajectories { public interface IInitial < Atom > { Atom Initial { get; set; } } public interface ICurrent < Atom > { Atom Current { get; set; } } public interface IPrevious < Atom > { Atom Previous { get; set; } } public interface ICount < Atom > { int Count { get; } } public interface IManualCount < Atom > : ICount < Atom > { int Count { get; set; } } ... } Every concrete implementation of a trajectory will implement some of the above interfaces. Here's a concrete implementation of a trajectory: public class SimpleTrajectory < Atom > : IInitial < Atom >, ICurrent < Atom >, ICount < Atom > { // ICount public int Count { get; private set; } // IInitial private Atom initial; public Atom Initial { get { return initial; } set { initial = current = value; Count = 1; } } // ICurrent private Atom current; public Atom Current { get { return current; } set { current = value; Count++; } } } Now, I want to be able to deduce things about the trajectories, so, for example I want to support predicates about different properties of some trajectory: namespace Conditions { public interface ICondition &lt Atom, Trajectory &gt { bool Test(ref Trajectory t); } public class CountLessThan &lt Atom, Trajectory &gt : ICondition &lt Atom, Trajectory &gt where Trajectory : Trajectories.ICount &lt Atom &gt { public int Value { get; set; } public CountLessThan() { } public bool Test(ref Trajectory t) { return t.Count &lt Value; } } public class CurrentNormLessThan &lt Trajectory &gt : ICondition &lt Complex, Trajectory &gt where Trajectory : Trajectories.ICurrent &lt Complex &gt { public double Value { get; set; } public CurrentNormLessThan() { } public bool Test(ref Trajectory t) { return t.Current.Norm() &lt Value; } } } Now, here's the question: What if I wanted to implement AND predicate? It would be something like this: public class And &lt Atom, CondA, TrajectoryA, CondB, TrajectoryB, Trajectory &gt : ICondition &lt Atom, Trajectory &gt where CondA : ICondition &lt Atom, TrajectoryA &gt where TrajectoryA : // Some interfaces where CondB : ICondition &lt Atom, TrajectoryB &gt where TrajectoryB : // Some interfaces where Trajectory : // MUST IMPLEMENT THE INTERFACES FOR TrajectoryA AND THE INTERFACES FOR TrajectoryB { public CondA A { get; set; } public CondB B { get; set; } public bool Test(ref Trajectory t){ return A.Test(t) && B.Test(t); } } How can I say: support only these trajectories, for which the arguments of AND are ok? So I can be able to write: var vand = new CountLessThan(32) & new CurrentNormLessThan(4.0); I think if I create an orevall interface for every subset of interfaces, I could be able to do it, but it will become quite ugly.

Read the article

Heaps of Trouble?

- by Paul White NZ

If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material. In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all. The problem is, predictably, that performance is not very good. The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found. Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables. A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings). The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan. The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate. Every join in the plan shown above estimates that it will produce just a single row as output. Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows. It should not surprise you that this has rather dire consequences for the remainder of the query plan. In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables. Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each. The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints. A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query. The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement! As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there. (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B). Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear. The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools. If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on. Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID. The term ‘eager’ means that the spool consumes all of its input rows when it starts up. The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index. From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table. The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself. The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation. Nested loops join preserves the order of rows on its outer input, so sorting early is safe. (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input. It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation. It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins. You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins. The answer, again, is down to the poor information available to the optimizer. Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs. SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk. The join process then continues, and may again run out of memory. This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow. The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion. You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this). Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it? We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs. The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk. Even so, the query runs to completion in three or four seconds. That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information. We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world. It’s there primarily for its educational and entertainment value. I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details. My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

Read the article

Option Trading: Getting the most out of the event session options

- by extended_events

You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Search Results

Search found 106 results on 5 pages for 'predicates'.

Page 4/5 | < Previous Page | 1 2 3 4 5 | Next Page >

- by Chad

- by johnbdh

- by logically

- by icerelic

- by Austin

- by reprogrammer

- by ThinkFloyd

- by ~otakuj462

- by Maxim Gershkovich

- by Jason

- by Cem

- by sixtyfootersdude

- by Dave Ballantyne

- by lighthouse65

- by sid_com

- by Ankur

- by Ether

- by David Tay

- by Maria Colgan

- by Paul White

- by Lukasz

- by polygenelubricants

- by K. Georgiev

- by Paul White NZ

- by extended_events

< Previous Page | 1 2 3 4 5 | Next Page >