Search Results

Search found 4685 results on 188 pages for 'queries'.

Page 28/188 | < Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35 | Next Page >

Would this method work to scale out SQL queries?

- by David

I have a database containing a single huge table. At the moment a query can take anything from 10 to 20 minutes and I need that to go down to 10 seconds. I have spent months trying different products like GridSQL. GridSQL works fine, but is using its own parser which does not have all the needed features. I have also optimized my database in various ways without getting the speedup I need. I have a theory on how one could scale out queries, meaning that I utilize several nodes to run a single query in parallel. The idea is to take an incoming SQL query and simply run it exactly like it is on all the nodes. When the results are returned to a coordinator node, the same query is run on the union of the resultsets. I realize that an aggregate function like average need to be rewritten into a count and sum to the nodes and that the coordinator divides the sum of the sums with the sum of the counts to get the average. What kinds of problems could not easily be solved using this model. I believe one issue would be the count distinct function. Edit: I am getting so many nice suggestions, but none have addressed the method.

Read the article
Why does Hibernate 2nd level cache only cache queries within a session?

- by Synesso

Using a named query in our application and with ehcache as the provider, it seems that the query results are tied to the session within the cache. Any attempt to access the value from the cache for a second time results in a LazyInitializationException We have set lazy = true for the following mapping because this object is also used by another part of the system which does not require the reference... and we want to keep it lean. <class name="domain.ReferenceAdPoint" table="ad_point" mutable="false" lazy="false"> <cache usage="read-only"/> <id name="code" type="long" column="ad_point_id"> <generator class="assigned" /> </id> <property name="name" column="ad_point_description" type="string"/> <set name="synonyms" table="ad_point_synonym" cascade="all-delete-orphan" lazy="true"> <cache usage="read-only"/> <key column="ad_point_id" /> <element type="string" column="synonym_description" /> </set> </class> <query name="find.adpoints.by.heading">from ReferenceAdPoint adpoint left outer join fetch adpoint.synonyms where adpoint.adPointField.headingCode = ?</query> Here's a snippet from our hibernate.cfg.xml <property name="hibernate.cache.provider_class">net.sf.ehcache.hibernate.SingletonEhCacheProvider</property> <property name="hibernate.cache.use_query_cache">true</property> It doesn't seem to make sense that the cache would be constrained to the session. Why are the cached queries not usable outside of the (relatively short-lived) sessions?

Read the article
How can I improve the performance of LinqToSql queries that use EntitySet properties?

- by DanM

I'm using LinqToSql to query a small, simple SQL Server CE database. I've noticed that any operations involving sub-properties are disappointingly slow. For example, if I have a Customer table that is referenced by an Order table, LinqToSql will automatically create an EntitySet<Order> property. This is a nice convenience, allowing me to do things like Customer.Order.Where(o => o.ProductName = "Stopwatch"), but for some reason, SQL Server CE hangs up pretty bad when I try to do stuff like this. One of my queries, which isn't really that complicated takes 3-4 seconds to complete. I can get the speed up to acceptable, even fast, if I just grab the two tables individually and convert them to List<Customer> and List<Order>, then join then manually with my own query, but this is throwing out a lot of what makes LinqToSql so appealing. So, I'm wondering if I can somehow get the whole database into RAM and just query that way, then occasionally save it. Is this possible? How? If not, is there anything else I can do to boost the performance besides resorting to doing all the joins manually? Note: My database in its initial state is about 250K and I don't expect it to grow to more than 1-2Mb. So, loading the data into RAM certainly wouldn't be a problem from a memory point of view. Update Here are the table definitions for the example I used in my question: create table Order ( Id int identity(1, 1) primary key, ProductName ntext null ) create table Customer ( Id int identity(1, 1) primary key, OrderId int null references Order (Id) )

Read the article
Is there a better loop I could write to reduce database queries?

- by dmanexe

Below is some code I've written that is effective, but makes too many database queries. Is there a way I could optimize and reduce the number of queries but have conditional statements still be as effective as below? I pasted the code repeated a few times just for good measure. echo "<h3>Pool Packages</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Pool Packages") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Pool Packages") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Water Features</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Water Features") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Water Features") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Waterfall Rock Work</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE) { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Waterfall Rock Work") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Sheer Descents</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Sheer Descents") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Sheer Descents") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Booster Pump</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Booster Pump") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Booster Pump") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Pool Concrete Decking</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Pool Concrete Decking") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Pool Concrete Decking") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Solar Heating</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Solar Heating") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Solar Heating") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Raised Bond Beam</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Raised Bond Beam") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Raised Bond Beam") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { echo "<li>None</li>"; } endforeach; echo "</ul>"; It goes on beyond this to several more categories, but I don't know how to handle looping through this best. Thanks!

Read the article
How do I execute queries upon DB connection in Rails?

- by sycobuny

I have certain initializing functions that I use to set up audit logging on the DB server side (ie, not rails) in PostgreSQL. At least one has to be issued (setting the current user) before inserting data into or updating any of the audited tables, or else the whole query will fail spectacularly. I can easily call these every time before running any save operation in the code, but DRY makes me think I should have the code repeated in as few places as possible, particularly since this diverges greatly from the ideal of database agnosticism. Currently I'm attempting to override ActiveRecord::Base.establish_connection in an initializer to set it up so that the queries are run as soon as I connect automatically, but it doesn't behave as I expect it to. Here is the code in the initializer: class ActiveRecord::Base # extend the class methods, not the instance methods class << self alias :old_establish_connection :establish_connection # hide the default def establish_connection(*args) ret = old_establish_connection(*args) # call the default # set up necessary session variables for audit logging # call these after calling default, to make sure conn is established 1st db = self.class.connection db.execute("SELECT SV.set('current_user', 'test@localhost')") db.execute("SELECT SV.set('audit_notes', NULL)") # end "empty variable" err ret # return the default's original value end end end puts "Loaded custom establish_connection into ActiveRecord::Base" sycobuny:~/rails$ ruby script/server = Booting WEBrick = Rails 2.3.5 application starting on http://0.0.0.0:3000 Loaded custom establish_connection into ActiveRecord::Base This doesn't give me any errors, and unfortunately I can't check what the method looks like internally (I was using ActiveRecord::Base.method(:establish_connection), but apparently that creates a new Method object each time it's called, which is seemingly worthless cause I can't check object_id for any worthwhile information and I also can't reverse the compilation). However, the code never seems to get called, because any attempt to run a save or an update on a database object fails as I predicted earlier. If this isn't a proper way to execute code immediately on connection to the database, then what is?

Read the article
Best way to run multiple queries per second on database, performance wise?

- by Michael Joell

I am currently using Java to insert and update data multiple times per second. Never having used databases with Java, I am not sure what is required, and how to get the best performance. I currently have a method for each type of query I need to do (for example, update a row in a database). I also have a method to create the database connection. Below is my simplified code. public static void addOneForUserInChannel(String channel, String username) throws SQLException { Connection dbConnection = null; PreparedStatement ps = null; String updateSQL = "UPDATE " + channel + "_count SET messages = messages + 1 WHERE username = ?"; try { dbConnection = getDBConnection(); ps = dbConnection.prepareStatement(updateSQL); ps.setString(1, username); ps.executeUpdate(); } catch(SQLException e) { System.out.println(e.getMessage()); } finally { if(ps != null) { ps.close(); } if(dbConnection != null) { dbConnection.close(); } } } And my DB connection private static Connection getDBConnection() { Connection dbConnection = null; try { Class.forName(DB_DRIVER); } catch (ClassNotFoundException e) { System.out.println(e.getMessage()); } try { dbConnection = DriverManager.getConnection(DB_CONNECTION, DB_USER,DB_PASSWORD); return dbConnection; } catch (SQLException e) { System.out.println(e.getMessage()); } return dbConnection; } This seems to be working fine for now, with about 1-2 queries per second, but I am worried that once I expand and it is running many more, I might have some issues. My questions: Is there a way to have a persistent database connection throughout the entire run time of the process? If so, should I do this? Are there any other optimizations that I should do to help with performance? Thanks

Read the article
What are good design practices when working with Entity Framework

- by AD

This will apply mostly for an asp.net application where the data is not accessed via soa. Meaning that you get access to the objects loaded from the framework, not Transfer Objects, although some recommendation still apply. This is a community post, so please add to it as you see fit. Applies to: Entity Framework 1.0 shipped with Visual Studio 2008 sp1. Why pick EF in the first place? Considering it is a young technology with plenty of problems (see below), it may be a hard sell to get on the EF bandwagon for your project. However, it is the technology Microsoft is pushing (at the expense of Linq2Sql, which is a subset of EF). In addition, you may not be satisfied with NHibernate or other solutions out there. Whatever the reasons, there are people out there (including me) working with EF and life is not bad.make you think. EF and inheritance The first big subject is inheritance. EF does support mapping for inherited classes that are persisted in 2 ways: table per class and table the hierarchy. The modeling is easy and there are no programming issues with that part. (The following applies to table per class model as I don't have experience with table per hierarchy, which is, anyway, limited.) The real problem comes when you are trying to run queries that include one or many objects that are part of an inheritance tree: the generated sql is incredibly awful, takes a long time to get parsed by the EF and takes a long time to execute as well. This is a real show stopper. Enough that EF should probably not be used with inheritance or as little as possible. Here is an example of how bad it was. My EF model had ~30 classes, ~10 of which were part of an inheritance tree. On running a query to get one item from the Base class, something as simple as Base.Get(id), the generated SQL was over 50,000 characters. Then when you are trying to return some Associations, it degenerates even more, going as far as throwing SQL exceptions about not being able to query more than 256 tables at once. Ok, this is bad, EF concept is to allow you to create your object structure without (or with as little as possible) consideration on the actual database implementation of your table. It completely fails at this. So, recommendations? Avoid inheritance if you can, the performance will be so much better. Use it sparingly where you have to. In my opinion, this makes EF a glorified sql-generation tool for querying, but there are still advantages to using it. And ways to implement mechanism that are similar to inheritance. Bypassing inheritance with Interfaces First thing to know with trying to get some kind of inheritance going with EF is that you cannot assign a non-EF-modeled class a base class. Don't even try it, it will get overwritten by the modeler. So what to do? You can use interfaces to enforce that classes implement some functionality. For example here is a IEntity interface that allow you to define Associations between EF entities where you don't know at design time what the type of the entity would be. public enum EntityTypes{ Unknown = -1, Dog = 0, Cat } public interface IEntity { int EntityID { get; } string Name { get; } Type EntityType { get; } } public partial class Dog : IEntity { // implement EntityID and Name which could actually be fields // from your EF model Type EntityType{ get{ return EntityTypes.Dog; } } } Using this IEntity, you can then work with undefined associations in other classes // lets take a class that you defined in your model. // that class has a mapping to the columns: PetID, PetType public partial class Person { public IEntity GetPet() { return IEntityController.Get(PetID,PetType); } } which makes use of some extension functions: public class IEntityController { static public IEntity Get(int id, EntityTypes type) { switch (type) { case EntityTypes.Dog: return Dog.Get(id); case EntityTypes.Cat: return Cat.Get(id); default: throw new Exception("Invalid EntityType"); } } } Not as neat as having plain inheritance, particularly considering you have to store the PetType in an extra database field, but considering the performance gains, I would not look back. It also cannot model one-to-many, many-to-many relationship, but with creative uses of 'Union' it could be made to work. Finally, it creates the side effet of loading data in a property/function of the object, which you need to be careful about. Using a clear naming convention like GetXYZ() helps in that regards. Compiled Queries Entity Framework performance is not as good as direct database access with ADO (obviously) or Linq2SQL. There are ways to improve it however, one of which is compiling your queries. The performance of a compiled query is similar to Linq2Sql. What is a compiled query? It is simply a query for which you tell the framework to keep the parsed tree in memory so it doesn't need to be regenerated the next time you run it. So the next run, you will save the time it takes to parse the tree. Do not discount that as it is a very costly operation that gets even worse with more complex queries. There are 2 ways to compile a query: creating an ObjectQuery with EntitySQL and using CompiledQuery.Compile() function. (Note that by using an EntityDataSource in your page, you will in fact be using ObjectQuery with EntitySQL, so that gets compiled and cached). An aside here in case you don't know what EntitySQL is. It is a string-based way of writing queries against the EF. Here is an example: "select value dog from Entities.DogSet as dog where dog.ID = @ID". The syntax is pretty similar to SQL syntax. You can also do pretty complex object manipulation, which is well explained [here][1]. Ok, so here is how to do it using ObjectQuery< string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); The first time you run this query, the framework will generate the expression tree and keep it in memory. So the next time it gets executed, you will save on that costly step. In that example EnablePlanCaching = true, which is unnecessary since that is the default option. The other way to compile a query for later use is the CompiledQuery.Compile method. This uses a delegate: static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => ctx.DogSet.FirstOrDefault(it => it.ID == id)); or using linq static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => (from dog in ctx.DogSet where dog.ID == id select dog).FirstOrDefault()); to call the query: query_GetDog.Invoke( YourContext, id ); The advantage of CompiledQuery is that the syntax of your query is checked at compile time, where as EntitySQL is not. However, there are other consideration... Includes Lets say you want to have the data for the dog owner to be returned by the query to avoid making 2 calls to the database. Easy to do, right? EntitySQL string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)).Include("Owner"); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); CompiledQuery static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => (from dog in ctx.DogSet.Include("Owner") where dog.ID == id select dog).FirstOrDefault()); Now, what if you want to have the Include parametrized? What I mean is that you want to have a single Get() function that is called from different pages that care about different relationships for the dog. One cares about the Owner, another about his FavoriteFood, another about his FavotireToy and so on. Basicly, you want to tell the query which associations to load. It is easy to do with EntitySQL public Dog Get(int id, string include) { string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)) .IncludeMany(include); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); } The include simply uses the passed string. Easy enough. Note that it is possible to improve on the Include(string) function (that accepts only a single path) with an IncludeMany(string) that will let you pass a string of comma-separated associations to load. Look further in the extension section for this function. If we try to do it with CompiledQuery however, we run into numerous problems: The obvious static readonly Func<Entities, int, string, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, string, Dog>((ctx, id, include) => (from dog in ctx.DogSet.Include(include) where dog.ID == id select dog).FirstOrDefault()); will choke when called with: query_GetDog.Invoke( YourContext, id, "Owner,FavoriteFood" ); Because, as mentionned above, Include() only wants to see a single path in the string and here we are giving it 2: "Owner" and "FavoriteFood" (which is not to be confused with "Owner.FavoriteFood"!). Then, let's use IncludeMany(), which is an extension function static readonly Func<Entities, int, string, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, string, Dog>((ctx, id, include) => (from dog in ctx.DogSet.IncludeMany(include) where dog.ID == id select dog).FirstOrDefault()); Wrong again, this time it is because the EF cannot parse IncludeMany because it is not part of the functions that is recognizes: it is an extension. Ok, so you want to pass an arbitrary number of paths to your function and Includes() only takes a single one. What to do? You could decide that you will never ever need more than, say 20 Includes, and pass each separated strings in a struct to CompiledQuery. But now the query looks like this: from dog in ctx.DogSet.Include(include1).Include(include2).Include(include3) .Include(include4).Include(include5).Include(include6) .[...].Include(include19).Include(include20) where dog.ID == id select dog which is awful as well. Ok, then, but wait a minute. Can't we return an ObjectQuery< with CompiledQuery? Then set the includes on that? Well, that what I would have thought so as well: static readonly Func<Entities, int, ObjectQuery<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, ObjectQuery<Dog>>((ctx, id) => (ObjectQuery<Dog>)(from dog in ctx.DogSet where dog.ID == id select dog)); public Dog GetDog( int id, string include ) { ObjectQuery<Dog> oQuery = query_GetDog(id); oQuery = oQuery.IncludeMany(include); return oQuery.FirstOrDefault; } That should have worked, except that when you call IncludeMany (or Include, Where, OrderBy...) you invalidate the cached compiled query because it is an entirely new one now! So, the expression tree needs to be reparsed and you get that performance hit again. So what is the solution? You simply cannot use CompiledQueries with parametrized Includes. Use EntitySQL instead. This doesn't mean that there aren't uses for CompiledQueries. It is great for localized queries that will always be called in the same context. Ideally CompiledQuery should always be used because the syntax is checked at compile time, but due to limitation, that's not possible. An example of use would be: you may want to have a page that queries which two dogs have the same favorite food, which is a bit narrow for a BusinessLayer function, so you put it in your page and know exactly what type of includes are required. Passing more than 3 parameters to a CompiledQuery Func is limited to 5 parameters, of which the last one is the return type and the first one is your Entities object from the model. So that leaves you with 3 parameters. A pitance, but it can be improved on very easily. public struct MyParams { public string param1; public int param2; public DateTime param3; } static readonly Func<Entities, MyParams, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, MyParams, IEnumerable<Dog>>((ctx, myParams) => from dog in ctx.DogSet where dog.Age == myParams.param2 && dog.Name == myParams.param1 and dog.BirthDate > myParams.param3 select dog); public List<Dog> GetSomeDogs( int age, string Name, DateTime birthDate ) { MyParams myParams = new MyParams(); myParams.param1 = name; myParams.param2 = age; myParams.param3 = birthDate; return query_GetDog(YourContext,myParams).ToList(); } Return Types (this does not apply to EntitySQL queries as they aren't compiled at the same time during execution as the CompiledQuery method) Working with Linq, you usually don't force the execution of the query until the very last moment, in case some other functions downstream wants to change the query in some way: static readonly Func<Entities, int, string, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, IEnumerable<Dog>>((ctx, age, name) => from dog in ctx.DogSet where dog.Age == age && dog.Name == name select dog); public IEnumerable<Dog> GetSomeDogs( int age, string name ) { return query_GetDog(YourContext,age,name); } public void DataBindStuff() { IEnumerable<Dog> dogs = GetSomeDogs(4,"Bud"); // but I want the dogs ordered by BirthDate gridView.DataSource = dogs.OrderBy( it => it.BirthDate ); } What is going to happen here? By still playing with the original ObjectQuery (that is the actual return type of the Linq statement, which implements IEnumerable), it will invalidate the compiled query and be force to re-parse. So, the rule of thumb is to return a List< of objects instead. static readonly Func<Entities, int, string, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, IEnumerable<Dog>>((ctx, age, name) => from dog in ctx.DogSet where dog.Age == age && dog.Name == name select dog); public List<Dog> GetSomeDogs( int age, string name ) { return query_GetDog(YourContext,age,name).ToList(); //<== change here } public void DataBindStuff() { List<Dog> dogs = GetSomeDogs(4,"Bud"); // but I want the dogs ordered by BirthDate gridView.DataSource = dogs.OrderBy( it => it.BirthDate ); } When you call ToList(), the query gets executed as per the compiled query and then, later, the OrderBy is executed against the objects in memory. It may be a little bit slower, but I'm not even sure. One sure thing is that you have no worries about mis-handling the ObjectQuery and invalidating the compiled query plan. Once again, that is not a blanket statement. ToList() is a defensive programming trick, but if you have a valid reason not to use ToList(), go ahead. There are many cases in which you would want to refine the query before executing it. Performance What is the performance impact of compiling a query? It can actually be fairly large. A rule of thumb is that compiling and caching the query for reuse takes at least double the time of simply executing it without caching. For complex queries (read inherirante), I have seen upwards to 10 seconds. So, the first time a pre-compiled query gets called, you get a performance hit. After that first hit, performance is noticeably better than the same non-pre-compiled query. Practically the same as Linq2Sql When you load a page with pre-compiled queries the first time you will get a hit. It will load in maybe 5-15 seconds (obviously more than one pre-compiled queries will end up being called), while subsequent loads will take less than 300ms. Dramatic difference, and it is up to you to decide if it is ok for your first user to take a hit or you want a script to call your pages to force a compilation of the queries. Can this query be cached? { Dog dog = from dog in YourContext.DogSet where dog.ID == id select dog; } No, ad-hoc Linq queries are not cached and you will incur the cost of generating the tree every single time you call it. Parametrized Queries Most search capabilities involve heavily parametrized queries. There are even libraries available that will let you build a parametrized query out of lamba expressions. The problem is that you cannot use pre-compiled queries with those. One way around that is to map out all the possible criteria in the query and flag which one you want to use: public struct MyParams { public string name; public bool checkName; public int age; public bool checkAge; } static readonly Func<Entities, MyParams, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, MyParams, IEnumerable<Dog>>((ctx, myParams) => from dog in ctx.DogSet where (myParams.checkAge == true && dog.Age == myParams.age) && (myParams.checkName == true && dog.Name == myParams.name ) select dog); protected List<Dog> GetSomeDogs() { MyParams myParams = new MyParams(); myParams.name = "Bud"; myParams.checkName = true; myParams.age = 0; myParams.checkAge = false; return query_GetDog(YourContext,myParams).ToList(); } The advantage here is that you get all the benifits of a pre-compiled quert. The disadvantages are that you most likely will end up with a where clause that is pretty difficult to maintain, that you will incur a bigger penalty for pre-compiling the query and that each query you run is not as efficient as it could be (particularly with joins thrown in). Another way is to build an EntitySQL query piece by piece, like we all did with SQL. protected List<Dod> GetSomeDogs( string name, int age) { string query = "select value dog from Entities.DogSet where 1 = 1 "; if( !String.IsNullOrEmpty(name) ) query = query + " and dog.Name == @Name "; if( age > 0 ) query = query + " and dog.Age == @Age "; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>( query, YourContext ); if( !String.IsNullOrEmpty(name) ) oQuery.Parameters.Add( new ObjectParameter( "Name", name ) ); if( age > 0 ) oQuery.Parameters.Add( new ObjectParameter( "Age", age ) ); return oQuery.ToList(); } Here the problems are: - there is no syntax checking during compilation - each different combination of parameters generate a different query which will need to be pre-compiled when it is first run. In this case, there are only 4 different possible queries (no params, age-only, name-only and both params), but you can see that there can be way more with a normal world search. - Noone likes to concatenate strings! Another option is to query a large subset of the data and then narrow it down in memory. This is particularly useful if you are working with a definite subset of the data, like all the dogs in a city. You know there are a lot but you also know there aren't that many... so your CityDog search page can load all the dogs for the city in memory, which is a single pre-compiled query and then refine the results protected List<Dod> GetSomeDogs( string name, int age, string city) { string query = "select value dog from Entities.DogSet where dog.Owner.Address.City == @City "; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>( query, YourContext ); oQuery.Parameters.Add( new ObjectParameter( "City", city ) ); List<Dog> dogs = oQuery.ToList(); if( !String.IsNullOrEmpty(name) ) dogs = dogs.Where( it => it.Name == name ); if( age > 0 ) dogs = dogs.Where( it => it.Age == age ); return dogs; } It is particularly useful when you start displaying all the data then allow for filtering. Problems: - Could lead to serious data transfer if you are not careful about your subset. - You can only filter on the data that you returned. It means that if you don't return the Dog.Owner association, you will not be able to filter on the Dog.Owner.Name So what is the best solution? There isn't any. You need to pick the solution that works best for you and your problem: - Use lambda-based query building when you don't care about pre-compiling your queries. - Use fully-defined pre-compiled Linq query when your object structure is not too complex. - Use EntitySQL/string concatenation when the structure could be complex and when the possible number of different resulting queries are small (which means fewer pre-compilation hits). - Use in-memory filtering when you are working with a smallish subset of the data or when you had to fetch all of the data on the data at first anyway (if the performance is fine with all the data, then filtering in memory will not cause any time to be spent in the db). Singleton access The best way to deal with your context and entities accross all your pages is to use the singleton pattern: public sealed class YourContext { private const string instanceKey = "On3GoModelKey"; YourContext(){} public static YourEntities Instance { get { HttpContext context = HttpContext.Current; if( context == null ) return Nested.instance; if (context.Items[instanceKey] == null) { On3GoEntities entity = new On3GoEntities(); context.Items[instanceKey] = entity; } return (YourEntities)context.Items[instanceKey]; } } class Nested { // Explicit static constructor to tell C# compiler // not to mark type as beforefieldinit static Nested() { } internal static readonly YourEntities instance = new YourEntities(); } } NoTracking, is it worth it? When executing a query, you can tell the framework to track the objects it will return or not. What does it mean? With tracking enabled (the default option), the framework will track what is going on with the object (has it been modified? Created? Deleted?) and will also link objects together, when further queries are made from the database, which is what is of interest here. For example, lets assume that Dog with ID == 2 has an owner which ID == 10. Dog dog = (from dog in YourContext.DogSet where dog.ID == 2 select dog).FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; Person owner = (from o in YourContext.PersonSet where o.ID == 10 select dog).FirstOrDefault(); //dog.OwnerReference.IsLoaded == true; If we were to do the same with no tracking, the result would be different. ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>) (from dog in YourContext.DogSet where dog.ID == 2 select dog); oDogQuery.MergeOption = MergeOption.NoTracking; Dog dog = oDogQuery.FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; ObjectQuery<Person> oPersonQuery = (ObjectQuery<Person>) (from o in YourContext.PersonSet where o.ID == 10 select o); oPersonQuery.MergeOption = MergeOption.NoTracking; Owner owner = oPersonQuery.FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; Tracking is very useful and in a perfect world without performance issue, it would always be on. But in this world, there is a price for it, in terms of performance. So, should you use NoTracking to speed things up? It depends on what you are planning to use the data for. Is there any chance that the data your query with NoTracking can be used to make update/insert/delete in the database? If so, don't use NoTracking because associations are not tracked and will causes exceptions to be thrown. In a page where there are absolutly no updates to the database, you can use NoTracking. Mixing tracking and NoTracking is possible, but it requires you to be extra careful with updates/inserts/deletes. The problem is that if you mix then you risk having the framework trying to Attach() a NoTracking object to the context where another copy of the same object exist with tracking on. Basicly, what I am saying is that Dog dog1 = (from dog in YourContext.DogSet where dog.ID == 2).FirstOrDefault(); ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>) (from dog in YourContext.DogSet where dog.ID == 2 select dog); oDogQuery.MergeOption = MergeOption.NoTracking; Dog dog2 = oDogQuery.FirstOrDefault(); dog1 and dog2 are 2 different objects, one tracked and one not. Using the detached object in an update/insert will force an Attach() that will say "Wait a minute, I do already have an object here with the same database key. Fail". And when you Attach() one object, all of its hierarchy gets attached as well, causing problems everywhere. Be extra careful. How much faster is it with NoTracking It depends on the queries. Some are much more succeptible to tracking than other. I don't have a fast an easy rule for it, but it helps. So I should use NoTracking everywhere then? Not exactly. There are some advantages to tracking object. The first one is that the object is cached, so subsequent call for that object will not hit the database. That cache is only valid for the lifetime of the YourEntities object, which, if you use the singleton code above, is the same as the page lifetime. One page request == one YourEntity object. So for multiple calls for the same object, it will load only once per page request. (Other caching mechanism could extend that). What happens when you are using NoTracking and try to load the same object multiple times? The database will be queried each time, so there is an impact there. How often do/should you call for the same object during a single page request? As little as possible of course, but it does happens. Also remember the piece above about having the associations connected automatically for your? You don't have that with NoTracking, so if you load your data in multiple batches, you will not have a link to between them: ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>)(from dog in YourContext.DogSet select dog); oDogQuery.MergeOption = MergeOption.NoTracking; List<Dog> dogs = oDogQuery.ToList(); ObjectQuery<Person> oPersonQuery = (ObjectQuery<Person>)(from o in YourContext.PersonSet select o); oPersonQuery.MergeOption = MergeOption.NoTracking; List<Person> owners = oPersonQuery.ToList(); In this case, no dog will have its .Owner property set. Some things to keep in mind when you are trying to optimize the performance. No lazy loading, what am I to do? This can be seen as a blessing in disguise. Of course it is annoying to load everything manually. However, it decreases the number of calls to the db and forces you to think about when you should load data. The more you can load in one database call the better. That was always true, but it is enforced now with this 'feature' of EF. Of course, you can call if( !ObjectReference.IsLoaded ) ObjectReference.Load(); if you want to, but a better practice is to force the framework to load the objects you know you will need in one shot. This is where the discussion about parametrized Includes begins to make sense. Lets say you have you Dog object public class Dog { public Dog Get(int id) { return YourContext.DogSet.FirstOrDefault(it => it.ID == id ); } } This is the type of function you work with all the time. It gets called from all over the place and once you have that Dog object, you will do very different things to it in different functions. First, it should be pre-compiled, because you will call that very often. Second, each different pages will want to have access to a different subset of the Dog data. Some will want the Owner, some the FavoriteToy, etc. Of course, you could call Load() for each reference you need anytime you need one. But that will generate a call to the database each time. Bad idea. So instead, each page will ask for the data it wants to see when it first request for the Dog object: static public Dog Get(int id) { return GetDog(entity,"");} static public Dog Get(int id, string includePath) { string query = "select value o " + " from YourEntities.DogSet as o " +

Read the article
Developing Schema Compare for Oracle (Part 5): Query Snapshots

- by Simon Cooper

If you've emailed us about a bug you've encountered with the EAP or beta versions of Schema Compare for Oracle, we probably asked you to send us a query snapshot of your databases. Here, I explain what a query snapshot is, and how it helps us fix your bug. Problem 1: Debugging users' bug reports When we started the Schema Compare project, we knew we were going to get problems with users' databases - configurations we hadn't considered, features that weren't installed, unicode issues, wierd dependencies... With SQL Compare, users are generally happy to send us a database backup that we can restore using a single RESTORE DATABASE command on our test servers and immediately reproduce the problem. Oracle, on the other hand, would be a lot more tricky. As Oracle generally has a 1-to-1 mapping between instances and databases, any databases users sent would have to be restored to their own instance. Furthermore, the number of steps required to get a properly working database, and the size of most oracle databases, made it infeasible to ask every customer who came across a bug during our beta program to send us their databases. We also knew that there would be lots of issues with data security that would make it hard to get backups. So we needed an easier way to be able to debug customers issues and sort out what strange schema data Oracle was returning. Problem 2: Test execution time Another issue we knew we would have to solve was the execution time of the tests we would produce for the Schema Compare engine. Our initial prototype showed that querying the data dictionary for schema information was going to be slow (at least 15 seconds per database), and this is generally proportional to the size of the database. If you're running thousands of tests on the same databases, each one registering separate schemas, not only would the tests would take hours and hours to run, but the test servers would be hammered senseless. The solution To solve these, we needed to be able to populate the schema of a database without actually connecting to it. Well, the IDataReader interface is the primary way we read data from an Oracle server. The data dictionary queries we use return their data in terms of simple strings and numbers, which we then process and reconstruct into an object model, and the results of these queries are identical for identical schemas. So, we can record the raw results of the queries once, and then replay these results to construct the same object model as many times as required without needing to actually connect to the original database. This is what query snapshots do. They are binary files containing the raw unprocessed data we get back from the oracle server for all the queries we run on the data dictionary to get schema information. The core of the query snapshot generation takes the results of the IDataReader we get from running queries on Oracle, and passes the row data to a BinaryWriter that writes it straight to a file. The query snapshot can then be replayed to create the same object model; when the results of a specific query is needed by the population code, we can simply read the binary data stored in the file on disk and present it through an IDataReader wrapper. This is far faster than querying the server over the network, and allows us to run tests in a reasonable time. They also allow us to easily debug a customers problem; using a simple snapshot generation program, users can generate a query snapshot that could be sent along with a bug report that we can immediately replay on our machines to let us debug the issue, rather than having to obtain database backups and restore databases to test systems. There are also far fewer problems with data security; query snapshots only contain schema information, which is generally less sensitive than table data. Query snapshots implementation However, actually implementing such a feature did have a couple of 'gotchas' to it. My second blog post detailed the development of the dependencies algorithm we use to ensure we get all the dependencies in the database, and that algorithm uses data from both databases to find all the needed objects - what database you're comparing to affects what objects get populated from both databases. We get information on these additional objects using an appropriate WHERE clause on all the population queries. So, in order to accurately replay the results of querying the live database, the query snapshot needs to be a snapshot of a comparison of two databases, not just populating a single database. Furthermore, although the code population queries (eg querying all_tab_cols to get column information) can simply be passed straight from the IDataReader to the BinaryWriter, we need to hook into and run the live dependencies algorithm while we're creating the snapshot to ensure we get the same WHERE clauses, and the same query results, as if we were populating straight from a live system. We also need to store the results of the dependencies queries themselves, as the resulting dependency graph is stored within the OracleDatabase object that is produced, and is later used to help order actions in synchronization scripts. This is significantly helped by the dependencies algorithm being a deterministic algorithm - given the same input, it will always return the same output. Therefore, when we're replaying a query snapshot, and processing dependency information, we simply have to return the results of the queries in the order we got them from the live database, rather than trying to calculate the contents of all_dependencies on the fly. Query snapshots are a significant feature in Schema Compare that really helps us to debug problems with the tool, as well as making our testers happier. Although not really user-visible, they are very useful to the development team to help us fix bugs in the product much faster than we otherwise would be able to.

Read the article
When should I use Areas in TFS instead of Team Projects

- by Martin Hinshelwood

Well, it depends…. If you are a small company that creates a finite number of internal projects then you will find it easier to create a single project for each of your products and have TFS do the heavy lifting with reporting, SharePoint sites and Version Control. But what if you are not… Update 9th March 2010 Michael Fourie gave me some feedback which I have integrated. Ed Blankenship via @edblankenship offered encouragement and a nice quote. Ewald Hofman gave me a couple of Cons, and maybe a few more soon. Ewald’s company, Avanade, currently uses Areas, but it looks like the manual management is getting too much and the project is getting cluttered. What if you are likely to have hundreds of projects, possibly with a multitude of internal and external projects? You might have 1 project for a customer or 10. This is the situation that most consultancies find themselves in and thus they need a more sustainable and maintainable option. What I am advocating is that we should have 1 “Team Project” per customer, and use areas to create “sub projects” within that single “Team Project”. "What you describe is what we generally do internally and what we recommend. We make very heavy use of area path to categorize the work within a larger project." - Brian Harry, Microsoft Technical Fellow & Product Unit Manager for Team Foundation Server "We tend to use areas to segregate multiple projects in the same team project and it works well." - Tiago Pascoal, Visual Studio ALM MVP "In general, I believe this approach provides consistency [to multi-product engagements] and lowers the administration and maintenance costs. All good." - Michael Fourie, Visual Studio ALM MVP “@MrHinsh BTW, I'm very much a fan of very large, if not huge, team projects in TFS. Just FYI :) Use Areas & Iterations.” Ed Blankenship, Visual Studio ALM MVP This would mean that SSW would have a single Team Project called “SSW” that contains all of our internal projects and consequently all of the Areas and Iteration move down one hierarchy to accommodate this. Where we would have had “\SSW\Sprint 1” we now have “\SSW\SqlDeploy\Sprint1” with “SqlDeploy” being our internal project. At the moment SSW has over 70 internal projects and more than 170 total projects in TFS. This method has long term benefits that help to simplify the support model for companies that often have limited internal support time and many projects. But, there are implications as TFS does not provide this model “out-of-the-box”. These implications stretch across Areas, Iterations, Queries, Project Portal and Version Control. Michael made a good comment, he said: I agree with your approach, assuming that in a multi-product engagement with a client, they are happy to adopt the same process template across all products. If they are not, then it’ll either be easy to convince them or there is a valid reason for having a different template - Michael Fourie, Visual Studio ALM MVP At SSW we have a standard template that we use and this is applied across the board, to all of our projects. We even apply any changes to the core process template to all of our existing projects as well. If you have multiple projects for the same clients on multiple templates and you want to keep it that way, then this approach will not work for you. However, if you want to standardise as we have at SSW then this approach may benefit you as well. Implications around Areas Areas should be used for topological classification/isolation of work items. You can think of this as architecture areas, organisational areas or even the main features of your application. In our scenario there is an additional top level item that represents the Project / Product that we want to chop our Team Project into. Figure: Creating a sub area to represent a product/project is easy. <teamproject> <teamproject>\<Functional Area/module whatever> Becomes: <teamproject> <teamproject>\<ProjectName>\ <teamproject>\<ProjectName>\<Functional Area/module whatever> Implications around Iterations Iterations should be used for chronological classification/isolation of work items. This could include isolated time boxes, milestones or release timelines and really depends on the logical flow of your project or projects. Due to the new level in Area we need to add the same level to Iteration. This is primarily because it is unlikely that the sprints in each of your projects/products will start and end at the same time. This is just a reality of managing multiple projects. Figure: Adding the same Area value to Iteration as the top level item adds flexibility to Iteration. <teamproject>\Sprint 1 Or <teamproject>\Release 1\Sprint 1 Becomes: <teamproject>\<ProjectName>\Sprint 1 Or <teamproject>\<ProjectName>\Release 1\Sprint 1 Implications around Queries Queries are used to filter your work items based on a specified level of granularity. There are a number of queries that are built into a project created using the MSF Agile 5.0 template, but we now have multiple projects and it would be a pain to have to edit all of the work items every time we changed project, and that would only allow one team to work on one project at a time. Figure: The Queries that are created in a normal MSF Agile 5.0 project do not quite suit our new needs. In order for project contributors to be able to query based on their project we need a couple of things. The first thing I did was to create an “_Area Template” folder that has a copy of the project layout with all the queries setup to filter based on the “_Area Template” Area and the “_Sprint template” you can see in the Area and Iteration views. Figure: The template is currently easily drag and drop, but you then need to edit the queries to point at the right Area and Iteration. This needs a tool. I then created an “Areas” folder to hold all of the area specific queries. So, when you go to create a new TFS Sub-Project you just drag “_Area Template” while holding “Ctrl” and drop it onto “Areas”. There is a little setup here. That said I managed it in around 10 minutes which is not so bad, and I can imagine it being quite easy to build a tool to create these queries Figure: These new queries can be configured in around 10 minutes, which includes setting up the Area and Iteration as well. Version Control What about your source code? Well, that is the easiest of the lot. Just create a sub folder for each of your projects/products. Figure: Creating sub folders in source control is easy as “Right click | Create new folder”. <teamproject>\DEV\Main\ Becomes: <teamproject>\<ProjectName>\DEV\Main\ Conclusion I think it is up to each company to make a call on how you want to configure your Team Projects and it depends completely on how many projects/products you are going to have for each customer including yourself. If we decide to utilise this route it will require some configuration to get our 170+ projects into this format, and I will probably be writing some tools to help. Pros You only have one project to upgrade when a process template changes – After going through an upgrade of over 170 project prior to the changes in the RC I can tell you that that many projects is no fun. Standardises your Process Template – You will always have the same Process implementation across projects/products without exception You get tighter control over the permissions – Yes, you can do this on a standard Team Project, but it gets a lot easier with practice. You can “move” work items from one “product” to another – Have we not always wanted to do that. You can rename your projects – Wahoo: everyone wants to do this, now you can. One set of Reporting Services reports to manage – You set an area and iteration to run reports anyway, so you may as well set both. Simplified Check-In Policies– There is only one set of check-in policies per client. This simplifies administration of policies. Simplified Alerts – As alerts are applied across multiple projects this simplifies your alert rules as per client. Cons All of these cons could be mitigated by a custom tool that helps automate creation of “Sub-projects” within Team Projects. This custom tool could create areas, Iteration, permissions, SharePoint and queries. It just does not exist yet :) You need to configure the Areas and Iterations You need to configure the permissions You may need to configure sub sites for SharePoint (depends on your requirement) – If you have two projects/products in the same Team Project then you will not see the burn down for each one out-of-the-box, but rather a cumulative for the Team Project. This is not really that much of a problem as you would have to configure your burndown graphs for your current iteration anyway. note: When you create a sub site to a TFS linked portal it will inherit the settings of its parent site :) This is fantastic as it means that you can easily create sub sites and then set the Area and Iteration path in each of the reports to be the correct one. Every team wants their own customization (via Ewald Hofman) - small teams of 2 persons against teams of 30 – or even outsourcing – need their own process, you cannot allow that because everybody gets the same work item types. note: Luckily at SSW this is not a problem as our template is standardised across all projects and customers. Large list of builds (via Ewald Hofman) – As the build list in Team Explorer is just a flat list it can get very cluttered. note: I would mitigate this by removing any build that has not been run in over 30 days. The build template and workflow will still be available in version control, but it will clean the list. Feedback Now that I have explained this method, what do you think? What other pros and cons can you see? What do you think of this approach? Will you be using it? What tools would you like to support you? Technorati Tags: Visual Studio ALM,TFS Administration,TFS,Team Foundation Server,Project Planning,TFS Customisation

Read the article
Oracle BI Server Modeling, Part 1- Designing a Query Factory

- by bob.ertl(at)oracle.com

Welcome to Oracle BI Development's BI Foundation blog, focused on helping you get the most value from your Oracle Business Intelligence Enterprise Edition (BI EE) platform deployments. In my first series of posts, I plan to show developers the concepts and best practices for modeling in the Common Enterprise Information Model (CEIM), the semantic layer of Oracle BI EE. In this segment, I will lay the groundwork for the modeling concepts. First, I will cover the big picture of how the BI Server fits into the system, and how the CEIM controls the query processing. Oracle BI EE Query Cycle The purpose of the Oracle BI Server is to bridge the gap between the presentation services and the data sources. There are typically a variety of data sources in a variety of technologies: relational, normalized transaction systems; relational star-schema data warehouses and marts; multidimensional analytic cubes and financial applications; flat files, Excel files, XML files, and so on. Business datasets can reside in a single type of source, or, most of the time, are spread across various types of sources. Presentation services users are generally business people who need to be able to query that set of sources without any knowledge of technologies, schemas, or how sources are organized in their company. They think of business analysis in terms of measures with specific calculations, hierarchical dimensions for breaking those measures down, and detailed reports of the business transactions themselves. Most of them create queries without knowing it, by picking a dashboard page and some filters. Others create their own analysis by selecting metrics and dimensional attributes, and possibly creating additional calculations. The BI Server bridges that gap from simple business terms to technical physical queries by exposing just the business focused measures and dimensional attributes that business people can use in their analyses and dashboards. After they make their selections and start the analysis, the BI Server plans the best way to query the data sources, writes the optimized sequence of physical queries to those sources, post-processes the results, and presents them to the client as a single result set suitable for tables, pivots and charts. The CEIM is a model that controls the processing of the BI Server. It provides the subject areas that presentation services exposes for business users to select simplified metrics and dimensional attributes for their analysis. It models the mappings to the physical data access, the calculations and logical transformations, and the data access security rules. The CEIM consists of metadata stored in the repository, authored by developers using the Administration Tool client. Presentation services and other query clients create their queries in BI EE's SQL-92 language, called Logical SQL or LSQL. The API simply uses ODBC or JDBC to pass the query to the BI Server. Presentation services writes the LSQL query in terms of the simplified objects presented to the users. The BI Server creates a query plan, and rewrites the LSQL into fully-detailed SQL or other languages suitable for querying the physical sources. For example, the LSQL on the left below was rewritten into the physical SQL for an Oracle 11g database on the right. Logical SQL Physical SQL SELECT "D0 Time"."T02 Per Name Month" saw_0, "D4 Product"."P01 Product" saw_1, "F2 Units"."2-01 Billed Qty (Sum All)" saw_2 FROM "Sample Sales" ORDER BY saw_0, saw_1 WITH SAWITH0 AS ( select T986.Per_Name_Month as c1, T879.Prod_Dsc as c2, sum(T835.Units) as c3, T879.Prod_Key as c4 from Product T879 /* A05 Product */ , Time_Mth T986 /* A08 Time Mth */ , FactsRev T835 /* A11 Revenue (Billed Time Join) */ where ( T835.Prod_Key = T879.Prod_Key and T835.Bill_Mth = T986.Row_Wid) group by T879.Prod_Dsc, T879.Prod_Key, T986.Per_Name_Month ) select SAWITH0.c1 as c1, SAWITH0.c2 as c2, SAWITH0.c3 as c3 from SAWITH0 order by c1, c2 Probably everybody reading this blog can write SQL or MDX. However, the trick in designing the CEIM is that you are modeling a query-generation factory. Rather than hand-crafting individual queries, you model behavior and relationships, thus configuring the BI Server machinery to manufacture millions of different queries in response to random user requests. This mass production requires a different mindset and approach than when you are designing individual SQL statements in tools such as Oracle SQL Developer, Oracle Hyperion Interactive Reporting (formerly Brio), or Oracle BI Publisher. The Structure of the Common Enterprise Information Model (CEIM) The CEIM has a unique structure specifically for modeling the relationships and behaviors that fill the gap from logical user requests to physical data source queries and back to the result. The model divides the functionality into three specialized layers, called Presentation, Business Model and Mapping, and Physical, as shown below. Presentation services clients can generally only see the presentation layer, and the objects in the presentation layer are normally the only ones used in the LSQL request. When a request comes into the BI Server from presentation services or another client, the relationships and objects in the model allow the BI Server to select the appropriate data sources, create a query plan, and generate the physical queries. That's the left to right flow in the diagram below. When the results come back from the data source queries, the right to left relationships in the model show how to transform the results and perform any final calculations and functions that could not be pushed down to the databases. Business Model Think of the business model as the heart of the CEIM you are designing. This is where you define the analytic behavior seen by the users, and the superset library of metric and dimension objects available to the user community as a whole. It also provides the baseline business-friendly names and user-readable dictionary. For these reasons, it is often called the "logical" model--it is a virtual database schema that persists no data, but can be queried as if it is a database. The business model always has a dimensional shape (more on this in future posts), and its simple shape and terminology hides the complexity of the source data models. Besides hiding complexity and normalizing terminology, this layer adds most of the analytic value, as well. This is where you define the rich, dimensional behavior of the metrics and complex business calculations, as well as the conformed dimensions and hierarchies. It contributes to the ease of use for business users, since the dimensional metric definitions apply in any context of filters and drill-downs, and the conformed dimensions enable dashboard-wide filters and guided analysis links that bring context along from one page to the next. The conformed dimensions also provide a key to hiding the complexity of many sources, including federation of different databases, behind the simple business model. Note that the expression language in this layer is LSQL, so that any expression can be rewritten into any data source's query language at run time. This is important for federation, where a given logical object can map to several different physical objects in different databases. It is also important to portability of the CEIM to different database brands, which is a key requirement for Oracle's BI Applications products. Your requirements process with your user community will mostly affect the business model. This is where you will define most of the things they specifically ask for, such as metric definitions. For this reason, many of the best-practice methodologies of our consulting partners start with the high-level definition of this layer. Physical Model The physical model connects the business model that meets your users' requirements to the reality of the data sources you have available. In the query factory analogy, think of the physical layer as the bill of materials for generating physical queries. Every schema, table, column, join, cube, hierarchy, etc., that will appear in any physical query manufactured at run time must be modeled here at design time. Each physical data source will have its own physical model, or "database" object in the CEIM. The shape of each physical model matches the shape of its physical source. In other words, if the source is normalized relational, the physical model will mimic that normalized shape. If it is a hypercube, the physical model will have a hypercube shape. If it is a flat file, it will have a denormalized tabular shape. To aid in query optimization, the physical layer also tracks the specifics of the database brand and release. This allows the BI Server to make the most of each physical source's distinct capabilities, writing queries in its syntax, and using its specific functions. This allows the BI Server to push processing work as deep as possible into the physical source, which minimizes data movement and takes full advantage of the database's own optimizer. For most data sources, native APIs are used to further optimize performance and functionality. The value of having a distinct separation between the logical (business) and physical models is encapsulation of the physical characteristics. This encapsulation is another enabler of packaged BI applications and federation. It is also key to hiding the complex shapes and relationships in the physical sources from the end users. Consider a routine drill-down in the business model: physically, it can require a drill-through where the first query is MDX to a multidimensional cube, followed by the drill-down query in SQL to a normalized relational database. The only difference from the user's point of view is that the 2nd query added a more detailed dimension level column - everything else was the same. Mappings Within the Business Model and Mapping Layer, the mappings provide the binding from each logical column and join in the dimensional business model, to each of the objects that can provide its data in the physical layer. When there is more than one option for a physical source, rules in the mappings are applied to the query context to determine which of the data sources should be hit, and how to combine their results if more than one is used. These rules specify aggregate navigation, vertical partitioning (fragmentation), and horizontal partitioning, any of which can be federated across multiple, heterogeneous sources. These mappings are usually the most sophisticated part of the CEIM. Presentation You might think of the presentation layer as a set of very simple relational-like views into the business model. Over ODBC/JDBC, they present a relational catalog consisting of databases, tables and columns. For business users, presentation services interprets these as subject areas, folders and columns, respectively. (Note that in 10g, subject areas were called presentation catalogs in the CEIM. In this blog, I will stick to 11g terminology.) Generally speaking, presentation services and other clients can query only these objects (there are exceptions for certain clients such as BI Publisher and Essbase Studio). The purpose of the presentation layer is to specialize the business model for different categories of users. Based on a user's role, they will be restricted to specific subject areas, tables and columns for security. The breakdown of the model into multiple subject areas organizes the content for users, and subjects superfluous to a particular business role can be hidden from that set of users. Customized names and descriptions can be used to override the business model names for a specific audience. Variables in the object names can be used for localization. For these reasons, you are better off thinking of the tables in the presentation layer as folders than as strict relational tables. The real semantics of tables and how they function is in the business model, and any grouping of columns can be included in any table in the presentation layer. In 11g, an LSQL query can also span multiple presentation subject areas, as long as they map to the same business model. Other Model Objects There are some objects that apply to multiple layers. These include security-related objects, such as application roles, users, data filters, and query limits (governors). There are also variables you can use in parameters and expressions, and initialization blocks for loading their initial values on a static or user session basis. Finally, there are Multi-User Development (MUD) projects for developers to check out units of work, and objects for the marketing feature used by our packaged customer relationship management (CRM) software. The Query Factory At this point, you should have a grasp on the query factory concept. When developing the CEIM model, you are configuring the BI Server to automatically manufacture millions of queries in response to random user requests. You do this by defining the analytic behavior in the business model, mapping that to the physical data sources, and exposing it through the presentation layer's role-based subject areas. While configuring mass production requires a different mindset than when you hand-craft individual SQL or MDX statements, it builds on the modeling and query concepts you already understand. The following posts in this series will walk through the CEIM modeling concepts and best practices in detail. We will initially review dimensional concepts so you can understand the business model, and then present a pattern-based approach to learning the mappings from a variety of physical schema shapes and deployments to the dimensional model. Along the way, we will also present the dimensional calculation template, and learn how to configure the many additivity patterns.

Read the article
Understanding and Controlling Parallel Query Processing in SQL Server

Data warehousing and general reporting applications tend to be CPU intensive because they need to read and process a large number of rows. To facilitate quick data processing for queries that touch a large amount of data, Microsoft SQL Server exploits the power of multiple logical processors to provide parallel query processing operations such as parallel scans. Through extensive testing, we have learned that, for most large queries that are executed in a parallel fashion, SQL Server can deliver linear or nearly linear response time speedup as the number of logical processors increases. However, some queries in high parallelism scenarios perform suboptimally. There are also some parallelism issues that can occur in a multi-user parallel query workload. This white paper describes parallel performance problems you might encounter when you run such queries and workloads, and it explains why these issues occur. In addition, it presents how data warehouse developers can detect these issues, and how they can work around them or mitigate them.

Read the article
SQLAuthority News – Download Whitepaper – Understanding and Controlling Parallel Query Processing in SQL Server

- by pinaldave

My recently article SQL SERVER – Reducing CXPACKET Wait Stats for High Transactional Database has received many good comments regarding MAXDOP 1 and MAXDOP 0. I really enjoyed reading the comments as the comments are received from industry leaders and gurus. I was further researching on the subject and I end up on following white paper written by Microsoft. Understanding and Controlling Parallel Query Processing in SQL Server Data warehousing and general reporting applications tend to be CPU intensive because they need to read and process a large number of rows. To facilitate quick data processing for queries that touch a large amount of data, Microsoft SQL Server exploits the power of multiple logical processors to provide parallel query processing operations such as parallel scans. Through extensive testing, we have learned that, for most large queries that are executed in a parallel fashion, SQL Server can deliver linear or nearly linear response time speedup as the number of logical processors increases. However, some queries in high parallelism scenarios perform suboptimally. There are also some parallelism issues that can occur in a multi-user parallel query workload. This white paper describes parallel performance problems you might encounter when you run such queries and workloads, and it explains why these issues occur. In addition, it presents how data warehouse developers can detect these issues, and how they can work around them or mitigate them. To review the document, please download the Understanding and Controlling Parallel Query Processing in SQL Server Word document. Note: Above abstract has been taken from here. The real question is what does the parallel queries has made life of DBA much simpler or is it looked at with potential issue related to degradation of the performance? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL White Papers, SQLAuthority News, T SQL, Technology

Read the article
Is this so bad when using MySQL queries in PHP?

- by alex

I need to update a lot of rows, per a user request. It is a site with products. I could... Delete all old rows for that product, then loop through string building a new INSERT query. This however will lose all data if the INSERT fails. Perform an UPDATE through each loop. This loop currently iterates over 8 items, but in the future it may get up to 15. This many UPDATEs doesn't sound like too good an idea. Change DB Schema, and add an auto_increment Id to the rows. Then first do a SELECT, get all old rows ids in a variable, perform one INSERT, and then a DELETE WHERE IN SET. What is the usual practice here? Thanks

Read the article
How to select the top n from a union of two queries where the resulting order needs to be ranked by individual query?

- by Jedidja

Let's say I have a table with usernames: Id | Name ----------- 1 | Bobby 20 | Bob 90 | Bob 100 | Joe-Bob 630 | Bobberino 820 | Bob Junior I want to return a list of n matches on name for 'Bob' where the resulting set first contains exact matches followed by similar matches. I thought something like this might work SELECT TOP 4 a.* FROM ( SELECT * from Usernames WHERE Name = 'Bob' UNION SELECT * from Usernames WHERE Name LIKE '%Bob%' ) AS a but there are two problems: It's an inefficient query since the sub-select could return many rows (looking at the execution plan shows a join happening before top) (Almost) more importantly, the exact match(es) will not appear first in the results since the resulting set appears to be ordered by primary key. I am looking for a query that will return (for TOP 4) Id | Name --------- 20 | Bob 90 | Bob (and then 2 results from the LIKE query, e.g. 1 Bobby and 100 Joe-Bob) Is this possible in a single query?

Read the article
Can I split an IEnumerable into two by a boolean criteria without two queries?

- by SFun28

Folks, Is it possible to split an IEnumerable into two IEnumerables using LINQ and only a single query/linq statement? In this way, I would avoid iterating through the IEnumerable twice. For example, is it possible to combine the last two statements below so allValues is only traversed once? IEnumerable<MyObj> allValues = ... List<MyObj> trues = allValues.Where( val => val.SomeProp ).ToList(); List<MyObj> falses = allValues.Where( val => !val.SomeProp ).ToList();

Read the article
MySQL create stored procedure fails but all internal queries succeed alone?

- by Mark

Hi all, I just created a simple database in MySQL, and I am learning how to write stored proc's. I'm familiar with M$SQL and as far as I can see the following should work: use mydb; -- -------------------------------------------------------------------------------- -- Routine DDL -- -------------------------------------------------------------------------------- DELIMITER // CREATE PROCEDURE mydb.doStats () BEGIN CREATE TABLE IF NOT EXISTS resultprobability ( ballNumber INT NOT NULL , probability FLOAT NULL, PRIMARY KEY (ballNumber) ); CREATE TABLE IF NOT EXISTS drawProbability ( drawDate DATE NOT NULL , ball1 INT NULL , ball2 INT NULL , ball3 INT NULL , ball4 INT NULL , ball5 INT NULL , ball6 INT NULL , ball7 INT NULL , score FLOAT NULL , PRIMARY KEY (drawDate) ); TRUNCATE TABLE resultprobability; TRUNCATE TABLE drawprobability; INSERT INTO resultprobability (ballNumber, probability) (select resultset.ballNumber ballNumber,(count(0)/(select count(0) from resultset)) probability from resultset group by resultset.ballNumber); INSERT INTO drawProbability (drawDate, ball1, ball2, ball3, ball4, ball5, ball6, ball7, score) (select distinct r.drawDate, a.ballnumber ball1, b.ballnumber ball2, c.ballnumber ball3, d.ballnumber ball4, e.ballnumber ball5, f.ballnumber ball6,g.ballnumber ball7, ((a.probability + b.probability + c.probability + d.probability + e.probability + f.probability + g.probability)/7) score from resultset r inner join (select r.drawDate, r.ballNumber, p.probability from resultset r inner join resultprobability p on p.ballNumber = r.ballNumber where r.appearence = 1) a on a.drawdate = r.drawDate inner join (select r.drawDate, r.ballNumber, p.probability from resultset r inner join resultprobability p on p.ballNumber = r.ballNumber where r.appearence = 2) b on b.drawdate = r.drawDate inner join (select r.drawDate, r.ballNumber, p.probability from resultset r inner join resultprobability p on p.ballNumber = r.ballNumber where r.appearence = 3) c on c.drawdate = r.drawDate inner join (select r.drawDate, r.ballNumber, p.probability from resultset r inner join resultprobability p on p.ballNumber = r.ballNumber where r.appearence = 4) d on d.drawdate = r.drawDate inner join (select r.drawDate, r.ballNumber, p.probability from resultset r inner join resultprobability p on p.ballNumber = r.ballNumber where r.appearence = 5) e on e.drawdate = r.drawDate inner join (select r.drawDate, r.ballNumber, p.probability from resultset r inner join resultprobability p on p.ballNumber = r.ballNumber where r.appearence = 6) f on f.drawdate = r.drawDate inner join (select r.drawDate, r.ballNumber, p.probability from resultset r inner join resultprobability p on p.ballNumber = r.ballNumber where r.appearence = 7) g on g.drawdate = r.drawDate order by score desc); END // DELIMITER ; instead i get the following Executed successfully in 0.002 s, 0 rows affected. Line 1, column 1 Error code 1064, SQL state 42000: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 26 Line 6, column 1 Error code 1064, SQL state 42000: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')) probability from resultset group by resultset.ballNumber); INSERT INTO d' at line 1 Line 31, column 51 Error code 1064, SQL state 42000: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ') score from resultset r inner join (select r.drawDate, r.ballNumber, p.probabi' at line 1 Line 39, column 114 Execution finished after 0.002 s, 3 error(s) occurred. What am I doing wrong? I seem to have exhausted my limited mental abilities!

Read the article
Tuning Red Gate: #3 of Lots

- by Grant Fritchey

I'm drilling down into the metrics about SQL Server itself available to me in the Analysis tab of SQL Monitor to see what's up with our two problematic servers. In the previous post I'd noticed that rg-sql01 had quite a few CPU spikes. So one of the first things I want to check there is how much CPU is getting used by SQL Server itself. It's possible we're looking at some other process using up all the CPU Nope, It's SQL Server. I compared this to the rg-sql02 server: You can see that there is a more, consistently low set of CPU counters there. I clearly need to look at rg-sql01 and capture more specific data around the queries running on it to identify which ones are causing these CPU spikes. I always like to look at the Batch Requests/sec on a server, not because it's an indication of a problem, but because it gives you some idea of the load. Just how much is this server getting hit? Here are rg-sql01 and rg-sql02: Of the two, clearly rg-sql01 has a lot of activity. Remember though, that's all this is a measure of, activity. It doesn't suggest anything other than what it says, the number of requests coming in. But it's the kind of thing you want to know in order to understand how the system is used. Are you seeing a correlation between the number of requests and the CPU usage, or a reverse correlation, the number of requests drops as the CPU spikes? See, it's useful. Some of the details you can look at are Compilations/sec, Compilations/Batch and Recompilations/sec. These give you some idea of how the cache is getting used within the system. None of these showed anything interesting on either server. One metric that I like (even though I know it can be controversial) is the Page Life Expectancy. On the average server I expect see a series of mountains as the PLE climbs then drops due to a data load or something along those lines. That's not the case here: Those spikes back in January suggest that the servers weren't really being used much. The PLE on the rg-sql01 seems to be somewhat consistent growing to 3 hours or so then dropping, but the rg-sql02 PLE looks like it might be all over the map. Instead of continuing to look at this high level gathering data view, I'm going to drill down on rg-sql02 and see what it's done for the last week: And now we begin to see where we might have an issue. Memory on this system is getting flushed every 1/2 hour or so. I'm going to check another metric, scans: Whoa! I'm going back to the system real quick to look at some disk information again for rg-sql02. Here is the average disk queue length on the server: and the transfers Right, I think I have a guess as to what's up here. We're seeing memory get flushed constantly and we're seeing lots of scans. The disks are queuing, especially that F drive, and there are lots of requests that correspond to the scans and the memory flushes. In short, we've got queries that are scanning the data, a lot, so we either have bad queries or bad indexes. I'm going back to the server overview for rg-sql02 and check the Top 10 expensive queries. I'm modifying it to show me the last 3 days and the totals, so I'm not looking at some maintenance routine that ran 10 minutes ago and is skewing the results: OK. I need to look into these queries that are getting executed this much. They're generating a lot of reads, but which queries are generating the most reads: Ow, all still going against the same database. This is where I'm going to temporarily leave SQL Monitor. What I want to do is connect up to the server, validate that the Warehouse database is using the F:\ drive (which I'll put money down it is) and then start seeing what's up with these queries. Part 1 of the Series Part 2 of the Series

Read the article
SQL SERVER – How to Force New Cardinality Estimation or Old Cardinality Estimation

- by Pinal Dave

After reading my initial two blog posts on New Cardinality Estimation, I received quite a few questions. Once I receive this question, I felt I should have clarified it earlier few things when I started to write about cardinality. Before continuing this blog, if you have not read it before I suggest you read following two blog posts. SQL SERVER – Simple Demo of New Cardinality Estimation Features of SQL Server 2014 SQL SERVER – Cardinality Estimation and Performance – SQL in Sixty Seconds #072 Q: Does new cardinality will improve performance of all of my queries? A: Remember, there is no 0 or 1 logic when it is about estimation. The general assumption is that most of the queries will be benefited by new cardinality estimation introduced in SQL Server 2014. That is why the generic advice is to set the compatibility level of the database to 120, which is for SQL Server 2014. Q: Is it possible that after changing cardinality estimation to new logic by setting the value to compatibility level to 120, I get degraded performance for few queries? A: Yes, it is possible. However, the number of the queries where this impact should be very less. Q: Can I still run my database in older compatibility level and force few queries to newer cardinality estimation logic? If yes, How? A: Yes, you can do that. You will need to force your query with trace flag 2312 to use newer cardinality estimation logic. USE AdventureWorks2014 GO -- Old Cardinality Estimation ALTER DATABASE AdventureWorks2014 SET COMPATIBILITY_LEVEL = 110 GO -- Using New Cardinality Estimation SELECT [AddressID],[AddressLine1],[City] FROM [Person].[Address] OPTION(QUERYTRACEON 2312);; -- Using Old Cardinality Estimation SELECT [AddressID],[AddressLine1],[City] FROM [Person].[Address]; GO Q: Can I run my database in newer compatibility level and force few queries to older cardinality estimation logic? If yes, How? A: Yes, you can do that. You will need to force your query with trace flag 9481 to use newer cardinality estimation logic. USE AdventureWorks2014 GO -- NEW Cardinality Estimation ALTER DATABASE AdventureWorks2014 SET COMPATIBILITY_LEVEL = 120 GO -- Using New Cardinality Estimation SELECT [AddressID],[AddressLine1],[City] FROM [Person].[Address]; -- Using Old Cardinality Estimation SELECT [AddressID],[AddressLine1],[City] FROM [Person].[Address] OPTION(QUERYTRACEON 9481); GO I guess, I have covered most of the questions so far I have received. If I have missed any questions, please send me again and I will include the same. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Get corresponding physical disk drives of mountpoints with WMI queries?

- by Thomas

Is there a way to retrieve a connection between a mountpoint (a volume which is mounted into the file system instead of mounted to a drive letter) and its belonging physical disk drive(s) with WMI? For example I have got a volume mountpoint on a W2K8 server which is mounted to “C:\Data\” and the mountpoint is spreaded on the physical disk drives 2, 4, and 5 of the server (the Data Management of the Server Manager shows that) but I cannot find a way to get this to know by using WMI. Volumes which have got a drive letter can be connected with the WMI-Classes Win32_DiskDrive -- Win32_DiskDriveToDiskPartition -- Win32_DiskPartition -- Win32_LogicalDiskToPartition -- Win32_LogicalDisk – but the problem is, that volume mountpoints aren’t listed in the class Win32_LogicalDisk, they are only listed in Win32_Volume. And I did not find a way to connect the class Win32_Volume with the class Win32_DiskDrive – there are missing some linking classes. Does anyone know a solution?

Read the article
NDepend 4 – First Steps

- by Ricardo Peres

Introduction Thanks to Patrick Smacchia I had the chance to test NDepend 4. I can only say: awesome! This will be the first of a series of posts on NDepend, where I will talk about my discoveries. Keep in mind that I am just starting to use it, so more experienced users may find these too basic, I just hope I don’t say anything foolish! I must say that I am in no way affiliated with NDepend and I never actually met Patrick. Installation No installation program – a curious decision, I’m not against it -, just unzip the files to a folder and run the executable. It will optionally register itself with Visual Studio 2008, 2010 and 11 as well as RedGate’s Reflector; also, it automatically looks for updates. NDepend can either be used as a stand-alone program (with or without a GUI) or from within Visual Studio or Reflector. Getting Started One thing that really pleases me is the Getting Started section of the stand-alone, with links to pages on NDepend’s web site, featuring detailed explanations, which usually include screenshots and small videos (<5 minutes). There’s also an How do I with hierarchical navigation that guides us to through the major features so that we can easily find what we want. Usage There are two basic ways to use NDepend: Analyze .NET solutions, projects or assemblies; Compare two versions of the same assembly. I have so far not used NDepend to compare assemblies, so I will first talk about the first option. After selecting a solution and some of its projects, it generates a single HTML page with an highly detailed report of the analysis it produced. This includes some metrics such as number of lines of code, IL instructions, comments, types, methods and properties, the calculation of the cyclomatic complexity, coupling and lots of others indicators, typically grouped by type, namespace and assembly. The HTML also includes some nice diagrams depicting assembly dependencies, type and method relative proportions (according to the number of IL instructions, I guess) and assembly analysis relating to abstractness and stability. Useful, I would say. Then there’s the rules; NDepend tests the target assemblies against a set of more than 120 rules, grouped in categories Code Quality, Object Oriented Design, Design, Architecture and Layering, Dead Code, Visibility, Naming Conventions, Source Files Organization and .NET Framework Usage. The full list can be configured on the application, and an explanation of each rule can be found on the web site. Rules can be validated, violated and violated in a critical manner, and the HTML will contain the violated rules, their queries – more on this later - and results. The HTML uses some nice JavaScript effects, which allow paging and sorting of tables, so its nice to use. Similar to the rules, there are some queries that display results for a number (about 200) questions grouped as Object Oriented Design, API Breaking Changes (for assembly version comparison), Code Diff Summary (also for version comparison) and Dead Code. The difference between queries and rules is that queries are not classified as passes, violated or critically violated, just present results. The queries and rules are expressed through CQLinq, which is a very powerful LINQ derivative specific to code analysis. All of the included rules and queries can be enabled or disabled and new ones can be added, with intellisense to help. Besides the HTML report file, the NDepend application can be used to explore all analysis results, compare different versions of analysis reports and to run custom queries. Comparison to Other Analysis Tools Unlike StyleCop, NDepend only works with assemblies, not source code, so you can’t expect it to be able to enforce brackets placement, for example. It is more similar to FxCop, but you don’t have the option to analyze at the IL level, that is, other that the number of IL instructions and the complexity. What’s Next In the next days I’ll continue my exploration with a real-life test case. References The NDepend web site is http://www.ndepend.com/. Patrick keeps an updated blog on http://codebetter.com/patricksmacchia/ and he regularly monitors StackOverflow for questions tagged NDepend, which you can find on http://stackoverflow.com/questions/tagged/ndepend. The default list of CQLinq rules, queries and statistics can be found at http://www.ndepend.com/DefaultRules/webframe.html. The syntax itself is described at http://www.ndepend.com/Doc_CQLinq_Syntax.aspx and its features at http://www.ndepend.com/Doc_CQLinq_Features.aspx.

Read the article
How can I speed up queries against tables I cannot add indexes to?

- by RenderIn

I access several tables remotely via DB Link. They are very normalized and the data in each is effective-dated. Of the millions of records in each table, only a subset of ~50k are current records. The tables are internally managed by a commercial product that will throw a huge fit if I add indexes or make alterations to its tables in any way. What are my options for speeding up access to these tables?

Read the article
MySQL & PHP - select/option lists and showing data to users that still allows me to generate queries

- by Andrew Heath

Sorry for the unclear title, an example will clear things up: TABLE: Scenario_victories ID scenid timestamp userid side playdate 1 RtBr001 2010-03-15 17:13:36 7 1 2010-03-10 2 RtBr001 2010-03-15 17:13:36 7 1 2010-03-10 3 RtBr001 2010-03-15 17:13:51 7 2 2010-03-10 ID and timestamp are auto-insertions by the database when the other 4 fields are added. The first thing to note is that a user can record multiple playings of the same scenario (scenid) on the same date (playdate) possibly with the same outcome (side = winner). Hence the need for the unique ID and timestamps for good measure. Now, on their user page, I'm displaying their recorded play history in a <select><option>... list form with 2 buttons at the end - Delete Record and Go to Scenario My script takes the scenid and after hitting a few other tables returns with something more user-friendly like: (playdate) (from scenid) (from side) ######################################################### # 2010-03-10 Road to Berlin #1 -- Germany, Hungary won # # 2010-03-10 Road to Berlin #1 -- Germany, Hungary won # # 2010-03-10 Road to Berlin #1 -- Soviet Union won # ######################################################### [Delete Record] [Go To Scenario] in HTML: <select name="history" size=3> <option>2010-03-10 Road to Berlin #1 -- Germany, Hungary won</option> <option>2010-03-10 Road to Berlin #1 -- Germany, Hungary won</option> <option>2010-03-10 Road to Berlin #1 -- Soviet Union won</option> </select> Now, if you were to highlight the first record and click Go to Scenario there is enough information there for me to parse it and produce the exact scenario you want to see. However, if you were to select Delete Record there is not - I have the playdate and I can parse the scenid and side from what's listed, but in this example all three records would have the same result. I appear to have painted myself into a corner. Does anyone have a suggestion as to how I can get some unique identifying data (ID and/or timestamp) to ride along on this form without showing it to the user? PHP-only please, I must be NoScript compliant!

Read the article
Which kinds of queries is better for querying against conceptual model in Entity Framework?

- by masoud ramezani

There are 3 way for querying against conceptual model in EF : LINQ to Entity Entity SQL Query Builder Methods Which one is better for which situation? Is there any performance issues for these 3 type of querying?

Read the article
Complex Rails queries across multiple tables, unions, and will_paginate. Solved.

- by uberllama

Hi folks. I've been working on a complex "user feed" type of functionality for a while now, and after experimenting with various union plugins, hacking named scopes, and brute force, have arrived at a solution I'm happy with. S.O. has been hugely helpful for me, so I thought I'd post it here in hopes that it might help others and also to get feedback -- it's very possible that I worked on this so long that I walked down an unnecessarily complicated road. For the sake of my example, I'll use users, groups, and articles. A user can follow other users to get a feed of their articles. They can also join groups and get a feed of articles that have been added to those groups. What I needed was a combined, pageable feed of distinct articles from a user's contacts and groups. Let's begin. user.rb has_many :articles has_many :contacts has_many :contacted_users, :through => :contacts has_many :memberships has_many :groups, :through => :memberships contact.rb belongs_to :user belongs_to :contacted_user, :class_name => "User", :foreign_key => "contacted_user_id" article.rb belongs_to :user has_many :submissions has_many :groups, :through => :submissions group.rb has_many :memberships has_many :users, :through => :memberships has_many :submissions has_many :articles, :through => :submissions Those are the basic models that define my relationships. Now, I add two named scopes to the Article model so that I can get separate feeds of both contact articles and group articles should I desire. article.rb # Get all articles by user's contacts named_scope :by_contacts, lambda {|user| {:joins => "inner join contacts on articles.user_id = contacts.contacted_user_id", :conditions => ["articles.published = 1 and contacts.user_id = ?", user.id]} } # Get all articles in user's groups. This does an additional query to get the user's group IDs, then uses those in an IN clause named_scope :by_groups, lambda {|user| {:select => "DISTINCT articles.*", :joins => :submissions, :conditions => {:submissions => {:group_id => user.group_ids}}} } Now I have to create a method that will provide a UNION of these two feeds into one. Since I'm using Rails 2.3.5, I have to use the construct_finder_sql method to render a scope into its base sql. In Rails 3.0, I could use the to_sql method. user.rb def feed "(#{Article.by_groups(self).send(:construct_finder_sql,{})}) UNION (#{Article.by_contacts(self).send(:construct_finder_sql,{})})" end And finally, I can now call this method and paginate it from my controller using will_paginate's paginate_by_sql method. HomeController.rb @articles = Article.paginate_by_sql(current_user.feed, :page => 1) And we're done! It may seem simple now, but it was a lot of work getting there. Feedback is always appreciated. In particular, it would be great to get away from some of the raw sql hacking. Cheers.

Read the article
How to search phrase queries in inverted index structure?

- by Mehdi Amrollahi

If we want to search a query like this "t1 t2 t3" (t1,t2 ,t3 must be queued) in an inverted index structure , which ways could we do ? 1-First we search the "t1" term and find all documents that contains "t1" , then do this work for "t2" and then "t3" . Then find documents that positions of "t1" , "t2" and "t3" are next to each other . 2-First we search the "t1" term and find all documents that contains "t1" , then in all documents that we found , we search the "t2" and next , in the result of this , we find documents that contains "t3" . thanks .

Read the article

Search Results

Search found 4685 results on 188 pages for 'queries'.

Page 28/188 | < Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35 | Next Page >

- by David

- by Synesso

- by DanM

- by dmanexe

- by sycobuny

- by Michael Joell

- by AD

- by Simon Cooper

- by Martin Hinshelwood

- by bob.ertl(at)oracle.com

- by pinaldave

- by alex

- by Jedidja

- by SFun28

- by Mark

- by Grant Fritchey

- by Pinal Dave

- by Thomas

- by Ricardo Peres

- by RenderIn

- by Andrew Heath

- by masoud ramezani

- by uberllama

- by Mehdi Amrollahi

< Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35 | Next Page >