Search Results

Search found 8161 results on 327 pages for 'django queries'.

Page 147/327 | < Previous Page | 143 144 145 146 147 148 149 150 151 152 153 154  | Next Page >

  • Data access strategy for a site like SO - sorted SQL queries and simultaneous updates that affect th

    - by Kaleb Brasee
    I'm working on a Grails web app that would be similar in access patterns to StackOverflow or MyLifeIsAverage - users can vote on entries, and their votes are used to sort a list of entries based on the number of votes. Votes can be placed while the sorted select queries are being performed. Since the selects would lock a large portion of the table, it seems that normal transaction locking would cause updates to take forever (given enough traffic). Has anyone worked on an app with a data access pattern such as this, and if so, did you find a way to allow these updates and selects to happen more or less concurrently? Does anyone know how sites like SO approach this? My thought was to make the sorted selects dirty reads, since it is acceptable if they're not completely up to date all of the time. This is my only idea for possibly improving performance of these selects and updates, but I thought someone might know a better way.

    Read the article

  • Need help on nested loop of queries in php and mysql?

    - by mysqllearner
    Hi, I am trying to get do this: <?php $good_customer = 0; $q = mysql_query("SELECT user FROM users WHERE activated = '1'"); // this gives me about 40k users while($r = mysql_fetch_assoc($q)){ $money_spent = 0; $user = $r['user']; // Do queries on another 20 tables for($i = 1; $i<=20 ; $i++){ $tbl_name = 'data' . $i; $q2 = mysql_query("SELECT money_spent FROM $tbl_name WHERE user = '{$user}'"); while($r2 = mysql_fetch_assoc($q2)){ $money_spend += $r2['money_spent']; } if($money_spend > 1000000){ $good_customer += 1; } } } This is just an example. I am testing on localhost, for single user, it returns very fast. But when I try 1000, it takes forever, not even mentioned 40k users. Anyway to optimise/improve this code? EDIT: By the way, each of the others 20 tables has ~20 - 40k records

    Read the article

  • Efficient way to combine results of two database queries.

    - by ensnare
    I have two tables on different servers, and I'd like some help finding an efficient way to combine and match the datasets. Here's an example: From server 1, which holds our stories, I perform a query like: query = """SELECT author_id, title, text FROM stories ORDER BY timestamp_created DESC LIMIT 10 """ results = DB.getAll(query) for i in range(len(results)): #Build a string of author_ids, e.g. '1314,4134,2624,2342' But, I'd like to fetch some info about each author_id from server 2: query = """SELECT id, avatar_url FROM members WHERE id IN (%s) """ values = (uid_list) results = DB.getAll(query, values) Now I need some way to combine these two queries so I have a dict that has the story as well as avatar_url and member_id. If this data were on one server, it would be a simple join that would look like: SELECT * FROM members, stories WHERE members.id = stories.author_id But since we store the data on multiple servers, this is not possible. What is the most efficient way to do this? Thanks.

    Read the article

  • Is there a Firefox or Chrome plugin, or a standalone program, for monitoring site usage and search queries?

    - by Leigh Caldwell
    I'm running some research on how people search the web for specific types of information. I'd like to be able to set them up with a laptop and browser and then record a history of what they search for and what sites they visit. A Firefox or Chrome plugin would be ideal, but a standalone program is fine too. It doesn't need to be free, just quick and reliable. It doesn't need to be a general PC monitoring program (though that would be OK too) - it's only Web usage I need to track. I've found a few on the Web but am not sure which ones to trust. Your recommendations would be much appreciated.

    Read the article

  • SQLAuthority News – We’re sorry… … but your computer or network may be sending automated queries. To

    - by pinaldave
    I use multiple browser many times when I am working with multiple projects simultaneously. Often I use Google Reader to read few feeds. Recently, I faced the following error and this error will not go. I even restarted my computer and rebooted my network. I am confident that my computer does not have viruses or malware, I could not tackle this error. When I opened Google Reader on another browser, it worked fine. Finally, I found the solution and I want share it with all of you. Error We’re sorry… … but your computer or network may be sending automated queries. To protect our users, we can’t process your request right now. I removed the cookies of Google Reader with the name ‘reader_offline’ as displayed in image below. Once I remove the above mentioned cookie, I could login perfectly fine in Google Reader. I think this message from Google was misleading and inaccurate; however, the solution is easy enough. I just wanted to share this quick tip with everyone who is facing such an issue. Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL, Technology Tagged: Google

    Read the article

  • How do I determine the cause of a sustained spike in mysql queries/activity?

    - by mattmcmanus
    So this is more of a "I'm trying to learn about how this works" question rather than "there is a serious problem I can't figure out!" question. I'm setting up a VPS and have been tweaking and changing things here and there. I recently installed munin (like two days ago) and yesterday I noticed a significant increase in mysql activity. So now my curiosity is going crazy. How do I setup/access mysql's query log? I have about 5 databases on the server so I want to see which one is getting all the action. Is there anything else I can do to keep a better eye on what's going on? Here are the graphs. As you can tell, it's not that much activity at all but I'm just curious at the change. The sites that are on the server right now do not get a lot of traffic. It's running a couple drupal sites, only one of which is live. The live one hasn't had a spike in traffic and the last spike was 250 visitors so it's barely a spike at all.

    Read the article

  • Can I configure a DNS cache not to forward AAAA queries?

    - by itsadok
    I'm setting up an internal DNS cache because my firewall is having trouble handling all the sessions created by DNS requests. I tried using bind9, dnsmasq and DJB dnscache, they all help reduce the number of requests leaving my network, but there are still a lot of request being made. Looking at the log files, and tcpdump and dnstop outputs, it seems that requests that return SERVFAIL do not get cached at all. And a lot of those failed requests are AAAA requests, which is a shame, because I do not have ipv6 enabled on any server. I've looked at several ways to help the situation, and I think if I could somehow prevent AAAA record requests from being forwarded by the DNS cache, it would reduce the number of requests significantly. The closest thing I found was the filter-aaaa-on-v4 option in BIND9. However, this only removes the record from the server response, and does not prevent it from forwarding it. Any help would be appreciated.

    Read the article

  • Play or Lift: which one is more explicit?

    - by Andrea
    I am going to investigate web development with Scala, and the choice is between learning Lift or Play: probably I will not have enough time to try both, at least at first. Now, many comparisons between the two are available on the internet, but I would like to know how do they compare with respect to being explicit and involving less magic. Let me explain what I mean by example. I have used, to various degrees, CakePHP, symfony2, Django and Grails. I feel a very clear distinction between Django and symfony2, which are very explicit about what you are doing, and Grails and CakePHP, which try to do their best to guess what you are trying to achieve and often feel "magical". Let me give some examples comparing Django and Grails. In Django, views are functions that take a request as input and return a response. You can instantiate explicitly an instance of HttpResponse and populate its body with a string, or you can use shortcut functions to leverage the template system. In any case the return value from your view always has the same type. In contrast, the render method from Grails is highly polymorphic. You can throw a context at it and it will try to render a template which is found by convention using that context. Or you can pass it a pair of a template path and a context and that will work too. Or a string. Or XML. Grails tries hard to make sense of whatever you return from your controller. In the Django ORM, each model class has a static attribute representing the manager for that class. That manager exposes a fluent interface to build querysets. In Grails, you can have a similar functionality by composing detached criteria. Still, the most common way to query objects seems to be the use of runtime-generated methods like FindUserByEmailNotNull or FindPostByDateGreaterThan. I will not go further, but my point is that in Django-like frameworks you have control over the whole flow of the request/response process, while in Grails-like ones I feel I only have to feel the blanks and the framework will manage the rest of the flow for me. This is not to criticize Grails or CakePHP; which type you prefer is mainly a matter of preference. In fact, I happen to like some aspects of Grails, but I feel more comfortable with a framework which does less for me. Back to the point of the question: which one among Play and Lift is more explicit about what you do and which one tries to simplify more what you have to do with a layer of "magic"?

    Read the article

  • apt-get install problem: Errors were encountered while processing: sun-j2sdk1.6

    - by pyeleven
    I have the following problem every time i run apt-get install: for example : installing python-django-south ... Unpacking python-django-south (from .../python-django-south_0.5-2_all.deb) ... Setting up sun-j2sdk1.6 (1.6.0+update22-linux-i586.) ... update-alternatives: error: alternative path /usr/lib/j2sdk1.6-sun/jre/plugin/amd64/ns7/libjavaplugin_oji.so doesn't exist. dpkg: error processing sun-j2sdk1.6 (--configure): subprocess installed post-installation script returned error exit status 2 Setting up python-django-south (0.5-2) ... Processing triggers for python-support ... Errors were encountered while processing: sun-j2sdk1.6 E: Sub-process /usr/bin/dpkg returned an error code (1) What could be the problem? I have 9.10 Ubuntu

    Read the article

  • Komodo Edit - How to disable the 'Linter' for a language?

    - by TM.
    I've been using Komodo Edit to work on a Django project. It works great except for one little annoyance: When I am editing Django template files, Komodo likes to put red squiggly lines underneath the first HTML tag that follows a Django tag, because it thinks it is an invalid HTML doc (although it isn't, it just has Django template tags/filters in it). Note that this red squiggly line is called a "Linter error" in the docs that I can find. Is there some way to turn off this red squiggly for only a specific type of language? It's nice to have when working on Python code but it's annoying to have a red squiggly on every single one of my Django templates.

    Read the article

  • What PowerShell/WSMan clients or queries are consuming more than 1000 requests per 2 seconds?

    - by makerofthings7
    Exchange 2010 remote administration tools are complaining with the following error [txexmb02.ibm.com] Connecting to remote server failed with the following error message : The WS-Management service cannot process the request. The system load quota of 1000 requests per 2 seconds has been exceeded. Send future requests at a slower rate or raise the system quota. The next request from this user will not be approved for at least 558475776 milliseconds. For more information, see the about_Remote_Troubleshooting Help topic. + CategoryInfo : OpenError: (System.Manageme....RemoteRunspace:RemoteRunspace) [], PSRemotingTransportException + FullyQualifiedErrorId : PSSessionOpenFailed VERBOSE: Connecting to TXEXHC02.ibm.com The help document this error referrers to says this is a WS-Man error. We're running SCOM 2007 R2 and am thinking that is increasing the query count, but I need to prove it.

    Read the article

  • [Repost-ish] Impossibly slow queries, Tables indexed, How can I speed it up?

    - by colorfulgrayscale
    Hi guys, I posted a little earlier on here at http://stackoverflow.com/questions/2656837/query-results-taking-too-long-on-200k-database-speed-up-tips asking about slow executing SQL queries. I was told to index the columns; I did. and its still slow (slow as in, i never see the results, both mysql and sqlite freeze up on query). Help would be greatly appreciated. Here is the SQL SELECT equipment.`unitID` AS `equipment_unitID`, equipment.`fleetCode` AS `equipment_fleetCode`, equipment.type AS equipment_type, equipment.tiremap AS equipment_tiremap, tiremap.`TireID` AS `tiremap_TireID`, tiremap.`WorkMap` AS `tiremap_WorkMap`, tiremap.`Position` AS `tiremap_Position`, tiremap.`DepthMap` AS `tiremap_DepthMap`, tiremap.timestamp AS tiremap_timestamp, workreference.`aMap` AS `workreference_aMap`, workreference.`bMap` AS `workreference_bMap`, tirework.`RO` AS `tirework_RO`, tirework.location AS tirework_location, tirework.mileage AS tirework_mileage, tirework.`mechanicCode` AS `tirework_mechanicCode`, tirework.`partNumber` AS `tirework_partNumber`, tirework.`historyID` AS `tirework_historyID`, tirework.workmap AS tirework_workmap, tirework.timestamp AS tirework_timestamp FROM equipment, tiremap, workreference, tirework WHERE equipment.tiremap = tiremap.`TireID` AND tiremap.`WorkMap` = workreference.`aMap` AND workreference.`bMap` = tirework.workmap LIMIT 5 and here is the EXPLAIN for it id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE equipment ALL tiremap 14079 1 SIMPLE tiremap ref PRIMARY,WorkMap,TireID,WorkMap_2 PRIMARY 52 tire.equipment.tiremap 3 1 SIMPLE workreference ref aMap,bMap aMap 52 tire.tiremap.WorkMap 1 1 SIMPLE tirework eq_ref NewIndex1 NewIndex1 52 tire.workreference.bMap 1

    Read the article

  • Would this method work to scale out SQL queries?

    - by David
    I have a database containing a single huge table. At the moment a query can take anything from 10 to 20 minutes and I need that to go down to 10 seconds. I have spent months trying different products like GridSQL. GridSQL works fine, but is using its own parser which does not have all the needed features. I have also optimized my database in various ways without getting the speedup I need. I have a theory on how one could scale out queries, meaning that I utilize several nodes to run a single query in parallel. The idea is to take an incoming SQL query and simply run it exactly like it is on all the nodes. When the results are returned to a coordinator node, the same query is run on the union of the resultsets. I realize that an aggregate function like average need to be rewritten into a count and sum to the nodes and that the coordinator divides the sum of the sums with the sum of the counts to get the average. What kinds of problems could not easily be solved using this model. I believe one issue would be the count distinct function. Edit: I am getting so many nice suggestions, but none have addressed the method.

    Read the article

  • How can I access mainframe data with .Net applications and SQL Queries?

    - by orandov
    We have a large amount of data stored on an IBM mainframe using VSAM files. A lot of this data is dropped on the network every night in the form of text files to be processed and dumped into FoxPro and SQL Server databases. There are also many text files produced nightly by custom applications that get uploaded to the mainframe to keep everything in sync. Keeping the everything in sync is very tricky, to say the least. We are not getting rid of the mainframe any time soon and we would like to replace all the nightly batch processing with real time access to the mainframe data. We would like to be able to: Read data directly from the mainframe and produce reports based on it. Possibly using SQL queries. Read and Write data from custom .Net applications. We are not looking for a new platform to interface with the mainframe like Information Builders offers. We don't want to build application modules or reports with new "Business Intelligence" tools. We already know how to generate reports and write custom applications using SQL,.Net, Visual Studio, etc. All we are looking for is some sort of adapter to connect to our mainframe data. Any ideas are appreciated.

    Read the article

  • How can I improve the performance of LinqToSql queries that use EntitySet properties?

    - by DanM
    I'm using LinqToSql to query a small, simple SQL Server CE database. I've noticed that any operations involving sub-properties are disappointingly slow. For example, if I have a Customer table that is referenced by an Order table, LinqToSql will automatically create an EntitySet<Order> property. This is a nice convenience, allowing me to do things like Customer.Order.Where(o => o.ProductName = "Stopwatch"), but for some reason, SQL Server CE hangs up pretty bad when I try to do stuff like this. One of my queries, which isn't really that complicated takes 3-4 seconds to complete. I can get the speed up to acceptable, even fast, if I just grab the two tables individually and convert them to List<Customer> and List<Order>, then join then manually with my own query, but this is throwing out a lot of what makes LinqToSql so appealing. So, I'm wondering if I can somehow get the whole database into RAM and just query that way, then occasionally save it. Is this possible? How? If not, is there anything else I can do to boost the performance besides resorting to doing all the joins manually? Note: My database in its initial state is about 250K and I don't expect it to grow to more than 1-2Mb. So, loading the data into RAM certainly wouldn't be a problem from a memory point of view. Update Here are the table definitions for the example I used in my question: create table Order ( Id int identity(1, 1) primary key, ProductName ntext null ) create table Customer ( Id int identity(1, 1) primary key, OrderId int null references Order (Id) )

    Read the article

  • Why does Hibernate 2nd level cache only cache queries within a session?

    - by Synesso
    Using a named query in our application and with ehcache as the provider, it seems that the query results are tied to the session within the cache. Any attempt to access the value from the cache for a second time results in a LazyInitializationException We have set lazy = true for the following mapping because this object is also used by another part of the system which does not require the reference... and we want to keep it lean. <class name="domain.ReferenceAdPoint" table="ad_point" mutable="false" lazy="false"> <cache usage="read-only"/> <id name="code" type="long" column="ad_point_id"> <generator class="assigned" /> </id> <property name="name" column="ad_point_description" type="string"/> <set name="synonyms" table="ad_point_synonym" cascade="all-delete-orphan" lazy="true"> <cache usage="read-only"/> <key column="ad_point_id" /> <element type="string" column="synonym_description" /> </set> </class> <query name="find.adpoints.by.heading">from ReferenceAdPoint adpoint left outer join fetch adpoint.synonyms where adpoint.adPointField.headingCode = ?</query> Here's a snippet from our hibernate.cfg.xml <property name="hibernate.cache.provider_class">net.sf.ehcache.hibernate.SingletonEhCacheProvider</property> <property name="hibernate.cache.use_query_cache">true</property> It doesn't seem to make sense that the cache would be constrained to the session. Why are the cached queries not usable outside of the (relatively short-lived) sessions?

    Read the article

  • Is there a better loop I could write to reduce database queries?

    - by dmanexe
    Below is some code I've written that is effective, but makes too many database queries. Is there a way I could optimize and reduce the number of queries but have conditional statements still be as effective as below? I pasted the code repeated a few times just for good measure. echo "<h3>Pool Packages</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Pool Packages") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Pool Packages") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Water Features</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Water Features") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Water Features") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Waterfall Rock Work</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE) { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Waterfall Rock Work") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Sheer Descents</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Sheer Descents") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Sheer Descents") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Booster Pump</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Booster Pump") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Booster Pump") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Pool Concrete Decking</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Pool Concrete Decking") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Pool Concrete Decking") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Solar Heating</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Solar Heating") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Solar Heating") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { } endforeach; echo "</ul>"; echo "<h3>Raised Bond Beam</h3>"; echo "<ul>"; foreach ($items as $item): $this->db->where('id', $item['id']); $query = $this->db->get('items')->row(); if ($item['quantity'] > 1 && $item['quantity'] == TRUE && $query->category == "Raised Bond Beam") { $newprice = $item['quantity'] * $query->price; $totals[] = $newprice; } else { $newprice = $query->price; $totals[] = $newprice; } if ($query->category == "Raised Bond Beam") { echo "<li>" . $query->name . " (QTY: " . $item['quantity'] . " x = " . str_ireplace(" ", "", money_format('%(#10n', $newprice)) . ")</li>"; } else { echo "<li>None</li>"; } endforeach; echo "</ul>"; It goes on beyond this to several more categories, but I don't know how to handle looping through this best. Thanks!

    Read the article

  • How do I execute queries upon DB connection in Rails?

    - by sycobuny
    I have certain initializing functions that I use to set up audit logging on the DB server side (ie, not rails) in PostgreSQL. At least one has to be issued (setting the current user) before inserting data into or updating any of the audited tables, or else the whole query will fail spectacularly. I can easily call these every time before running any save operation in the code, but DRY makes me think I should have the code repeated in as few places as possible, particularly since this diverges greatly from the ideal of database agnosticism. Currently I'm attempting to override ActiveRecord::Base.establish_connection in an initializer to set it up so that the queries are run as soon as I connect automatically, but it doesn't behave as I expect it to. Here is the code in the initializer: class ActiveRecord::Base # extend the class methods, not the instance methods class << self alias :old_establish_connection :establish_connection # hide the default def establish_connection(*args) ret = old_establish_connection(*args) # call the default # set up necessary session variables for audit logging # call these after calling default, to make sure conn is established 1st db = self.class.connection db.execute("SELECT SV.set('current_user', 'test@localhost')") db.execute("SELECT SV.set('audit_notes', NULL)") # end "empty variable" err ret # return the default's original value end end end puts "Loaded custom establish_connection into ActiveRecord::Base" sycobuny:~/rails$ ruby script/server = Booting WEBrick = Rails 2.3.5 application starting on http://0.0.0.0:3000 Loaded custom establish_connection into ActiveRecord::Base This doesn't give me any errors, and unfortunately I can't check what the method looks like internally (I was using ActiveRecord::Base.method(:establish_connection), but apparently that creates a new Method object each time it's called, which is seemingly worthless cause I can't check object_id for any worthwhile information and I also can't reverse the compilation). However, the code never seems to get called, because any attempt to run a save or an update on a database object fails as I predicted earlier. If this isn't a proper way to execute code immediately on connection to the database, then what is?

    Read the article

  • Best way to run multiple queries per second on database, performance wise?

    - by Michael Joell
    I am currently using Java to insert and update data multiple times per second. Never having used databases with Java, I am not sure what is required, and how to get the best performance. I currently have a method for each type of query I need to do (for example, update a row in a database). I also have a method to create the database connection. Below is my simplified code. public static void addOneForUserInChannel(String channel, String username) throws SQLException { Connection dbConnection = null; PreparedStatement ps = null; String updateSQL = "UPDATE " + channel + "_count SET messages = messages + 1 WHERE username = ?"; try { dbConnection = getDBConnection(); ps = dbConnection.prepareStatement(updateSQL); ps.setString(1, username); ps.executeUpdate(); } catch(SQLException e) { System.out.println(e.getMessage()); } finally { if(ps != null) { ps.close(); } if(dbConnection != null) { dbConnection.close(); } } } And my DB connection private static Connection getDBConnection() { Connection dbConnection = null; try { Class.forName(DB_DRIVER); } catch (ClassNotFoundException e) { System.out.println(e.getMessage()); } try { dbConnection = DriverManager.getConnection(DB_CONNECTION, DB_USER,DB_PASSWORD); return dbConnection; } catch (SQLException e) { System.out.println(e.getMessage()); } return dbConnection; } This seems to be working fine for now, with about 1-2 queries per second, but I am worried that once I expand and it is running many more, I might have some issues. My questions: Is there a way to have a persistent database connection throughout the entire run time of the process? If so, should I do this? Are there any other optimizations that I should do to help with performance? Thanks

    Read the article

  • What are good design practices when working with Entity Framework

    - by AD
    This will apply mostly for an asp.net application where the data is not accessed via soa. Meaning that you get access to the objects loaded from the framework, not Transfer Objects, although some recommendation still apply. This is a community post, so please add to it as you see fit. Applies to: Entity Framework 1.0 shipped with Visual Studio 2008 sp1. Why pick EF in the first place? Considering it is a young technology with plenty of problems (see below), it may be a hard sell to get on the EF bandwagon for your project. However, it is the technology Microsoft is pushing (at the expense of Linq2Sql, which is a subset of EF). In addition, you may not be satisfied with NHibernate or other solutions out there. Whatever the reasons, there are people out there (including me) working with EF and life is not bad.make you think. EF and inheritance The first big subject is inheritance. EF does support mapping for inherited classes that are persisted in 2 ways: table per class and table the hierarchy. The modeling is easy and there are no programming issues with that part. (The following applies to table per class model as I don't have experience with table per hierarchy, which is, anyway, limited.) The real problem comes when you are trying to run queries that include one or many objects that are part of an inheritance tree: the generated sql is incredibly awful, takes a long time to get parsed by the EF and takes a long time to execute as well. This is a real show stopper. Enough that EF should probably not be used with inheritance or as little as possible. Here is an example of how bad it was. My EF model had ~30 classes, ~10 of which were part of an inheritance tree. On running a query to get one item from the Base class, something as simple as Base.Get(id), the generated SQL was over 50,000 characters. Then when you are trying to return some Associations, it degenerates even more, going as far as throwing SQL exceptions about not being able to query more than 256 tables at once. Ok, this is bad, EF concept is to allow you to create your object structure without (or with as little as possible) consideration on the actual database implementation of your table. It completely fails at this. So, recommendations? Avoid inheritance if you can, the performance will be so much better. Use it sparingly where you have to. In my opinion, this makes EF a glorified sql-generation tool for querying, but there are still advantages to using it. And ways to implement mechanism that are similar to inheritance. Bypassing inheritance with Interfaces First thing to know with trying to get some kind of inheritance going with EF is that you cannot assign a non-EF-modeled class a base class. Don't even try it, it will get overwritten by the modeler. So what to do? You can use interfaces to enforce that classes implement some functionality. For example here is a IEntity interface that allow you to define Associations between EF entities where you don't know at design time what the type of the entity would be. public enum EntityTypes{ Unknown = -1, Dog = 0, Cat } public interface IEntity { int EntityID { get; } string Name { get; } Type EntityType { get; } } public partial class Dog : IEntity { // implement EntityID and Name which could actually be fields // from your EF model Type EntityType{ get{ return EntityTypes.Dog; } } } Using this IEntity, you can then work with undefined associations in other classes // lets take a class that you defined in your model. // that class has a mapping to the columns: PetID, PetType public partial class Person { public IEntity GetPet() { return IEntityController.Get(PetID,PetType); } } which makes use of some extension functions: public class IEntityController { static public IEntity Get(int id, EntityTypes type) { switch (type) { case EntityTypes.Dog: return Dog.Get(id); case EntityTypes.Cat: return Cat.Get(id); default: throw new Exception("Invalid EntityType"); } } } Not as neat as having plain inheritance, particularly considering you have to store the PetType in an extra database field, but considering the performance gains, I would not look back. It also cannot model one-to-many, many-to-many relationship, but with creative uses of 'Union' it could be made to work. Finally, it creates the side effet of loading data in a property/function of the object, which you need to be careful about. Using a clear naming convention like GetXYZ() helps in that regards. Compiled Queries Entity Framework performance is not as good as direct database access with ADO (obviously) or Linq2SQL. There are ways to improve it however, one of which is compiling your queries. The performance of a compiled query is similar to Linq2Sql. What is a compiled query? It is simply a query for which you tell the framework to keep the parsed tree in memory so it doesn't need to be regenerated the next time you run it. So the next run, you will save the time it takes to parse the tree. Do not discount that as it is a very costly operation that gets even worse with more complex queries. There are 2 ways to compile a query: creating an ObjectQuery with EntitySQL and using CompiledQuery.Compile() function. (Note that by using an EntityDataSource in your page, you will in fact be using ObjectQuery with EntitySQL, so that gets compiled and cached). An aside here in case you don't know what EntitySQL is. It is a string-based way of writing queries against the EF. Here is an example: "select value dog from Entities.DogSet as dog where dog.ID = @ID". The syntax is pretty similar to SQL syntax. You can also do pretty complex object manipulation, which is well explained [here][1]. Ok, so here is how to do it using ObjectQuery< string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); The first time you run this query, the framework will generate the expression tree and keep it in memory. So the next time it gets executed, you will save on that costly step. In that example EnablePlanCaching = true, which is unnecessary since that is the default option. The other way to compile a query for later use is the CompiledQuery.Compile method. This uses a delegate: static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => ctx.DogSet.FirstOrDefault(it => it.ID == id)); or using linq static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => (from dog in ctx.DogSet where dog.ID == id select dog).FirstOrDefault()); to call the query: query_GetDog.Invoke( YourContext, id ); The advantage of CompiledQuery is that the syntax of your query is checked at compile time, where as EntitySQL is not. However, there are other consideration... Includes Lets say you want to have the data for the dog owner to be returned by the query to avoid making 2 calls to the database. Easy to do, right? EntitySQL string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)).Include("Owner"); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); CompiledQuery static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => (from dog in ctx.DogSet.Include("Owner") where dog.ID == id select dog).FirstOrDefault()); Now, what if you want to have the Include parametrized? What I mean is that you want to have a single Get() function that is called from different pages that care about different relationships for the dog. One cares about the Owner, another about his FavoriteFood, another about his FavotireToy and so on. Basicly, you want to tell the query which associations to load. It is easy to do with EntitySQL public Dog Get(int id, string include) { string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)) .IncludeMany(include); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); } The include simply uses the passed string. Easy enough. Note that it is possible to improve on the Include(string) function (that accepts only a single path) with an IncludeMany(string) that will let you pass a string of comma-separated associations to load. Look further in the extension section for this function. If we try to do it with CompiledQuery however, we run into numerous problems: The obvious static readonly Func<Entities, int, string, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, string, Dog>((ctx, id, include) => (from dog in ctx.DogSet.Include(include) where dog.ID == id select dog).FirstOrDefault()); will choke when called with: query_GetDog.Invoke( YourContext, id, "Owner,FavoriteFood" ); Because, as mentionned above, Include() only wants to see a single path in the string and here we are giving it 2: "Owner" and "FavoriteFood" (which is not to be confused with "Owner.FavoriteFood"!). Then, let's use IncludeMany(), which is an extension function static readonly Func<Entities, int, string, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, string, Dog>((ctx, id, include) => (from dog in ctx.DogSet.IncludeMany(include) where dog.ID == id select dog).FirstOrDefault()); Wrong again, this time it is because the EF cannot parse IncludeMany because it is not part of the functions that is recognizes: it is an extension. Ok, so you want to pass an arbitrary number of paths to your function and Includes() only takes a single one. What to do? You could decide that you will never ever need more than, say 20 Includes, and pass each separated strings in a struct to CompiledQuery. But now the query looks like this: from dog in ctx.DogSet.Include(include1).Include(include2).Include(include3) .Include(include4).Include(include5).Include(include6) .[...].Include(include19).Include(include20) where dog.ID == id select dog which is awful as well. Ok, then, but wait a minute. Can't we return an ObjectQuery< with CompiledQuery? Then set the includes on that? Well, that what I would have thought so as well: static readonly Func<Entities, int, ObjectQuery<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, ObjectQuery<Dog>>((ctx, id) => (ObjectQuery<Dog>)(from dog in ctx.DogSet where dog.ID == id select dog)); public Dog GetDog( int id, string include ) { ObjectQuery<Dog> oQuery = query_GetDog(id); oQuery = oQuery.IncludeMany(include); return oQuery.FirstOrDefault; } That should have worked, except that when you call IncludeMany (or Include, Where, OrderBy...) you invalidate the cached compiled query because it is an entirely new one now! So, the expression tree needs to be reparsed and you get that performance hit again. So what is the solution? You simply cannot use CompiledQueries with parametrized Includes. Use EntitySQL instead. This doesn't mean that there aren't uses for CompiledQueries. It is great for localized queries that will always be called in the same context. Ideally CompiledQuery should always be used because the syntax is checked at compile time, but due to limitation, that's not possible. An example of use would be: you may want to have a page that queries which two dogs have the same favorite food, which is a bit narrow for a BusinessLayer function, so you put it in your page and know exactly what type of includes are required. Passing more than 3 parameters to a CompiledQuery Func is limited to 5 parameters, of which the last one is the return type and the first one is your Entities object from the model. So that leaves you with 3 parameters. A pitance, but it can be improved on very easily. public struct MyParams { public string param1; public int param2; public DateTime param3; } static readonly Func<Entities, MyParams, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, MyParams, IEnumerable<Dog>>((ctx, myParams) => from dog in ctx.DogSet where dog.Age == myParams.param2 && dog.Name == myParams.param1 and dog.BirthDate > myParams.param3 select dog); public List<Dog> GetSomeDogs( int age, string Name, DateTime birthDate ) { MyParams myParams = new MyParams(); myParams.param1 = name; myParams.param2 = age; myParams.param3 = birthDate; return query_GetDog(YourContext,myParams).ToList(); } Return Types (this does not apply to EntitySQL queries as they aren't compiled at the same time during execution as the CompiledQuery method) Working with Linq, you usually don't force the execution of the query until the very last moment, in case some other functions downstream wants to change the query in some way: static readonly Func<Entities, int, string, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, IEnumerable<Dog>>((ctx, age, name) => from dog in ctx.DogSet where dog.Age == age && dog.Name == name select dog); public IEnumerable<Dog> GetSomeDogs( int age, string name ) { return query_GetDog(YourContext,age,name); } public void DataBindStuff() { IEnumerable<Dog> dogs = GetSomeDogs(4,"Bud"); // but I want the dogs ordered by BirthDate gridView.DataSource = dogs.OrderBy( it => it.BirthDate ); } What is going to happen here? By still playing with the original ObjectQuery (that is the actual return type of the Linq statement, which implements IEnumerable), it will invalidate the compiled query and be force to re-parse. So, the rule of thumb is to return a List< of objects instead. static readonly Func<Entities, int, string, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, IEnumerable<Dog>>((ctx, age, name) => from dog in ctx.DogSet where dog.Age == age && dog.Name == name select dog); public List<Dog> GetSomeDogs( int age, string name ) { return query_GetDog(YourContext,age,name).ToList(); //<== change here } public void DataBindStuff() { List<Dog> dogs = GetSomeDogs(4,"Bud"); // but I want the dogs ordered by BirthDate gridView.DataSource = dogs.OrderBy( it => it.BirthDate ); } When you call ToList(), the query gets executed as per the compiled query and then, later, the OrderBy is executed against the objects in memory. It may be a little bit slower, but I'm not even sure. One sure thing is that you have no worries about mis-handling the ObjectQuery and invalidating the compiled query plan. Once again, that is not a blanket statement. ToList() is a defensive programming trick, but if you have a valid reason not to use ToList(), go ahead. There are many cases in which you would want to refine the query before executing it. Performance What is the performance impact of compiling a query? It can actually be fairly large. A rule of thumb is that compiling and caching the query for reuse takes at least double the time of simply executing it without caching. For complex queries (read inherirante), I have seen upwards to 10 seconds. So, the first time a pre-compiled query gets called, you get a performance hit. After that first hit, performance is noticeably better than the same non-pre-compiled query. Practically the same as Linq2Sql When you load a page with pre-compiled queries the first time you will get a hit. It will load in maybe 5-15 seconds (obviously more than one pre-compiled queries will end up being called), while subsequent loads will take less than 300ms. Dramatic difference, and it is up to you to decide if it is ok for your first user to take a hit or you want a script to call your pages to force a compilation of the queries. Can this query be cached? { Dog dog = from dog in YourContext.DogSet where dog.ID == id select dog; } No, ad-hoc Linq queries are not cached and you will incur the cost of generating the tree every single time you call it. Parametrized Queries Most search capabilities involve heavily parametrized queries. There are even libraries available that will let you build a parametrized query out of lamba expressions. The problem is that you cannot use pre-compiled queries with those. One way around that is to map out all the possible criteria in the query and flag which one you want to use: public struct MyParams { public string name; public bool checkName; public int age; public bool checkAge; } static readonly Func<Entities, MyParams, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, MyParams, IEnumerable<Dog>>((ctx, myParams) => from dog in ctx.DogSet where (myParams.checkAge == true && dog.Age == myParams.age) && (myParams.checkName == true && dog.Name == myParams.name ) select dog); protected List<Dog> GetSomeDogs() { MyParams myParams = new MyParams(); myParams.name = "Bud"; myParams.checkName = true; myParams.age = 0; myParams.checkAge = false; return query_GetDog(YourContext,myParams).ToList(); } The advantage here is that you get all the benifits of a pre-compiled quert. The disadvantages are that you most likely will end up with a where clause that is pretty difficult to maintain, that you will incur a bigger penalty for pre-compiling the query and that each query you run is not as efficient as it could be (particularly with joins thrown in). Another way is to build an EntitySQL query piece by piece, like we all did with SQL. protected List<Dod> GetSomeDogs( string name, int age) { string query = "select value dog from Entities.DogSet where 1 = 1 "; if( !String.IsNullOrEmpty(name) ) query = query + " and dog.Name == @Name "; if( age > 0 ) query = query + " and dog.Age == @Age "; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>( query, YourContext ); if( !String.IsNullOrEmpty(name) ) oQuery.Parameters.Add( new ObjectParameter( "Name", name ) ); if( age > 0 ) oQuery.Parameters.Add( new ObjectParameter( "Age", age ) ); return oQuery.ToList(); } Here the problems are: - there is no syntax checking during compilation - each different combination of parameters generate a different query which will need to be pre-compiled when it is first run. In this case, there are only 4 different possible queries (no params, age-only, name-only and both params), but you can see that there can be way more with a normal world search. - Noone likes to concatenate strings! Another option is to query a large subset of the data and then narrow it down in memory. This is particularly useful if you are working with a definite subset of the data, like all the dogs in a city. You know there are a lot but you also know there aren't that many... so your CityDog search page can load all the dogs for the city in memory, which is a single pre-compiled query and then refine the results protected List<Dod> GetSomeDogs( string name, int age, string city) { string query = "select value dog from Entities.DogSet where dog.Owner.Address.City == @City "; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>( query, YourContext ); oQuery.Parameters.Add( new ObjectParameter( "City", city ) ); List<Dog> dogs = oQuery.ToList(); if( !String.IsNullOrEmpty(name) ) dogs = dogs.Where( it => it.Name == name ); if( age > 0 ) dogs = dogs.Where( it => it.Age == age ); return dogs; } It is particularly useful when you start displaying all the data then allow for filtering. Problems: - Could lead to serious data transfer if you are not careful about your subset. - You can only filter on the data that you returned. It means that if you don't return the Dog.Owner association, you will not be able to filter on the Dog.Owner.Name So what is the best solution? There isn't any. You need to pick the solution that works best for you and your problem: - Use lambda-based query building when you don't care about pre-compiling your queries. - Use fully-defined pre-compiled Linq query when your object structure is not too complex. - Use EntitySQL/string concatenation when the structure could be complex and when the possible number of different resulting queries are small (which means fewer pre-compilation hits). - Use in-memory filtering when you are working with a smallish subset of the data or when you had to fetch all of the data on the data at first anyway (if the performance is fine with all the data, then filtering in memory will not cause any time to be spent in the db). Singleton access The best way to deal with your context and entities accross all your pages is to use the singleton pattern: public sealed class YourContext { private const string instanceKey = "On3GoModelKey"; YourContext(){} public static YourEntities Instance { get { HttpContext context = HttpContext.Current; if( context == null ) return Nested.instance; if (context.Items[instanceKey] == null) { On3GoEntities entity = new On3GoEntities(); context.Items[instanceKey] = entity; } return (YourEntities)context.Items[instanceKey]; } } class Nested { // Explicit static constructor to tell C# compiler // not to mark type as beforefieldinit static Nested() { } internal static readonly YourEntities instance = new YourEntities(); } } NoTracking, is it worth it? When executing a query, you can tell the framework to track the objects it will return or not. What does it mean? With tracking enabled (the default option), the framework will track what is going on with the object (has it been modified? Created? Deleted?) and will also link objects together, when further queries are made from the database, which is what is of interest here. For example, lets assume that Dog with ID == 2 has an owner which ID == 10. Dog dog = (from dog in YourContext.DogSet where dog.ID == 2 select dog).FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; Person owner = (from o in YourContext.PersonSet where o.ID == 10 select dog).FirstOrDefault(); //dog.OwnerReference.IsLoaded == true; If we were to do the same with no tracking, the result would be different. ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>) (from dog in YourContext.DogSet where dog.ID == 2 select dog); oDogQuery.MergeOption = MergeOption.NoTracking; Dog dog = oDogQuery.FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; ObjectQuery<Person> oPersonQuery = (ObjectQuery<Person>) (from o in YourContext.PersonSet where o.ID == 10 select o); oPersonQuery.MergeOption = MergeOption.NoTracking; Owner owner = oPersonQuery.FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; Tracking is very useful and in a perfect world without performance issue, it would always be on. But in this world, there is a price for it, in terms of performance. So, should you use NoTracking to speed things up? It depends on what you are planning to use the data for. Is there any chance that the data your query with NoTracking can be used to make update/insert/delete in the database? If so, don't use NoTracking because associations are not tracked and will causes exceptions to be thrown. In a page where there are absolutly no updates to the database, you can use NoTracking. Mixing tracking and NoTracking is possible, but it requires you to be extra careful with updates/inserts/deletes. The problem is that if you mix then you risk having the framework trying to Attach() a NoTracking object to the context where another copy of the same object exist with tracking on. Basicly, what I am saying is that Dog dog1 = (from dog in YourContext.DogSet where dog.ID == 2).FirstOrDefault(); ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>) (from dog in YourContext.DogSet where dog.ID == 2 select dog); oDogQuery.MergeOption = MergeOption.NoTracking; Dog dog2 = oDogQuery.FirstOrDefault(); dog1 and dog2 are 2 different objects, one tracked and one not. Using the detached object in an update/insert will force an Attach() that will say "Wait a minute, I do already have an object here with the same database key. Fail". And when you Attach() one object, all of its hierarchy gets attached as well, causing problems everywhere. Be extra careful. How much faster is it with NoTracking It depends on the queries. Some are much more succeptible to tracking than other. I don't have a fast an easy rule for it, but it helps. So I should use NoTracking everywhere then? Not exactly. There are some advantages to tracking object. The first one is that the object is cached, so subsequent call for that object will not hit the database. That cache is only valid for the lifetime of the YourEntities object, which, if you use the singleton code above, is the same as the page lifetime. One page request == one YourEntity object. So for multiple calls for the same object, it will load only once per page request. (Other caching mechanism could extend that). What happens when you are using NoTracking and try to load the same object multiple times? The database will be queried each time, so there is an impact there. How often do/should you call for the same object during a single page request? As little as possible of course, but it does happens. Also remember the piece above about having the associations connected automatically for your? You don't have that with NoTracking, so if you load your data in multiple batches, you will not have a link to between them: ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>)(from dog in YourContext.DogSet select dog); oDogQuery.MergeOption = MergeOption.NoTracking; List<Dog> dogs = oDogQuery.ToList(); ObjectQuery<Person> oPersonQuery = (ObjectQuery<Person>)(from o in YourContext.PersonSet select o); oPersonQuery.MergeOption = MergeOption.NoTracking; List<Person> owners = oPersonQuery.ToList(); In this case, no dog will have its .Owner property set. Some things to keep in mind when you are trying to optimize the performance. No lazy loading, what am I to do? This can be seen as a blessing in disguise. Of course it is annoying to load everything manually. However, it decreases the number of calls to the db and forces you to think about when you should load data. The more you can load in one database call the better. That was always true, but it is enforced now with this 'feature' of EF. Of course, you can call if( !ObjectReference.IsLoaded ) ObjectReference.Load(); if you want to, but a better practice is to force the framework to load the objects you know you will need in one shot. This is where the discussion about parametrized Includes begins to make sense. Lets say you have you Dog object public class Dog { public Dog Get(int id) { return YourContext.DogSet.FirstOrDefault(it => it.ID == id ); } } This is the type of function you work with all the time. It gets called from all over the place and once you have that Dog object, you will do very different things to it in different functions. First, it should be pre-compiled, because you will call that very often. Second, each different pages will want to have access to a different subset of the Dog data. Some will want the Owner, some the FavoriteToy, etc. Of course, you could call Load() for each reference you need anytime you need one. But that will generate a call to the database each time. Bad idea. So instead, each page will ask for the data it wants to see when it first request for the Dog object: static public Dog Get(int id) { return GetDog(entity,"");} static public Dog Get(int id, string includePath) { string query = "select value o " + " from YourEntities.DogSet as o " +

    Read the article

  • Developing Schema Compare for Oracle (Part 5): Query Snapshots

    - by Simon Cooper
    If you've emailed us about a bug you've encountered with the EAP or beta versions of Schema Compare for Oracle, we probably asked you to send us a query snapshot of your databases. Here, I explain what a query snapshot is, and how it helps us fix your bug. Problem 1: Debugging users' bug reports When we started the Schema Compare project, we knew we were going to get problems with users' databases - configurations we hadn't considered, features that weren't installed, unicode issues, wierd dependencies... With SQL Compare, users are generally happy to send us a database backup that we can restore using a single RESTORE DATABASE command on our test servers and immediately reproduce the problem. Oracle, on the other hand, would be a lot more tricky. As Oracle generally has a 1-to-1 mapping between instances and databases, any databases users sent would have to be restored to their own instance. Furthermore, the number of steps required to get a properly working database, and the size of most oracle databases, made it infeasible to ask every customer who came across a bug during our beta program to send us their databases. We also knew that there would be lots of issues with data security that would make it hard to get backups. So we needed an easier way to be able to debug customers issues and sort out what strange schema data Oracle was returning. Problem 2: Test execution time Another issue we knew we would have to solve was the execution time of the tests we would produce for the Schema Compare engine. Our initial prototype showed that querying the data dictionary for schema information was going to be slow (at least 15 seconds per database), and this is generally proportional to the size of the database. If you're running thousands of tests on the same databases, each one registering separate schemas, not only would the tests would take hours and hours to run, but the test servers would be hammered senseless. The solution To solve these, we needed to be able to populate the schema of a database without actually connecting to it. Well, the IDataReader interface is the primary way we read data from an Oracle server. The data dictionary queries we use return their data in terms of simple strings and numbers, which we then process and reconstruct into an object model, and the results of these queries are identical for identical schemas. So, we can record the raw results of the queries once, and then replay these results to construct the same object model as many times as required without needing to actually connect to the original database. This is what query snapshots do. They are binary files containing the raw unprocessed data we get back from the oracle server for all the queries we run on the data dictionary to get schema information. The core of the query snapshot generation takes the results of the IDataReader we get from running queries on Oracle, and passes the row data to a BinaryWriter that writes it straight to a file. The query snapshot can then be replayed to create the same object model; when the results of a specific query is needed by the population code, we can simply read the binary data stored in the file on disk and present it through an IDataReader wrapper. This is far faster than querying the server over the network, and allows us to run tests in a reasonable time. They also allow us to easily debug a customers problem; using a simple snapshot generation program, users can generate a query snapshot that could be sent along with a bug report that we can immediately replay on our machines to let us debug the issue, rather than having to obtain database backups and restore databases to test systems. There are also far fewer problems with data security; query snapshots only contain schema information, which is generally less sensitive than table data. Query snapshots implementation However, actually implementing such a feature did have a couple of 'gotchas' to it. My second blog post detailed the development of the dependencies algorithm we use to ensure we get all the dependencies in the database, and that algorithm uses data from both databases to find all the needed objects - what database you're comparing to affects what objects get populated from both databases. We get information on these additional objects using an appropriate WHERE clause on all the population queries. So, in order to accurately replay the results of querying the live database, the query snapshot needs to be a snapshot of a comparison of two databases, not just populating a single database. Furthermore, although the code population queries (eg querying all_tab_cols to get column information) can simply be passed straight from the IDataReader to the BinaryWriter, we need to hook into and run the live dependencies algorithm while we're creating the snapshot to ensure we get the same WHERE clauses, and the same query results, as if we were populating straight from a live system. We also need to store the results of the dependencies queries themselves, as the resulting dependency graph is stored within the OracleDatabase object that is produced, and is later used to help order actions in synchronization scripts. This is significantly helped by the dependencies algorithm being a deterministic algorithm - given the same input, it will always return the same output. Therefore, when we're replaying a query snapshot, and processing dependency information, we simply have to return the results of the queries in the order we got them from the live database, rather than trying to calculate the contents of all_dependencies on the fly. Query snapshots are a significant feature in Schema Compare that really helps us to debug problems with the tool, as well as making our testers happier. Although not really user-visible, they are very useful to the development team to help us fix bugs in the product much faster than we otherwise would be able to.

    Read the article

  • Python virtualenv questions

    - by orokusaki
    I'm using VirtualEnv on Windows XP. I'm wondering if I have my brain wrapped around it correctly. I ran virtualenv ENV and it created C:\WINDOWS\system32\ENV. I then changed my PATH variable to include C:\WINDOWS\system32\ENV\Scripts instead of C:\Python27\Scripts. Then, I checked out Django into C:\WINDOWS\system32\ENV\Lib\site-packages\django-trunk, updated my PYTHON_PATH variable to point the new Django directory, and continued to easy_install other things (which of course go into my new C:\WINDOWS\system32\ENV\Lib\site-packages directory). I understand why I should use VirtualEnv so I can run multiple versions of Django, and other libraries on the same machine, but does this mean that to switch between environments I have to basically change my PATH and PYTHON_PATH variable? So, I go from developing one Django project which uses Django 1.2 in an environment called ENV and then change my PATH and such so that I can use an environment called ENV2 which has the dev version of Django? Is that basically it, or is there some better way to automatically do all this (I could update my path in Python code, but that would require me to write machine-specific code in my application)? Also, how does this process compare to using VirtualEnv on Linux (I'm quite the beginner at Linux).

    Read the article

  • When should I use Areas in TFS instead of Team Projects

    - by Martin Hinshelwood
    Well, it depends…. If you are a small company that creates a finite number of internal projects then you will find it easier to create a single project for each of your products and have TFS do the heavy lifting with reporting, SharePoint sites and Version Control. But what if you are not… Update 9th March 2010 Michael Fourie gave me some feedback which I have integrated. Ed Blankenship via @edblankenship offered encouragement and a nice quote. Ewald Hofman gave me a couple of Cons, and maybe a few more soon. Ewald’s company, Avanade, currently uses Areas, but it looks like the manual management is getting too much and the project is getting cluttered. What if you are likely to have hundreds of projects, possibly with a multitude of internal and external projects? You might have 1 project for a customer or 10. This is the situation that most consultancies find themselves in and thus they need a more sustainable and maintainable option. What I am advocating is that we should have 1 “Team Project” per customer, and use areas to create “sub projects” within that single “Team Project”. "What you describe is what we generally do internally and what we recommend. We make very heavy use of area path to categorize the work within a larger project." - Brian Harry, Microsoft Technical Fellow & Product Unit Manager for Team Foundation Server   "We tend to use areas to segregate multiple projects in the same team project and it works well." - Tiago Pascoal, Visual Studio ALM MVP   "In general, I believe this approach provides consistency [to multi-product engagements] and lowers the administration and maintenance costs. All good." - Michael Fourie, Visual Studio ALM MVP   “@MrHinsh BTW, I'm very much a fan of very large, if not huge, team projects in TFS. Just FYI :) Use Areas & Iterations.” Ed Blankenship, Visual Studio ALM MVP   This would mean that SSW would have a single Team Project called “SSW” that contains all of our internal projects and consequently all of the Areas and Iteration move down one hierarchy to accommodate this. Where we would have had “\SSW\Sprint 1” we now have “\SSW\SqlDeploy\Sprint1” with “SqlDeploy” being our internal project. At the moment SSW has over 70 internal projects and more than 170 total projects in TFS. This method has long term benefits that help to simplify the support model for companies that often have limited internal support time and many projects. But, there are implications as TFS does not provide this model “out-of-the-box”. These implications stretch across Areas, Iterations, Queries, Project Portal and Version Control. Michael made a good comment, he said: I agree with your approach, assuming that in a multi-product engagement with a client, they are happy to adopt the same process template across all products. If they are not, then it’ll either be easy to convince them or there is a valid reason for having a different template - Michael Fourie, Visual Studio ALM MVP   At SSW we have a standard template that we use and this is applied across the board, to all of our projects. We even apply any changes to the core process template to all of our existing projects as well. If you have multiple projects for the same clients on multiple templates and you want to keep it that way, then this approach will not work for you. However, if you want to standardise as we have at SSW then this approach may benefit you as well. Implications around Areas Areas should be used for topological classification/isolation of work items. You can think of this as architecture areas, organisational areas or even the main features of your application. In our scenario there is an additional top level item that represents the Project / Product that we want to chop our Team Project into. Figure: Creating a sub area to represent a product/project is easy. <teamproject> <teamproject>\<Functional Area/module whatever> Becomes: <teamproject> <teamproject>\<ProjectName>\ <teamproject>\<ProjectName>\<Functional Area/module whatever> Implications around Iterations Iterations should be used for chronological classification/isolation of work items. This could include isolated time boxes, milestones or release timelines and really depends on the logical flow of your project or projects. Due to the new level in Area we need to add the same level to Iteration. This is primarily because it is unlikely that the sprints in each of your projects/products will start and end at the same time. This is just a reality of managing multiple projects. Figure: Adding the same Area value to Iteration as the top level item adds flexibility to Iteration. <teamproject>\Sprint 1 Or <teamproject>\Release 1\Sprint 1 Becomes: <teamproject>\<ProjectName>\Sprint 1 Or <teamproject>\<ProjectName>\Release 1\Sprint 1 Implications around Queries Queries are used to filter your work items based on a specified level of granularity. There are a number of queries that are built into a project created using the MSF Agile 5.0 template, but we now have multiple projects and it would be a pain to have to edit all of the work items every time we changed project, and that would only allow one team to work on one project at a time.   Figure: The Queries that are created in a normal MSF Agile 5.0 project do not quite suit our new needs. In order for project contributors to be able to query based on their project we need a couple of things. The first thing I did was to create an “_Area Template” folder that has a copy of the project layout with all the queries setup to filter based on the “_Area Template” Area and the “_Sprint template” you can see in the Area and Iteration views. Figure: The template is currently easily drag and drop, but you then need to edit the queries to point at the right Area and Iteration. This needs a tool. I then created an “Areas” folder to hold all of the area specific queries. So, when you go to create a new TFS Sub-Project you just drag “_Area Template” while holding “Ctrl” and drop it onto “Areas”. There is a little setup here. That said I managed it in around 10 minutes which is not so bad, and I can imagine it being quite easy to build a tool to create these queries Figure: These new queries can be configured in around 10 minutes, which includes setting up the Area and Iteration as well. Version Control What about your source code? Well, that is the easiest of the lot. Just create a sub folder for each of your projects/products.   Figure: Creating sub folders in source control is easy as “Right click | Create new folder”. <teamproject>\DEV\Main\ Becomes: <teamproject>\<ProjectName>\DEV\Main\ Conclusion I think it is up to each company to make a call on how you want to configure your Team Projects and it depends completely on how many projects/products you are going to have for each customer including yourself. If we decide to utilise this route it will require some configuration to get our 170+ projects into this format, and I will probably be writing some tools to help. Pros You only have one project to upgrade when a process template changes – After going through an upgrade of over 170 project prior to the changes in the RC I can tell you that that many projects is no fun. Standardises your Process Template – You will always have the same Process implementation across projects/products without exception You get tighter control over the permissions – Yes, you can do this on a standard Team Project, but it gets a lot easier with practice. You can “move” work items from one “product” to another – Have we not always wanted to do that. You can rename your projects – Wahoo: everyone wants to do this, now you can. One set of Reporting Services reports to manage – You set an area and iteration to run reports anyway, so you may as well set both. Simplified Check-In Policies– There is only one set of check-in policies per client. This simplifies administration of policies. Simplified Alerts – As alerts are applied across multiple projects this simplifies your alert rules as per client. Cons All of these cons could be mitigated by a custom tool that helps automate creation of “Sub-projects” within Team Projects. This custom tool could create areas, Iteration, permissions, SharePoint and queries. It just does not exist yet :) You need to configure the Areas and Iterations You need to configure the permissions You may need to configure sub sites for SharePoint (depends on your requirement) – If you have two projects/products in the same Team Project then you will not see the burn down for each one out-of-the-box, but rather a cumulative for the Team Project. This is not really that much of a problem as you would have to configure your burndown graphs for your current iteration anyway. note: When you create a sub site to a TFS linked portal it will inherit the settings of its parent site :) This is fantastic as it means that you can easily create sub sites and then set the Area and Iteration path in each of the reports to be the correct one. Every team wants their own customization (via Ewald Hofman) - small teams of 2 persons against teams of 30 – or even outsourcing – need their own process, you cannot allow that because everybody gets the same work item types. note: Luckily at SSW this is not a problem as our template is standardised across all projects and customers. Large list of builds (via Ewald Hofman) – As the build list in Team Explorer is just a flat list it can get very cluttered. note: I would mitigate this by removing any build that has not been run in over 30 days. The build template and workflow will still be available in version control, but it will clean the list. Feedback Now that I have explained this method, what do you think? What other pros and cons can you see? What do you think of this approach? Will you be using it? What tools would you like to support you?   Technorati Tags: Visual Studio ALM,TFS Administration,TFS,Team Foundation Server,Project Planning,TFS Customisation

    Read the article

  • 400 error with nginx subdomains over https

    - by aquavitae
    Not sure what I'm doing wrong, but I'm trying to get gunicorn/django through nginx using only https. Here is my nginx configuration: upstream app_server { server unix:/srv/django/app/run/gunicorn.sock fail_timeout=0; } server { listen 80; return 301 https://$host$request_uri; } server { listen 443; server_name app.mydomain.com; ssl on; ssl_certificate /etc/nginx/ssl/nginx.crt; ssl_certificate_key /etc/nginx/ssl/nginx.key; client_max_body_size 4G; access_log /srv/django/app/logs/nginx-access.log; error_log /srv/django/app/logs/nginx-error.log; location /static/ { alias /srv/django/app/data/static/; } location /media/ { alias /wrv/django/app/data/media/; } location / { proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_set_header Host $http_host; proxy_pass http://app_server; } } I get a 400 error on app.mydomain.com, but the app is published on mydomain.com. Is there an error in my configuration?

    Read the article

< Previous Page | 143 144 145 146 147 148 149 150 151 152 153 154  | Next Page >