Search Results

Search found 4788 results on 192 pages for 'adhoc queries'.

Page 46/192 | < Previous Page | 42 43 44 45 46 47 48 49 50 51 52 53  | Next Page >

  • SQL Server Optimizer Malfunction?

    - by Tony Davis
    There was a sharp intake of breath from the audience when Adam Machanic declared the SQL Server optimizer to be essentially "stuck in 1997". It was during his fascinating "Query Tuning Mastery: Manhandling Parallelism" session at the recent PASS SQL Summit. Paraphrasing somewhat, Adam (blog | @AdamMachanic) offered a convincing argument that the optimizer often delivers flawed plans based on assumptions that are no longer valid with today’s hardware. In 1997, when Microsoft engineers re-designed the database engine for SQL Server 7.0, SQL Server got its initial implementation of a cost-based optimizer. Up to SQL Server 2000, the developer often had to deploy a steady stream of hints in SQL statements to combat the occasionally wilful plan choices made by the optimizer. However, with each successive release, the optimizer has evolved and improved in its decision-making. It is still prone to the occasional stumble when we tackle difficult problems, join large numbers of tables, perform complex aggregations, and so on, but for most of us, most of the time, the optimizer purrs along efficiently in the background. Adam, however, challenged further any assumption that the current optimizer is competent at providing the most efficient plans for our more complex analytical queries, and in particular of offering up correctly parallelized plans. He painted a picture of a present where complex analytical queries have become ever more prevalent; where disk IO is ever faster so that reads from disk come into buffer cache faster than ever; where the improving RAM-to-data ratio means that we have a better chance of finding our data in cache. Most importantly, we have more CPUs at our disposal than ever before. To get these queries to perform, we not only need to have the right indexes, but also to be able to split the data up into subsets and spread its processing evenly across all these available CPUs. Improvements such as support for ColumnStore indexes are taking things in the right direction, but, unfortunately, deficiencies in the current Optimizer mean that SQL Server is yet to be able to exploit properly all those extra CPUs. Adam’s contention was that the current optimizer uses essentially the same costing model for many of its core operations as it did back in the days of SQL Server 7, based on assumptions that are no longer valid. One example he gave was a "slow disk" bias that may have been valid back in 1997 but certainly is not on modern disk systems. Essentially, the optimizer assesses the relative cost of serial versus parallel plans based on the assumption that there is no IO cost benefit from parallelization, only CPU. It assumes that a single request will saturate the IO channel, and so a query would not run any faster if we parallelized IO because the disk system simply wouldn’t be able to handle the extra pressure. As such, the optimizer often decides that a serial plan is lower cost, often in cases where a parallel plan would improve performance dramatically. It was challenging and thought provoking stuff, as were his techniques for driving parallelism through query logic based on subsets of rows that define the "grain" of the query. I highly recommend you catch the session if you missed it. I’m interested to hear though, when and how often people feel the force of the optimizer’s shortcomings. Barring mistakes, such as stale statistics, how often do you feel the Optimizer fails to find the plan you think it should, and what are the most common causes? Is it fighting to induce it toward parallelism? Combating unexpected plans, arising from table partitioning? Something altogether more prosaic? Cheers, Tony.

    Read the article

  • The five steps of business intelligence adoption: where are you?

    - by Red Gate Software BI Tools Team
    When I was in Orlando and New York last month, I spoke to a lot of business intelligence users. What they told me suggested a path of BI adoption. The user’s place on the path depends on the size and sophistication of their organisation. Step 1: A company with a database of customer transactions will often want to examine particular data, like revenue and unit sales over the last period for each product and territory. To do this, they probably use simple SQL queries or stored procedures to produce data on demand. Step 2: The results from step one are saved in an Excel document, so business users can analyse them with filters or pivot tables. Alternatively, SQL Server Reporting Services (SSRS) might be used to generate a report of the SQL query for display on an intranet page. Step 3: If these queries are run frequently, or business users want to explore data from multiple sources more freely, it may become necessary to create a new database structured for analysis rather than CRUD (create, retrieve, update, and delete). For example, data from more than one system — plus external information — may be incorporated into a data warehouse. This can become ‘one source of truth’ for the business’s operational activities. The warehouse will probably have a simple ‘star’ schema, with fact tables representing the measures to be analysed (e.g. unit sales, revenue) and dimension tables defining how this data is aggregated (e.g. by time, region or product). Reports can be generated from the warehouse with Excel, SSRS or other tools. Step 4: Not too long ago, Microsoft introduced an Excel plug-in, PowerPivot, which allows users to bring larger volumes of data into Excel documents and create links between multiple tables.  These BISM Tabular documents can be created by the database owners or other expert Excel users and viewed by anyone with Excel PowerPivot. Sometimes, business users may use PowerPivot to create reports directly from the primary database, bypassing the need for a data warehouse. This can introduce problems when there are misunderstandings of the database structure or no single ‘source of truth’ for key data. Step 5: Steps three or four are often enough to satisfy business intelligence needs, especially if users are sophisticated enough to work with the warehouse in Excel or SSRS. However, sometimes the relationships between data are too complex or the queries which aggregate across periods, regions etc are too slow. In these cases, it can be necessary to formalise how the data is analysed and pre-build some of the aggregations. To do this, a business intelligence professional will typically use SQL Server Analysis Services (SSAS) to create a multidimensional model — or “cube” — that more simply represents key measures and aggregates them across specified dimensions. Step five is where our tool, SSAS Compare, becomes useful, as it helps review and deploy changes from development to production. For us at Red Gate, the primary value of SSAS Compare is to establish a dialog with BI users, so we can develop a portfolio of products that support creation and deployment across a range of report and model types. For example, PowerPivot and the new BISM Tabular model create a potential customer base for tools that extend beyond BI professionals. We’re interested in learning where people are in this story, so we’ve created a six-question survey to find out. Whether you’re at step one or step five, we’d love to know how you use BI so we can decide how to build tools that solve your problems. So if you have a sixty seconds to spare, tell us on the survey!

    Read the article

  • How to get decent MySQL driver perfomance in Ruby

    - by Zombies
    I notice that I am getting very poor performance for either or both inserts and queries. The queries themselves are basic and can execute with no delay directly from mysql. The ruby script that I wrote is only 1 thread, so only 1 connection is being used, and never closed unless the script is terminated. Pretty basic, I am just trying to insert a lot of rows. There is a look-up or two to get a surrogate key, or to check for duplicates, but the complexity is just O(n). Also, it isn't like there are millions of records, so again the queries themselves take no time to run. I am using: Ruby 1.9.1 Gem/driver:ruby-mysql 2.9.2 MySQL 5.1.37-1ubuntu5.1 ^ all 32 bit versions on a 32bit ubuntu distro I am getting about 1-2 inserts per second, pretty slow. I know a lot of people will suggest to change drivers, but that means I have some refactoring and resting to do. So I would really appreciate any help, but please if you do recomend that at least say why you do (eg: if you have used ruby-mysql x.x.x before and found another mysql driver to be better).ruby-mysql 2.9.2 What I would like to know: How can I improve performance with ruby-mysql 2.9.2 If and only if I cannot do this with ruby-mysql 2.9.2, what should I do?

    Read the article

  • Lazy loading of Blob properties of one class

    - by Khosro
    Hi, My class contains "summary" and "title" properties those are Blob and other properties. Code:(I write some part of class) public class News extends BaseEntity{ @Lob @Basic(fetch = FetchType.LAZY) public String getSummary() { return summary; } @Lob @Basic(fetch = FetchType.LAZY) public String getTitle() { return title; } @Temporal(TemporalType.TIMESTAMP) public Date getPublishDate() { return publishDate; } } I instrument this class to lazy load of Blob properties using this class "org.hibernate.tool.instrument.javassist.InstrumentTask". When i write this code to retrieve only summary of new , newsDAO.findByid(1L).getSummary(); Hibernate generates these queries: Hibernate: select news0_.id as id1_, news0_.entityVersion as entityVe2_1_, news0_.publishDate as publish15_1_, news0_.url as url1_ from News news0_ Hibernate: select news_.summary as summary1_, news_.title as title1_ from News news_ where news_.id=? I have two qurestions: 1.I only want to retrieve "summary" property not "title" property,but Hibernate queries show that it also retrieve "title" property,Why this happens(i want to lazy load of "property")? It seems when i load one of Blob property ,Hibernate loads all of them.Why?(This is my main question) 2.Why Hibernate generates two queries for retrieving only "summary" property of news? Khosro.

    Read the article

  • Alternatives to decompiling MS Access MDE files

    - by booyaa
    I've been tasked with finding a suitable tool to decompile MDE files. The MDEs were created by staff who have since left (familar story eh?) and we do not have access to the originally MDB files. The reason we need access to the original code is that the data source is changing (the backend as well as some of the table and queries) and we need a way to update queries. An example of a change, in a SELECT statement where is the WHERE clause looks for zero as a string ("0") rather than an integer. I'm aware that unless you use the services of people like EverythingAccess.com its unlikely you will ever get the source code back. My main query is to ask for alternative methods to decompiling code. An example of the kinds of methods I'm thinking about is to spy on the traffic between the app the the ODBC DSN using tcpdump. I might then be able to write code to translate the data source queries between the old and new systems. Ideally I'd prefer a solution that is application centric rather than one that analyses all network traffic. I should add one caveat, no doubt most of you are thinking the best solution is to rewrite the code, based on its perceived functionality. This is the option we're not considering (at the moment).

    Read the article

  • Help with fql.multiQuery

    - by Daniel Schaffer
    I'm playing around with the Facebook API's fql.multiQuery method. I'm just using the API Test Console, and trying to get a successful response but can't seem to figure out exactly what it wants. Here's the text I'm entering into the "queries" field: {"tags" : "select subject from photo_tag where subject != 601599551 and pid in ( select pid from photo_tag where subject = 601599551 ) and subject in ( select uid2 from friend where uid1 = 601599551 )", "foo" : "select uid from user where uid = 601599551"} All it'll give me is a queries parameter: array expected. error. I've also tried just about every permutation I could think of involving wrapping the name/query pairs in their own curly braces, adding brackets, adding whitespace, removing whitespace in case it didn't want an associative array (for those watching the edits, I just found out about these wonderful things now... oy), all to no avail. Is there something painfully obvious I'm missing here, or do I need to make like Chuck Norris Jon Skeet and simply will it to do my bidding? Update: A note to anyone finding this question now: The fql.multiquery test console appears to be broken. You can test your query by clicking on the generated url in the test console and manually adding the "queries" parameter into the querystring.

    Read the article

  • Several Small, Specific, MySQL Query Cache Questions

    - by Robbie
    I've look all over the web and in the questions asked here about MySQL caching and most of them seem very non-specific about a couple of questions that I have about performance and MySQL query caching. Specifically I want answers to these questions, assume for all questions that I have the query cache enabled and it is of type 2, or "DEMAND": Is the query cache per table, per database, or per server? Meaning if I have the cache size set to X and have T tables and D databases will I be caching TX, DX, or X amount of data? If I have table T1 which I regularly use the SQL_CACHE hint on for SELECT queries and table T2 which I never do, when I query T2 with a SELECT query will it check through the cache first before performing the query? *Note: I don't want to use the SQL_NO_CACHE for all T2 queries.* Assume the same situation as in question 2. If I alter (INSERT, DELETE) table T2 will any processing be done on the cache? For answers to 2 and 3, is this processing time negligible if T2 is constantly being altered and is the target of a majority of my SELECT queries?

    Read the article

  • dynamically horizontal scalable key value store

    - by Zubair
    Hi, Is there a key value store that will give me the following: Allow me to simply add and remove nodes and will redstribute the data automatically Allow me to remove nodes and still have 2 extra data nodes to provide redundancy Allow me to store text or images up to 1GB in size Can store small size data up to 100TB of data Fast (so will allow queries to be performed on top of it) Make all this transparent to the client Works on Ubuntu/FreeBSD or Mac Free or open source I basically want something I can use a "single", and not have to worry about having memcached, a db, and several storage components so yes, I do want a database "silver bullet" you could say. Thanks Zubair Answers so far: MogileFS on top of BackBlaze - As far as I can see this is just a filesystem, and after some research it only seems to be appropriate for large image files Tokyo Tyrant - Needs lightcloud. This doesn't auto scale as you add new nodes. I did look into this and it seems it is very fast for queries which fit onto a single node though Riak - This is one I am looking into myself, but I don't have any results yet Amazon S3 - Is anyone using this as their sole persistance layer in production? From what I have seen it seems to be used for storage of images as complex queries are too expensive @shaman suggested Cassandra - definitely one I am looking into So far it seems that there is no database or key value store that fulfills the criteria I mentioned, not even after offering a bounty of 100 points did the question get answered!

    Read the article

  • Should we have a database independent SQL like query language in Django? [closed]

    - by Yugal Jindle
    Note : I know we have Django ORM already that keeps things database independent and converts to the database specific SQL queries. Once things starts getting complicated it is preferred to write raw SQL queries for better efficiency. When you write raw sql queries your code gets trapped with the database you are using. I also understand its important to use the full power of your database that can-not be achieved with the django orm alone. My Question : Until I use any database specific feature, why should one be trapped with the database. For instance : We have a query with multiple joins and we decided to write a raw sql query. Now, that makes my website postgres specific. Even when I have not used any postgres specific feature. I feel there should be some fake sql language which can translate to any database's sql query. Even Django's ORM can be built over it. So, that if you go out of ORM but not database specific - you can still remain database independent. I asked the same question to Jacob Kaplan Moss (In person) : He advised me to stay with the database that I like and endure its whole power, to which I agree. But my point was not that we should be database independent. My point is we should be database independent until we use a database specific feature. Please explain, why should be there a fake sql layer over the actual sql ?

    Read the article

  • How best to use XPath with very large XML files in .NET?

    - by glenatron
    I need to do some processing on fairly large XML files ( large here being potentially upwards of a gigabyte ) in C# including performing some complex xpath queries. The problem I have is that the standard way I would normally do this through the System.XML libraries likes to load the whole file into memory before it does anything with it, which can cause memory problems with files of this size. I don't need to be updating the files at all just reading them and querying the data contained in them. Some of the XPath queries are quite involved and go across several levels of parent-child type relationship - I'm not sure whether this will affect the ability to use a stream reader rather than loading the data into memory as a block. One way I can see of making it work is to perform the simple analysis using a stream-based approach and perhaps wrapping the XPath statements into XSLT transformations that I could run across the files afterward, although it seems a little convoluted. Alternately I know that there are some elements that the XPath queries will not run across, so I guess I could break the document up into a series of smaller fragments based on it's original tree structure, which could perhaps be small enough to process in memory without causing too much havoc. I've tried to explain my objective here so if I'm barking up totally the wrong tree in terms of general approach I'm sure you folks can set me right...

    Read the article

  • SQL Server insert performance

    - by Jose
    I have an insert query that gets generated like this INSERT INTO InvoiceDetail (LegacyId,InvoiceId,DetailTypeId,Fee,FeeTax,Investigatorid,SalespersonId,CreateDate,CreatedById,IsChargeBack,Expense,RepoAgentId,PayeeName,ExpensePaymentId,AdjustDetailId) VALUES(1,1,2,1500.0000,0.0000,163,1002,'11/30/2001 12:00:00 AM',1116,0,550.0000,850,NULL,@ExpensePay1,NULL); DECLARE @InvDetail1 INT; SET @InvDetail1 = (SELECT @@IDENTITY); This query is generated for only 110K rows. It takes 30 minutes for all of these query's to execute I checked the query plan and the largest % nodes are A Clustered Index Insert at 57% query cost which has a long xml that I don't want to post. A Table Spool which is 38% query cost <RelOp AvgRowSize="35" EstimateCPU="5.01038E-05" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Eager Spool" NodeId="80" Parallel="false" PhysicalOp="Table Spool" EstimatedTotalSubtreeCost="0.0466109"> <OutputList> <ColumnReference Database="[SkipPro]" Schema="[dbo]" Table="[InvoiceDetail]" Column="InvoiceId" /> <ColumnReference Database="[SkipPro]" Schema="[dbo]" Table="[InvoiceDetail]" Column="InvestigatorId" /> <ColumnReference Column="Expr1054" /> <ColumnReference Column="Expr1055" /> </OutputList> <Spool PrimaryNodeId="3" /> </RelOp> So my question is what is there that I can do to improve the speed of this thing? I already run ALTER TABLE TABLENAME NOCHECK CONSTRAINTS ALL Before the queries and then ALTER TABLE TABLENAME NOCHECK CONSTRAINTS ALL after the queries. And that didn't shave off hardly anything off of the time. Know I am running these queries in a .NET application that uses a SqlCommand object to send the query. I then tried to output the sql commands to a file and then execute it using sqlcmd, but I wasn't getting any updates on how it was doing, so I gave up on that. Any ideas or hints or help?

    Read the article

  • Can't return a List from a Compiled Query.

    - by Andrew
    I was speeding up my app by using compiled queries for queries which were getting hit over and over. I tried to implement it like this: Function Select(ByVal fk_id As Integer) As List(SomeEntity) Using db As New DataContext() db.ObjectTrackingEnabled = False Return CompiledSelect(db, fk_id) End Using End Function Shared CompiledSelect As Func(Of DataContext, Integer, List(Of SomeEntity)) = _ CompiledQuery.Compile(Function(db As DataContext, fk_id As Integer) _ (From u In db.SomeEntities _ Where u.SomeLinkedEntity.ID = fk_id _ Select u).ToList()) This did not work and I got this error message: Type : System.ArgumentNullException, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089 Message : Value cannot be null. Parameter name: value However, when I changed my compiled query to return IQueryable instead of List like so: Function Select(ByVal fk_id As Integer) As List(SomeEntity) Using db As New DataContext() db.ObjectTrackingEnabled = False Return CompiledSelect(db, fk_id).ToList() End Using End Function Shared CompiledSelect As Func(Of DataContext, Integer, IQueryable(Of SomeEntity)) = _ CompiledQuery.Compile(Function(db As DataContext, fk_id As Integer) _ From u In db.SomeEntities _ Where u.SomeLinkedEntity.ID = fk_id _ Select u) It worked fine. Can anyone shed any light as to why this is? BTW, compiled queries rock! They sped up my app by a factor of 2.

    Read the article

  • Lucene (.NET) Document stucture and performance suggestions.

    - by Josh Handel
    Hello, I am indexing about 100M documents that consist of a few string identifiers and a hundred or so numaric terms.. I won't be doing range queries, so I haven't dugg too deep into Numaric Field but I'm not thinking its the right choose here. My problem is that the query performance degrades quickly when I start adding OR criteria to my query.. All my queries are on specific numaric terms.. So a document looks like StringField:[someString] and N DataField:[someNumber].. I then query it with something like DataField:((+1 +(2 3)) (+75 +(3 5 52)) (+99 +88 +(102 155 199))). Currently these queries take about 7 to 16 seconds to run on my laptop.. I would like to make sure thats really the best they can do.. I am open to suggestions on field structure and query structure :-). Thanks Josh PS: I have already read over all the other lucene performance discussions on here, and on the Lucene wiki and at lucid imiagination... I'm a bit further down the rabbit hole then that...

    Read the article

  • transactions and delete using fluent nhibernate

    - by Will I Am
    I am starting to play with (Fluent) nHibernate and I am wondering if someone can help with the following. I'm sure it's a total noob question. I want to do: delete from TABX where name = 'abc' where table TABX is defined as: ID int name varchar(32) ... I build the code based on internet samples: using (ITransaction transaction = session.BeginTransaction()) { IQuery query = session.CreateQuery("FROM TABX WHERE name = :uid") .SetString("uid", "abc"); session.Delete(query.List<Person>()[0]); transaction.Commit(); } but alas, it's generating two queries (one select and one delete). I want to do this in a single statement, as in my original SQL. What is the correct way of doing this? Also, I noticed that in most samples on the internet, people tend to always wrap all queries in transactions. Why is that? If I'm only running a single statement, that seems an overkill. Do people tend to just mindlessly cut and paste, or is there a reason beyond that? For example, in my query above, if I do manage it to get it from two queries down to one, i should be able to remove the begin/commit transaction, no? if it matters, I'm using PostgreSQL for experimenting.

    Read the article

  • Setting Connection Parameters via ADO for MSSQL

    - by taspeotis
    Is it possible to set a connection parameter on a connection to SQL Server and have that variable persist throughout the life of the connection? The parameter must be usable by subsequent queries. We have some old Access reports that use a handful of VBScript functions in the SQL queries (let's call them GetStartDate and GetEndDate) that return global variables. Our application would set these before invoking the query and then the queries can return information between date ranges specified in our application. We are looking at changing to a ReportViewer control running in local mode, but I don't see any convenient way to use these custom functions in straight T-SQL. I have two concept solutions (not tested yet), but I would like to know if there is a better way. Below is some psuedo code. Set all variables before running Recordset.OpenForward Connection->Execute("SET @GetStartDate = ..."); Connection->Execute("SET @GetEndDate = ..."); // Repeat for all parameters Will these variables persist to later calls of Recordset->OpenForward? Can anything reset the variables aside from another SET/SELECT @variable statement? Create an ADOCommand "factory" that automatically adds parameters to each ADOCommand object I will use to execute SQL // Command has been previously been created ADOParameter *Parameter1 = Command->CreateParameter("GetStartDate"); ADOParameter *Parameter2 = Command->CreateParameter("GetEndDate"); // Set values and attach etc... What I would like to know if there is something like: Connection->SetParameter("GetStartDate", "20090101"); Connection->SetParameter("GetEndDate", 20100101"); And these will persist for the lifetime of the connection, and the SQL can do something like @GetStartDate to access them. This may be exactly solution #1, if the variables persist throughout the lifetime of the connection.

    Read the article

  • How to query JDO persistent objects in unowned relationship model?

    - by Paul B
    Hello, I'm trying to migrate my app from PHP and RDBMS (MySQL) to Google App Engine and have a hard time figuring out data model and relationships in JDO. In my current app I use a lot of JOIN queries like: SELECT users.name, comments.comment FROM users, comments WHERE users.user_id = comments.user_id AND users.email = '[email protected]' As I understand, JOIN queries are not supported in this way so the only(?) way to store data is using unowned relationships and "foreign" keys. There is a documentation regarding that, but no useful examples. So far I have something like this: @PersistenceCapable public class Users {     @PrimaryKey     @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)     private Key key;     @Persistent     private String name;         @Persistent     private String email;         @Persistent     private Set<Key> commentKeys;     // Accessors... } @PersistenceCapable public class Comments {     @PrimaryKey     @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)     private Key key;     @Persistent     private String comment;         @Persistent     private Date commentDate;     @Persistent     private Key userKey;     // Accessors... } So, how do I get a list with commenter's name, comment and date in one query? I see how I probably could get away with 3 queries but that seems wrong and would create unnecessary overhead. Please, help me out with some code examples. -- Paul.

    Read the article

  • MS Access 2003 - Is there a way to programmatically define the data for a chart?

    - by Justin
    So I have some VBA for taking charts built with the Form's Chart Wizard, and automatically inserting it into PowerPoint Presentation slides. I use those chart-forms as sub forms within a larger forms that has parameters the user can select to determine what is on the chart. The idea is that the user can determine the parameter, build the chart to his/her liking, and click a button and have it in a ppt slide with the company's background template, blah blah blah..... So it works, though it is very bulky in terms of the amount of objects I have to use to accomplish this. I use expressions such as the following: like forms!frmMain.Month&* to get the input values into the saved queries, which was fine when i first started, but it went over so well and they want so many options, that it is driving the number of saved queries/objects up. I need several saved forms with charts because of the number of different types of charts I need to have this be able to handle. SO FINALLY TO MY QUESTION: I would much rather do all this on the fly with some VBA. I know how to insert list boxes, and text boxes on a form, and I know how to use SQL in VBA to get the values I want from tables/queries using VBA, I just don't know if there is some vba I can use to set the data values of the charts from a resulting recordset: DIM rs AS DAO.Rescordset DIM db AS DAO.Database DIM sql AS String sql = "SELECT TOP 5 Count(tblMain.TransactionID) AS Total, tblMain.Location FROM tblMain WHERE (((tblMain.Month) = """ & me.txtMonth & """ )) ORDER BY Count (tblMain.TransactionID) DESC;" set db = currentDB set rs = db.OpenRecordSet(sql) rs.movefirst some kind of cool code in here to make this recordset the data of chart in frmChart ("Chart01") thanks for your help. apologies for the length of the explanation.

    Read the article

  • Mysql InnoDB performance optimization and indexing

    - by Davide C
    Hello everybody, I have 2 databases and I need to link information between two big tables (more than 3M entries each, continuously growing). The 1st database has a table 'pages' that stores various information about web pages, and includes the URL of each one. The column 'URL' is a varchar(512) and has no index. The 2nd database has a table 'urlHops' defined as: CREATE TABLE urlHops ( dest varchar(512) NOT NULL, src varchar(512) DEFAULT NULL, timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, KEY dest_key (dest), KEY src_key (src) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Now, I need basically to issue (efficiently) queries like this: select p.id,p.URL from db1.pages p, db2.urlHops u where u.src=p.URL and u.dest=? At first, I thought to add an index on pages(URL). But it's a very long column, and I already issue a lot of INSERTs and UPDATEs on the same table (way more than the number of SELECTs I would do using this index). Other possible solutions I thought are: -adding a column to pages, storing the md5 hash of the URL and indexing it; this way I could do queries using the md5 of the URL, with the advantage of an index on a smaller column. -adding another table that contains only page id and page URL, indexing both columns. But this is maybe a waste of space, having only the advantage of not slowing down the inserts and updates I execute on 'pages'. I don't want to slow down the inserts and updates, but at the same time I would be able to do the queries on the URL efficiently. Any advice? My primary concern is performance; if needed, wasting some disk space is not a problem. Thank you, regards Davide

    Read the article

  • Setting Connection Parameters via ADO for SQL Server

    - by taspeotis
    Is it possible to set a connection parameter on a connection to SQL Server and have that variable persist throughout the life of the connection? The parameter must be usable by subsequent queries. We have some old Access reports that use a handful of VBScript functions in the SQL queries (let's call them GetStartDate and GetEndDate) that return global variables. Our application would set these before invoking the query and then the queries can return information between date ranges specified in our application. We are looking at changing to a ReportViewer control running in local mode, but I don't see any convenient way to use these custom functions in straight T-SQL. I have two concept solutions (not tested yet), but I would like to know if there is a better way. Below is some pseudo code. Set all variables before running Recordset.OpenForward Connection->Execute("SET @GetStartDate = ..."); Connection->Execute("SET @GetEndDate = ..."); // Repeat for all parameters Will these variables persist to later calls of Recordset->OpenForward? Can anything reset the variables aside from another SET/SELECT @variable statement? Create an ADOCommand "factory" that automatically adds parameters to each ADOCommand object I will use to execute SQL // Command has been previously been created ADOParameter *Parameter1 = Command->CreateParameter("GetStartDate"); ADOParameter *Parameter2 = Command->CreateParameter("GetEndDate"); // Set values and attach etc... What I would like to know if there is something like: Connection->SetParameter("GetStartDate", "20090101"); Connection->SetParameter("GetEndDate", 20100101"); And these will persist for the lifetime of the connection, and the SQL can do something like @GetStartDate to access them. This may be exactly solution #1, if the variables persist throughout the lifetime of the connection.

    Read the article

  • ways to avoid global temp tables in oracle

    - by Omnipresent
    We just converted our sql server stored procedures to oracle procedures. Sql Server SP's were highly dependent on session tables (INSERT INTO #table1...) these tables got converted as global temporary tables in oracle. We ended up with aroun 500 GTT's for our 400 SP's Now we are finding out that working with GTT's in oracle is considered a last option because of performance and other issues. what other alternatives are there? Collections? Cursors? Our typical use of GTT's is like so: Insert into GTT INSERT INTO some_gtt_1 (column_a, column_b, column_c) (SELECT someA, someB, someC FROM TABLE_A WHERE condition_1 = 'YN756' AND type_cd = 'P' AND TO_NUMBER(TO_CHAR(m_date, 'MM')) = '12' AND (lname LIKE (v_LnameUpper || '%') OR lname LIKE (v_searchLnameLower || '%')) AND (e_flag = 'Y' OR it_flag = 'Y' OR fit_flag = 'Y')); Update the GTT UPDATE some_gtt_1 a SET column_a = (SELECT b.data_a FROM some_table_b b WHERE a.column_b = b.data_b AND a.column_c = 'C') WHERE column_a IS NULL OR column_a = ' '; and later on get the data out of the GTT. These are just sample queries, in actuality the queries are really complext with lot of joins and subqueries. I have a three part question: Can someone show how to transform the above sample queries to collections and/or cursors? Since with GTT's you can work natively with SQL...why go away from the GTTs? are they really that bad. What should be the guidelines on When to use and When to avoid GTT's

    Read the article

  • Modeling complex hierarchies

    - by jdn
    To gain some experience, I am trying to make an expert system that can answer queries about the animal kingdom. However, I have run into a problem modeling the domain. I originally considered the animal kingdom hierarchy to be drawn like -animal -bird -carnivore -hawk -herbivore -bluejay -mammals -carnivores -herbivores This I figured would allow me to make queries easily like "give me all birds", but would be much more expensive to say "give me all carnivores", so I rewrote the hierarchy to look like: -animal -carnivore -birds -hawk -mammals -xyz -herbivores -birds -bluejay -mammals But now it will be much slower to query "give me all birds." This is of course a simple example, but it got me thinking that I don't really know how to model complex relationships that are not so strictly hierarchical in nature in the context of writing an expert system to answer queries as stated above. A directed, cyclic graph seems like it could mathematically solve the problem, but storing this in a relational database and maintaining it (updates) would seem like a nightmare to me. I would like to know how people typically model such things. Explanations or pointers to resources to read further would be acceptable and appreciated.

    Read the article

  • Using JSF, PrimeFaces and JPA: Create Basic WebApp without using Generated CRUD Classes, Forms, etc

    - by user2774489
    I am trying to build a basic CRUD application with NetBeans 7.4, JSF, PrimeFaces and JPA using MySQL. I have successfully done this by using the NetBeans wizards. I want to do this from scratch, no wizards. There seems to be a lack of support for the combo of JSF, PrimeFaces and JPA. When I say "lack", I mean a full example (I might be asking too much), without using the CRUD auto-gen templates/classes AND shows actual queries coded and passed to the datatables(primefaces). YouTube is full of non-English speaking examples using Hibernate (not JPA) and other examples that show flashy GUI's with no code. So far I understand you need an @Entity class (provides the physical build of the tables), a Controller (serializable) and the .xhtml web page to show the datatable.. what else? Also, I'm not seeing any posts or examples where queries are using with JPA/JSF and how they are tied together (in one place). I need to connect the dots here so that I can leverage JSF/JPA to create simple queries to populate my PF DataTables. I've read the blogs and I've googled the intranets until I'm blue in the face. Sending me a list of URL's to read to learn about each product is something I've already done. I get what they do independently, but am looking for the "How do they all connect" answer with maybe some basic code examples!! :)

    Read the article

  • Can I clone an IQueryable in linq? For UNION purposes?

    - by user169867
    I have a table of WorkOrders. The table has a PrimaryWorker & PrimaryPay field. It also has a SecondaryWorker & SecondaryPay field (which can be null). I wish to run 2 very similar queries & union them so that it will return a Worker Field & Pay field. So if a single WorkOrder record had both the PrimaryWorker and SecondaryWorker field populated I would get 2 records back. The "where clause" part of these 2 queries is very similar and long to construct. Here's a dummy example var q = ctx.WorkOrder.Where(w => w.WorkDate >= StartDt && w.WorkDate <= EndDt); if(showApprovedOnly) { q = q.Where(w => w.IsApproved); } //...more filters applied Now I also have a search flag called "hideZeroPay". If that's true I don't want to include the record if the worker was payed $0. But obviously for 1 query I need to compare the PrimaryPay field and in the other I need to compare the SecondaryPay field. So I'm wondering how to do this. Can I clone my base query "q" and make a primary & secondary worker query out of it and then union those 2 queries together? I'd greatly appreciate an example of how to correctly handle this. Thanks very much for any help.

    Read the article

  • Good strategy for copying a "sliding window" of data from a table?

    - by chiborg
    I have a MySQL table from a third-party application that has millions of rows and only one index - the timestamp of each entry. Now I want to do some heavy self-joins and queries on the data using fields other than the timestamp. Doing the query on the original table would bring the database to a crawl, adding indexes to the table is not an option. Additionally, I only need entries that are newer than one week. My current strategy for doing the queries efficiently is to use a separate table (aux_table) that has the necessary indexes. My questions are: Is there another way to do the queries? and if not, How do I update the data in the indexed table efficiently? So far I have found two approaches for updating aux_table: Truncate aux_table and insert the desired data from the original table. Not very efficient because all the indexes must be re-crated. Check for the biggest timestamp in aux_table and insert all entries with a greater or equal timestamp from the original table. Occasionally drop older entries. Only copying entries with greater timestamp leads to dropped entries (because of entries with same timestamp that were inserted into the original table after the last update).

    Read the article

  • How to use a getter with a nullable?

    - by Desmond Lost
    I am reading a bunch of queries from a database. I had an issue with the queries not closing, so I added a CommandTimeout. Now, the individual queries read from the config file each time they are run. How would I make the code cache the int from the config file only once using a static nullable and getter. I was thinking of doing something along the lines of: static int? var; get{ var = null; if (var.HasValue) ...(i dont know how to complete the rest) My actual code: private object ExecuteQuery(string dbConnStr, bool fixIt) { object result = false; using (SqlConnection connection = new SqlConnection(dbConnStr)) { connection.Open(); using (SqlCommand sqlCmd = new SqlCommand()) { AddSQLParms(sqlCmd); sqlCmd.CommandTimeout = 30; sqlCmd.CommandText = _cmdText; sqlCmd.Connection = connection; sqlCmd.CommandType = System.Data.CommandType.Text; sqlCmd.ExecuteNonQuery(); } connection.Close(); } return result; }}

    Read the article

< Previous Page | 42 43 44 45 46 47 48 49 50 51 52 53  | Next Page >