Search Results

Search found 4815 results on 193 pages for 'parameterized queries'.

Page 53/193 | < Previous Page | 49 50 51 52 53 54 55 56 57 58 59 60  | Next Page >

  • How to optimize my PageRank calculation?

    - by asmaier
    In the book Programming Collective Intelligence I found the following function to compute the PageRank: def calculatepagerank(self,iterations=20): # clear out the current PageRank tables self.con.execute("drop table if exists pagerank") self.con.execute("create table pagerank(urlid primary key,score)") self.con.execute("create index prankidx on pagerank(urlid)") # initialize every url with a PageRank of 1.0 self.con.execute("insert into pagerank select rowid,1.0 from urllist") self.dbcommit() for i in range(iterations): print "Iteration %d" % i for (urlid,) in self.con.execute("select rowid from urllist"): pr=0.15 # Loop through all the pages that link to this one for (linker,) in self.con.execute("select distinct fromid from link where toid=%d" % urlid): # Get the PageRank of the linker linkingpr=self.con.execute("select score from pagerank where urlid=%d" % linker).fetchone()[0] # Get the total number of links from the linker linkingcount=self.con.execute("select count(*) from link where fromid=%d" % linker).fetchone()[0] pr+=0.85*(linkingpr/linkingcount) self.con.execute("update pagerank set score=%f where urlid=%d" % (pr,urlid)) self.dbcommit() However, this function is very slow, because of all the SQL queries in every iteration >>> import cProfile >>> cProfile.run("crawler.calculatepagerank()") 2262510 function calls in 136.006 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 136.006 136.006 <string>:1(<module>) 1 20.826 20.826 136.006 136.006 searchengine.py:179(calculatepagerank) 21 0.000 0.000 0.528 0.025 searchengine.py:27(dbcommit) 21 0.528 0.025 0.528 0.025 {method 'commit' of 'sqlite3.Connecti 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler 1339864 112.602 0.000 112.602 0.000 {method 'execute' of 'sqlite3.Connec 922600 2.050 0.000 2.050 0.000 {method 'fetchone' of 'sqlite3.Cursor' 1 0.000 0.000 0.000 0.000 {range} So I optimized the function and came up with this: def calculatepagerank2(self,iterations=20): # clear out the current PageRank tables self.con.execute("drop table if exists pagerank") self.con.execute("create table pagerank(urlid primary key,score)") self.con.execute("create index prankidx on pagerank(urlid)") # initialize every url with a PageRank of 1.0 self.con.execute("insert into pagerank select rowid,1.0 from urllist") self.dbcommit() inlinks={} numoutlinks={} pagerank={} for (urlid,) in self.con.execute("select rowid from urllist"): inlinks[urlid]=[] numoutlinks[urlid]=0 # Initialize pagerank vector with 1.0 pagerank[urlid]=1.0 # Loop through all the pages that link to this one for (inlink,) in self.con.execute("select distinct fromid from link where toid=%d" % urlid): inlinks[urlid].append(inlink) # get number of outgoing links from a page numoutlinks[urlid]=self.con.execute("select count(*) from link where fromid=%d" % urlid).fetchone()[0] for i in range(iterations): print "Iteration %d" % i for urlid in pagerank: pr=0.15 for link in inlinks[urlid]: linkpr=pagerank[link] linkcount=numoutlinks[link] pr+=0.85*(linkpr/linkcount) pagerank[urlid]=pr for urlid in pagerank: self.con.execute("update pagerank set score=%f where urlid=%d" % (pagerank[urlid],urlid)) self.dbcommit() This function is 20 times faster (but uses a lot more memory for all the temporary dictionaries) because it avoids the unnecessary SQL queries in every iteration: >>> cProfile.run("crawler.calculatepagerank2()") 64802 function calls in 6.950 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.004 0.004 6.950 6.950 <string>:1(<module>) 1 1.004 1.004 6.946 6.946 searchengine.py:207(calculatepagerank2 2 0.000 0.000 0.104 0.052 searchengine.py:27(dbcommit) 23065 0.012 0.000 0.012 0.000 {meth 'append' of 'list' objects} 2 0.104 0.052 0.104 0.052 {meth 'commit' of 'sqlite3.Connection 1 0.000 0.000 0.000 0.000 {meth 'disable' of '_lsprof.Profiler' 31298 5.809 0.000 5.809 0.000 {meth 'execute' of 'sqlite3.Connectio 10431 0.018 0.000 0.018 0.000 {method 'fetchone' of 'sqlite3.Cursor' 1 0.000 0.000 0.000 0.000 {range} But is it possible to further reduce the number of SQL queries to speed up the function even more?

    Read the article

  • using Spring JdbcTemplate for multiple database operations

    - by Joel Carranza
    I like the apparent simplicity of JdbcTemplate but am a little confused as to how it works. It appears that each operation (query() or update()) fetches a connection from a datasource and closes it. Beautiful, but how do you perform multiple SQL queries within the same connection? I might want to perform multiple operations in sequence (for example SELECT followed by an INSERT followed by a commit) or I might want to perform nested queries (SELECT and then perform a second SELECT based on result of each row). How do I do that with JdbcTemplate. Am I using the right class?

    Read the article

  • Database design for very large amount of data

    - by Hossein
    Hi, I am working on a project, involving large amount of data from the delicious website.The data available is at files are "Date,UserId,Url,Tags" (for each bookmark). I normalized my database to a 3NF, and because of the nature of the queries that we wanted to use In combination I came down to 6 tables....The design looks fine, however, now a large amount of data is in the database, most of the queries needs to "join" at least 2 tables together to get the answer, sometimes 3 or 4. At first, we didn't have any performance issues, because for testing matters we haven't had added too much data in the database. No that we have a lot of data, simply joining extremely large tables does take a lot of time and for our project which has to be real-time is a disaster.I was wondering how big companies solve these issues.Looks like normalizing tables just adds complexity, but how does the big company handle large amounts of data in their databases, don't they do the normalization? thanks

    Read the article

  • Avoid implicit conversion from date to timestamp for selects with Oracle 11g using Hibernate

    - by sapporo
    I'm using Hibernate 3.2.1 criteria queries to select rows from an Oracle 11g database, filtering by a timestamp field. The field in question is of type java.util.Date in Java, and DATE in Oracle. It turns out that the field gets mapped to java.sql.Timestamp, and Oracle converts all rows to TIMESTAMP before comparing to the passed in value, bypassing the index and thereby ruining performance. One solution would be to use Hibernate's sqlRestriction() along with Oracle's TO_DATE function. That would fix performance, but requires rewriting the application code (lots of queries). So is there a more elegant solution? Since Hibernate already does type mapping, could it be configured to do the right thing?

    Read the article

  • sql query to incrementally modify where condition until result contains what is required

    - by iamrohitbanga
    I need an sql query to select some items from a table based on some condition which is based on a category field. As an example consider a list of people and I fetch the people from a particular age group from the database. I want to check if the result contains at least one result from each of a number of categories. If not I want to modify the age group by extending it and check the results again. This is repeated until I get an age group for which one result is present for each category. Right now i am doing this by analyzing the results and modifying the sql query. So a number of sql select queries are sent. What is the most efficient way of doing this? I am invoking the select queries from a java program using jdbc. I am using mysql database.

    Read the article

  • fql.multiquery new sdk

    - by Ronald Burris
    I cannot figure out what is wrong with this fql.multiquery and cannot seem to find any examples of the new sdk with fql.multiquery. Ultimately I want to get the page name and page id(s) of the visiting user pages which they both administrator and are fans of. $queries = '{ "page_admin_ids" : "SELECT page_id FROM page_admin WHERE uid = ' . $afid . ' LIMIT 5", "page_fan_ids" : "SELECT page_id FROM page_fan WHERE page_id IN (SELECT page_id FROM #page_admin_ids)", "page_name_and_id" : "SELECT name, page_id FROM page WHERE page_id IN (SELECT page_id FROM #page_fan_ids)" }'; $attachment = array("method"="fql.multiquery","query"=$queries,'access_token'=$access_token); $ret_code = $facebook-api($attachment); print_r($ret_code); die();

    Read the article

  • Best practices for combining Lucene.NET and a relational database?

    - by FlySwat
    I'm working on a project where I will have a LOT of data, and it will be searchable by several forms that are very efficiently expressed as SQL Queries, but it also needs to be searched via natural language processing. My plan is to build an index using Lucene for this form of search. My question is that if I do this, and perform a search, Lucene will then return the ID's of matching documents in the index, I then have to lookup these entities from the relational database. This could be done in two ways (That I can think of so far): N amount of queries (Horrible) Pass all the ID's to a stored procedure at once (Perhaps as a comma delimited parameter). This has the downside of being limited to the max parameter size, and the slow performance of a UDF to split the string into a temporary table. I'm almost tempted to mirror everything into lucenes index, so that I can periodicly generate the index from the backing store, but only need to access it for the frontend. Advice?

    Read the article

  • The subscription model behind CSS selectors?

    - by Martin Kristiansen
    With CSS selectors a query string body > h1.span subscribes to a specific type of nodes in the tree. Does anyone know how this is done? Selectors for transformations, how does the browser select the result set? And is there a trick to making it efficient? I imagine there being some sort of hierarchical type-tree for the entire structure to which the nodes subscribe and which is what is used when doing the selector queries — but this is only a guess. Does anyone know the real answer? Or even more interesting, what would be the best way to do dynamic lookups on a tree based on jQuery/CSS search queries?

    Read the article

  • Help to translate SQL query to Relational Algebra

    - by Mestika
    Hi everyone, I'm having some difficulties with translating some queries to Relational Algebra. I've a great book about Database Design and here is a chapter about Relational Algebra but I still seem to have some trouble creating the right one: Thoes queries I've most difficuelt with is these: SELECT COUNT( cs.student_id ) AS counter FROM course c, course_student cs WHERE c.id = cs.course_id AND c.course_name = 'Introduction to Database Design' SELECT COUNT( cs.student_id ) FROM Course c INNER JOIN course_student cs ON c.id = cs.course_id WHERE c.course_name = 'Introduction to Database Design' and SELECT COUNT( * ) FROM student JOIN grade ON student.f_name = "Andreas" AND student.l_name = "Pedersen" AND student.id = grade.student_id I know the notation can be a bit hard to paste into HTML forum, but maybe just use some common name or the Greek name. Thanks in advance Mestika

    Read the article

  • Problem using SQLDMO/Vb6 against SQL Server 2008

    - by E.J. Brennan
    I have a client, that uses SQLDMO for a portion of a custom application that was written against SQL Server 2000, and they recently upgraded to SQL Server 2008. The majority of the app still runs fine (doesn't use SQLDMO), but the admin functions which rely on SQLDMO stopped working. I installed the SQL2005 backward compatibility pack, and now SQLDMO partially works, i.e. I can run "select" type queries, but any "Update" queries fail with the error message: to connect to the server you must use SQL Server management studio or sql server management objects (SMO) Any thoughts? Should the backward compatibility pack give me ALL the functionality back, or is this a known issue? BTW: I realize SQLDMO has been deprecated and will go away next release, none-the-less I need to do what I can to solve the problem at hand.

    Read the article

  • Convert a Mysql database from latin to UTF-8

    - by Matthieu
    I am converting a website from ISO to UTF-8, so I need to convert the Mysql database too. On internet, I read various solutions, I don't know wich one to choose. Do I really need to convert my varchar columns to binary, then to utf8 like that : ALTER TABLE t MODIFY col BINARY(150); ALTER TABLE t MODIFY col CHAR(150) CHARACTER SET utf8; This is long to do that for each column, of each table, of each database. I have 10 database, wich 20 tables each, wich around 2 - 3 varchar colums (2 queries each column), this gives me around 1000 queries to write ! How come ? How to do ?

    Read the article

  • How to eager load sibling data using LINQ to SQL?

    - by Scott
    The goal is to issue the fewest queries to SQL Server using LINQ to SQL without using anonymous types. The return type for the method will need to be IList<Child1>. The relationships are as follows: Parent Child1 Child2 Grandchild1 Parent Child1 is a one-to-many relationship Child1 Grandchild1 is a one-to-n relationship (where n is zero to infinity) Parent Child2 is a one-to-n relationship (where n is zero to infinity) I am able to eager load the Parent, Child1 and Grandchild1 data resulting in one query to SQL Server. This query with load options eager loads all of the data, except the sibling data (Child2): DataLoadOptions loadOptions = new DataLoadOptions(); loadOptions.LoadWith<Child1>(o => o.GrandChild1List); loadOptions.LoadWith<Child1>(o => o.Parent); dataContext.LoadOptions = loadOptions; IQueryable<Child1> children = from child in dataContext.Child1 select child; I need to load the sibling data as well. One approach I have tried is splitting the query into two LINQ to SQL queries and merging the result sets together (not pretty), however upon accessing the sibling data it is lazy loaded anyway. Adding the sibling load option will issue a query to SQL Server for each Grandchild1 and Child2 record (which is exactly what I am trying to avoid): DataLoadOptions loadOptions = new DataLoadOptions(); loadOptions.LoadWith<Child1>(o => o.GrandChild1List); loadOptions.LoadWith<Child1>(o => o.Parent); loadOptions.LoadWith<Parent>(o => o.Child2List); dataContext.LoadOptions = loadOptions; IQueryable<Child1> children = from child in dataContext.Child1 select child; exec sp_executesql N'SELECT * FROM [dbo].[Child2] AS [t0] WHERE [t0].[ForeignKeyToParent] = @p0',N'@p0 int',@p0=1 exec sp_executesql N'SELECT * FROM [dbo].[Child2] AS [t0] WHERE [t0].[ForeignKeyToParent] = @p0',N'@p0 int',@p0=2 exec sp_executesql N'SELECT * FROM [dbo].[Child2] AS [t0] WHERE [t0].[ForeignKeyToParent] = @p0',N'@p0 int',@p0=3 exec sp_executesql N'SELECT * FROM [dbo].[Child2] AS [t0] WHERE [t0].[ForeignKeyToParent] = @p0',N'@p0 int',@p0=4 I've also written LINQ to SQL queries to join in all of the data in hopes that it would eager load the data, however when the LINQ to SQL EntitySet of Child2 or Grandchild1 are accessed it lazy loads the data. The reason for returning the IList<Child1> is to hydrate business objects. My thoughts are I am either: Approaching this problem the wrong way. Have the option of calling a stored procedure? My organization should not be using LINQ to SQL as an ORM? Any help is greatly appreciated. Thank you, -Scott

    Read the article

  • Authentication on odata service

    - by Toad
    I want to add some authentication to my odata service. Depending on the user calling i want to: filter rows and/or remove columns. I read in scott hanselmans fine blogpost on odata ( http://www.hanselman.com/blog/CreatingAnODataAPIForStackOverflowIncludingXMLAndJSONIn30Minutes.aspx )that it is possible to intercept the incoming queries. If this works i could add some extra filtering. How would this intercepting and altering queries work exactly? I can not find any examples of where and how to do this. (i'm using entitie framework and wcf dataservices (just like scotts example blog)

    Read the article

  • Linq Query Help Needed

    - by Randy Minder
    Say I have the following LINQ queries: var source = from workflow in sourceWorkflowList select new { SubID = workflow.SubID, ReadTime = workflow.ReadTime, ProcessID = workflow.ProcessID, LineID = workflow.LineID }; var target = from workflow in targetWorkflowList select new { SubID = workflow.SubID, ReadTime = workflow.ReadTime, ProcessID = workflow.ProcessID, LineID = workflow.LineID }; var difference = source.Except(target); sourceWorkflowList and targetWorkflowList have the exact same column definitions. But they both contain more columns of data than what is shown in the queries above. Those are just the columns needed for this particular issue. difference contains all rows in sourceWorkflowList that are not contained in targetWorkflowList Now what I would like to do is to remove all rows from sourceWorkflowList that do not exist in difference. Could someone show me a query that would do this? Thanks very much - Randy

    Read the article

  • The advantages and disadvantages of using ORM

    - by JHarley1
    Good Morning, I would like to discuss today the advantages and disadvantages of using ORM (such as ADO.NET). Advantages: Speeds-up Development - eliminates the need for repetitive SQL code. Reduces Development Time. Reduces Development Costs. Overcomes vendor specific SQL differences - the ORM knows how to write vendor specific SQL so you don't have to. Disadvantages: Loss in developer productivity whilst they learn to program with ORM. Developers loose understanding of what the code is actually doing - the developer is more in control using SQL. ORM has a tendency to be slow. ORM fail to compete against SQL queries for complex queries. In summary, I believe that the disadvantages of using an ORM (mainly the reduced time taken to perform repetitive tasks) is far outweighed by the disadvantages of ORM e.g. its difficulty to get to grips with. Can people point out were I am going wrong and suggest any further advantages/disadvantages. Many Thanks, J

    Read the article

  • Is there an extensible SQL like query language that is safe for exposing via a public API?

    - by Lokkju
    I want to expose some spatial (and a few non-spatial) datasets via a public API. The backend store will either be PostgreSQL/PostGIS, sqlite/spatialite, or CouchDB/GeoCouch. My goal is to find a some, preferably standard, way to allow people to make complex spatial queries against the data. I would like it to be a simple GET based request. The idea is to allow safe SQL type queries, without allowing unsafe ones. I would rather modify something that is off the shelf than doing the entire thing myself. I specifically want to support requesting specific fields from a table; joining results; and spatial functions that are already implemented by the underlying datastore. Ideas anyone?

    Read the article

  • VBA - Create ADODB.Recordset from the contents of a spreadsheet

    - by robault
    Hello, I am working on an Excel application that queries a SQL database. The queries can take a long time to run (20-40 min). If I've miss-coded something it can take a long time to error or reach a break point. I can save the results to a sheet fine, it's when I am working with the record sets that things can blow up. Is there a way to load the data into a ADODB.Recordset when I'm debugging to skip querying the database (after the first time)? Would I use something like this? http://stackoverflow.com/questions/2086234/query-excel-worksheet-in-ms-access-vba-using-adodb-recordset

    Read the article

  • Suggest Cassandra data model for an existing schema

    - by Andriy Bohdan
    Hello guys! I hope there's someone who can help me suggest a suitable data model to be implemented using nosql database Apache Cassandra. More of than I need it to work under high loads and large amounts of data. Simplified I have 3 types of objects: Product Tag ProductTag Product: key - string key name - string .... - some other fields Tag: key - string key name - unique tag words ProductTag: product_key - foreign key referring to product tag_key - foreign key referring to tag rating - this is rating of tag for this product Each product may have 0 or many tags. Tag may be assigned to 1 or many products. Means relation between products and tags is many-to-many in terms of relational databases. Value of "rating" is updated "very" often. I need to be run the following queries Select objects by keys Select tags for product ordered by rating Select products by tag order by rating Update rating by product_key and tag_key The most important is to make these queries really fast on large amounts of data, considering that rating is constantly updated.

    Read the article

  • geo-indexing: efficiently calculating proximity based on latitude/longitude

    - by AnC
    My simple web app (WSGI, Python) supports text queries to find items in the database. Now I'd like to extend this to allow for queries like "find all items within 1 mile of {lat,long}". Of course that's a complex job if efficiency is a concern, so I'm thinking of a dedicated external module that does indexing for geo-coordinates - sort of like Lucene would for text. I assume a generic component like this already exists, but haven't been able to find anything so far. Any help would be greatly appreciated.

    Read the article

  • Thumbnails from HTML pages created and used automatically in web application

    - by Jesper Rønn-Jensen
    I am working on a Ruby on Rails app that visualizes product trees. The tree is built of nodes an everything is rendered in HTML/CSS3. Some of the products make several hundred SQL queries as the tree builds up (up to 800 queries on the biggest tree). I'd like to have small thumbnails of each tree to present it on an index page. So rendering each tree once again and modifying CSS to make a tiny representation is an option. But i think it's probably easier to generate thumbnails, crop, cache, and show these on the index page. Any ideas on how to do this? Any links/articles/blog posts that could help me?

    Read the article

  • Fast search in XMl files in .NET (or How to index XML files)

    - by codymanix
    I have to implement a search feature which is able to quickly perform arbitrary complex queries to XML-data. If the user makes a query, all XML files must be searched to find possible matches. The users will have lots of XML-Files (a few 10000 or more) which are typically a few kilobytes in size. All the XML-files have almost the same structure. I already benchmarked XPath, it is too slow for my needs. How can it be done most efficiently? Is is possible to create indexes for the contents of the XML files (preserving content semantics, not just plain fulltext search)? Will it be useful to put the XML data into an (embedded) SQL database and do the queries with SQL? What other possibilities do I have?

    Read the article

< Previous Page | 49 50 51 52 53 54 55 56 57 58 59 60  | Next Page >