aggregate - Page 7 - Developer IT

How to do Linq aggregates when there might be an empty set?

- by Shaul

I have a Linq collection of Things, where Thing has an Amount (decimal) property. I'm trying to do an aggregate on this for a certain subset of Things: var total = myThings.Sum(t => t.Amount); and that works nicely. But then I added a condition that left me with no Things in the result: var total = myThings.Where(t => t.OtherProperty == 123).Sum(t => t.Amount); And instead of getting total = 0 or null, I get an error: System.InvalidOperationException: The null value cannot be assigned to a member with type System.Decimal which is a non-nullable value type. That is really nasty, because I didn't expect that behavior. I would have expected total to be zero, maybe null - but certainly not to throw an exception! What am I doing wrong? What's the workaround/fix?

Read the article

Count, inner join

- by Urosh

I have two tables: DRIVER (Driver_Id,First name,Last name,...) PARTICIPANT IN CAR ACCIDENT (Participant_Id,Driver_Id-foreign key,responsibility-yes or no,...) Now, I need to find out which driver participated in accident where responsibility is 'YES', and how many times. I did this: Select Driver_ID, COUNT (Participant.Driver_ID)as 'Number of accidents' from Participant in car accident where responsibility='YES' group by Driver_ID order by COUNT (Participant.Driver_ID) desc But, I need to add drivers first and last name from the first table(using inner join, I suppose). I don't know how, because it is not contained in either an aggregate function or the GROUP BY clause. Please help :)

Read the article

Exporting de-aggregated data

- by Ben

I'm currently working on a data export feature for a survey application. We are using SQL2k8. We store data in a normalized format: QuestionId, RespondentId, Answer. We have a couple other tables that define what the question text is for the QuestionId and demographics for the RespondentId... Currently I'm using some dynamic SQL to generate a pivot that joins the question table to the answer table and creates an export, its working... The problem is that it seems slow and we don't have that much data (less than 50k respondents). Right now I'm thinking "why am I 'paying' to de-aggregate the data for each query? Why don't I cache that?" The data being exported is based on dynamic criteria. It could be "give me respondents that completed on x date (or range)" or "people that like blue", etc. Because of that, I think I have to cache at the respondent level, find out what respondents are being exported and then select their combined cached de-aggregated data. To me the quick and dirty fix is a totally flat table, RespondentId, Question1, Question2, etc. The problem is, we have multiple clients and that doesn't scale AND I don't want to have to maintain the flattened table as the survey changes. So I'm thinking about putting an XML column on the respondent table and caching the results of a SELECT * FROM Data FOR XML AUTO WHERE RespondentId = x. With that in place, I would then be able to get my export with filtering and XML calls into the XML column. What are you doing to export aggregated data in a flattened format (CSV, Excel, etc)? Does this approach seem ok? I worry about the cost of XML functions on larger result sets (think SELECT RespondentId, XmlCol.value('//data/question_1', 'nvarchar(50)') AS [Why is there air?], XmlCol.RinseAndRepeat)... Is there a better technology/approach for this? Thanks!

Read the article

Aggregating and displaying content from hundreds of RSS feeds

- by Andrew LeClair

I'd like to build a website that aggregates and displays content from hundreds of RSS feeds. The feeds will be from different sites: Twitter, Flickr, Tumblr, etc, so the content will be very heterogenous. In a perfect world — and this is more of a side issue — I would like to allow other people to help manage the list of feeds and assign tags to the content from each individual feed so that you can filter the items that are displayed. What I've tried so far: Google Feeds API – I thought this would be the answer, but unless I'm missing something, the FeedController will only output the collected feed content as separate lists. Is there any way to ask the Google Feeds API to aggregate and sort the content from many RSS feeds before displaying? Yahoo! Pipes – This also seemed like a good solution at first. I setup a Pipe that accesses a list of RSS feeds stored in a Google Doc spreadsheet and then aggregates the content. However, the output leaves a lot to be desired; Tumblr video posts, for example, only show a title and a permalink to the post, the embedded Youtube video is lost. PHP – I've seen this question, which looks like a good approach. I'm less proficient in PHP, so although I'm willing to learn, I'd ideally like to find a different approach. Any thoughts? Thanks.

Read the article

ODI 12c - Aggregating Data

- by David Allan

This posting will look at the aggregation component that was introduced in ODI 12c. For many ETL tool users this shouldn't be a big surprise, its a little different than ODI 11g but for good reason. You can use this component for composing data with relational like operations such as sum, average and so forth. Also, Oracle SQL supports special functions called Analytic SQL functions, you can use a specially configured aggregation component or the expression component for these now in ODI 12c. In database systems an aggregate transformation is a transformation where the values of multiple rows are grouped together as input on certain criteria to form a single value of more significant meaning - that's exactly the purpose of the aggregate component. In the image below you can see the aggregate component in action within a mapping, for how this and a few other examples are built look at the ODI 12c Aggregation Viewlet here - the viewlet illustrates a simple aggregation being built and then some Oracle analytic SQL such as AVG(EMP.SAL) OVER (PARTITION BY EMP.DEPTNO) built using both the aggregate component and the expression component. In 11g you used to just write the aggregate expression directly on the target, this made life easy for some cases, but it wan't a very obvious gesture plus had other drawbacks with ordering of transformations (agg before join/lookup. after set and so forth) and supporting analytic SQL for example - there are a lot of postings from creative folks working around this in 11g - anything from customizing KMs, to bypassing aggregation analysis in the ODI code generator. The aggregate component has a few interesting aspects. 1. Firstly and foremost it defines the attributes projected from it - ODI automatically will perform the grouping all you do is define the aggregation expressions for those columns aggregated. In 12c you can control this automatic grouping behavior so that you get the code you desire, so you can indicate that an attribute should not be included in the group by, that's what I did in the analytic SQL example using the aggregate component. 2. The component has a few other properties of interest; it has a HAVING clause and a manual group by clause. The HAVING clause includes a predicate used to filter rows resulting from the GROUP BY clause. Because it acts on the results of the GROUP BY clause, aggregation functions can be used in the HAVING clause predicate, in 11g the filter was overloaded and used for both having clause and filter clause, this is no longer the case. If a filter is after an aggregate, it is after the aggregate (not sometimes after, sometimes having). 3. The manual group by clause let's you use special database grouping grammar if you need to. For example Oracle has a wealth of highly specialized grouping capabilities for data warehousing such as the CUBE function. If you want to use specialized functions like that you can manually define the code here. The example below shows the use of a manual group from an example in the Oracle database data warehousing guide where the SUM aggregate function is used along with the CUBE function in the group by clause. The SQL I am trying to generate looks like the following from the data warehousing guide; SELECT channel_desc, calendar_month_desc, countries.country_iso_code, TO_CHAR(SUM(amount_sold), '9,999,999,999') SALES$ FROM sales, customers, times, channels, countries WHERE sales.time_id=times.time_id AND sales.cust_id=customers.cust_id AND sales.channel_id= channels.channel_id AND customers.country_id = countries.country_id AND channels.channel_desc IN ('Direct Sales', 'Internet') AND times.calendar_month_desc IN ('2000-09', '2000-10') AND countries.country_iso_code IN ('GB', 'US') GROUP BY CUBE(channel_desc, calendar_month_desc, countries.country_iso_code); I can capture the source datastores, the filters and joins using ODI's dataset (or as a traditional flow) which enables us to incrementally design the mapping and the aggregate component for the sum and group by as follows; In the above mapping you can see the joins and filters declared in ODI's dataset, allowing you to capture the relationships of the datastores required in an entity-relationship style just like ODI 11g. The mix of ODI's declarative design and the common flow design provides for a familiar design experience. The example below illustrates flow design (basic arbitrary ordering) - a table load where only the employees who have maximum commission are loaded into a target. The maximum commission is retrieved from the bonus datastore and there is a look using employees as the driving table and only those with maximum commission projected. Hopefully this has given you a taster for some of the new capabilities provided by the aggregate component in ODI 12c. In summary, the actions should be much more consistent in behavior and more easily discoverable for users, the use of the components in a flow graph also supports arbitrary designs and the tool (rather than the interface designer) takes care of the realization using ODI's knowledge modules. Interested to know if a deep dive into each component is interesting for folks. Any thoughts?

Read the article

SQL SERVER – DQS Error – Cannot connect to server – A .NET Framework error occurred during execution of user-defined routine or aggregate “SetDataQualitySessions” – SetDataQualitySessionPhaseTwo

- by pinaldave

Earlier I wrote a blog post about how to install DQS in SQL Server 2012. Today I decided to write a second part of this series where I explain how to use DQS, however, as soon as I started the DQS client, I encountered an error that will not let me pass through and connect with DQS client. It was a bit strange to me as everything was functioning very well when I left it last time. The error was very big but here are the first few words of it. Cannot connect to server. A .NET Framework error occurred during execution of user-defined routine or aggregate “SetDataQualitySessions”: System.Data.SqlClient.SqlException (0×80131904): A .NET Framework error occurred during execution of user-defined routine or aggregate “SetDataQualitySessionPhaseTwo”: The error continues – here is the quick screenshot of the error. As my initial attempts could not fix the error I decided to search online and I finally received a wonderful solution from Microsoft Site. The error has happened due to latest update I had installed on .NET Framework 4. There was a mismatch between the Module Version IDs (MVIDs) of the SQL Common Language Runtime (SQLCLR) assemblies in the SQL Server 2012 database and the Global Assembly Cache (GAC). This mismatch was to be resolved for the DQS to work properly. The workaround is specified here in detail. Scroll to subtopic 4.23 Some .NET Framework 4 Updates Might Cause DQS to Fail. The script was very much straight forward. Here are the few things to not to miss while applying workaround. Make sure DQS client is properly closed The NETAssemblies is based on your OS. NETAssemblies for 64 bit machine – which is my machine is “c:\windows\Microsoft.NET\Framework64\v4.0.30319″. If you have Winodws installed on any other drive other than c:\windows do not forget to change that in the above path. Additionally if you have 32 bit version installed on c:\windows you should use path as ”c:\windows\Microsoft.NET\Framework\v4.0.30319″ Make sure that you execute the script specified in 4.23 sections in this article in the database DQS_MAIN. Do not run this in the master database as this will not fix your error. Do not forget to restart your SQL Services once above script has been executed. Once you open the client it will work this time. Here is the script which I have bit modified from original script. I strongly suggest that you use original script mentioned 4.23 sections. However, this one is customized my own machine. /* Original source: http://bit.ly/PXX4NE (Technet) Modifications: -- Added Database context -- Added environment variable @NETAssemblies -- Main script modified to use @NETAssemblies */ USE DQS_MAIN GO BEGIN -- Set your environment variable -- assumption - Windows is installed in c:\windows folder DECLARE @NETAssemblies NVARCHAR(200) -- For 64 bit uncomment following line SET @NETAssemblies = 'c:\windows\Microsoft.NET\Framework64\v4.0.30319\' -- For 32 bit uncomment following line -- SET @NETAssemblies = 'c:\windows\Microsoft.NET\Framework\v4.0.30319\' DECLARE @AssemblyName NVARCHAR(200), @RefreshCmd NVARCHAR(200), @ErrMsg NVARCHAR(200) DECLARE ASSEMBLY_CURSOR CURSOR FOR SELECT name AS NAME FROM sys.assemblies WHERE name NOT LIKE '%ssdqs%' AND name NOT LIKE '%microsoft.sqlserver.types%' AND name NOT LIKE '%practices%' AND name NOT LIKE '%office%' AND name NOT LIKE '%stdole%' AND name NOT LIKE '%Microsoft.Vbe.Interop%' OPEN ASSEMBLY_CURSOR FETCH NEXT FROM ASSEMBLY_CURSOR INTO @AssemblyName WHILE @@FETCH_STATUS = 0 BEGIN BEGIN TRY SET @RefreshCmd = 'ALTER ASSEMBLY [' + @AssemblyName + '] FROM ''' + @NETAssemblies + @AssemblyName + '.dll' + ''' WITH PERMISSION_SET = UNSAFE' EXEC sp_executesql @RefreshCmd PRINT 'Successfully upgraded assembly ''' + @AssemblyName + '''' END TRY BEGIN CATCH IF ERROR_NUMBER() != 6285 BEGIN SET @ErrMsg = ERROR_MESSAGE() PRINT 'Failed refreshing assembly ' + @AssemblyName + '. Error message: ' + @ErrMsg END END CATCH FETCH NEXT FROM ASSEMBLY_CURSOR INTO @AssemblyName END CLOSE ASSEMBLY_CURSOR DEALLOCATE ASSEMBLY_CURSOR END GO Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Error Messages, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article

Sumproduct using Django's aggregation

- by Matthew Rankin

Question Is it possible using Django's aggregation capabilities to calculate a sumproduct? Background I am modeling an invoice, which can contain multiple items. The many-to-many relationship between the Invoice and Item models is handled through the InvoiceItem intermediary table. The total amount of the invoice—amount_invoiced—is calculated by summing the product of unit_price and quantity for each item on a given invoice. Below is the code that I'm currently using to accomplish this, but I was wondering if there is a better way to handle this using Django's aggregation capabilities. Current Code class Item(models.Model): item_num = models.SlugField(unique=True) description = models.CharField(blank=True, max_length=100) class InvoiceItem(models.Model): item = models.ForeignKey(Item) invoice = models.ForeignKey('Invoice') unit_price = models.DecimalField(max_digits=10, decimal_places=2) quantity = models.DecimalField(max_digits=10, decimal_places=4) class Invoice(models.Model): invoice_num = models.SlugField(max_length=25) invoice_items = models.ManyToManyField(Item,through='InvoiceItem') def _get_amount_invoiced(self): invoice_items = self.invoiceitem_set.all() amount_invoiced = 0 for invoice_item in invoice_items: amount_invoiced += (invoice_item.unit_price * invoice_item.quantity) return amount_invoiced amount_invoiced = property(_get_amount_invoiced)

Read the article

Count(*) vs Count(1)

- by Nai

Hi, just wondering if any of you guys use Count(1) over Count(*) and if there is a noticeable difference for SQL Server 2005 in performance? Or is this just a legacy habit that has been brought forward from days gone past?

Read the article

Saving a select count(*) value to an integer (SQL Server)

- by larryq

Hi everyone, I'm having some trouble with this statement, owing no doubt to my ignorance of what is returned from this select statement: declare @myInt as INT set @myInt = (select COUNT(*) from myTable as count) if(@myInt <> 0) begin print 'there's something in the table' end There are records in myTable, but when I run the code above the print statement is never run. Further checks show that myInt is in fact zero after the assignment above. I'm sure I'm missing something, but I assumed that a select count would return a scalar that I could use above?

Read the article

MySQL Query GROUP_CONCAT Over Multiple Rows

- by PeteGO

I'm getting name and address data out of generic question / answer data to create some kind of normalised reporting database. The query I've got uses group_concat and works for individual sets of questions but not for multiple sets. I've tried to simplify what I'm doing by using just forename and surname and just 3 records, 2 for 1 person and 1 for another. In reality though there are more than 300,000 records. Example of results with qs.Id = 1. QuestionSetId Forename Surname ------------------------------------------------------- 1 Bob Jones Example of results with qs.Id IN (1, 2, 3). QuestionSetId Forename Surname ------------------------------------------------------- 3 Bob,Bob,Frank Jones,Jones,Smith What I would like to see for qs.Id IN (1, 2, 3). QuestionSetId Forename Surname ------------------------------------------------------- 1 Bob Jones 2 Bob Jones 3 Frank Smith So how can I make the 2nd example return a separate row for each set of name and address information? I realise the current way the data is stored is "questionable" but I cannot change the way the data is stored. I can get sets of individual answers but not sure how to combine the others. My simplified Schema that I cannot change: CREATE TABLE StaticQuestion ( Id INT NOT NULL, StaticText VARCHAR(500) NOT NULL); CREATE TABLE Question ( Id INT NOT NULL, Text VARCHAR(500) NOT NULL); CREATE TABLE StaticQuestionQuestionLink ( Id INT NOT NULL, StaticQuestionId INT NOT NULL, QuestionId INT NOT NULL, DateEffective DATETIME NOT NULL); CREATE TABLE Answer ( Id INT NOT NULL, Text VARCHAR(500) NOT NULL); CREATE TABLE QuestionSet ( Id INT NOT NULL, DateEffective DATETIME NOT NULL); CREATE TABLE QuestionAnswerLink ( Id INT NOT NULL, QuestionSetId INT NOT NULL, QuestionId INT NOT NULL, AnswerId INT NOT NULL, StaticQuestionId INT NOT NULL); Some example data for only forename and surname. INSERT INTO StaticQuestion (Id, StaticText) VALUES (1, 'FirstName'), (2, 'LastName'); INSERT INTO Question (Id, Text) VALUES (1, 'What is your first name?'), (2, 'What is your forename?'), (3, 'What is your Surname?'); INSERT INTO StaticQuestionQuestionLink (Id, StaticQuestionId, QuestionId, DateEffective) VALUES (1, 1, 1, '2001-01-01'), (2, 1, 2, '2008-08-08'), (3, 2, 3, '2001-01-01'); INSERT INTO Answer (Id, Text) VALUES (1, 'Bob'), (2, 'Jones'), (3, 'Bob'), (4, 'Jones'), (5, 'Frank'), (6, 'Smith'); INSERT INTO QuestionSet (Id, DateEffective) VALUES (1, '2002-03-25'), (2, '2009-05-05'), (3, '2009-08-06'); INSERT INTO QuestionAnswerLink (Id, QuestionSetId, QuestionId, AnswerId, StaticQuestionId) VALUES (1, 1, 1, 1, 1), (2, 1, 3, 2, 2), (3, 2, 2, 3, 1), (4, 2, 3, 4, 2), (5, 3, 2, 5, 1), (6, 3, 3, 6, 2); Just in case SQLFiddle is down here are the 3 queries from the examples I've linked to: 1: - working query but only on 1 set of data. SELECT MAX(QuestionSetId) AS QuestionSetId, GROUP_CONCAT(Forename) AS Forename, GROUP_CONCAT(Surname) AS Surname FROM (SELECT x.QuestionSetId, CASE x.StaticQuestionId WHEN 1 THEN Text END AS Forename, CASE x.StaticQuestionId WHEN 2 THEN Text END AS Surname FROM (SELECT (SELECT link.StaticQuestionId FROM StaticQuestionQuestionLink link WHERE link.Id = qa.QuestionId AND link.DateEffective <= qs.DateEffective AND link.StaticQuestionId IN (1, 2) ORDER BY link.DateEffective DESC LIMIT 1) AS StaticQuestionId, a.Text, qa.QuestionSetId FROM QuestionSet qs INNER JOIN QuestionAnswerLink qa ON qs.Id = qa.QuestionSetId INNER JOIN Answer a ON qa.AnswerId = a.Id WHERE qs.Id IN (1)) x) y 2: - working query but undesired results on multiple sets of data. SELECT MAX(QuestionSetId) AS QuestionSetId, GROUP_CONCAT(Forename) AS Forename, GROUP_CONCAT(Surname) AS Surname FROM (SELECT x.QuestionSetId, CASE x.StaticQuestionId WHEN 1 THEN Text END AS Forename, CASE x.StaticQuestionId WHEN 2 THEN Text END AS Surname FROM (SELECT (SELECT link.StaticQuestionId FROM StaticQuestionQuestionLink link WHERE link.Id = qa.QuestionId AND link.DateEffective <= qs.DateEffective AND link.StaticQuestionId IN (1, 2) ORDER BY link.DateEffective DESC LIMIT 1) AS StaticQuestionId, a.Text, qa.QuestionSetId FROM QuestionSet qs INNER JOIN QuestionAnswerLink qa ON qs.Id = qa.QuestionSetId INNER JOIN Answer a ON qa.AnswerId = a.Id WHERE qs.Id IN (1, 2, 3)) x) y 3: - working query on multiple sets of data only on 1 field (answer) though. SELECT qs.Id AS QuestionSet, a.Text AS Answer FROM QuestionSet qs INNER JOIN QuestionAnswerLink qalink ON qs.Id = qalink.QuestionSetId INNER JOIN StaticQuestionQuestionLink sqqlink ON qalink.QuestionId = sqqlink.QuestionId INNER JOIN Answer a ON qalink.AnswerId = a.Id WHERE sqqlink.StaticQuestionId = 1 /* FirstName */ AND sqqlink.DateEffective = (SELECT DateEffective FROM StaticQuestionQuestionLink WHERE StaticQuestionId = 1 AND DateEffective <= qs.DateEffective ORDER BY DateEffective DESC LIMIT 1)

Read the article

Concatenation of fields in different rows

- by spender

I'm stuck on an aggregation problem that I can't get to the bottom of. I have some data which is best summarized as follows id |phraseId|seqNum|word ========================= 1 |1 |1 |hello 2 |1 |2 |world 3 |2 |1 |black 4 |2 |2 |and 5 |2 |3 |white I'd like a query that gives back the following data: phraseId|completePhrase ======================== 1 |hello world 2 |black and white Anyone?

Read the article

Java Meta Search Engine API

- by Loki

I'm currently researching Java libraries to help in building a meta type search engine in the sense of being able to replace any given search engine in the back-end of the application or to simultaneously search using multiple search engines. I'm not interested in the GUI part here, just the generalization of search engine APIs and usage. I'd like to know about the common libraries used to achieve this task and if there are any common patterns used in this case. I imagined that this problem is common enough to be able to find plenty of stuff on Google, but it seems like search is a very proprietary domain and not much information is fed back to the community.

Read the article

HOW-TO Include movie showtimes in Google Movies

- by Ofir

I am making the website of a movie theater, and I want Google Movies to index the showtimes I publish on the website. Does anyone know if there's a way to do this?, maybe using microformats? or exposing this information using RSS or something similar?. Thx

Read the article

Django and conditional aggregates

- by piquadrat

I have two models, authors and articles: class Author(models.Model): name = models.CharField('name', max_length=100) class Article(models.Model) title = models.CharField('title', max_length=100) pubdate = models.DateTimeField('publication date') authors = models.ManyToManyField(Author) Now I want to select all authors and annotate them with their respective article count. That's a piece of cake with Django's aggregates. Problem is, it should only count the articles that are already published. According to ticket 11305 in the Django ticket tracker, this is not yet possible. I tried to use the CountIf annotation mentioned in that ticket, but it doesn't quote the datetime string and doesn't make all the joins it would need. So, what's the best solution, other than writing custom SQL?

Read the article

SQL Server: collect values in an aggregation temporarily and reuse in the same query

- by Erwin Brandstetter

How do I accumulate values in t-SQL? AFAIK there is no ARRAY type. I want to reuse the values like demonstrated in this PostgreSQL example using array_agg(). SELECT a[1] || a[i] AS foo ,a[2] || a[5] AS bar -- assuming we have >= 5 rows for simplicity FROM ( SELECT array_agg(text_col ORDER BY text_col) AS a ,count(*)::int4 AS i FROM tbl WHERE id between 10 AND 100 ) x How would I best solve this with t-SQL? Best I could come up with are two CTE and subselects: ;WITH x AS ( SELECT row_number() OVER (ORDER BY name) AS rn ,name AS a FROM #t WHERE id between 10 AND 100 ), i AS ( SELECT count(*) AS i FROM x ) SELECT (SELECT a FROM x WHERE rn = 1) + (SELECT a FROM x WHERE rn = i) AS foo ,(SELECT a FROM x WHERE rn = 2) + (SELECT a FROM x WHERE rn = 5) AS bar FROM i Test setup: CREATE TABLE #t( id INT PRIMARY KEY ,name NVARCHAR(100)) INSERT INTO #t VALUES (3 , 'John') ,(5 , 'Mary') ,(8 , 'Michael') ,(13, 'Steve') ,(21, 'Jack') ,(34, 'Pete') ,(57, 'Ami') ,(88, 'Bob') Is there a simpler way?

Read the article

MYSQL fetch 10 posts, each w/ vote count, sorted by vote count, limited by where clause on posts

- by nibblebot

I want to fetch a set of Posts w/ vote count listed, sorted by vote count (e.g.) Post 1 - Post Body blah blah - Votes: 500 Post 2 - Post Body blah blah - Votes: 400 Post 3 - Post Body blah blah - Votes: 300 Post 4 - Post Body blah blah - Votes: 200 I have 2 tables: Posts - columns - id, body, is_hidden Votes - columns - id, post_id, vote_type_id Here is the query I've tried: SELECT p.*, v.yes_count FROM posts p LEFT JOIN (SELECT post_id, vote_type_id, COUNT(1) AS yes_count FROM votes WHERE (vote_type_id = 1) GROUP BY post_id ORDER BY yes_count DESC LIMIT 0, 10) v ON v.post_id = p.id WHERE (p.is_hidden = 0) ORDER BY yes_count DESC LIMIT 0, 10 Correctness: The above query almost works. The subselect is including votes for posts that have is_hidden = 1, so when I left join it to posts, if a hidden post is in the top 10 (ranked by votes), I can end up with records with NULL on the yes_count field. Performance: I have ~50k posts and ~500k votes. On my dev machine, the above query is running in .4sec. I'd like to stay at or below this execution time. Indexes: I have an index on the Votes table that covers the fields: vote_type_id and post_id EXPLAIN id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY p ALL NULL NULL NULL NULL 45985 Using where; Using temporary; Using filesort 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 10 2 DERIVED votes ref VotingPost VotingPost 4 319881 Using where; Using index; Using temporary; Using filesort

Read the article

How to find the latest row for each group of data

- by Jason

Hi All, I have a tricky problem that I'm trying to find the most effective method to solve. Here's a simplified version of my View structure. Table: Audits AuditID | PublicationID | AuditEndDate | AuditStartDate 1 | 3 | 13/05/2010 | 01/01/2010 2 | 1 | 31/12/2009 | 01/10/2009 3 | 3 | 31/03/2010 | 01/01/2010 4 | 3 | 31/12/2009 | 01/10/2009 5 | 2 | 31/03/2010 | 01/01/2010 6 | 2 | 31/12/2009 | 01/10/2009 7 | 1 | 30/09/2009 | 01/01/2009 There's 3 query's that I need from this. I need to one to get all the data. The next to get only the history data (that is, everything but exclude the latest data item by AuditEndDate) and then the last query is to obtain the latest data item (by AuditEndDate). There's an added layer of complexity that I have a date restriction (This is on a per user/group basis) where certain user groups can only see between certain dates. You'll notice this in the where clause as AuditEndDate<=blah and AuditStartDate=blah Foreach publication, select all the data available. select * from Audits Where auditEndDate<='31/03/10' and AuditStartDate='06/06/2009'; foreach publication, select all the data but Exclude the latest data available (by AuditEndDate) select * from Audits left join (select AuditId as aid, publicationID as pid and max(auditEndDate) as pend from Audit where auditenddate <= '31/03/2009' /* user restrict / group by pid) Ax on Ax.pid=Audit.pubid where pend!=Audits.auditenddate AND auditEndDate<='31/03/10' and AuditStartDate='06/06/2009' / user restrict */ Foreach publication, select only the latest data available (by AuditEndDate) select * from Audits left join (select AuditId as aid, publicationID as pid and max(auditEndDate) as pend from Audit where auditenddate <= '31/03/2009'/* user restrict / group by pid) Ax on Ax.pid=Audit.pubid where pend=Audits.auditenddate AND auditEndDate<='31/03/10' and AuditStartDate='06/06/2009' / user restrict */ So at the moment, query 1 and 3 work fine, but query 2 just returns all the data instead of the restriction. Can anyone help me? Thanks jason

Read the article

SQL GROUP BY - also SELECT not-constant columns

- by Michal Kosek

can someone please help me with this query? I have 2 tables in my database: log_visitors: -------------------- id | host_id log_access: -------------------- visitor | document | timestamp "log_access.visitor" links to "log_visitors.id" Currently, I'm using this query: SELECT log_visitors.host_id , MIM(log_access.timestamp) AS min_timestamp FROM log_access INNER JOIN log_visitors ON (log_access.visitor = log_visitors.id) GROUP BY log_visitors.host_id; to get "MIN(timestamp)" for each "host_id" in the database. HERE'S MY QUESTION: I also need to get "document" for that access with that timestamp... I can't simply add "log_access.document" into SELECT list, since it's not constant and I am not grouping by document... Any ideas?

Read the article

SQL to get distinct statistics

- by Sung Kim

Hi, Suppose I have data in table X: id assign team ---------------------- 1 hunkim A 1 ygg A 2 hun B 2 gw B 2 david B 3 haha A I want to know how many assigns for each id. I can get using: select id, count(distinct assign) from X group by id order by count(distinct assign)desc; It will give me something: 1 2 2 3 3 1 My question is how can I get the average of the all assign counts? In addition, now I want to know the everage per team. So I want to get something like: team assign_avg ------------------- A 1.5 B 3 Thanks in advance!

Read the article

SQL - How to join on similar (not exact) columns

- by BlueRaja

I have two tables which get updated at almost the exact same time - I need to join on the datetime column. I've tried this: SELECT * FROM A, B WHERE ABS(DATEDIFF(second, A.Date_Time, B.Date_Time) = ( SELECT MIN(ABS(DATEDIFF(second, A.Date_Time, B2.Date_Time))) FROM B AS B2 ) But it tells me: Multiple columns are specified in an aggregated expression containing an outer reference. If an expression being aggregated contains an outer reference, then that outer reference must be the only column referenced in the expression. How can I join these tables?

Read the article

dataframe of averagetemperatures with Years in rows and months in columns

- by Maxwell Mkondiwa

I have data with variables minimum and maximum temperatures, month and years(1951-2001). I want to get a dataframe of average temperatures for each month in each year. I want the dataframe to look like this: Year jan feb mar apr may june..... 1951 xx xx 1952 xx 1953 . .

Read the article

composed_of in Rails - when to use it?

- by fig-gnuton

When should you use ActiveRecord's composed_of class method?

Read the article

Testing for existence using SELECT WHERE HAVING and NOT HAVING in a grouped subset

- by IanC

I have data on which I need to count +1 if a particular condition exists or another condition doesn't exist. I'm using SQL Server 2008. I shred the following simplified sample XML into a temp table and validate it: <product type="1"> <param type="1"> <item mode="0" weight="1" /> </param> <param type="2"> <item mode="1" weight="1" /> <item mode="0" weight="0.1" /> </param> <param type="3"> <item mode="1" weight="0.75" /> <item mode="1" weight="0.25" /> </param> </product> The validation in concern is the following rule: For each product type, for each param type, mode may be 0 & (1 || 2). In other words, there may be a 0(s), but then 1s or 2s are required, or there may be only 1(s) or 2(s). There cannot be only 0s, and there cannot be 1s and 2s. The only part I haven't figured out is how to detect if there are only 0s. This seems like a "not having" problem. The validation code (for this part): WITH t1 AS ( SELECT SUM(t.ParamWeight) AS S, COUNT(1) AS C, t.ProductTypeID, t.ParamTypeID, t.Mode FROM @t AS t GROUP BY t.ProductTypeID, t.ParamTypeID, t.Mode ), ... UNION ALL SELECT TOP (1) 1 -- only mode 0 & (1 || 2) is allowed FROM t1 WHERE t1.Mode IN (1, 2) GROUP BY t1.ProductTypeID, t1.ParamTypeID HAVING COUNT(1) > 1 UNION ALL ... ) SELECT @C = COUNT(1) FROM t2 This will show if any mode 1s & 2s are mixed, but not if the group contains only a 0. I'm sure there is a simple solution, but it's evading me right now. EDIT: I thought of a "cheat" that works perfectly. I added the following to the above: SELECT TOP (1) 1 -- only mode 0 & (null || 1 || 2) is allowed FROM t1 GROUP BY t1.ProductTypeID, t1.ParamTypeID HAVING SUM(t1.Mode) = 0 However, I'd still like to know how to do this without cheating.

Read the article

How to avoid timestamp issue in a long query?

- by pingi

Hi, I have the following 2 tables: items: id int primary key bla text events: id_items int num int when timestamp without time zone ble text composite primary key: id_items, num and want to select to each item the most recent event (the newest 'when'). I wrote an request, but I don't know if it could be written more efficiently. Also on PostgreSQL there is a issue with comparing Timestamp objects: 2010-05-08T10:00:00.123 == 2010-05-08T10:00:00.321 so I select with 'MAX(num)' Any thoughts how to make it better? Thanks. SELECT i.*, ea.* FROM items AS i JOIN ( SELECT t.s AS t_s, t.c AS t_c, max(e.num) AS o FROM events AS e JOIN ( SELECT DISTINCT id_item AS s, MAX(when) AS c FROM events GROUP BY s ORDER BY c ) AS t ON t.s = e.id_item AND e.when = t.c GROUP BY t.s, t.c ) AS tt ON tt.t_s = i.id JOIN events AS ea ON ea.id_item = tt.t_s AND ea.cas = tt.t_c AND ea.num = tt.o;

Read the article

How to get top 3 frequencies in MySQL?

- by Amenhotep

Hello, In MySQL I have a table called "meanings" with three columns: "person" (int), "word" (byte, 16 possible values) "meaning" (byte, 26 possible values). A person assigns one or more meanings to each word: person word meaning ------------------- 1 1 4 1 2 19 1 2 7 <-- second meaning for word 2 1 3 5 ... 1 16 2 Then another person, and so on. There will be thousands of persons. I need to find for each of the 16 words the top three meanings (with their frequencies). Something like: word 1: meaning 5 (35% of people), meaning 19 (22% of people), meaning 2 (13% of people) word 2: meaning 8 (57%), meaning 1 (18%), meaning 22 (7%) ... Is it possible to solve this with a single MySQL query? (If this problem is a classic one and has been answered elsewhere, I would appreciate if you could give me a link to the solution.) Thank you very much, ve

Search Results

Search found 754 results on 31 pages for 'aggregate'.

Page 7/31 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14 | Next Page >

- by Shaul

- by Urosh

- by Ben

- by Andrew LeClair

- by David Allan

- by pinaldave

- by Matthew Rankin

- by Nai

- by larryq

- by PeteGO

- by spender

- by Loki

- by Ofir

- by piquadrat

- by Erwin Brandstetter

- by nibblebot

- by Jason

- by Michal Kosek

- by Sung Kim

- by BlueRaja

- by Maxwell Mkondiwa

- by fig-gnuton

- by IanC

- by pingi

- by Amenhotep

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14 | Next Page >