I recently encountered a problem (again) where I have a query for a report that has two different levels of aggregations of numbers. I know, tisk..tisk..tisk, but in my world this happens with a fairly high frequency. Dealing with these different levels of aggregations in a report is a real problem, with no good solution within SSRS, at least that I have found. But, since this is something I bump into from time to time, I thought I should figure out a way to deal with it that I can use over and over.
SSRS Datasets with different levels of aggregation
Using the AdventureWorks database, let me begin by clarifying the problem. Using a simple query designed to return order information of SubTotal, TaxAmt, Freight, and TotalDue, and also return order detail information of ProductID and ProductLineTotal. Here is the query and the result set.
As you can see the order level numbers for SubTotal, TaxAmt, etc. repeat their values for each detail row. Pretty basic SQL, and a pretty basic result set. When you pull this kind of query and result set into an SSRS report it can get pretty sloppy. With the product information as the detail records and grouping on the sales order the standard aggregations for the order level don’t really work, but the detail aggregations are fine.
Your order level numbers are multiplied by the number of rows of detail. Not really surprising, but definitely a problem.
Now, at this level it would be easy to change the order level numbers in 2 ways. Either use AVG instead of SUM, or simply drop the aggregation of those numbers all together. In the sample below, I dropped the aggregation.
Then the order level numbers are correct, and obviously the sum of the product detail still works fine.
The real problem occurs when you want to add totals at the bottom of your report. You can’t sum the numbers because the same thing will happen with the totals for the report as happened for the totals for the order.
I have tried all sorts of variations to SSRS expressions to get the correct numbers here: Averages, Sums divided by Counts, Sum by ReportItem rather than field, RunningValue. None will work reliably, and most return some sort of error like this:
SSRS Datasets with different levels of aggregation can be resolved by using Row_Number() when generating the dataset.
I have searched high and low for a solution within the confines of SSRS. So far I have not been able to come up with a solution. Usually when I can’t find a solution to a reporting problem, I go back to the data believing that there must be a way to change the dataset so that I can get numbers I can use on the report and have everything come out correctly. Looking back at my first image, there are really only 2 conclusions I could come to: First is to somehow average the order level numbers across the detail so that when they were summed the number was correct. Second is to return only one row of data at the order level and return null or zero (definitely zero) for the rest of the rows corresponding to the product detail. Looking back at my original query, I started by adding only one row with a Row_Number().
WITH cteOrderInfo AS ( SELECT soh.SalesOrderID ,soh.[Status] ,soh.OnlineOrderFlag ,soh.SalesOrderNumber ,soh.AccountNumber ,soh.CustomerID ,soh.SubTotal ,soh.TaxAmt ,soh.Freight ,soh.TotalDue ,sod.ProductID ,sod.LineTotal ,ROW_NUMBER() OVER(PARTITION BY soh.SalesOrderID ORDER BY soh.SalesOrderID) AS [RowNumber] FROM Sales.SalesOrderHeader AS soh INNER JOIN Sales.SalesOrderDetail AS sod ON soh.SalesOrderID = sod.SalesOrderID WHERE soh.SalesOrderID > 75000 ) SELECT SalesOrderID ,[Status] ,OnlineOrderFlag ,SalesOrderNumber ,AccountNumber ,CustomerID ,CASE WHEN RowNumber = 1 THEN SubTotal ELSE 0 END AS SubTotal ,CASE WHEN RowNumber = 1 THEN TaxAmt ELSE 0 END AS TaxAmt ,CASE WHEN RowNumber = 1 THEN Freight ELSE 0 END AS Freight ,CASE WHEN RowNumber = 1 THEN TotalDue ELSE 0 END AS TotalDue ,ProductID ,LineTotal FROM cteOrderInfo
Then I made my original query into a Common Table Expression (CTE). Using the CTE as the basis for my final query I used a simple CASE statement to evaluate if the RowNumber was equal to one, if yes then return the value, if no then return zero.
Without getting into a long explanation of how ROW_NUMBER() works, there are plenty of blogs on this topic, here is what I did. In the OVER section I have PARTITION BY and ORDER BY. The ORDER BY orders the set to be given row numbers by the SalesOrderID. If I ran the query with just the ORDER BY, I would get a result set with a sequence of numbers numbering 1 to the number of total rows returned in the result set. The PARTITION BY creates a break and restart in the sequence so that I get row numbers from 1 to the number of lines for each order, and then the process repeats over again for the next order, and so on.
Taking my CTE with row numbers, and a Case statement to limit order level values to only the first row gives me a result set that looks like this:
With a result set like this I can now go back to my report design, and revert to the original design where all order level information is aggregated with a SUM, and the report totals are also aggregated with a sum.
More importantly, the report totals now come out with the correct numbers.
No more errors about aggregating report items incorrectly. No more errors about nesting or somehow illegally combining aggregations. Nothing but good numbers can be found on the report.
I hope you find this information useful. Thanks for reading.