Expanding on the conversation from this blog written by Arpita Ghosh, How to Differ CALCULATETABLE over FILTER in DAX, I wanted to clearly layout the different combinations you can come across, particularly when used with table variables.
Here I demonstrate 6 ways to look at these:
CalculateTable() with AddColumns() Inside
CalculateTable() with AddColumns() Outside
CalculateTable() with AddColumns() Outside with Calculate()
Filter() with AddColumns() Inside
Filter() with AddColumns() Outside
Filter() with AddColumns() Outside with Calculate()
The Alien Says... Do not use #2 or #3. All other methods can be correct depending on your situation, but #2 and #3 just need to be avoided. They return incorrect results as we can see below.
- Synopsis -
Method | Results in... | Use or Lose |
1) CalculateTable with AddColumns Inside | Rows filtered. Values filtered. | Use it |
2) CalculateTable with AddColumns Outside | Rows unfiltered. Values unfiltered. | Lose it |
3) CalculateTable with AddColumns Outside with Calculate | Rows unfiltered. Values filtered. | Lose it |
4) Filter with AddColumns Inside | Rows filtered. Values unfiltered. | Use it |
5) Filter with AddColumns Outside | Rows filtered. Values unfiltered. | Use it |
6) Filter with AddColumns Outside with Calculate | Rows filtered. Values filtered. | Use it |
Filter - Always returns rows filtered. Make sure to wrap CALCULATE() over any calculations you are doing within the table being filtered on to ensure context transition.
CalculateTable - has several implications:
If you pass in a table as a variable, you need a CALCULATE() for any calculations on that table to ensure context transition.
Any filters used in CalculateTable WILL NOT flow through to a table passed in as a variable. This is because table variables are constants, and as constants, cannot be changed.
Any filters used in CalculateTable WILL flow through to a table directly used in CalculateTable, and Filters will also apply to any calculations (you do not need to wrap it in a CALCULATE()).
See the bottom of the article for an in-depth explanation of why FILTER and CALCULATE return different things.
Here's a diagram to line things up.
What is the difference between FILTER and CALCULATETABLE?
Still confused? Take a look at the examples below to clear things up.
1) CALCULATETABLE() with ADDCOLUMNS() Inside
VAR _CTable =
CALCULATETABLE(
ADDCOLUMNS(
VALUES(Products[ColorName]),
"@countofrows", COUNTROWS(Products)
),
Products[ColorName] = "Blue"
)
RETURN
_CTable
Returns the following:
ColorName | @countofrows |
Blue | 200 |
2) CALCULATETABLE with ADDCOLUMNS Outside as a Variable (do not use this method!)
VAR _MyTable =
ADDCOLUMNS(
VALUES(Products[ColorName]),
"@countofrows", COUNTROWS(Products)
)
VAR _CTable =
CALCULATETABLE(
_MyTable,
Products[ColorName] = "Blue"
)
RETURN
_CTable
Returns the following:
ColorName | @countofrows |
Silver | 2517 |
Blue | 2517 |
White | 2517 |
Red | 2517 |
3) CALCULATETABLE with ADDCOLUMNS Outside as a Variable Using Calculate() (do not use this method!)
VAR _MyTable =
ADDCOLUMNS(
VALUES(Products[ColorName]),
"@countofrows", CALCULATE(COUNTROWS(Products))
)
VAR _CTable =
CALCULATETABLE(
_MyTable,
Products[ColorName] = "Blue"
)
RETURN
_CTable
Returns the following:
ColorName | @countofrows |
Silver | 417 |
Blue | 200 |
White | 505 |
Red | 99 |
4) FILTER with ADDCOLUMNS Inside
VAR _FTable =
FILTER(
ADDCOLUMNS(
VALUES(Products[ColorName]),
"@countofrows", COUNTROWS(Products)
),
Products[ColorName] = "Blue"
)
RETURN
_FTable
Returns the following:
ColorName | @countofrows |
Blue | 2517 |
5) FILTER with ADDCOLUMNS Outside as a Variable
VAR _MyTable =
ADDCOLUMNS(
VALUES(Products[ColorName]),
"@countofrows", COUNTROWS(Products)
)
VAR _FTable =
FILTER(
_MyTable,
Products[ColorName] = "Blue"
)
RETURN
_FTable
Returns the following:
ColorName | @countofrows |
Blue | 2517 |
5) FILTER with ADDCOLUMNS Outside as a Variable Using CALCULATE()
VAR _MyTable =
ADDCOLUMNS(
VALUES(Products[ColorName]),
"@countofrows", CALCULATE(COUNTROWS(Products))
)
VAR _FTable =
FILTER(
_MyTable,
Products[ColorName] = "Blue"
)
RETURN
_FTable
Returns the following:
ColorName | @countofrows |
Blue | 200 |
Why does FILTER() properly filter a table variable, whereas CALCULATETABLE() does not?
The first thing you need to understand is that CALCULATETABLE is fundamentally different than FILTER. CALCUALTETABLE leverages the storage engine to modify the filter context. FILTER uses the formula engine to evaluate, row by row, a filter condition. Consider the following:
CALCULATETABLE(
VALUES( Products[Product] ),
Products[ColorName] = "Blue"
)
CALCULATETABLE first modifies the filter context of the Products[ColorName] column to where ColorName is "Blue", THEN evaluates and returns the table (the order of execution is backwards). Furthermore, the table Products cannot be a variable. Consider the following - this code does not work:
VAR _MyProducts = VALUES( Products[Product] )
RETURN
CALCULATETABLE(
_MyProducts ,
Products[ColorName] = "Blue"
)
It does not work because Products[Product] has already been evaluated outside of CALCULATETABLE and saved as a variable. CALCULATETABLE cannot modify the filter context over a variable (variables are constants), so the resulting table contains all rows, unfiltered.
FITLER, on the other hand, does not work by modifying the filter context. It logically scans the table row by row for the condition you give it. Consider the following:
FILTER(
VALUES( Products[Product] ),
Products[ColorName] = "Blue"
)
This returns the Products column filtered for Blue products - same as CALCULATETABLE. Now consider the following:
VAR _MyProducts = VALUES( Products[Product] )
RETURN
FILTER(
_MyProducts ,
Products[ColorName] = "Blue"
)
Because FILTER does not operate by modifying the filter context, it can scan the table (even as a variable) and filter it with no problem. So the result is a table filtered for blue products.
Should I use CALCULATETABLE or FILTER?
Given the option, CALCULATETABLE is generally faster and more efficient than FILTER. The reason is because CALCULATETABLE leverages the storage engine to modify the filter context and apply the filter condition. FILTER on the other hand, use the formula engine to go row by row to apply its filtering conditions, which tend to be slower.
The one you should use depends on the situation. Indeed, FILTER is invoked by CALCULATE at the end of the day. Consider the following:
CALCULATE(
[Sales Amount],
Products[ColorName] = "Blue"
)
This is internally translated to:
CALCULATE(
[Sales Amount],
FILTER(
ALL( Products[ColorName]),
Products[ColorName] = "Blue"
)
)
Why doesn't the engine use CALCUALTETABLE here? Well, CALCULATE is already telling the engine to modify the filter context. It wouldn't make sense to use anything other than FILTER. Technically, you could force it to use CALCULATETABLE, but again, the filter within CALCULATETABLE would then (again) be translated to use FILTER.
In other words, CALCULATE/CALCULATETABLE tell the engine, "hey, use the storage engine to apply this FILTER condition before I make this calculation".
Conclusion
To be clear, I'm not claiming that FILTER is irrelevant; it is very important and DAX uses it in the most essential way; Indeed, FILTER is required for the language to work. This article is to demonstrate how FILTER and CALCULATETABLE are similar, but fundamentally different at the same time. The appropriate function to use is situational, but as long as you understand what you are doing, you should be fine.
Hopefully this clears up some of the differences between CalculateTable and Filter. Cheers!
Further readings
If you're interested in more optimization techniques, check out my other blog post here or check out SQLBI's book Optimizing DAX on Amazon.