top of page
Antares Studio Games Icon_Small.png

The Antares Alien

Power BI   |  DAX

How to Optimize DAX in Power BI

Updated: Jul 22


Optimizing your DAX code can be a challenging (even monumental) task. Not only is it frustrating waiting for measures to finish running, poorly performing measures end up costing you money (if you're a business that runs on the PBI service).


My measure is slow! How do I make my measure run faster? What is the best way to optimize DAX?


Optimizing DAX

Optimizing DAX by Alberto Ferrari and Marco Russo

I recently read Optimizing DAX by Alberto Ferrari and Marco Russo (AKA the DAX gods of our time). Let me start by saying that this book is a treasure trove of information. It's like a gift that keeps on giving. Seriously, if you have an interest in learning more DAX, you will not be disappointment. With each page turn comes a surprising fact that changes my understanding of what is going on under the hood.


To quote the very first sentence in the book, "The amount of knowledge required to optimize any measure is massive." Indeed, the more I read, the more I realized how little I know. However, there are some general techniques to try out.







In this post, I've laid out five generic ways you can optimize your DAX code to run as efficiently as possible. Keep in mind, however, performance varies greatly across machines and models. If you have DAX Studio to look at the server timings and query plans, that will be very useful as well.


  1. Remove context transition within iterations

  2. Avoid time intelligence functions

  3. Push logic away

  4. Create calculated columns for very complex filters

  5. Use IF.EAGER()



(1) Remove Context Transition Within Iterations

Whenever you reference another measure, you are invoking context transition (an implicit CALCULATE() is placed over the measure). While this is fine in most scenarios, it becomes a huge bottleneck when used within a large iteration (>5000 rows).


BAD

DEFINE MEASURE
Sales[Adjusted Sales]=
	SUMX(
		Sales,
		[Sales Amount] * 1.02
	)

Why it's bad: the [Sales Amount] measure is unnecessarily invoking context transition. We can replace this with the columns available in the row context.


OPTIMIZED

DEFINE MEASURE
Sales[Adjusted Sales]=
	SUMX(
		Sales,
		Sales[Net Price]  Sales[Quantity] * 1.02
	)

Why it's better: we no longer invoke context transition for each row of sales.



(2) Avoid Time Intelligence Functions

Time intelligence is convenient but behind the scenes it requires materializing each date within the filter context. Creating your own custom logic to define date ranges in your measure can generally improve performance.


REGULAR

DEFINE MEASURE
Sales[Sales Previous Month]=
	CALCULATE(
		[Sales Amount],
		PREVIOUSMONTH( Date[Date] )
	)

Why it's bad: time intelligence functions materialize a row for each date in the filter context. This can sometimes be unnecessary.

OPTIMIZED

DEFINE MEASURE
Sales[Sales Previous Month]=
	CALCULATE(
		[Sales Amount],
		Date[Year Month Index] = MAX( Date[Year Month Index] ) - 1
	)

Why it's better: simply subtracting 1 from a [year month index] column introduces a simpler query plan, with only one row needed (month level instead of day level).



(3) Push logic away

Conditional logic like IF and SWITCH cannot be used by the storage engine (which is very fast) and is handled instead by the formula engine (which is slow). Re-writing your IF condition to be a standard filter within CALCULATE can have a dramatic affect on performance. If you're unable to do this, "pushing" the logic to the outermost iteration is a best practice.


BAD

DEFINE MEASURE
Sales[Big Quantity Sales]=
	SUMX(
		Sales,
		IF ( Sales[Quantity] > 3, Sales[Net Price] * Sales[Quantity])
	)

Why it's bad: this logic invokes a callback from the storage engine to the formula engine for each row in Sales.

OPTIMIZED

DEFINE MEASURE
Sales[Big Quantity Sales]=
	CALCULATE(
		[Sales Amount],
		Sales[Quantity] > 3
	)

Why it's better: changing the logic to a filter condition allows the storage engine to pull rows accordingly, and thus no callback is needed.



(4) Create a calculated column for very complex filters

If your logic is very complex (uses multiple columns, and, or, >, <, etc.) it may be better to create a calculated column to handle the logic at refresh time instead of at calculation time. Your model size will increase but is usually a small price to pay for better query performance.


BAD

DEFINE MEASURE
Sales[Targeted Customer Sales]=
	CALCULATE(
		[Sales Amount],
		FILTER(
			ALL( DimCustomer ),
			FIND( "500", [Phone], 1, -1) <> -1
			&& [CustomerType] = "Person"
			&& [YearlyIncome] > 50000
			&& [DateFirstPurchase] >= DATE( 2003, 1, 1)  
		)
	)

Why it's bad: involving multiple columns with complex logic in a filter will typically involve callbacks and potentially a large materialization.

OPTIMIZED

DEFINE MEASURE
Sales[Targeted Customer Sales]=
	CALCULATE(
		[Sales Amount],
		Customer[Targeted Customer] = 1 //Logic pushed to customer table
	)

Why it's better: relying on a single, pre-computed column filter results in a much simpler query plan.



(5) Use IF.EAGER()

This tip may or may not improve speed. As always, it depends on your model and columns you are aggregating by. With that being said, if you're testing condition A and returning either B or C, you may try out IF.EAGER() instead of the regular IF(). This will be particularly helpful if you are slicing and dicing by many columns in your visuals (or a high cardinality column).


REGULAR

DEFINE MEASURE
Sales[Average Product Sales]=
	AVERAGEX(
		VALUES ( Product ),
		IF ( [Condition A], [United States Sales], [European Sales] )
	)

Why it's (potentially) bad: This logic results in multiple storage engine queries. Depending on the complexity of the measures in each branch, and the distribution of data, this query plan can result in slow performance.


OPTIMIZED (potentially)

DEFINE MEASURE
Sales[Average Product Sales]=
	AVERAGEX(
		VALUES ( Product ),
		IF.EAGER ( [Condition A], [United States Sales], [European Sales] )
	)

Why it's (potentially) optimized: This logic results in a single storage engine request, with the condition measure, and each branch's measure computed at the same time. While doing more work up front, the subsequent logic become less work.



Other techniques

Of course there are many other ways to optimize your DAX. Here are a few of those:

  • Filter columns, not tables.

  • Aggregate your data to a larger grain.

  • ​Avoid many-many relationships.

  • Avoid filtering on large cardinality columns.

  • Invert any nested iterations.

  • SUMX over DISTINCT instead of DISTINCTCOUNT.



I only went surface deep for the techniques described in this article, but to really understand WHY these help performance requires understanding the formula and storage engine, and even the query plans. Those topics were outside the scope of this post but maybe I'll go into those details on a different day.


Do you have a favorite optimization technique? Or do you have any questions about the ones listed above? Post it in the comments!

Comments


  • Facebook
  • Twitter
  • Instagram
bottom of page