Best Book to Learn Data Visualization

Sep 25, 2018

Updated: Jul 15, 2024

In this blog, I'll attempt to use some of the major principles in Cole Knaflic's book "Storytelling With Data" to create impactful, clean and clear graphs in Power BI. The concept's described in the book are focused on explanatory graphs. This is in sharp contrast to Microsoft's tool, Power BI, where the interactive elements of the graphs drive an exploratory experience.

Explanatory Graphs almost always include an action statement. They describe a course of action or recommendation for their audience. Well prepared explanatory graphs generally highlight a specific section of the graph in order to draw the audience's eyes to that spot.

Exploratory Graphs are usually reserved for the data analyst. They are a tool used to explore possible relationships and correlations. Power BI typically provides these types of graphs due to the interactive nature of how the reports work (as well as a huge selling point).

Power BI offers a unique way for us to tell a story with data. In addition to being able to highlight specific portions of our graphs, we can also drill down into the data. This is like accessing another dimension in terms of story telling with data. Having this extra dimension is great, but it can also be a double edged sword. Some users may be put off by the idea of having to explore the data themselves. Thankfully, Power BI offers some other tools to assist us in these cases as well.

1. Use Enlighten Data Story custom visual with RANKX.

We typically use a text box for explaining parts of our visuals for users to get a better understanding of what's going on.

However, since it's ultimately just static text, we need to go in and manually update it whenever the data changes. To remedy this situation, we can replace it with a special visual which will dynamically update when the data changes, the Enlighten Data Story visual. I'll get to this later on though.

The important part is to come up with a message before hand. In our above message, we want to highlight the company with the largest profit ratio, and also comment on its sales number. We can use the RANKX formula to get a ranking of our segments, but before that, I'll create a new table summarizing the segments (with Power Query). The following M code references the main data table, and summarizes it by sum of sales and sum of profit.

let

    Source = Finance_Data, 
    #"Grouped Rows" = Table.Group(Source, {"Segment"}, {
        {"Sales", each List.Sum([Sales]), type number},
        {"Profit", each List.Sum([Profit]), type number}
     })
in
    #"Grouped Rows"

We now have a summarized table to work with:

It would probably be best to go ahead and add additional columns for averages, min, max and the like here, but for simplicity I just kept it to just sum values.

Back in the report editor, I added a column for Profit Ratio based on this new table (just the profit divided by sales). Now we are ready to add a RANKX formula to get our ranks!

- RANKX -

I wanted to see a column of ranks, so I added another calculated column as opposed to a measure. I'm using the following formula to get the profit ratio ranking:

Profit Ratio Rank = 
    RANKX('Segment Summary', 'Segment Summary'[Profit Ratio], , DESC)

I'm just passing in the table I want to use (Segment Summary) and the column I want to base the rank off of (Profit Ratio). Lastly, I'm specifying that I want to order descending (which is also the default if you do not specify).

I added one more column to get the rank for Sales as well using the same method.

Almost done here... as a reminder, the goal is to create a statement about the segment leader, as well as comment on its sales. I'll create one more calculated column, but it'll return a string.

I want to return a string, because words have a bigger impact than cold hard numbers. I want to say something like "Sales for [Segment] ...is the Biggest, ...is the Smallest, ...is Average", etc.. To do this, I'll calculate the MAX and MEDIAN rank values (I don't need MIN since it's always going to be 1) for Sales rank. I'll then compare those numbers to the row level Sales rank value to determine a string value to output.

Sales Rank Analysis Column =
 
// Max
var maxRank = CALCULATE(
  MAX('Segment Summary'[Sales Rank]), All('Segment Summary'))
 
// Median
var midRank = CALCULATE( 
  MEDIAN('Segment Summary'[Sales Rank]), All('Segment Summary'))
 
// Current Row's Rank
var curRank = 'Segment Summary'[Sales Rank]
 
return
 
if(curRank = 1, "and is Biggest",
    if(curRank = maxRank, "but is Smallest",
    if(curRank = midRank, "and is Average",
    if(curRank > midRank, 
     "and is Above Average", 
     "but is Below Average"
    )))
   )

Alien Tip: That last step is where you can potentially insert action statements based on preset thresholds. For example: If X_Sales < $5K, "X needs more training this year.", else "Y needs more training this year." You could also use a LookUp() function if you have a fact table, which may be more efficient.

- Enlighten Data Custom Visual -

Okay finally! We have our ranks, and we have our partial string output. How can we use these? We could create a measure which concatenates the values with our ranks and such, but there's is a really nice custom visual on the marketplace called "Enlighten Data Story". This is just what we need!

Using this custom visual, we can create the following text!

This Enlighten Data Story visual automatically concatenates the fields we pass in as values. Under the formatting options within the "Story" section, we can enter the text to complete the sentence. By default, it uses the pound symbol (#) to represent the fields you passed in from your data. You can add multiple #'s for as many fields you pass into the visual.

This is a great first step in providing explanatory visuals for our users while retaining a dynamic property for when the data is updated.

2. Remove Clutter

This is another big one that Cole talks about in her book. After reading, I was surprised at myself for how much stuff I included on my own visuals that was unnecessary.

For the most part, the default display for visuals in Power BI do a good job of providing nice, clean looking graphs. However, there are some times when the expected result is less than optimal. I've found that this is particularly true when using a line graph where there are only 2 years of data. You may get something like this; and quite frankly... I never liked the look of it.

It's not bad... But I feel it somehow lacks purpose. I'm not sure what conclusion to draw from looking at it. My initial thought is, "just two random lines..." Of course there is a legend to aid us in what it means, and there are axis numbers, but there's still some disconnect.

- The Problem -

The problem here is clutter. There are too many lines, the y-axis doesn't really help us except for providing a scale of the numbers, and the legend also just kind of blends in with the title. All of this ends up becoming a distraction and results in a graph with no clear focus.

The first thing we can do is remove the gridlines. Because there are only two years, the gridlines tend to compete with the data lines.
I removed the y-axis and added data point labels. This provide a clearer picture while reducing clutter.
I changed the legend to a vertical layout. This provides some contrast to the mostly horizontal lines.
I Increased the font size of the title to provide some since of hierarchy.
I shortened the graph to draw more attention to the slope of the graphs.

We could also add a statement about the graph as well, similar to what we did above, to add more clarity and impact.

Those were two things you can do to add explanatory elements to your dashboard. Let me know what you thought in the comments or if you have any tips you'd like to share! Or, you can check out my other blog post that goes more in depth about storytelling concepts here: 5 Steps to Storytelling with Data

If you are interested in learning more about storytelling with data, please check out Cole's book: Storytelling With Data on Amazon. She also has a blog where she writes about various topics surrounding data visualization at storytellingwithdata.com.