Tableau Create Bins from a Calculated Field

//

In Tableau, do you want to create bins from a calculated field?

This post will show you how to create bins from a measure calculated field and bins from a dimension calculated field.

You will learn two ways to created the bins from a measure calculated field, the first using a FIXED LOD calculation; the second – from the days where FIXED didn’t exist – using a data blend.

How to create bins in Tableau

First of all start with the straightforward, creating standard bins without any calculations.

Right-click on any non-calculated field measure and Create – Bins.

However, if you want variable bin sizes or bins from a calculated field, the answer is to create the bins using a calculated field.

How to Create bins from a measure calculated field in Tableau

In Tableau, there are 2 ways to create static bins from a calculated field. Static bins don’t change when manipulating the data.

The first of these is using a FIXED Level of Detail calculation, which is by far the simplest option.

The second way, which comes from the days before LOD calculations were possible, is using a data blend.

An example scenario: calculate events per listing

To begin we have a simple dataset containing only 3 fields: Month, ListingID and EnquiryCount.

We want to create bins to group listings on the total EnquiryCount per listing.

For example it’s useful to see how many properties had 1-10 enquiries, how many had 11-20 enquiries, etc.

Using this data source, there are two pieces of information to find:

1. The number of listings within each enquiry bin
2. The monthly distribution of enquiries received per bin. For example listings in bin 1 received 20% of enquiries in Month 1, 50% in Month 2 and the remaining 30% in Month 3; Listings in bin 2 received 50% of enquiries in Month 1, 25% in Month 2 and 25% in Month 3; etc.

Firstly I want to calculate number of enquiries per listing over the entire time period. Then I want to use the result of this calculation for the bins.

In other words assume ListingId “1” had an EnquiryCount of 10 in Month 1, 2 enquiries in Month 2 and 8 enquiries in Month 3, Therefore ListingId “1” received 20 enquiries in total so would be in the bin 11-20.

Once I can put each listing in the appropriate bin next I want to see, for each group, what % of total enquiries came in month 1, month 2 and month 3.

Returning to the example, ListingId “1” received 50% of enquiries in Month 1, 10% in Month 2 and 40% in Month 3.

Create bins in Tableau from a calculated field using a FIXED LOD

To begin, create a calculated field to calculate and ‘fix’ the number of enquiries per listing:

`FixedEnqCount: {FIXED [ListingId]: SUM([EnquiryCount])}`

Using a FIXED calculation effectively creates a static value for each listing. Even filters, as long as they are out of Context, don’t affect the output of a FIXED calculation.

Using that calculated field, next create the bins.

Create the bins using the FIXED calculated field

Use the FixedEnqCount field, group the enquiries per listing into different size bins.

In this example they are put into groups of 10, until they reach 40 enquiries, where they are grouped into the same 40+ bin.

Male the bin field, [FixedEnqCountBins], a dimension. A benefit of a FIXED calculation is that it can be either a measure or a dimension. Drag the FIXED bin calculation to dimensions to convert, should it default to a measure.

To answer the both questions, start by putting the bin field dimension on the Columns shelf.

To answer question 1, the number of listings per bin, put the distinct count of [ListingId] on the view.

For question 2, the monthly distribution of enquiries per listing per bin, put the bin dimension on Columns, and sum the EnquiryCount on rows.

Next put the month on the Colour shelf and it’s more or less complete. To show the % of total, simply click the drop down in the EnquiryCount pill on Rows. Select “Quick Table Calculation” – “Percent of Total”. Set the Compute Using to Month.

Create bins from a calculated field using a Tableau data blend

This method is far more complex than using FIXED. This post was first written when FIXED was nothing but a pipe-dream and some things, that are simple today, were very difficult to achieve.

I have to give a huge thanks to Richard Leeke, who provided the solution. My alternative was to pre-calculate the data outside of Tableau.

The challenge without a FIXED option is keeping the bins static. The bins will recalculate and can return a different result to that expected. Data blending is a way to prevent that recalculation.

Create the calculated fields and bins

The first thing to do is create a calculated field to sum the enquiries:

`EnqCountCalc: SUM(EnquiryCount)`

Use the result of this to create the bins in another calculated field.

We will call the bin calculation [EnqCountBins], which is below:

This bin field can be used without any problems for 1 dimensional reporting.

Count the number of listings within each bin

We want to know the number of unique ListingIDs within each bin. To find this create another calculated field:

`ListingCount: COUNTD([ListingID])`

Put the EnqCountBins pill to the Columns shelf and the ListingCount pill to the Rows shelf. That gives the listing count. Both pills should display like AGG([CalcField]).

You should notice the bins don’t display correctly.

To split the listing counts out into bins drag the ListingID pill on to the Level Of Detail and that completes the chart.

The complications begin when we want to use the result of the bin calculation AND perform further calculations. This will affect the result, the bins calculate as we drag and drop.

Calculate the monthly distribution of enquiries per bin

As mentioned earlier I’m trying to calculate, for each Enquiry Count bin, within which months they received enquiries. This is where the real complexity begins.

Manipulating the chart causes the bins to recalculate

To get the enquiry counts per bucket, drag the EnquiryCount pill to Rows shelf in the place of ListingCount.

This should be a SUM(EnquiryCount) calculation.

Next, we want to split the enquiry counts by month.

The obvious thing to do is to drag the Month pill into the Colour shelf.

Do that and you see things start to go wrong.

The bins recalculate dependent on the fields in the view. Now the bin calculation is taking the Month into account.

Check the difference in the bar sizes with Month on Colour vs without Month on Colour.

The above isn’t wanted, but there is a solution: Data Blending. In effect is like a self join in SQL.

Blending the data should prevent the re-calculation of the bins as they are coming from a different data source.

To recap there are two pieces of information to find:

1. The number of listings within each enquiry bin
2. The monthly distribution of enquiries received per bin. For example listings in bin 1 received 20% of enquiries in Month 1, 50% in Month 2 and the remaining 30% in Month 3; Listings in bin 2 received 50% of enquiries in Month 1, 25% in Month 2 and 25% in Month 3; etc.

The solution for Number 1 is above.

Number 2 is far more complex. This is due to the bins recalculating when I bring in the month.

Our bins are a measure and not a dimension, meaning they are not static, they change depending on the view.

To do #2 we need to use data blending, which can be similar to a SQL self join.

Data Blending to make the measure static

To blend (join) data sets first we need at least 2 data connections in the workbook.

For this particular task I want to Self Join. In other words, blend the same data set to itself.

This is because we need to assign a fixed enquiry count bin for each ListingId, we don’t want it to re-calculate.

Therefore, create the bins in one dataset while the calculations are occur out in the other.

Set up the data blend

The first thing to do is right click in the area where the Data connections are stored (top left of your workbook) and select Duplicate.

Now you’ll have two copies of the same data source in your workbook.

Next step is to alter the relationship between the two data sources. The the two data sources should ONLY be connected by ListingID and NOT both ListingID and Month, which is the default join.

The Relationships are set in the Data menu.

Select the Custom option and remove the Month, which is entered as a default join.

Using the blended data sources

The reason for joining the two data sources is to make the bins static in one of the datasets.

Thereby allowing data manipulation in the other data source without affecting the bins.

This solution leans heavily into advanced Table Calculations, which are described further into the article.

The first thing is to place the EnqCountBins pill from secondary source on the Columns shelf. This “locks” in the bins.

All other data will come from the primary data set.

Note the orange tick mark against the EnqCountBins pill. This indicates it’s from a secondary data source of that sheet.

When data blending it is important to know the data is coming from the intended data source.

Sometimes, especially when dealing with identical data sources, it is simple to accidentally use data from the ‘wrong’ data source.

This chart looks exactly the same as when only using 1 data source. The difference is we can now drag the Month pill into the Colour shelf to split the enquiry count by month of enquiry. Note the bars are the same size.

Calculate the % monthly distribution per bin

The next challenge is to show these values as a %.

For each bin we want to know 2 things:

1. How many enquiries happened in total within that bin
2. How many enquiries happened each month in each bin

This means using the Window_Sum table calculation, which sums everything in a partition. This whitepaper on the Tableau site is useful should you want more information on table calculations and partitions.

Number 1, the number of enquiries in total within each bin, is relatively simple:

`EnqCountPerBin: WINDOW_SUM(SUM(EnquiryCount))`

By default this partitions across the table – in other words by each bin, which is exactly what is want.

Number 2, the number of enquiries per month in each bin, is more complicated.

We need to partition by month and also by bin.

For this we also use the FIRST function to ensure only one value is returned per partition.

Without FIRST the same value will return multiple times making the chart look strange, with multiple values piled on top of each other.

EnqCountForMonth:

``````IF (FIRST()==0) THEN
WINDOW_SUM(SUM([EnquiryCount]))
END``````

That formula returns only the first instance of the window_sum; preventing the multiple values on top of each other.

Set the calculation partitions

Unfortunately there’s more to do than just dragging and dropping the calculated fields.

Firstly drag the EnqCountForMonth pill to the Rows shelf – depending on the settings of AGG([EnqCountBins]) this could give you a strange looking graph.

If you are seeing weird results first thing to do is right-click on the AGG([EnqCountsBins]) pill and make sure the option ‘Ignore in Table Calculations’ is NOT selected.

Next thing is to check the [EnqCountForMonth] is calculating along the correct partitions. Right-click into the pill and set it to compute along the ListingId so the description is as follows:

If the description isn’t exactly like this check the ‘Ignore in Table Calculations’ setting again.

Next step is to test the EnqCountPerBin is working properly.

Drag this pill into the Label section.

It should show the total enquiry count for the bin duplicated for each month but to make this work we alter the way it calculates along the partitions.

Again right click into the pill and select Compute Using Advanced. Make sure the settings of this pill are as follows:

If all has gone well the chart will now look like this:

The next step is to create a calculation using these values to get the % of enquiries per month per bin, so each bin will equal 100%.

`PercentPerMonthPerBin: [EnqCountForMonth]/[EnqCountPerBin]`

Finish the chart

Put this new field on the Rows shelf, replacing EnqCountForMonth, and the visualisation is almost complete. Finally format the PercentPerMonthPerBin pill as a %, remove the label and the chart is now complete.

It should look like this:

Both the solution using FIXED and using the Data Blend are below.

Create bins from a dimension in Tableau

We covered this, but not explicitly, in the section creating the bins using a FIXED calculation.

With a numeric dimension there isn’t the option to create bins. However, there is a way to do it. Use a calculated field.

To create the bins use an IF…THEN…ELSEIF…THEN…ELSE…END function.

Example:

``````IF [CalculatedField] < 10 THEN "10"
ELSEIF [CalculatedField] < 20 THEN "20"
ELSEIF [CalculatedField] < 30 THEN "30"
ELSE ">=30"
END``````

2 thoughts on “Tableau Create Bins from a Calculated Field”

1. This problem can also be solved by using a fixed calculation level method. By creating ‘EnqCountCalc’ using a fixed level calculation you can have Tableau create the bins. Here is the Calculation example:

{Fixed ListingID: sum([EnquiryCount])}
This formula will sum all the inquires per ListingID across all the months, and keeping it fixed.

Next you can use Tableau’s menu to create the bins on this calculation.