Category Archives: Tableau

Distance Calculation

I was tasked with an interesting request recently at work. It was to find how many hotels of an external brand were near one of our brands’ hotels. For those not familiar with the hotel industry, we leverage a data provider called STR. So no crazy web scrapers needed. We have all the hotel data required for the task.

The key fields required here are a unique identifier for each hotel and their respective latitude and longitude. My resulting data would need to have a row of every external brand hotel for each internal brand hotel. First, I created two tables in the database. One for all internal and the other for the external brand. Querying both tables together without specifying a join relation results in a row of every external brand hotel for each internal hotel. The only thing I needed to add was a calculation to determine the distance between two lat/lon coordinates.

We’re a Vertica shop. So I checked the reference guide and found the DISTANCEV function. It is as simple as passing in all four coordinates and it returns the distance in kilometers. I just multiplied that by 0.621371 to convert to miles. The task was to find number of external hotels within a 2 mile and 7 mile radius per internal hotel. There are a number of ways to get this but I decided to use Tableau. You never know when requirements will change and a viz will then be asked for. I pass this as a custom SQL query in Tableau’s data window.

screenhunter_43-sep-20-21-32

The output needed to be a list of our internal hotels with two columns. One counting the number of external hotels within 2 miles and another within 7 miles. Using Tableau’s Level of Detail function, I created the following.

if {fixed [choice_id], [external_str_nbr] : sum([distance])} <=2 then 1 else 0 end

For every unique pair of internal and external hotel ids, I check the distance between the two and if it is less than or equal to 2, I sum 1. This is repeated with less than or equal to 7 calculated field.

My end result is as follows:

screenhunter_43-sep-21-09-16

The way I have this setup, under 2 miles are also included in under 7. This was a fun task. I’ve never done anything related to distance measurement. Luckily, there are tools that make it so simple.

 

Share Button

Org Chart in Tableau

Since starting my new position, I’ve been talking about my work and department with my wife. I set out to create an organizational chart to give her more context when I speak about certain things. There are plenty of tools that facilitate org charts. Microsoft alone has Visio and Powerpoint. Heck, you could even use Excel or Word. But hey, I’m a Tableau user, so let make this more interesting.

Org charts in Tableau are not native. I searched the web and came across this thread on Tableau’s Ideas section. It would be interesting if Tableau added this as a Show Me feature. Perhaps they will someday. So I used this thread as a starting point on how to accomplish my goal.

The key to an org chart is connecting points. Let’s also not forget that the data structure is important when it comes to Tableau. Each data point needs two rows of data. You’re drawing a line to connect dots, so you need to have one row for the starting location (direct report) and one row for the end location (manager). Here’s how I set up my data:

ScreenHunter_01 Sep. 12 10.47

I have three individuals in this sample. Name1 is the top of the chart, so note the blank X, Y coordinates in the repeated row. This is because Name1 does not have a line to connect to its manager. I assigned each person an ID and mapped their manager ID. Only records in Order 1 have a Display Name and Display Title. This is used in the Label shelf so the records aren’t displayed twice. The hardest part of this is the X and Y coordinates. Depending on the size and branches of your org chart, you have to play around with their values. On a simple chart, you could probably automate the generation of these with Excel formulas.

The entire data file is stored here.

On to building the chart. Place the X values on the column shelf and the Y values on the row shelf. Go to Analysis and deselect Aggregate Measures. You should have something like this.

ScreenHunter_02 Sep. 12 11.06

Add the Y values to the rows again and enable dual axis with synchronized axis. On the Marks card, I set the first Y values to Line and the second to Square. Now you have something like this. It’s close but obviously needs a little more work.

ScreenHunter_06 Sep. 12 16.48

Each pair of records needs an unique ID. This ID will connect the hierarchy in the proper manner, instead of the mess above. We have this unique value as the ID column. Place the ID column in the Details shelf. From here you’re pretty much done. All that is left is formatting and labeling the values.

ScreenHunter_07 Sep. 12 16.53

Here’s my finished product. I placed this single sheet into a dashboard and set it to legal landscape. Click the image to navigate to the Tableau Public version if you wish to download the workbook.

Organizational Chart

Thanks for tuning in. Hopefully one day, I’ll be the at the top of this chart :).

Share Button

Tableau in the office – episode 3

I’ve been revising some old dashboards at work that definitely are due for a refresh. One of the updates is to demonstrate Year over Year for a given metric. I wanted the calculation for current year to compare Jan 1 to date against previous year. But if the user selects to view YoY of previous years, I wanted to show full year difference, not Jan 1 to date. Below was my approach.

For sample data, I created an Excel file with random values per date from 1/1/2011 to 5/31/2015. You can download a copy of it here. To get my desired result, it will take one parameter and three calculated fields in the Filter shelf.

First, I created a Parameter for Year.

ScreenHunter_01 May. 20 16.07

Then I created a formula to only show the year selected in the Parameter and the year before it. This is a boolean expression and I place it on the Filter shelf, selecting the True result.

[Year] = year([Date]) or [Year] – 1 = year([Date])

In my scenario, I only want to show YoY of a year at a time. And since YoY is created with a table calculation for percent difference from previous, I needed a way to keep the result but hide the previous year in the view since it is blank. I used a simple index() calculation and placed this in the Filter shelf, selecting the value “2”.

Next, the following formula creates a calculated field which again is used in the Filter shelf. I’ve broken down the formula with the //comments.

//If a data point’s date is this year and in past months, keep it.
if [Year] = datepart(‘year’, today())
and datepart(‘month’,[Date]) < datepart(‘month’, today())
then ‘keep’
//if a data point’s date is in this year and in current or future months, filter it.
elseif [Year] = datepart(‘year’, today())
and datepart(‘month’,[Date]) > datepart(‘month’, today())
then ‘filter’
//if a data point’s date is in past years, keep it.
elseif [Year] < datepart(‘year’, today())
then ‘keep’
else ‘filter’
end

Place this calculated field in Filter shelf and select “keep”. Here are the three calculated fields in the Filter shelf. Any field I create strictly for filtering, I name with a “f.” prefix so I know its intended purpose.

ScreenHunter_02 May. 21 09.57

Now I place the Units measure in the Text shelf, and change the Sum(Units) to a Percent Difference table calculation.

ScreenHunter_03 May. 21 10.02

I also created a “label” field so the text changes depending on the Year parameter. This is to make it clear to the audience that YoY for current year is a Jan 1 to date comparison over previous year. But previous year to previous year minus 1 comparisons are Full Year over Full Year.

if [Year] = datepart(‘year’,today()) then “YoY” else “FYoFY” END

Add this “label” field to the Text shelf and lastly, add Year(Date) to the Text shelf too. Once all my text fields are in place, I edit the label as follows:

ScreenHunter_06 May. 21 10.09

The final result is below. Change the Parameter to view the differing YoY or FYoFY calculations. I included a table with the sample data.

Thanks for tuning in!

Share Button

March Madness Upsets

March Madness is one of my favorite sporting events. The excitement, the format, the emotion, the buzzer beaters, it encapsulates everything that is great about sports. So as I was watching the games these past two days, I started to think about all the upsets I have seen in my life. Naturally, a viz idea was born.

The below is summary of all March Madness upsets since 1985, the current format. I grabbed the data from this wikipedia page using Kimono, a great and simple tool to extract structured data from the web. Please note, the lowest seed considered an upset is 11. So no 10 vs. 7 or 9 vs. 8 match ups are present.

 

As for my personal brackets this year, the word of the day is #busted.

Share Button

Tableau in the office – episode 2

Time for another Tableau in the Office post. A quick viz that I worked on was for a department head that wanted to see how much a service had declined in recent years. The decline is due to advances in technology, therefore the service becoming obsolete. I took a look at the data (sample data used here) and created a simple line chart by plotting Quarter (Continuous) with Quantity.

ScreenHunter_01 Jan. 31 15.22

This is pretty straightforward. But I wanted to show the total percent decrease for 2014 Q4 since 2013 Q1. This is where I employed a Table Calculation. Table calcs in Tableau are extremely powerful. They can range for very simple to exceptionally complex. I try to read as much as I can about table calcs and encourage others as well.

I created a calculated field that used the LOOKUP function. This is a great way to isolate exactly what is needed as you can specify the offset. The formula is written as LOOKUP(expression, offset). So LOOKUP(SUM(QTY), FIRST() + 3) would return the fourth value in my partition.

Since I only needed the first and last values, I didn’t require an offset. Here’s how my formula looks:

ScreenHunter_02 Jan. 31 15.33

In pseudocode: (QTY at 2014 Q4 – QTY at 2013 Q1)/ QTY at 2013 Q1

I formatted the value as a percentage and placed this calculated field in the Detail shelf so I can use it in the chart.

ScreenHunter_03 Jan. 31 16.02

Right click on the last mark, select Annotate > Mark

ScreenHunter_04 Jan. 31 16.05

Edit the annotation as desired.

ScreenHunter_05 Jan. 31 16.06

Here’s my final viz. Simple and effective in illustrating the decrease of a service over the last two years.

ScreenHunter_06 Jan. 31 16.08

Thanks for tuning in!

Share Button

Tableau in the office

I’m going to start sharing some things I do with Tableau in my professional life. I’ve done this once with my Time Duration post. Hopefully, “Tableau in the Office” becomes a recurring series and provides ideas/tips/tricks that you can employ in your daily Tableau routine.

This week I was asked by a colleague for some data on their jobs queue. As my division is the content supply chain for Discovery, incoming deliveries must be QC’d to ensure they meet our technical specifications. We call this process, Technical Evaluation or TE. Our media management system tracks these TE jobs with multiple date fields. We have job created date and job completed date. This is a high volume operation so new jobs are created every day as the team completes jobs in the queue.

To illustrate how I solved this, I’ve mocked up data in the following excel file. This file contains three columns: Created Date, Completed Date and Job Number. I randomly generated 1500 dates between January 1, 2014 and October 31, 2014 to populate the Created Date column. Then I populated the Completed Date column by randomly adding 1 to 15 days to the Created Date value.

I connected to my data source using the following custom SQL technique. For excel file, this would be:

SELECT
[Sheet1$].[created date] AS [date],
‘CREATED’ AS [SOURCE],
[Sheet1$].[job_num] AS [job_num]
FROM [Sheet1$]

UNION SELECT
[Sheet1$].[completed date] AS [date],
‘COMPLETED’ AS [SOURCE],
[Sheet1$].[job_num] AS [job_num]
FROM [Sheet1$]

The important thing to note here is that the first select statement labels the Created date as Date and the second labels Completed Date as Date. This way I have both date values in the same column when I bring the data into Tableau. Also, I added a SOURCE field to designate where the data is coming from.

Once in Tableau, I made a bar chart with Stack Marks turned off. Placed SOURCE in the color and size shelf. This is a great way to see how many jobs are being created versus completed for a given date range. I’m filtering out November in the view below.

job dates

This was my take on this problem. Let me know if you have done something similar in a different manner.

Share Button

Tableau UTC Conversion

In one of my earlier posts, I wrote about how to manually convert UTC time to Eastern time, adjusting for the Standard vs Daylight Savings. I’ve found this formula useful for some of SQL work but recently needed to incorporate it into Tableau.

So I decided to share the following calculated field. Simply cut and paste the formula below into your favorite text editor and Replace All the [date_field] with your date column name.

if date([date_field]) > #13-03-2011# and date([date_field]) < #06-11-2011# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #11-03-2012# and date([date_field]) < #04-11-2012# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #10-03-2013# and date([date_field]) < #03-11-2013# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #09-03-2014# and date([date_field]) < #02-11-2014# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #08-03-2015# and date([date_field]) < #01-11-2015# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #13-03-2016# and date([date_field]) < #06-11-2016# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #12-03-2017# and date([date_field]) < #05-11-2017# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #11-03-2018# and date([date_field]) < #04-11-2018# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #10-03-2019# and date([date_field]) < #03-11-2019# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #08-03-2020# and date([date_field]) < #01-11-2020# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #14-03-2021# and date([date_field]) < #07-11-2021# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #13-03-2022# and date([date_field]) < #06-11-2022# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #12-03-2023# and date([date_field]) < #05-11-2023# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #10-03-2024# and date([date_field]) < #03-11-2024# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #09-03-2025# and date([date_field]) < #02-11-2025# then DATEADD(‘hour’,-4,[date_field])
else DATEADD(‘hour’,-5,[date_field]) end

Please note, this converts UTC to US Eastern time zone. So if you’re in another time zone, be sure to adjust the DATEADD values. I hope this helps.

Share Button

Time Duration in Tableau

Time conversion seems to be pretty popular topic for Tableau. I recently worked on a project where the users wanted to display media duration in hh:mm:ss format. I looked around the Tableau forum and saw others have had this issue too. As long as the duration does not exceed 24 hours, this formula works just fine DATETIME(SUM([total dur secs]/86400)). And then use Tableau Date formatting to change to hh:mm:ss. But if duration will exceed 24 hours, this formula will not work. In my case, duration will exceed 24 hours. So here’s how I tackled this problem.

First, I needed to calculate duration in seconds before I could create the required format. Working in media, time codes go down to the frames level. So first I converted frames into aggregated seconds based on the asset frame rate, [total dur secs] =  SUM([PROGRAM_DURATION]/29.97)

Now that my [total dur secs] field is in total seconds we can begin. There are probably many ways to perform this but this was the approach I took.

First, I created a [total hours] calculated field.

IF INT([total dur secs]/60/60) >=1 THEN
STR(IIF(INT[total dur secs]/60/60)> 9, “”, “0”)) + STR(INT[total dur secs]/60/60))
ELSE “”
END

The first line tests if the duration is greater than or equal to one hour. If that is true, then the second line tests if the duration is greater than 9 hours. If that is true, then create a blank, if false, then pad with a zero. Since I’m concatenating, the duration is converted to strings. If the total duration is not greater than an hour, then blank.

Next, I created a [total minutes : seconds] calculated field.

IF INT([total dur secs]/60) >=60 THEN
STR(IIF((INT([total dur secs]/60)-INT([total hours])*60)> 9, “”, “0”)) + STR(INT([total dur secs]/60)-INT([total hours])*60)
+ “:” +
STR(IIF(ROUND([total dur secs]INT([total dur secs]/60)*60)> 9, “”, “0”)) +
STR(ROUND([total dur secs]INT([total dur secs]/60)*60))
ELSEIF INT([total dur secs]/60) >=1 THEN
STR(IIF(INT([total dur secs]/60)> 9, “”, “0”)) + STR(INT([total dur secs]/60))
+ “:” +
STR(IIF(ROUND([total dur secs]INT([total dur secs]/60)*60)> 9, “”, “0”)) +
STR(ROUND([total dur secs]INT([total dur secs]/60)*60))
ELSE “00:” + STR(IIF(ROUND([total dur secs])> 9, “”, “0”)) + STR(ROUND([total dur secs]))
END

Let’s break down what is happening in the calculated field. The following portion below checks if [total dur secs] is greater than or equal to 60 minutes. Because I want time formatted as hh:mm:ss and I don’t want the minutes portion to exceed 60. So I subtract out [total hours] as minutes. Again, using the IIF to pad the extra zero as needed.

IF INT([total dur secs]/60) >=60 THEN
STR(IIF((INT([total dur secs]/60)-INT([total hours])*60)> 9, “”, “0”)) + STR(INT([total dur secs]/60)-INT([total hours])*60)

The next section below is calculating the seconds portion of the duration.

STR(IIF(ROUND([total dur secs]INT([total dur secs]/60)*60)> 9, “”, “0”)) +
STR(ROUND([total dur secs]INT([total dur secs]/60)*60))

I have to round [total dur secs] because I don’t want decimals. Then similar to the minutes calculation, I want to subtract out minutes that exceed 60. The IIF again pads for zeros as necessary. 

The ELSEIF and ELSE basically repeat the same calculation, checking if minutes are greater or equal to 1 or zero respectively.

Lastly, I bring [total hours] and [total minutes : seconds] together in a new calculated field, [Total Duration].

[total hours]+ IIF([total hours]=””,[total minutes : seconds],”:”+ [total minutes : seconds])

So just as I’m about to finish this set of calculations, a new message appears in my inbox. I subscribe to Tableau Zen Master – Jonathan Drummey’s blog, and he just posted a much simpler version of time duration on, Drawing with Numbers. I guess I just prefer the scenic route. Check it out his work though, it is great.

How have you handled time duration formatting in Tableau?

 

Share Button

My First Speaking Engagement

This past week, Alteryx and Tableau hosted a joint event in Bethesda, MD. My Alteryx sales representative called me asking if I would present a use case at the meeting. I was honored and decided to give it a go. This is the first time I’ve been asked to present outside my organization so it was a bit nerve racking. I really wanted to do a great job at telling Discovery’s successes with these tools, with hopes that others too share their experience during the networking happy hour.

If you’re not familiar with Alteryx, you should be. This a great tool that is capable of endless data possibilities. My organization only recently purchased it and it already solved a roadblock I had encountered. This is what my presentation was about. With Alteryx, we were able to join 7 disparate data sources into a single Tableau Data Extract file and create a management dashboard for our Media Operations team. While I won’t go into the entire use case here, a colleague kindly took a few pictures.

My introduction slide into the global reach of Discovery Communications.

presentation3

Discussing one of the final dashboards thanks to our Alteryx and Tableau solution. presentation

Alteryx was kind of enough to post about the event on their blog If you’re able to catch one of the future events, I highly recommend it. 

Like I said, I really enjoyed this experience, even though I was anxious about it all week. I hope to get my Alteryx skills up to where I am with Tableau. I believe we can solve a great deal of data challenges by leveraging both tools in tandem.

I also got to meet a fellow data blogger, which was exciting. Here’s a picture of Emily and me afterwards. Link to her blog.

IMG_20140410_174338

I plan on posting about Alteryx in the future on my data adventures here. Stick around for more!

Share Button

Cloud Data Connection

How many of you out there use cloud storage services such as Box, Dropbox, or Google Drive? Exactly, we all do.

It would be amazing if Tableau Desktop could connect to a data source that is stored in the cloud without the need to download it first. I know at my organization, we are moving more towards cloud storage and away from local storage. I envision the ability to connect via the download url these services provide, but the data is not downloaded. It would act just like a regular local storage connection. Taking this further, be able to publish cloud stored data sources to Tableau Server and set refresh schedules. That would be fantastic.

So if you’re for this idea, please vote for it on Tableau’s site!

http://community.tableausoftware.com/ideas/3204

And share with others!!

Share Button