Post World Cup Viz

So I’m a little late to the party but I wanted to do a quick write up about the World Cup. As a Brazilian, I had deep emotions leading up to this tournament.

I made this quick bar chart that visually displays my interest level in the World Cup at different stages.

As you can clearly see, I was all in until Brazil decided to have a churrasco on the field instead of play defense. You can’t win them all but I really wanted this one.

Until the next post!

 

 

 

Share Button

World Cup TV Schedule

Ok, so this is not a post about data but I love soccer and being born in Brazil, I’m very excited about today. Below is the TV listing for group play and round of 16 games on Eastern Time zone in the US.

Enjoy!

First Round

Thursday, June 12

4 p.m.: Brazil vs. Croatia at Arena Corinthians, Sao Paulo (ESPN)

Friday, June 13

12 p.m.: Mexico vs. Cameroon at Estadio das Dunas, Natal (ESPN2)

3 p.m.: Spain vs. Netherlands at Arena Fonte Nova, Salvador (ESPN)

6 p.m. Chile vs. Australia at Arena Pantanal, Cuiaba (ESPN2)

Saturday, June 14

12 p.m.: Colombia vs. Greece at Estádio Mineirao, Belo Horizonte (ABC)

3 p.m.: Uruguay vs. Costa Rica at Estádio Castelao, Fortaleza (ABC)

6 p.m.: England vs. Italy at Arena Amazonia, Manaus (ESPN)

9 p.m.: Ivory Coast vs. Japan at Arena Pernambuco, Recife (ESPN)

Sunday, June 15

12 p.m.: Switzerland vs. Ecuador at Estádio Nacional Mane Garrincha, Brasilia (ABC)

3 p.m.: France vs. Honduras at Estádio Beira-Rio, Porto Alegre (ABC)

6 p.m.: Argentina vs. Bosnia-Herzegovina at Estadio do Maracana, Rio de Janeiro (ESPN)

Monday, June 16

12 p.m.: Germany vs. Portugal at Arena Fonte Nova, Salvador (ESPN)

3 p.m.: Iran vs. Nigeria at Arena da Baixada, Curitiba (ESPN)

6 p.m.: Ghana vs. United States at Estadio das Dunas, Natal (ESPN)

Tuesday, June 17

12 p.m.: Belgium vs. Algeria at Estádio Mineirao, Belo Horizonte (ESPN)

3 p.m.: Brazil vs. Mexico at Estadio Castelao, Fortaleza (ESPN)

6 p.m.: Russia vs. South Korea at Arena Pantanal, Cuiaba (ESPN)

Wednesday, June 18

12 p.m.: Australia vs. Netherlands at Estadio Beira-Rio, Porto Alegre (ESPN)

3 p.m.: Spain vs. Chile at Estadio do Maracana, Rio de Janeiro (ESPN)

6 p.m.: Cameroon vs. Croatia at Arena Amazonia, Manaus (ESPN)

Thursday, June 19

12 p.m.: Colombia vs. Ivory Coast at Estadio Nacional Mane Garrincha, Brasilia (ESPN)

3 p.m.: Uruguay vs. England at Arena Corinthians, Sao Paulo (ESPN)

6 p.m.: Japan vs. Greece at Estadio das Dunas, Natal (ESPN)

Friday, June 20

12 p.m.: Italy vs. Costa Rica at Arena Pernambuco, Recife (ESPN)

3 p.m.: Switzerland vs. France at Arena Fonte Nova, Salvador (ESPN)

6 p.m.: Honduras vs. Ecuador at Arena da Baixada, Curitiba (ESPN)

Saturday, June 21

12 p.m.: Argentina vs. Iran at Estadio Mineirao, Belo Horizonte (ESPN)

3 p.m.: Germany vs. Ghana at Estadio Castelao, Fortaleza (ESPN)

6 p.m.: Nigeria vs. Bosnia Herzegovina at Arena Pantanal, Cuiaba (ESPN)

Sunday, June 22

12 p.m.: Belgium vs. Russia at Estadio do Maracana, Rio de Janeiro (ABC)

3 p.m.: South Korea vs. Algeria at Estadio Beira-Rio, Porto Alegre (ABC)

6 p.m.: United States vs. Portugal at Arena Amazonia, Manaus (ESPN)

Monday, June 23

12 p.m.: Netherlands vs. Chile at Arena Corinthians, Sao Paulo (ESPN2)

12 p.m.: Australia vs. Spain at Arena da Baixada, Curitiba (ESPN)

4 p.m.: Croatia vs. Mexico at Arena Pernambuco, Recife (ESPN)

4 p.m.: Cameroon vs. Brazil at Estadio Nacional Mané Garrincha, Brasilia (ESPN2)

Tuesday, June 24

12 p.m.: Italy vs. Uruguay at Estadio das Dunas, Natal (ESPN)

12 p.m.: Costa Rica vs. England at Estadio Mineirao, Belo Horizonte (ESPN2)

4 p.m.: Japan vs. Colombia at Arena Pantanal, Cuiabá (ESPN)

4 p.m.: Greece vs. Ivory Coast at Estadio Castelao, Fortaleza (ESPN2)

Wednesday, June 25

12 p.m.: Nigeria vs. Argentina at Estadio Beira-Rio, Porto Alegre (ESPN)

12 p.m.: Bosnia Herzegovina vs. Iran at Arena Fonte Nova, Salvador (ESPN2)

4 p.m.: Ecuador vs. France at Estadio do Maracana, Rio de Janeiro (ESPN2)

4 p.m.: Honduras vs. Switzerland at Arena Amazonia, Manaus (ESPN)

Thursday, June 26

12 p.m.: United States vs. Germany at Arena Pernambuco, Recife (ESPN)

12 p.m.: Portugal vs. Ghana at Estadio Nacional Mane Garrincha, Brasilia (ESPN2)

4 p.m.: South Korea vs. Belgium at Arena Corinthians, Sao Paulo (ESPN)

4 p.m.: Algeria vs. Russia at Estadio Beira-Rio, Porto Alegre (ESPN2)

Round of 16

Saturday, June 28

12 p.m.: 1A vs. 2B at Estadio Mineirao, Belo Horizonte (ABC)

4 p.m.: 1C vs. 2D at Estadio do Maracana, Rio de Janeiro (ABC)

Sunday, June 29

12 p.m.: 1B vs. 2A at Estadio Castelao, Fortaleza (ESPN)

4 p.m.: 1D vs. 2C at Arena Pernambuco, Recife (ESPN)

Monday, June 30

12 p.m.: 1E vs. 2F at Estadio Nacional Mane Garrincha, Brasilia (ESPN)

4 p.m.: 1G vs. 2H at Estadio Beira-Rio, Porto Alegre (ESPN)

Tuesday, July 1

12 p.m.: 1F vs. 2E at Arena Corinthians, Sao Paulo (ESPN)

4 p.m.: 1H vs. 2G at Arena Fonte Nova, Salvador (ESPN)

Share Button

My First Speaking Engagement

This past week, Alteryx and Tableau hosted a joint event in Bethesda, MD. My Alteryx sales representative called me asking if I would present a use case at the meeting. I was honored and decided to give it a go. This is the first time I’ve been asked to present outside my organization so it was a bit nerve racking. I really wanted to do a great job at telling Discovery’s successes with these tools, with hopes that others too share their experience during the networking happy hour.

If you’re not familiar with Alteryx, you should be. This a great tool that is capable of endless data possibilities. My organization only recently purchased it and it already solved a roadblock I had encountered. This is what my presentation was about. With Alteryx, we were able to join 7 disparate data sources into a single Tableau Data Extract file and create a management dashboard for our Media Operations team. While I won’t go into the entire use case here, a colleague kindly took a few pictures.

My introduction slide into the global reach of Discovery Communications.

presentation3

Discussing one of the final dashboards thanks to our Alteryx and Tableau solution. presentation

Alteryx was kind of enough to post about the event on their blog If you’re able to catch one of the future events, I highly recommend it. 

Like I said, I really enjoyed this experience, even though I was anxious about it all week. I hope to get my Alteryx skills up to where I am with Tableau. I believe we can solve a great deal of data challenges by leveraging both tools in tandem.

I also got to meet a fellow data blogger, which was exciting. Here’s a picture of Emily and me afterwards. Link to her blog.

IMG_20140410_174338

I plan on posting about Alteryx in the future on my data adventures here. Stick around for more!

Share Button

Simple Data Organization

My brother sent me this gif and I wanted to share with everyone. Simple but effective. Thanks Felipe.

datatables

 

 

Share Button

Cloud Data Connection

How many of you out there use cloud storage services such as Box, Dropbox, or Google Drive? Exactly, we all do.

It would be amazing if Tableau Desktop could connect to a data source that is stored in the cloud without the need to download it first. I know at my organization, we are moving more towards cloud storage and away from local storage. I envision the ability to connect via the download url these services provide, but the data is not downloaded. It would act just like a regular local storage connection. Taking this further, be able to publish cloud stored data sources to Tableau Server and set refresh schedules. That would be fantastic.

So if you’re for this idea, please vote for it on Tableau’s site!

http://community.tableausoftware.com/ideas/3204

And share with others!!

Share Button

The other side of data

This past Sunday, 60 Minutes aired a story on Data Brokers and the selling of your personal information. The report details the companies who package up to 1500 unique attributes about an individual and sell it to other companies. Technology has made this practice rather easy for others to exploit. Simply by browsing the web, these data brokers are creating a portfolio of the single user. To marketers, this is extremely valuable information. Strategic targeted marketing campaigns can not only save expenses for an organization but also potentially have a higher conversion rate.

We all know about the NSA surveillance program that came into light last year. That’s a topic for another blog post. But the fact that corporations are doing the same in the name of profit makes it feel different. In many ways, this isn’t news per-say.  Most of us use discount cards when shopping at grocery stores or have memberships to Costco/Sam’s Club style stores. Of course these companies are collecting our buying habits. The difference is their data collection is not their main source of income.

These data brokers have no other product or service. Their sole source of income is you. How does that make you feel? Kind of freaks me out a bit, not going lie.

Share Button

Manual UTC Conversion

During a project last year I was creating a FACT table from one of many databases in my organization used to track program information. Using SQL Server, one of the fields that I wanted to pull was Air Date/Time of a program. When I first created my fact table, I noticed that the Air Date/Time field was in UTC. I suspect this database was created with a time OFFSET field but I didn’t have schema documentation for it.

I decided create my own conversion calculation to East Coast time. I’ve been working with SQL for a little time now but I’m no expert. So the challenge I faced was adjusting for Standard versus Day Light Savings because UTC does not change. I looked up the dates when we adjust our clocks from 2011 – 2025. The rest was the following SQL CASE statement:

CASE
WHEN [Table].[Date Field] BETWEEN ’2011-03-13 07:00:00:000′ AND ’2011-11-06 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2012-03-11 07:00:00:000′ AND ’2012-11-04 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2013-03-10 07:00:00:000′ AND ’2013-11-03 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2014-03-09 07:00:00:000′ AND ’2014-11-02 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2015-03-08 07:00:00:000′ AND ’2015-11-01 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2016-03-13 07:00:00:000′ AND ’2016-11-06 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2017-03-12 07:00:00:000′ AND ’2017-11-02 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2018-03-11 07:00:00:000′ AND ’2018-11-04 06:00:00:000 THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2019-03-10 07:00:00:000′ AND ’2019-11-03 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2020-03-08 07:00:00:000′ AND ’2020-11-01 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2021-03-14 07:00:00:000′ AND ’2021-11-07 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2022-03-13 07:00:00:000′ AND ’2022-11-06 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2023-03-12 07:00:00:000′ AND ’2023-11-05 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2024-03-10 07:00:00:000′ AND ’2024-11-03 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
WHEN [Table].[Date Field] BETWEEN ’2025-03-09 07:00:00:000′ AND ’2025-11-02 06:00:00:000′ THEN DATEADD(hour,-4,[Table].[Date Field])
ELSE DATEADD(hour,-5,[Table].[Date Field]) END AS [Date Name]

I first created a parameter in Tableau to let the user decide to view the data between standard and daylight savings. But I wanted the users to avoid any confusion between the viz and any other area where one can see program air date/time. Secondly, for Tableau best data practices, do heavy data lifting calculations before connection to Tableau. Software like SQL Server are built for this type of data manipulation. Especially when working with millions of records, long winded calculations can slow Tableau performance.

Thanks to my manager for encouraging this post. I hope it saves anyone time who needs to convert UTC date fields. 

Share Button

Tableau Server Tip

As an administrator of Tableau Server, you’re granted access to six dashboards that show a variety of server activity/performance/utilization information. These dashboards are connected to Tableau’s internal PostgreSQL database. What if you’d like other server users to access these admin dashboards but you do not want to grant those users Administrative rights? There are four easy steps to accomplish this.

  1. Create a PostgreSQL password
  2. Find and copy the tabbed_admin_views workbook
  3. Open workbook and update database connections with your server and password
  4. Publish workbook

Step 1:

  • Log onto your machine that runs Tableau Server and in Command Line navigate to the bin folder
  • Type tabadmin db pass [password]
  • For [password] input your desired password
  • Type tabadmin configure
  • Type tabadmin restart

You’ve now set a password to the internal PostgreSQL database. You’re free to connect to it from a new workbook and poke around. Note, if you have not done so, you’ll need to install PostgreSQL database drivers.

Step 2:

Thank you Adolph Barclift, DC-VA Tableau User Group Co-Chair for the assistance on this step. 

While logged in to your machine running Tableau Server, open folder …Tableau\Tableau Server\[version]\wgserver\z5\WEB-INF\admin

Make a copy of the tabbed_admin_views.twb and save to desired location.

ScreenHunter_01 Feb. 08 12.44

Step 3:

Open the workbook and click OK when prompted to log in, you’ll get an error and click Yes to edit the connection.

Complete the connection dialog with your credentials. Port, Database and Username should remain unchanged.

ScreenHunter_02 Feb. 08 13.03

This workbook has eight separate connections so you’ll have to repeat the process seven more times.

Step 4:

Publish workbook, embedding credentials, and set your security as desire.

Congratulations! You now have your own administrative dashboards. What’s even better is that you can create custom admin dashboards! Check out Russell Christopher’s series on Tableau History Tables for more details on PostgreSQL schema.

Share Button

Year over Year Quick Tip

Over the years as a Tableau user I’ve learned to adjust my vizzes depending on the audience. Displaying Year over Year is an important metric to many at my organization. Here is a quick and easy way that I use often.

I’m using the Superstore sample data that ships with Tableau. Place Order Date in the Columns shelf and Sales in the Rows shelf, change the Marks to Bar.

ScreenHunter_06 Feb. 01 20.01

Now grab Sales again and place it in the Label Marks. Hover over the right edge of the pill until you see the triangle. Click it and move down to Quick Table Calculation > Percent Difference

ScreenHunter_09 Feb. 01 20.05

And that’s it! Tableau will use the earliest date as the base line and calculate the YoY change.

ScreenHunter_10 Feb. 01 20.06

I like this view because the viz utilizes the Y axis to display the Sales in dollars, but the label shows the YoY growth. The two metrics combined provide more context to your audience.

One thing to keep in mind is that if current year is not over, filter for months. Let’s say today is currently June 2013 and management wants to know how Sales are tracking against previous years.

ScreenHunter_12 Feb. 01 20.09

Hold right click as you drag Order Date into the Filter shelf. Filter on Months and only include January to May. Very simple!

Share Button

Quarterback Evaluation

The single most important position, in possibly all team sports, is the quarterback. Since the early 70s, quarterback performance has been measured by a calculation called Passer Rating. The Passer Rating was commissioned by the NFL in 1971 to standardize performance and have the ability to compare quarterbacks over a single game, season, or career. A team of statisticians used all data from 1960 to 1970 to create this formula, which was approved by the NFL in 1973.

The formula takes five passing statistics into four parts. These are attempts, completions, passing yards, passing touchdowns, and interceptions. The four components are as follows:

c = (comp/att – 0.3) x 5

y = (yards/att – 3) x 0.25

t = (td/att) x 20

i = 2.375 – (int/att x 25)

Finally, bring it all together and the Passer Rating = ((c + y + t + i)/6) x 100

I have created an excel template to quickly calculate passer rating based on the five statistics. You can download it here.

This formula has two constraints. The four variables cannot be less than zero or greater than 2.375. So if any of the calculations yield a result outside those constraints, then the minimum of zero or maximum of 2.375 is used. A perfect passer rating is then 158.3. The creators of the Passer Rating scaled all 1960 – 1970 data between 0 and 2.375 with 1 being statistically average.

For decades, this was accepted as the measurement of quarterback performance. Is it perfect? No. It does not factor in fumbles, rushing, or sacks but, after all, it is called a “passer” rating. In 2011 ESPN released their own calculation called Total Quarterback Rating, QBR. For a detailed explanation of the QBR, please click here. The CliffsNotes version tells us that the QBR is a weighted calculation that incorporates game context and how those plays transfer to wins. The rating ranges from 0 to 100. Sounds great, show me the algorithm! Ah…that’s where ESPN says hold on, this is proprietary. The QBR formula is not public.

So let’s see how these two rating systems stack up against each other. I pulled “qualified” regular season data only for each rating from ESPN.com for 2008 to 2013. You can explore the dashboard below. Note, ESPN states qualified data for passer rating means a player must have at least 14 attempts per team’s games played. There was no qualified definition for QBR data. 

Select the statistic. This applies to Player Ranking table, the overall average, and the Distribution box and whisker plot. Year filter allows to view the Player Ranking for each single season.


Tableau 8.1 now has a one-click Box and Whisker Plot in the Show Me feature. This is a great function to analyze distributions. It is simple to read. In this case, QBs in the grey box are considered average, with the median being the value where the light grey and dark gray areas meet. QBs that fall above box and below the upper whisker are above average. Finally, QBs outside the whiskers are either exceptional or terrible, depending on the upper or lower whisker.

Looking at 2013, Nick Foles had the highest passer rating but Josh McCown had the highest QBR. But I think most will agree that Peyton Manning was the best quarterback this year. He’s number 2 in both statistics. And if you’re building a fantasy football team, you’d probably take 10-15 other quarterbacks before drafting Foles or McCown. My Washington Redskins had a dismal 2013 season. But 2012 was a great year. Robert Griffin III was 3rd in Passer Rating but 5th in QBR. He had an amazing regular season by any standards, incredible for a rookie. Let’s not talk about what happened in the post season. In general, a QB does not vary greatly between the two metrics. Upper tier players will have upper tier Passer Ratings and QBR scores and so on for mid and lower tier.

Since ESPN launched the QBR, they only use this metric in their broadcast. ESPN has it’s upsides and downsides. I find them only reporting a non-public metric they created on television to fall under the downside. Obliviously they do have Passer Rating on their website, where I got the data. Call me a traditionalist, but I prefer Passer Rating, mostly because I can calculated myself. How about you?

Share Button