Category Archives: Uncategorized

Update

It has been a few months since I last posted. I wanted to take a quick minute to give an update. I took some time off from blogging as I began a job search. Like many others in the data blogging community, I’ve been very lucky to have turned my data passion into a career. I have moved to Choice Hotels International as a Tableau Developer and Analyst in the Business Intelligence Group. After 10 years with Discovery Communications, I decided it was time for a new challenge. I will miss many Discovery colleagues but have been making new friends at Choice. Discovery has so many talented individuals so I know everything I was part of there is in great hands.

It’s been a little intimidating moving industries. You take for granted the institutional knowledge that is acquired after 10 years of employment in one location. Now moving to the hospitality industry, I have so much to learn. RevPar, RevPar Index, ADR and the like are the foundations of hospitality reporting and analytics. It’s a been a fun ride so far and I look forward to much more.

So that is my quick update. I plan on being more active again now that things are settling down. Thanks for listening.

Share Button

One year in

One year ago, Tableau declared January as Data Blogging Month. As a fan of the data community, I was inspired by that declaration to join in and thus, Data Knight Rises was born with this first post.

So a year in and I must say, “Wow!” Time sure does fly when you’re being a data geek. I have really enjoyed being part of this community. This blog has allowed me to e-meet some great people out there, with hopes of getting to meet them in person as well. The data community further inspires me to do more. I want to say Thank You to everyone who have supported me, taught me, and participated with me during this past year. Some of my most satisfying moments have been a simple thanks comment left on a post. It’s great knowing that what I have produced has made someone’s day easier in their data needs.

These site stats might not blow you away, but I’m really proud of what I’ve been able to accomplish this past year.

ScreenHunter_01 Jan. 25 13.24

So what’s next? There’s so much I want to accomplish, which I believe is a very common sentiment among other data enthusiasts. In order to be more organized, I feel like I should layout some goals for 2015.

  1. Continue learning and sharing Tableau and Alteryx tips/tricks/wins that I come across.
  2. Dedicate time to Python. I really want to get more into this language as its potential is vast in my professional and personal life.
  3. Community involvement. There are a plethora of data related gatherings in the DC area that I’ve been meaning to check out. The ones I see are from Data Community DC. Let me know if you know of others.
  4. Have fun! I never want this to feel like a chore or homework.

Thank you again for being a part of my journey and may your 2015 be as fulfilling as you intend!

Share Button

Post World Cup Viz

So I’m a little late to the party but I wanted to do a quick write up about the World Cup. As a Brazilian, I had deep emotions leading up to this tournament.

I made this quick bar chart that visually displays my interest level in the World Cup at different stages.

As you can clearly see, I was all in until Brazil decided to have a churrasco on the field instead of play defense. You can’t win them all but I really wanted this one.

Until the next post!

 

 

 

Share Button

World Cup TV Schedule

Ok, so this is not a post about data but I love soccer and being born in Brazil, I’m very excited about today. Below is the TV listing for group play and round of 16 games on Eastern Time zone in the US.

Enjoy!

First Round

Thursday, June 12

4 p.m.: Brazil vs. Croatia at Arena Corinthians, Sao Paulo (ESPN)

Friday, June 13

12 p.m.: Mexico vs. Cameroon at Estadio das Dunas, Natal (ESPN2)

3 p.m.: Spain vs. Netherlands at Arena Fonte Nova, Salvador (ESPN)

6 p.m. Chile vs. Australia at Arena Pantanal, Cuiaba (ESPN2)

Saturday, June 14

12 p.m.: Colombia vs. Greece at Estádio Mineirao, Belo Horizonte (ABC)

3 p.m.: Uruguay vs. Costa Rica at Estádio Castelao, Fortaleza (ABC)

6 p.m.: England vs. Italy at Arena Amazonia, Manaus (ESPN)

9 p.m.: Ivory Coast vs. Japan at Arena Pernambuco, Recife (ESPN)

Sunday, June 15

12 p.m.: Switzerland vs. Ecuador at Estádio Nacional Mane Garrincha, Brasilia (ABC)

3 p.m.: France vs. Honduras at Estádio Beira-Rio, Porto Alegre (ABC)

6 p.m.: Argentina vs. Bosnia-Herzegovina at Estadio do Maracana, Rio de Janeiro (ESPN)

Monday, June 16

12 p.m.: Germany vs. Portugal at Arena Fonte Nova, Salvador (ESPN)

3 p.m.: Iran vs. Nigeria at Arena da Baixada, Curitiba (ESPN)

6 p.m.: Ghana vs. United States at Estadio das Dunas, Natal (ESPN)

Tuesday, June 17

12 p.m.: Belgium vs. Algeria at Estádio Mineirao, Belo Horizonte (ESPN)

3 p.m.: Brazil vs. Mexico at Estadio Castelao, Fortaleza (ESPN)

6 p.m.: Russia vs. South Korea at Arena Pantanal, Cuiaba (ESPN)

Wednesday, June 18

12 p.m.: Australia vs. Netherlands at Estadio Beira-Rio, Porto Alegre (ESPN)

3 p.m.: Spain vs. Chile at Estadio do Maracana, Rio de Janeiro (ESPN)

6 p.m.: Cameroon vs. Croatia at Arena Amazonia, Manaus (ESPN)

Thursday, June 19

12 p.m.: Colombia vs. Ivory Coast at Estadio Nacional Mane Garrincha, Brasilia (ESPN)

3 p.m.: Uruguay vs. England at Arena Corinthians, Sao Paulo (ESPN)

6 p.m.: Japan vs. Greece at Estadio das Dunas, Natal (ESPN)

Friday, June 20

12 p.m.: Italy vs. Costa Rica at Arena Pernambuco, Recife (ESPN)

3 p.m.: Switzerland vs. France at Arena Fonte Nova, Salvador (ESPN)

6 p.m.: Honduras vs. Ecuador at Arena da Baixada, Curitiba (ESPN)

Saturday, June 21

12 p.m.: Argentina vs. Iran at Estadio Mineirao, Belo Horizonte (ESPN)

3 p.m.: Germany vs. Ghana at Estadio Castelao, Fortaleza (ESPN)

6 p.m.: Nigeria vs. Bosnia Herzegovina at Arena Pantanal, Cuiaba (ESPN)

Sunday, June 22

12 p.m.: Belgium vs. Russia at Estadio do Maracana, Rio de Janeiro (ABC)

3 p.m.: South Korea vs. Algeria at Estadio Beira-Rio, Porto Alegre (ABC)

6 p.m.: United States vs. Portugal at Arena Amazonia, Manaus (ESPN)

Monday, June 23

12 p.m.: Netherlands vs. Chile at Arena Corinthians, Sao Paulo (ESPN2)

12 p.m.: Australia vs. Spain at Arena da Baixada, Curitiba (ESPN)

4 p.m.: Croatia vs. Mexico at Arena Pernambuco, Recife (ESPN)

4 p.m.: Cameroon vs. Brazil at Estadio Nacional Mané Garrincha, Brasilia (ESPN2)

Tuesday, June 24

12 p.m.: Italy vs. Uruguay at Estadio das Dunas, Natal (ESPN)

12 p.m.: Costa Rica vs. England at Estadio Mineirao, Belo Horizonte (ESPN2)

4 p.m.: Japan vs. Colombia at Arena Pantanal, Cuiabá (ESPN)

4 p.m.: Greece vs. Ivory Coast at Estadio Castelao, Fortaleza (ESPN2)

Wednesday, June 25

12 p.m.: Nigeria vs. Argentina at Estadio Beira-Rio, Porto Alegre (ESPN)

12 p.m.: Bosnia Herzegovina vs. Iran at Arena Fonte Nova, Salvador (ESPN2)

4 p.m.: Ecuador vs. France at Estadio do Maracana, Rio de Janeiro (ESPN2)

4 p.m.: Honduras vs. Switzerland at Arena Amazonia, Manaus (ESPN)

Thursday, June 26

12 p.m.: United States vs. Germany at Arena Pernambuco, Recife (ESPN)

12 p.m.: Portugal vs. Ghana at Estadio Nacional Mane Garrincha, Brasilia (ESPN2)

4 p.m.: South Korea vs. Belgium at Arena Corinthians, Sao Paulo (ESPN)

4 p.m.: Algeria vs. Russia at Estadio Beira-Rio, Porto Alegre (ESPN2)

Round of 16

Saturday, June 28

12 p.m.: 1A vs. 2B at Estadio Mineirao, Belo Horizonte (ABC)

4 p.m.: 1C vs. 2D at Estadio do Maracana, Rio de Janeiro (ABC)

Sunday, June 29

12 p.m.: 1B vs. 2A at Estadio Castelao, Fortaleza (ESPN)

4 p.m.: 1D vs. 2C at Arena Pernambuco, Recife (ESPN)

Monday, June 30

12 p.m.: 1E vs. 2F at Estadio Nacional Mane Garrincha, Brasilia (ESPN)

4 p.m.: 1G vs. 2H at Estadio Beira-Rio, Porto Alegre (ESPN)

Tuesday, July 1

12 p.m.: 1F vs. 2E at Arena Corinthians, Sao Paulo (ESPN)

4 p.m.: 1H vs. 2G at Arena Fonte Nova, Salvador (ESPN)

Share Button

Simple Data Organization

My brother sent me this gif and I wanted to share with everyone. Simple but effective. Thanks Felipe.

datatables

 

 

Share Button

The other side of data

This past Sunday, 60 Minutes aired a story on Data Brokers and the selling of your personal information. The report details the companies who package up to 1500 unique attributes about an individual and sell it to other companies. Technology has made this practice rather easy for others to exploit. Simply by browsing the web, these data brokers are creating a portfolio of the single user. To marketers, this is extremely valuable information. Strategic targeted marketing campaigns can not only save expenses for an organization but also potentially have a higher conversion rate.

We all know about the NSA surveillance program that came into light last year. That’s a topic for another blog post. But the fact that corporations are doing the same in the name of profit makes it feel different. In many ways, this isn’t news per-say.  Most of us use discount cards when shopping at grocery stores or have memberships to Costco/Sam’s Club style stores. Of course these companies are collecting our buying habits. The difference is their data collection is not their main source of income.

These data brokers have no other product or service. Their sole source of income is you. How does that make you feel? Kind of freaks me out a bit, not going lie.

Share Button

Quarterback Evaluation

The single most important position, in possibly all team sports, is the quarterback. Since the early 70s, quarterback performance has been measured by a calculation called Passer Rating. The Passer Rating was commissioned by the NFL in 1971 to standardize performance and have the ability to compare quarterbacks over a single game, season, or career. A team of statisticians used all data from 1960 to 1970 to create this formula, which was approved by the NFL in 1973.

The formula takes five passing statistics into four parts. These are attempts, completions, passing yards, passing touchdowns, and interceptions. The four components are as follows:

c = (comp/att – 0.3) x 5

y = (yards/att – 3) x 0.25

t = (td/att) x 20

i = 2.375 – (int/att x 25)

Finally, bring it all together and the Passer Rating = ((c + y + t + i)/6) x 100

I have created an excel template to quickly calculate passer rating based on the five statistics. You can download it here.

This formula has two constraints. The four variables cannot be less than zero or greater than 2.375. So if any of the calculations yield a result outside those constraints, then the minimum of zero or maximum of 2.375 is used. A perfect passer rating is then 158.3. The creators of the Passer Rating scaled all 1960 – 1970 data between 0 and 2.375 with 1 being statistically average.

For decades, this was accepted as the measurement of quarterback performance. Is it perfect? No. It does not factor in fumbles, rushing, or sacks but, after all, it is called a “passer” rating. In 2011 ESPN released their own calculation called Total Quarterback Rating, QBR. For a detailed explanation of the QBR, please click here. The CliffsNotes version tells us that the QBR is a weighted calculation that incorporates game context and how those plays transfer to wins. The rating ranges from 0 to 100. Sounds great, show me the algorithm! Ah…that’s where ESPN says hold on, this is proprietary. The QBR formula is not public.

So let’s see how these two rating systems stack up against each other. I pulled “qualified” regular season data only for each rating from ESPN.com for 2008 to 2013. You can explore the dashboard below. Note, ESPN states qualified data for passer rating means a player must have at least 14 attempts per team’s games played. There was no qualified definition for QBR data. 

Select the statistic. This applies to Player Ranking table, the overall average, and the Distribution box and whisker plot. Year filter allows to view the Player Ranking for each single season.


Tableau 8.1 now has a one-click Box and Whisker Plot in the Show Me feature. This is a great function to analyze distributions. It is simple to read. In this case, QBs in the grey box are considered average, with the median being the value where the light grey and dark gray areas meet. QBs that fall above box and below the upper whisker are above average. Finally, QBs outside the whiskers are either exceptional or terrible, depending on the upper or lower whisker.

Looking at 2013, Nick Foles had the highest passer rating but Josh McCown had the highest QBR. But I think most will agree that Peyton Manning was the best quarterback this year. He’s number 2 in both statistics. And if you’re building a fantasy football team, you’d probably take 10-15 other quarterbacks before drafting Foles or McCown. My Washington Redskins had a dismal 2013 season. But 2012 was a great year. Robert Griffin III was 3rd in Passer Rating but 5th in QBR. He had an amazing regular season by any standards, incredible for a rookie. Let’s not talk about what happened in the post season. In general, a QB does not vary greatly between the two metrics. Upper tier players will have upper tier Passer Ratings and QBR scores and so on for mid and lower tier.

Since ESPN launched the QBR, they only use this metric in their broadcast. ESPN has it’s upsides and downsides. I find them only reporting a non-public metric they created on television to fall under the downside. Obliviously they do have Passer Rating on their website, where I got the data. Call me a traditionalist, but I prefer Passer Rating, mostly because I can calculated myself. How about you?

Share Button

One week in, what I have learned

It’s been a week since I launched this blog and already I have a few notes to pass on to future potential bloggers.

First, I’ll admit that I jumped in without researching blogging platforms at all. My wife has a WordPress blog so I figured I’ll just use that. WordPress.com is fantastic tool to get one up and running in minutes. The .com version has all the basics at a click of the button and it provides tutorial videos. I bought my custom URL and started drafting my first post. It was not until my second post that I realized that .com was not the platform I would need. It is a stripped down version of WordPress.org and most importantly, I could not embed Tableau Public vizzes.

I went to my Google account and created a test Blogger.com site. That platform did allow for embedding Tableau Public vizzes from the generated code. So the decision was Google or self-hosting? I’ll point out that I’m a big Google fan. I’ve had an Android phone since the original Droid launched in 2009 and am currently rocking a Nexus 4. But sometimes Google can be disappointing. They shut down popular products like Google Reader and not so popular products like Google Wave. I didn’t want to start a blogger site then have to move it later down the road.

WordPress.org it was! Now came my difficulties. I would need to convert my .com to .org. WordPress will do all the conversion for a fee of $129. Ouch! I consider my self pretty tech savvy. I’ve picked up decent SQL skills in my professional life and I love hacking my Nexus 4 (I root my phones and I’m currently running CyanogenMod rom and the Franco kernel). I decided I’d do it myself. I started reading the conversion steps about exporting your blog and importing and bought a Bluehost account. I did not know that I could not transfer my custom URL within 6 months of purchase. I would need to point WordPress nameservers to Bluehost’s nameservers. Honestly, I know very little about web hosting, so some of this was a little confusing. After I changed the nameservers, I crossed my fingers and it worked….took a day or two to actually be processed. I also saw that I lost my stats and subscribers from my first post. I mentioned this to my wife who quickly reminded me that she moved to self-hosting after a year and lost a lot more than I ever accumulated in two days. I got over it.

In creating the Bluehost account, I used a one-click install of WordPress.org but inadvertently created another WordPress account. Now I have separate accounts for .com and .org. I still need the .com because it has my custom url and it does not look like you can merge the two. All the features of .com that got me up and running so quickly are not present from the start in the .org platform. Everything needs to be installed as separate plugins. I installed a few plugins and I still could not embed a viz. I finally found an iFrame plugin that worked with Tableau Public content. It is still not perfect if viewing my vizzes from a mobile device, that’s a work in progress.

My advice to anyone considering starting a blog is do your research. There are a plethora of blogging platforms out there. Find the one that best fits your expected needs, then create your necessary account(s).

 

Share Button

Welcome to my blog…hope you stick around

Hello peoples of the information superhighway. I’m excited to announce that I have decided to create a blog about data. For the past few years, my involvement in the world of data visualization and analysis has grown exponentially…and I’ve loved every minute of it. So why did I decide to start a data blog? Well, there are a few reasons:

  1. I have discovered a wonderful community of data enthusiasts and wanted to become a part of it. I have learned a great deal from blogs such as DataRemixed, Tableau Love, Health Intelligence and VizNinja just to name a few.
  2. Tableau – A magnificent self-service Business Intelligence tool that has transformed my career. My own data visualizations in this blog (vizzes for us in the biz) will utilize Tableau Public. A free version of the software designated for public data and sharing.
  3. At the encouragement of my lovely wife Hilary and peers at Discovery Communications.
  4. Finally, because I recently found out that January is Data Blogging Month.

I can’t wait to get started. I hope you’ll enjoy what I have to offer and come back to interact with me and my vizzes.

Thanks for stopping by and have a nice data day. (see what I did there)

Share Button