Python – Here I Come

My tech journey has brought me to want to learn a programming language. I’ve picked up some SQL for my professional life and really enjoyed writing queries, digging into data and solving problems. I want to learn a language that I can leverage both at work and for personal use. I did some research and decided that Python would be a great fit for my goals. As a data geek, I’ve been seeing Python used for many things I’d like to be able to do. From web scrapping, data processing, statistical analysis to even things that weren’t on my radar like web development.

So where do I start? I looked at the familiar free tools available out there. I started with Codeacademy. but I didn’t really like the format. I tried using the forum but people were just posting the solution code so I wasn’t learning much. I switched over to Udacity and began Intro to Computer Science. This is a class that is taught in Python and the ultimate goal is to build your own search engine. This project sounded fun but again I struggled with the format. The self-paced aspect had me putting it off constantly. Life happens and it became an after thought. I need deadlines.

So now comes Coursera. I spoke to colleague at work who was about to finish the Programming for Everyone (Python) class. He enjoyed to the experience so I decided to check it out. Coursera classes differ in length. But they follow a weekly schedule with deadlines. I believe this will assist me in satisfying my goal. The #PR4E class is taught by a professor at the University of Michigan, Dr. Chuck. Once you start watching the class videos, his passion for teaching and educational technology is clearly evident…and he has the tattoos to prove it. Check out his TEDx talk on MOOCs here.

MOOC stands for Massive Open Online Course. I love how technology has enabled MOOCs to grow and continue to mature. I wish these were available when I was younger and had more free time. I look forward to continue learning in my life from MOOCs.

The Programming for Everyone class started on October 6th and it runs 10 weeks long. But there is still time to join so if you’re interested, click here.

– Future Pythonista

Share Button

Tableau UTC Conversion

In one of my earlier posts, I wrote about how to manually convert UTC time to Eastern time, adjusting for the Standard vs Daylight Savings. I’ve found this formula useful for some of SQL work but recently needed to incorporate it into Tableau.

So I decided to share the following calculated field. Simply cut and paste the formula below into your favorite text editor and Replace All the [date_field] with your date column name.

if date([date_field]) > #13-03-2011# and date([date_field]) < #06-11-2011# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #11-03-2012# and date([date_field]) < #04-11-2012# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #10-03-2013# and date([date_field]) < #03-11-2013# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #09-03-2014# and date([date_field]) < #02-11-2014# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #08-03-2015# and date([date_field]) < #01-11-2015# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #13-03-2016# and date([date_field]) < #06-11-2016# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #12-03-2017# and date([date_field]) < #05-11-2017# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #11-03-2018# and date([date_field]) < #04-11-2018# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #10-03-2019# and date([date_field]) < #03-11-2019# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #08-03-2020# and date([date_field]) < #01-11-2020# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #14-03-2021# and date([date_field]) < #07-11-2021# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #13-03-2022# and date([date_field]) < #06-11-2022# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #12-03-2023# and date([date_field]) < #05-11-2023# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #10-03-2024# and date([date_field]) < #03-11-2024# then DATEADD(‘hour’,-4,[date_field])
elseif date([date_field]) > #09-03-2025# and date([date_field]) < #02-11-2025# then DATEADD(‘hour’,-4,[date_field])
else DATEADD(‘hour’,-5,[date_field]) end

Please note, this converts UTC to US Eastern time zone. So if you’re in another time zone, be sure to adjust the DATEADD values. I hope this helps.

Share Button

Time Duration in Tableau

Time conversion seems to be pretty popular topic for Tableau. I recently worked on a project where the users wanted to display media duration in hh:mm:ss format. I looked around the Tableau forum and saw others have had this issue too. As long as the duration does not exceed 24 hours, this formula works just fine DATETIME(SUM([total dur secs]/84600)). And then use Tableau Date formatting to change to hh:mm:ss. But if duration will exceed 24 hours, this formula will not work. In my case, duration will exceed 24 hours. So here’s how I tackled this problem.

First, I needed to calculate duration in seconds before I could create the required format. Working in media, time codes go down to the frames level. So first I converted frames into aggregated seconds based on the asset frame rate, [total dur secs] =  SUM([PROGRAM_DURATION]/29.97)

Now that my [total dur secs] field is in total seconds we can begin. There are probably many ways to perform this but this was the approach I took.

First, I created a [total hours] calculated field.

IF INT([total dur secs]/60/60) >=1 THEN
STR(IIF(INT[total dur secs]/60/60)> 9, “”, “0”)) + STR(INT[total dur secs]/60/60))
ELSE “”
END

The first line tests if the duration is greater than or equal to one hour. If that is true, then the second line tests if the duration is greater than 9 hours. If that is true, then create a blank, if false, then pad with a zero. Since I’m concatenating, the duration is converted to strings. If the total duration is not greater than an hour, then blank.

Next, I created a [total minutes : seconds] calculated field.

IF INT([total dur secs]/60) >=60 THEN
STR(IIF((INT([total dur secs]/60)-INT([total hours])*60)> 9, “”, “0”)) + STR(INT([total dur secs]/60)-INT([total hours])*60)
+ “:” +
STR(IIF(ROUND([total dur secs]-INT([total dur secs]/60)*60)> 9, “”, “0”)) +
STR(ROUND([total dur secs]-INT([total dur secs]/60)*60))
ELSEIF INT([total dur secs]/60) >=1 THEN
STR(IIF(INT([total dur secs]/60)> 9, “”, “0”)) + STR(INT([total dur secs]/60))
+ “:” +
STR(IIF(ROUND([total dur secs]-INT([total dur secs]/60)*60)> 9, “”, “0”)) +
STR(ROUND([total dur secs]-INT([total dur secs]/60)*60))
ELSE “00:” + STR(IIF(ROUND([total dur secs])> 9, “”, “0”)) + STR(ROUND([total dur secs]))
END

Let’s break down what is happening in the calculated field. The following portion below checks if [total dur secs] is greater than or equal to 60 minutes. Because I want time formatted as hh:mm:ss and I don’t want the minutes portion to exceed 60. So I subtract out [total hours] as minutes. Again, using the IIF to pad the extra zero as needed.

IF INT([total dur secs]/60) >=60 THEN
STR(IIF((INT([total dur secs]/60)-INT([total hours])*60)> 9, “”, “0”)) + STR(INT([total dur secs]/60)-INT([total hours])*60)

The next section below is calculating the seconds portion of the duration.

STR(IIF(ROUND([total dur secs]-INT([total dur secs]/60)*60)> 9, “”, “0”)) +
STR(ROUND([total dur secs]-INT([total dur secs]/60)*60))

I have to round [total dur secs] because I don’t want decimals. Then similar to the minutes calculation, I want to subtract out minutes that exceed 60. The IIF again pads for zeros as necessary. 

The ELSEIF and ELSE basically repeat the same calculation, checking if minutes are greater or equal to 1 or zero respectively.

Lastly, I bring [total hours] and [total minutes : seconds] together in a new calculated field, [Total Duration].

[total hours]+ IIF([total hours]=””,[total minutes : seconds],”:”+ [total minutes : seconds])

So just as I’m about to finish this set of calculations, a new message appears in my inbox. I subscribe to Tableau Zen Master – Jonathan Drummey’s blog, and he just posted a much simpler version of time duration on, Drawing with Numbers. I guess I just prefer the scenic route. Check it out his work though, it is great.

How have you handled time duration formatting in Tableau?

 

Share Button

Beer!

Beer is awesome! What else is new, right? I’ve been thinking about doing a beer post and started looking around the internet for data. If you’re a fan of beer, you’ve noticed the rise in new brewers over the last decade. In fact, Denizens Brewing Co. just opened across the street from my office. I’ve heard rumors our engineering team is looking at ways to boost our wifi signal, allegedly of course. The explosion of craft beer has been great for beer nerds like myself. I love trying a new beer with friends and I keep finding new favorites. I prefer the IPAs but will try just about anything.

Off to the internet I went and I found data from the Beer Institute. The site has a downloadable excel with a wealth of beer information. The bar chart below shows active brewer permits by state and year. Use the slide filter to see the growth in permits issued.

Looking at the growth a little closer you can see YoY for the US as a whole and for each state. My home state of Maryland needs to step it up, while 40% is great, I know we can do better. :) Some states on the other hand are growing fast since 2004. Keep it up!

But the growth isn’t necessarily a great thing if you’re trying to break in the market. I’ve talked to the brewer of a local restaurant in Bethesda, MD and he says we’re going to start seeing many brewers close up shop. Many individuals are entering this space and not all will survive. But in the meantime, I’ll sample their beer.

Clearly California has the most active brewer permits, but it is the most populous state. How does 2012 brewer permits look when population is factored in? I found some data on adult population and blended it with the brewer data.  Note that adult here means 18 years of age or older, not 21, the legal drinking age in the US. I won’t go into it here much but I will say that I believe the drinking age should be 18, but the driving age should be 21.

Vermont has the least adults per active brewer permit and if you live in Mississippi, I’m sorry. I hope your state is importing other craft brews to help you out.

So now that I got you thinking about beer, you’re probably about to stop reading and head for the fridge. Wait!!! In order to help you make a selection, I scraped the web for prize winning beers. Using import.io, a fantastic web scraping tool, I downloaded a table of the Great American Beer Festival winners from 2013.

Use the filter to select the Category to view the Gold, Silver, and Bronze winners. I’m going to have to use this chart next time I’m on a beer run.

Now if you’ll excuse me, I just opened one of my favorites, Loose Cannon from Heavy Seas out of Baltimore. Have a great day!

loose cannon

 

 

Share Button

Tableau Server Tip

Tip time again. And this one focuses on Tableau Server. One of my favorite features of Tableau Server is the ability to publish data sources. I work for a global company and we constantly share data with teams in international offices. This data can range from SQL Server/Oracle data to Excel files stored on a multitude of share drives. Publishing data sources allows our global teams to all access data needed for a given project or report, without needing access to where the underlying data is stored. This has been advantage and allowed collaboration between regions, all while working with a single version of the truth.

Ok, now getting to the tip. Let’s say you’re given a workbook that already contains charts, tables, dashboards, and of course, calculated fields. The workbook is connected to a published data source on your server. You try to edit a calculated field and the only option is Edit Copy, which makes a new calculated field.

ScreenHunter_02 Jul. 24 16.45

This is not ideal because the calculated field you’d like to edit is already embedded in several locations of the workbook. So how do you edit the original field?

Easy, simply right click on the data source name and select Create Local Copy.

ScreenHunter_03 Jul. 24 16.48

Save the data source to your desired folder. You’ll now have a new data source.

ScreenHunter_04 Jul. 24 16.50

Next, swap data sources so the exists workbook sources from the local copy.

Select Data > Replace Data Source…

ScreenHunter_01 Jul. 24 19.59

Now when you right click to edit a calculated field, you’ll be able to edit the original field that is already being used throughout the workbook.

ScreenHunter_05 Jul. 24 16.51

From here, you can proceed with the needed task. Just remember that if you republish the data source, your calculated field edits will be available for others. So depending on your ultimate goal, know if you should overwrite the original published data source or not.

Thanks! And happy Tableau’ing!

Share Button

Post World Cup Viz

So I’m a little late to the party but I wanted to do a quick write up about the World Cup. As a Brazilian, I had deep emotions leading up to this tournament.

I made this quick bar chart that visually displays my interest level in the World Cup at different stages.

As you can clearly see, I was all in until Brazil decided to have a churrasco on the field instead of play defense. You can’t win them all but I really wanted this one.

Until the next post!

 

 

 

Share Button

World Cup TV Schedule

Ok, so this is not a post about data but I love soccer and being born in Brazil, I’m very excited about today. Below is the TV listing for group play and round of 16 games on Eastern Time zone in the US.

Enjoy!

First Round

Thursday, June 12

4 p.m.: Brazil vs. Croatia at Arena Corinthians, Sao Paulo (ESPN)

Friday, June 13

12 p.m.: Mexico vs. Cameroon at Estadio das Dunas, Natal (ESPN2)

3 p.m.: Spain vs. Netherlands at Arena Fonte Nova, Salvador (ESPN)

6 p.m. Chile vs. Australia at Arena Pantanal, Cuiaba (ESPN2)

Saturday, June 14

12 p.m.: Colombia vs. Greece at Estádio Mineirao, Belo Horizonte (ABC)

3 p.m.: Uruguay vs. Costa Rica at Estádio Castelao, Fortaleza (ABC)

6 p.m.: England vs. Italy at Arena Amazonia, Manaus (ESPN)

9 p.m.: Ivory Coast vs. Japan at Arena Pernambuco, Recife (ESPN)

Sunday, June 15

12 p.m.: Switzerland vs. Ecuador at Estádio Nacional Mane Garrincha, Brasilia (ABC)

3 p.m.: France vs. Honduras at Estádio Beira-Rio, Porto Alegre (ABC)

6 p.m.: Argentina vs. Bosnia-Herzegovina at Estadio do Maracana, Rio de Janeiro (ESPN)

Monday, June 16

12 p.m.: Germany vs. Portugal at Arena Fonte Nova, Salvador (ESPN)

3 p.m.: Iran vs. Nigeria at Arena da Baixada, Curitiba (ESPN)

6 p.m.: Ghana vs. United States at Estadio das Dunas, Natal (ESPN)

Tuesday, June 17

12 p.m.: Belgium vs. Algeria at Estádio Mineirao, Belo Horizonte (ESPN)

3 p.m.: Brazil vs. Mexico at Estadio Castelao, Fortaleza (ESPN)

6 p.m.: Russia vs. South Korea at Arena Pantanal, Cuiaba (ESPN)

Wednesday, June 18

12 p.m.: Australia vs. Netherlands at Estadio Beira-Rio, Porto Alegre (ESPN)

3 p.m.: Spain vs. Chile at Estadio do Maracana, Rio de Janeiro (ESPN)

6 p.m.: Cameroon vs. Croatia at Arena Amazonia, Manaus (ESPN)

Thursday, June 19

12 p.m.: Colombia vs. Ivory Coast at Estadio Nacional Mane Garrincha, Brasilia (ESPN)

3 p.m.: Uruguay vs. England at Arena Corinthians, Sao Paulo (ESPN)

6 p.m.: Japan vs. Greece at Estadio das Dunas, Natal (ESPN)

Friday, June 20

12 p.m.: Italy vs. Costa Rica at Arena Pernambuco, Recife (ESPN)

3 p.m.: Switzerland vs. France at Arena Fonte Nova, Salvador (ESPN)

6 p.m.: Honduras vs. Ecuador at Arena da Baixada, Curitiba (ESPN)

Saturday, June 21

12 p.m.: Argentina vs. Iran at Estadio Mineirao, Belo Horizonte (ESPN)

3 p.m.: Germany vs. Ghana at Estadio Castelao, Fortaleza (ESPN)

6 p.m.: Nigeria vs. Bosnia Herzegovina at Arena Pantanal, Cuiaba (ESPN)

Sunday, June 22

12 p.m.: Belgium vs. Russia at Estadio do Maracana, Rio de Janeiro (ABC)

3 p.m.: South Korea vs. Algeria at Estadio Beira-Rio, Porto Alegre (ABC)

6 p.m.: United States vs. Portugal at Arena Amazonia, Manaus (ESPN)

Monday, June 23

12 p.m.: Netherlands vs. Chile at Arena Corinthians, Sao Paulo (ESPN2)

12 p.m.: Australia vs. Spain at Arena da Baixada, Curitiba (ESPN)

4 p.m.: Croatia vs. Mexico at Arena Pernambuco, Recife (ESPN)

4 p.m.: Cameroon vs. Brazil at Estadio Nacional Mané Garrincha, Brasilia (ESPN2)

Tuesday, June 24

12 p.m.: Italy vs. Uruguay at Estadio das Dunas, Natal (ESPN)

12 p.m.: Costa Rica vs. England at Estadio Mineirao, Belo Horizonte (ESPN2)

4 p.m.: Japan vs. Colombia at Arena Pantanal, Cuiabá (ESPN)

4 p.m.: Greece vs. Ivory Coast at Estadio Castelao, Fortaleza (ESPN2)

Wednesday, June 25

12 p.m.: Nigeria vs. Argentina at Estadio Beira-Rio, Porto Alegre (ESPN)

12 p.m.: Bosnia Herzegovina vs. Iran at Arena Fonte Nova, Salvador (ESPN2)

4 p.m.: Ecuador vs. France at Estadio do Maracana, Rio de Janeiro (ESPN2)

4 p.m.: Honduras vs. Switzerland at Arena Amazonia, Manaus (ESPN)

Thursday, June 26

12 p.m.: United States vs. Germany at Arena Pernambuco, Recife (ESPN)

12 p.m.: Portugal vs. Ghana at Estadio Nacional Mane Garrincha, Brasilia (ESPN2)

4 p.m.: South Korea vs. Belgium at Arena Corinthians, Sao Paulo (ESPN)

4 p.m.: Algeria vs. Russia at Estadio Beira-Rio, Porto Alegre (ESPN2)

Round of 16

Saturday, June 28

12 p.m.: 1A vs. 2B at Estadio Mineirao, Belo Horizonte (ABC)

4 p.m.: 1C vs. 2D at Estadio do Maracana, Rio de Janeiro (ABC)

Sunday, June 29

12 p.m.: 1B vs. 2A at Estadio Castelao, Fortaleza (ESPN)

4 p.m.: 1D vs. 2C at Arena Pernambuco, Recife (ESPN)

Monday, June 30

12 p.m.: 1E vs. 2F at Estadio Nacional Mane Garrincha, Brasilia (ESPN)

4 p.m.: 1G vs. 2H at Estadio Beira-Rio, Porto Alegre (ESPN)

Tuesday, July 1

12 p.m.: 1F vs. 2E at Arena Corinthians, Sao Paulo (ESPN)

4 p.m.: 1H vs. 2G at Arena Fonte Nova, Salvador (ESPN)

Share Button

My First Speaking Engagement

This past week, Alteryx and Tableau hosted a joint event in Bethesda, MD. My Alteryx sales representative called me asking if I would present a use case at the meeting. I was honored and decided to give it a go. This is the first time I’ve been asked to present outside my organization so it was a bit nerve racking. I really wanted to do a great job at telling Discovery’s successes with these tools, with hopes that others too share their experience during the networking happy hour.

If you’re not familiar with Alteryx, you should be. This a great tool that is capable of endless data possibilities. My organization only recently purchased it and it already solved a roadblock I had encountered. This is what my presentation was about. With Alteryx, we were able to join 7 disparate data sources into a single Tableau Data Extract file and create a management dashboard for our Media Operations team. While I won’t go into the entire use case here, a colleague kindly took a few pictures.

My introduction slide into the global reach of Discovery Communications.

presentation3

Discussing one of the final dashboards thanks to our Alteryx and Tableau solution. presentation

Alteryx was kind of enough to post about the event on their blog If you’re able to catch one of the future events, I highly recommend it. 

Like I said, I really enjoyed this experience, even though I was anxious about it all week. I hope to get my Alteryx skills up to where I am with Tableau. I believe we can solve a great deal of data challenges by leveraging both tools in tandem.

I also got to meet a fellow data blogger, which was exciting. Here’s a picture of Emily and me afterwards. Link to her blog.

IMG_20140410_174338

I plan on posting about Alteryx in the future on my data adventures here. Stick around for more!

Share Button

Simple Data Organization

My brother sent me this gif and I wanted to share with everyone. Simple but effective. Thanks Felipe.

datatables

 

 

Share Button

Cloud Data Connection

How many of you out there use cloud storage services such as Box, Dropbox, or Google Drive? Exactly, we all do.

It would be amazing if Tableau Desktop could connect to a data source that is stored in the cloud without the need to download it first. I know at my organization, we are moving more towards cloud storage and away from local storage. I envision the ability to connect via the download url these services provide, but the data is not downloaded. It would act just like a regular local storage connection. Taking this further, be able to publish cloud stored data sources to Tableau Server and set refresh schedules. That would be fantastic.

So if you’re for this idea, please vote for it on Tableau’s site!

http://community.tableausoftware.com/ideas/3204

And share with others!!

Share Button