Published on November 21st, 2012 | by Daniel Castro0
Five Q’s on Data Innovation with Mark Whitehorn
Mark Whitehorn is a professor of analytics at the University of Dundee’s School of Computing in Scotland and the author of ten books on business intelligence. I spoke to Mark about how higher-ed programs are adapting to new demands in the era of Big Data.
Castro: What kinds of skills do data scientists need?
Whitehorn: They need to be intelligent! Oh, I see, you want specifics! They need to be good at designing new analytical techniques and be able to code them. The job also includes general skills (e.g., excellent analytical capabilities, machine learning, data mining, statistics, math, algorithm development, writing coding, data visualisation, and understanding multi-dimensional database design and implementation) and specific skills such as technologies to handle big data (e.g., Hadoop and related technologies, MapReduce and its implementation on differing software platforms, and NoSQl databases) and knowledge of languages (e.g., SQL, MDX, R, and functional and OOP languages such as Erlang and Java).
General characteristics also include an insatiable curiosity, interdisciplinary interests and excellent communication skills. Duncan Ross, the Director of Data Sciences at Teradata, has said that, “The first and most important trait is curiosity. Insane curiosity. In many walks of life evolution selects against the kind of person who decides to find out what happens ‘if I push that button.’ Data Science selects for it.”
In addition, communication skills are of paramount importance. Data scientists have to be able to explain what the information means and how it was derived from the underlying data.
Kind of implied in all of this (but often not stated explicitly) is that they will often be working with big data.
So, nothing too taxing then…
Castro: Is “data science” a new field or is it just a rebranding of existing fields such as statistics or information science?
Whitehorn: Good question, with a complex answer! I would argue that it certainly is not new. Oh, the term is new, but the job has been around for years. Since the advent of handling data with computers, we have needed people who have the skills that are embraced by the job description.
But it is absolutely not simply a rebranding of existing fields. A data scientist has a mix of skills that are drawn from multiple disciplines (see the answer to the question above). One definition I like is that a data scientist is “a better software engineer than any statistician and a better statistician than any software engineer.” Data scientists are people who love playing with data, seeing the patterns – those people have been around since data was first collected.
Castro: The University of Dundee recently announced that it will launch a new graduate degree in data science. Why did the School of Computing decide to create this new program?
Whitehorn: As discussed above, we believe that data science is not new. It has also been traditionally true that you tend to find people with those skills doing research work in Universities; although the same is, of course, also true for some commercial sectors such as finance, insurance, oil and gas and so on. What has happened recently is that the need for data scientists has started to expand dramatically. Given that the School of Computing is already the leading academic institution for BI in the UK and that we already have extensive data science skills, it was a no-brainer to run the course. (Or, as we say in our restrained, British manner, “it seemed somehow appropriate to consider offering this new course to the World.”)
Castro: As businesses and government agencies increasingly become data-driven organizations, should policymakers be concerned that there will be a shortage of qualified data scientists in the workforce?
Whitehorn: Given that we have just started a course in data science, I am BOUND to say “Yes”! But you can turn that around. Why did we start a course in data science? Because we honestly believe that there is a huge shortage coming. Will there always be a job for “Data Scientists”? I have no idea, it depends if the term stay in fashion or not. Is there always going to be a job for people with these skills? Yes; and I believe the requirement will only grow with time. There is absolutely no doubt in my mind that the demand will be far, far greater than any one University course can possibly supply.
Castro: What advice would you offer individuals who are considering entering this profession?
Whitehorn: If you can logically see the potential of data science and want to become a data scientist because you think it would be a good career move and earn you great money, then I suggest you stay as far away from the subject as possible; you’ll hate it as a job.
If, on the other hand, you have always been unaccountably attracted to data in all its forms; and if you have ever started playing with some data early one evening and suddenly found yourself at three in the morning, cold, stiff jointed but elated because you have cracked the code and the elusive pattern that was hidden in the data is finally revealed on the screen in front of you, then welcome to data science. The world has finally caught up with you and, better still, is prepared to pay you a great salary for doing what you already love.
You can, of course, learn on the job and work your way up. But, again, given my academic position, I am bound to say that I think it is worth gaining a qualification in the subject. I will (inevitably) also tell you that Dundee offers a great course (which we do) but there are also other good programs out there. I had the privilege recently of meeting Professor Diego Klabjan, director of McCormick’s Master of Science in Analytics (MSiA) program and professor of industrial engineering and management sciences at Northwestern University. I was really impressed with the course that he is running there.
Either way, if you are thinking of entering the profession because you love playing with data, go for it.
“5 Q’s on Data Innovation” is part of an ongoing series of interviews for Data Innovation Day by ITIF Senior Analyst Daniel Castro. If you have a suggestion for someone who should be featured, send an email to Daniel Castro at email@example.com.