Skip to main content

A data scientist. Huh?

I’ve been telling people that I’m going to be a data scientist when I graduate in May. That causes a lot of head-scratching, even among highly educated folks. Turns out, people don’t really know what data science is or what a data scientist does. I typically give some examples of AI in action or talk about evaluating the effectiveness of direct mail, and people nod and smile and wish me luck. Numbers, atoms, and telescopes say yes to science.

 

I have a hypothesis. Or, since I am not going to conduct any tests to validate my hypothesis, an opinion!

 

I think people, in general, don’t know what scientists do. Here are my anecdotal pieces of evidence:

  • When I used to say I was an astrophysicist, people would ask if I named stars or looked at planets. There was no conception that unknowns were waiting to be known.
  • Biology in high school is probably the most advanced science course for many US-educated folks. Biology in high school is very fact-oriented. Biology as a science is more than listing the names of parts of a cell.
  • “Belief” in science seems to wane in the public zeitgeist when evidence that a change in some sort of model is necessitated due to new information.

Science is not a belief system, and science is not about static facts. Scientists use facts and curiosity to develop questions to investigate by collecting data. New data drives the need to analyze and develop models to describe and frame the data. Sometimes new data causes upheavals in established models. Typically there will be efforts to collect more information to support the old and new models. This is the scientific process.

 
This process and change confuses people who think of science as fixed information. Pluto is a planet. And then it is demoted. This confuses people and makes them angry at NDGT.
 
Analysis and constant evaluation of information are what scientists do. With data.
 
All scientists are data scientists.
 
However, data scientists are trained to work with statistics and many strategies to be able to analyze all sorts of information. We learn to build predictive models that can apply across many fields. We create confidence intervals for our work. We are curious, and we investigate just like other scientists.
 

So I think part of the problem in understanding what a data scientist does is not understanding scientists and what scientists do because science is not well understood.

 

I think the other problem is not knowing what “data” means. But… I’ll save that for another post.