The Differences Amongst Types of Data Scientists and Technologies

Being a computer nerd can be cool, but only if the nerds know the new (actually, old) buzzwords: machine learning, deep learning, AI, statistics, IoT, operations research, and applied mathematics.

Categories of data scientists

  • Statisticians: statistical modelers, data reduction, et cetera
  • Mathematicians/ Operations Researchers: typical NSA or defense/military classified as operations researchers 
  • Data engineers: Hadoop, API’s, Analytics as a Service, et cetera
  • Machine learning / computer scientists: algorithms, computational complexity
  • Business analysts: ROI optimization, decision sciences, high-level database design
  • Software engineering: production code in a few programming languages)
  • Data visualization specialists: specialists with varying backgrounds
  • Spatial data specialists: specialists often with specific background using data modeled by graphs, graph databases
  • Hybrid: Those strong in a few of the above

AI vs Machine Learning vs Deep Learning vs NLP 

  • AI (Artificial intelligence) is a subfield of computer science, that was created in the 1960s, and it was (is) concerned with solving tasks that are easy for humans, but hard for computers. In particular, a so-called Strong AI would be a system that can do anything a human can (perhaps without purely physical things). This is fairly generic, and includes all kinds of tasks, such as planning, moving around in the world, recognizing objects and sounds, speaking, translating, performing social or business transactions, creative work (making art or poetry), etc.
  • Machine learning is concerned with one aspect of this: given some AI problem that can be described in discrete terms (e.g. out of a particular set of actions, which one is the right one), and given a lot of information about the world, figure out what is the “correct” action, without having the programmer program it in. Typically some outside process is needed to judge whether the action was correct or not. In mathematical terms, it’s a function: you feed in some input, and you want it to to produce the right output, so the whole problem is simply to build a model of this mathematical function in some automatic way. To draw a distinction with AI, if I can write a very clever program that has human-like behavior, it can be AI, but unless its parameters are automatically learned from data, it’s not machine learning.
  • Deep learning is one kind of machine learning that’s very popular now. It involves a particular kind of mathematical model that can be thought of as a composition of simple blocks (function composition) of a certain type, and where some of these blocks can be adjusted to better predict the final outcome.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s