Welcome to the 2020 update of the Self-Taught Data Scientist Curriculum! Smart, scrappy, and resourceful data professionals are more in-demand than ever. That’s as true now, as it was 3 years ago when I first published this article.

** What’s changed, however? **Plenty! First, the industry is flooded with talent from fresh grads and more mature workers who’ve invested in re-tooling their skill sets. Thankfully though, there is still more demand than supply of data professionals, so if you’re among us then you can join in that *happy dance*.

What else has changed? Well since then, I’ve gone on to train over 1 Million workers on how to do data science and machine learning (with my 5 LinkedIn Learning courses available through this link HERE, and my book Data Science For Dummies HERE.)

And while we should be jumping for joy that there is a more data-educated workforce to staff business requirements…

**I’ve seen data initiative and data professionals STRUGGLING profoundly in 2 big ways:**

**85% of data projects FAIL**– that’s according to Gartner! It’s been this way… Although it seems like there are lots of people focusing on how to build data solutions, and very few people focusing on making sure those projects actually generate profits for the company.*Yikes!***With more data professionals on the market, it becomes harder and harder for individual data workers to gain traction as leaders in the industry.**

To solve these 2 problems in one FAST & FUN fell swoop… I’ve recently released Winning With Data. It is chalked full of **data career quick win challenges** that are focused on upgrading a data professional’s skills & visibility with respect to 4 main superpowers: **Data Strategy, Project Management, Thought Leadership, and (Organizational) Leadership.** The whole point of this product is to get you data career wins within 30 days. You can see more about that HERE.

Some data scientists are trained in academia, and that’s fine. For people with degrees in non-quantitative fields, I recommend those formal academic programs. And then there’s the – driven data scientist, – the dedicated data scientist, **– the self-taught data scientist**! These are the people who aren’t afraid to go in deep with data, math, and code. These are the type that love to explore the numbers and know that they don’t need some academia professor forcing assignments down their throat in order to make progress in a field.

**If that’s you then, welcome to the club!**

If you’ve been following along with the Data-Mania blog, then you’ve already researched and identified the skills you need to land a job in data science. If you haven’t gotten that far, worry not – I broke the process down inside this FREE 52-PAGE GUIDE for breaking into data. You can download that and get the whole step-by-step process for free.

*(Hhheeeyyy – Let’s help each other out by crowd-sourcing the research. Why not?! In the comment section, write the title of the specific role you research and the top 5 skills that are needed for this role.)*

Ok, so… you’re (going to be) a self-taught data scientist. You know what skills you need to master. Now let’s take a look at some of the best places you can go online to learn these skills. The commons skills they need to acquire include (the obvious):

**Python (for data scientists and engineers)****R (for data scientists)****Spark (for data scientists and engineers)****Tableau (for data scientists and other analytics professionals)****Hadoop (for data engineers)****SQL (for data scientists and engineers)****NoSQL (for data engineers)****Machine Learning (for data scientists)****Deep Learning (for data scientists)****Natural Language Processing (for data scientists)**

**Helpful note:** If you are deciding which skill to master first, I recommend that you learn the skill that is as versatile as possible (notice how Python, Spark, and Tableau are useful in more than one data niche??).

## Courses for The Self-Taught Data Scientist and Engineer

The listing below is just a small sample of the courses I recommend. These are generalist courses aimed to please the self-taught data scientist or engineer. [*Please note: These are the materials I recommend to clients and to friends, because I believe in the quality of their content. If you purchase through some of these links though, I may earn a small commission on your purchase.*]

### Python

- Python for Data Science Essential Training – Part 1
- Python for Data Science Essential Training – Part 2
- [Python (for Data Engineering)] course

### R

### Spark

### Tableau

### Hadoop

### NoSQL

### SQL

### Machine Learning

### Deep Learning

### Natural Language Processing

You may have noticed the absence of Coursera and other MOOC courses here. My experience is that courses on Udemy, LinkedIn Learning, and Udacity are much more friendly. By that I mean, you don’t feel like a freshman attending a weed-out course. Instead, the courses that I recommend are designed to make it as easy as possible for you to succeed. I mean, that is the goal – right? ????

**Now that you’ve seen my list of recommended self-training materials, why not recommend a few of your personal favorites in the comments section below!?!**

*Click **HERE** to subscribe for special newsletter-only updates & free LinkedIn Live TV episodes with live Q&A access to Lillian!*

Consider a camera that has numerous parameters that can be set to improve the quality of the image produced, depending on unknown environmental conditions. Training data consists of sample images, a measure of quality and the parameter set used to configure the camera. How to train a model of some kind that given a set of example images and there parameters can predict how the parameters can be set to produce an optimal image? It’s a complex regression problem and I have some ideas but would love your thoughts.

You need to look into using deep learning for this 🙂

Try multiple regression analysis with correlation analysis. The choice of your variables can yield some amazing results. All the best.

Awesome suggestion! Thanks for saying hi!

What I think is missing in your list of skills are some applied math skills such as linear algebra, calculus and at least some hard core statistics exposure enough to understand the base concepts of distributions and probability – don’t forget Bayes.

Thanks for your input. I consider those as standard for anyone with a quantitative degree, more or less…

Derrick, Lilian,

I agree with Derrick!

I think a little revision in maths (linear algebria, statistics and probabiliy,…) are mandatory to really understand the data sciences. (by the way, currently that is what I am doing!)

You two make good points – however, you should already have these if you have a degree in a quantitative area. That was my assumption, at least, when I wrote this. Also the linear algebra used in data science is not much more than what you use in statics and dynamics for engineering, so… I guess this all depends on what type of degree you got.

It is very assumptive to assume that the reader/learner has a degree of any kind. So many people in the world of IT learn their skills because of a passion rather than attending some formal school.

As someone who is in mid-career, looking to change direction, going back to get a relevant degree simply isn’t an option as I couldn’t then support my family. So I have to learn on a job at a company who currently makes no use of such skills, the idea being that I’m trying to introduce them to this, to improve their use of data and start making strategic decisions influenced by accurate data, and data discovery not considered before. The amount of learning needed does include brushing up on math as it’s not a skill I’ve ever really need to put into practice since leaving education.

On a side note, I highly recommend DataCamp.com for some structured learning in either Python or R for data science or data analytics.

Good luck to you Robi

Don’t forget about self-taught bioinformaticians who in a sense become data scientists rather than biologists. The amount of maths and stats in biology courses is negligible so deeper understanding of the subject is definitely recommended.

You know what Anna? You’re absolutely right about that. I guess not all STEM degree are super quantitative – but doesn’t bio at least require Calculus and Statistics, or no?

I guess that depends on where you get your degree. I got mine in UK and we had some very basic stats classes and no calculus at all.

Oh – yeh, I guess it does depend

Nice list, pretty comprehensive on the tech side i would say. What i also find important is to find a passion for a field, be it health, finance or something like retail and sales. If youre spending weeks on building something, you want to do that where you feel dedicated. This also made it much easier for me to arrive at questions i want to answer from the data and which then resulted in true value for the client.

I couldn’t agree more. Unfortunately this type of specialization can’t be included in a generic cirriculm – but, you are correct!

It would be very useful to details the Maths skills required for those without a degree that covered maths, so the actual maths and statistics required would be useful.

hmmm, well – the math you need would be Calculus, Probability & Stats, and Linear Algebra. But I am not sure you can short-cut out of the quantitative degree plan just by taking these math classes. The thing about quantitative degrees is that they (should) teach you how to solve problems on your own… how to teach yourself quantitative subjects in order to get the solutions to the problems you face. A few math classes are only going to teach the math, but not the applied problem solving skills… make sense?

Do you need a Phd in maths to do data science? If so data scientists are likely to be in short supply forever as it takes 6 years to train for a PhD in maths in the UK.

Nope, def dont need a PhD in math to do data science! Coders and people with practical STEM degrees seem to be forever short-supplied, although all sub-sections are constantly growing and evolving.

So what if I do not have a degree that has a quantitative math skills that seems to be highly beneficial in learning Data Science?

Is there a way to teach oneself these mathematical skills?

Yes, but your better off (competitively) in analytics or data viz instead of data science.

Thanks for the reply Lillian! Could you just explain what do mean by analytics and data viz (do u mean visualization?)

Do you teach that?

I do, but only in live training at the moment…

It’s a blurred line, but where I work, the analysts generally work with historical data, and the data scientists tend to work with more real-time data. Like I said though, the line is blurred and a lot of work at different organisations can actually be very similar.

Data visualisation can involve working with business stakeholders to provide solutions to their problems where they need access to data. This often involves creating dashboards in programs like Tableau, Qliksense, RShiny etc. Data visualisation can often be blurred with Business Intelligence roles.

Excellent point, Thomas. Yes, they are fuzzy classes – but for the purpose of hiring and training — we must make some sort of meaningful distinction. Thank you for adding this point to the conversation!

This answers my question.

I’m Spanish and I would like to know if you webinar will be in direct. What’stime will be in Spain?. My level language is regular. Thank you

What webinar, sorry?

Probability and statistics. Bayesian methods. Solid tech skills… learning how things operate under the hood (it’s easy enough to pick up any given tool given a solid tech background). Learn how to program, period. Acquire a good understanding of all things data (databases, data structures, data analysis, data modeling, data visualization, ETL processes, etc.). Be more of a thinker, problem solver as opposed to a robotic doer. Acquire a solid business acumen.

Great adds!! There is A LOT to learn, but people have to start somewhere, right?

Hi Lillian,

I really appreciate this perspective that you provide in this article, and I also discovered that Udemy has many of their courses for $10.99 now! Have you ever used datacamp.com? If so, what are your thoughts? I find this site to be user-friendly and also offer a nice selection of courses. Thanks!

– Andrew

I haven’t tried DataCamp, but I LOVE Udemy. We are so lucky now. There were no data science courses on Udemy in 2012.

As someone with with a minor in economics and a BA in journalism, where would be a good place to start learning for political polling and such? I’ve taken Calc II at my uni and some advanced micro that emphasized game theory. But I wasn’t required to take probability. I assume all the courses you posted would be helpful once I know what I need to start with.

Have you looked into the work of Nate Silver?

Hi there Lillian, what I’m gathering from your responses to the previous comments here from non-quant degree holders like myself is that if we’re not going back for a quantitative degree, you’re saying that data science is not within reach or we simply won’t be a good as technical degree holders basically, correct? I’ve wandered through a few of the articles and PDFs here and that appears to be the sentiment, can you confirm? Thanks for your insight..

Hi Tierra, for people that don’t have technical degrees, I often recommend them to look into a data visualization or analytics role. There are still data roles that could be a good fit, depending on the type of training and experience a person has.

Hi Tierra,

Even if you’re not good at math and programming, you can still become a data scientist. It will just take you more effort and time to do so. With enough hard work and dedication, you can have the skills of an entry level data scientist within a year.

I’m a middle aged mom. I’ve been out of the workforce for over twelve years. Still, that’s not stopping me from trying to become a data scientist myself.

It’s good to ask advice from people like Lillian who have experience in data science but, in the end, it’s up to you to make the final decision on whether or not you want to become a data scientist. With all the online resources available, there are no longer any entry barriers to this field.

If you need encouraging words, I recommend a blog post by James Kobielus called Closing the Gap about the role self-taught data scientists play in closing the data science talent gap.

Best wishes to you.

Love this. Thanks for helping her out Bianca <3

A contrary opinion is that you are only limited by the time you’re willing to invest in learning the skills required to be a data scientist. If your background in math is limited but you’re willing to put in the time needed to learn, Khan Academy provides a complete math curriculum from kindergarten math to multivariate calculus. https://www.khanacademy.org/math . Also, OpenIntro provides a variety of statistics resources geared at the high school level. https://www.openintro.org/stat/

My experience teaching inferential statistics to doctoral nursing candidates shows that when people have a reason to learn something, they’ll invest the time to learn it.

When I coach people regarding the development of data science skills, I encourage them to find a problem they want to solve that is related to their current job role or an interest outside of work, and develop the skills needed to solve that problem.

Hi Lillian,

Thank you for the insightful thoughts and recommendations. I would like to get some advice/recommendations from your side. I’m interested to know which statistical methods do you recommend for the price elasticity model? What is the starting procedure? Is it really important to look at econometrics models?

Thank you.

Edward

Hi Edward, I look into some online courses on econometrics. Best of luck!

If you are not strong on the math side of things(like me). I found a few resources that were helpful for me. Paco Nathan has an O’Reilly video series called “Just Enough Math”

https://learning.oreilly.com/videos/just-enough-math/9781491904077

” many business people need just enough math to take advantage of open source frameworks for big data. This video course from Paco Nathan and Allen Day presents useful areas of advanced math in easy-to-digest morsels. If you’re familiar with high school Algebra 2 and basic statistics, you’re good to go.”

And for re familizaration and some advanced subjects Khan Academy has some useful courses to get basics again, and then learn new things.

https://www.khanacademy.org/math

These resources might help you out if you are not strong or have not “Done Math” for awhile.

It makes it easier to understand ML courses if you understand the math, or are familiar with the language of the math.

There are of course many other resources out there Coursera, EdX to name a few.

Hope this helps someone

This is a great addition!! Thank you Jake!