I recently completed a Statistics for Data Science course at the University of Toronto for Continuing Studies and I wanted to share my reflections about the experience.
Overall, the course was mostly interesting, some parts boring, and always challenging.
I should begin with admitting that I managed to avoid any heavy maths or statistics classes in university even though I completed science and business degrees. I certainly felt behind when math notations and equations started popping up in class. But, where there's a will, there's a way (mostly).
Most lectures were challenging to follow because of the pace of learning. I even tried to read ahead for some lectures to be more prepared, but that only marginally helped. As a result, I often sat in class with material that was way over my head and questioning if it was the best use of my time to spend that weekday night on campus, rather than at home self-studying. A younger me might of panicked and wondered what I had gotten myself into or worried that I would be outed as a fake. And I don't think I was the only one as I saw classmates drop out in the initial few weeks.
There is, however, one skill that I've developed since finishing my college studies that came in handy - learning how to learn. So, I tackled this problem like any other and focused on finding the best resources suitable for me.
I started with the required Statistics textbook and found the explanations and sample problems were inviting to beginners. The lectures mostly followed the chapters in the textbook, and I actually ignored the course lecture slides for the first 3/4 of the course. I also supplemented with Khan Academy, which does a great job of breaking material down into small snippets with videos. It's incredible that all of this material is available free and online - I highly recommend it. Lastly, I had a friend who was a bit more experienced whom I would ask for help when needed. Friendships are usually the biggest reason why I always recommend in-person courses vs. online studies.
So...was it worth it? Yes, but your mileage may vary.
In the working world, the satisfaction that you get out of something is a reflection of the effort that you put in. No one is watching over your shoulder, nor cares if you slack. Also, the rewards are more abstract - there's no incredible fitness transformation photos that you can show everyone after training for 12 weeks.
What I did get out of the course was a revised mental framework in which to examine decisions, better language when working with data scientists, and as mentioned, new friendships.
How many of us have to make decisions every day based on performance reports, trend analysis reports, or present business cases to internal stakeholders or clients? Would you feel more prepared if you had a framework to critically assess the underlying data in order to make a decision?
Not to get too technical, but I've stepped back and starting considering outcomes based on conditional probabilities - also known as Bayesian statistics. This change also coincides with a book I was reading recently, Principles, by Ray Dalio, who talks about expected value - weighing risks and rewards against probabilities. Many of us already make decisions taking probabilities and expected value into account, but I think about these factors more explicitly now.
Another benefit is the ability to better relate to data scientists and, ultimately, develop more effective working relationships. I may not know the intricacies of Monte Carlo Markov Chains, but I have been exposed to the topics. Actually, one of my metrics of how well I can relate to a functional role is if I can get a person to laugh at a joke related to their discipline whether they're in sales, marketing, client services, developers, or data scientists. I suppose I'm always building up a joke supply.
Lastly, friendships naturally develop over the course of the program with familiar faces and group projects. It's fascinating meeting working professionals from different backgrounds, who have opted to drag themselves to a stats class for 3 months during the winter, after work. There's a shared bond and certainly it's one of the most enduring satisfactions of the course.
Kerry's Recommendations for Data Science Statistics Resources
Good
- Great introduction for Stats theory: OpenIntro Statistics (https://www.openintro.org/stat/textbook.php)
- Dynamic introduction for Stats theory: Khan Academy (https://www.khanacademy.org/math/statistics-probability)
- Best resource combining Stats theory and Programming: Probabilistic and Bayesian Methods for Hackers (https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers)
Not Great
- Even though it's written by Allen Downey, I found his book to be too high-level: Think Bayes (https://github.com/AllenDowney/ThinkBayes)