Table of Contents

## Introduction

As a student in the Eastern University MS in Data Science program, I’ve been taking courses in Python, R, statistics and databases for the last 9 months. I’m half way through the program now, and thought it would be helpful to potential students to provide more information on what each course in the program (that I’ve taken so far) is about.

This is a continuation of my REVIEW: Eastern University Master of Science in Data Science 2021. I expect to make another post once I’ve gotten further into the program.

### Courses I’ve Completed:

- DTSC-520 Fundamentals of Data Science
- DTSC-550 Introduction to Statistical Modeling
- DTSC-660 Data Analytics in R
- DTSC-575 Principles of Python Programming
- DTSC-660 Data and Database Management with SQL

### Upcoming courses:

- DTSC-670 Foundations of Machine Learning Models (starts August 30)
- DTSC-680 Applied Machine Learning
- DTSC-600 Information Visualization
- DTSC-690 Data Science Capstone: Ethical and Philosophical Issues in Data Science
- DTSC-691 Data Science Capstone: Applied Data Science

## DTSC-520 Fundamentals of Data Science

This course is the first one in the program, so it covers a lot of material. It’s an introduction to Python and the Anaconda distribution, numpy, pandas, matplotlib, seaborn, and then the general principles of data science.

This class is a whirlwind. There are optional coding assignments that you can complete but it is mostly theory based. You’ll be asked things on exams like what a specific slice of a string is, or how many times a loop will repeat, etc., but you won’t have to submit detailed coding assignments.

This course has 4 exams currently, each one worth 25% of your grade.

## DTSC-550 Introduction to Statistical Modeling

This course is, again, a theory course but this one focuses on statistics and R. It goes right from the basics of measures of central tendency into variance, covariance, standard deviation, hypothesis testing (T-Tests, Z-Tests, and ANOVA.)

It also discusses parametric and nonparametric statistical testing and includes optional labs in R. I actually skipped the labs, which was a mistake (!) because it made 650, the course that came after, quite a bit harder than it needed to be.

This course includes 5 exams, each worth 20% of your grade.

## DTSC-650 Data Analytics in R

This course is an introduction to R. It continues the statistical education but focuses on applying all of the concepts you learned in 550 with the R programming language. While 550 briefly touches on linear and logistic regression, 650 goes into depth with how to perform these in R and how to interpret the results.

650 also adds other components like using the AIC to assess model fit, how to interpret R-squared and how to use the Bonferroni correction to adjust p-values.

This course, when I took it, included 60% exams and 40% CodeGrade coding assignments.

CodeGrade assignments are repeatable coding assignments, where you’re given a dataset and then asked to answer questions on it.

For example, one of the datasets included a series of orders from a pizza place, and you might be asked, “Find the average number of orders delivered by Lisa on Fridays” or “Write a regression predicting whether a customer got wine based on their total order price and day of the week.” These assignments were really enjoyable and helped solidify my knowledge of R, but they ended up taking me a long time because I never used R until then.

There were 8 CodeGrade assignments making up 30% of the grade, and 4 exams making up 60% of the grade. My average CodeGrade assignment was 40 lines of code.

The last 10% of the grade was a large final project. This was also done in R, and it used a real dataset: https://www.kaggle.com/cdc/behavioral-risk-factor-surveillance-system.

We had to answer a variety of questions and also do our own analysis (exploratory data analysis and regressions, etc.) It was a lot of work but also a lot of fun. My assignment was about 400 lines of R, and I cut down some of it by writing a function to calculate some specific summary statistics I wanted for the variables I had selected.

## DTSC-575 Principles of Python Programming

This course is an introduction to Python course. It reviews what you did in 520 and adds on object-oriented programming. The first module is basically a review of the Python from 520, but it adds some information on list comprehensions.

Module 2 goes over strings, string formatting (also from 520), conditionals, the walrus operator, loops (with the addition of the break and continue commands.)

Module 3 goes over how to create functions including giving them arguments and parameters, decorators and exceptions and how to use lambdas which are little self-contained one line programs.

Module 4 goes over object-oriented programming, how to create objects and classes and make them parent/childs of each other, which is called inheritance.

Module 5 is called “odds and ends” and it goes over how to do statistics in Python, including how to use the scipy package, and how to do different tests from 550 and 650 in Python including ANOVA, t-test and linear regression.

This course was surprisingly short but packed a lot of material in. There are 24 small CodeGrade assignments. In contrast to 650 where there were 8 assignments averaging 40 lines of code each (320 lines in total), this course had 24 small assignments that were under 10 lines of code each (240 lines in total.)

I did need to look up the quadratic formula to answer one of these questions, but otherwise it was pretty straightforward.

## DTSC-660 Data and Database Management with SQL

I completed 660 and 575 at the same time. In retrospect, that was a bad idea. This course turned out to have 20 hours of video, 5 exams and 4 assignments!

The first two modules focus on the basics of database design. This is a lot of theory and mostly involves just drilling the definitions and trying to understand how they all fit together.

The first assignment involves designing an entity-relationship (ER) diagram for a fictional business. Assignment 2 is designing a relational schema for a fictional business and answering some questions about primary and foreign keys, among others.

Modules 3-6 were 1000% better than Modules 1 and 2. Starting in Module 3, the professor walks through PostgresSQL syntax and shows you how to achieve different tasks. This is a comprehensive course (as the 20 hours of video indicate) – you will be well-versed in SQL when you are finished.

Assignment 3 is to write a short SQL query, worth 3%.

Assignment 4 is pretty big. It involves writing a number of SQL queries, some procedures, functions and triggers, all in PostgresSQL. Assignment 4 is worth 20% of the grade.

Finally, the last module, Module 6, is on Git and Github. I was really happy to see this module because I wanted to create a Github and start posting my contributions. I still need to do some more project work (and get it looking nice – right now it’s just used as a repository for work-in-progress code.)

For DTSC-660, all the assignments total up to 44%. The 5 quizzes are worth 56%. There are 6 modules, so one module does not have a quiz, since it has Assignment 4 in it.

## Conclusion

I have really enjoyed this program. I am super excited for DTSC-670, which is the Foundations of Machine Learning course. I’ve already gotten to Chapter 4 in the textbook and hope to get up to Chapter 5 by the time the course starts (it actually goes up to Chapter 7 in the book.)

Hi Dustin!

Thanks for your thorough review. I am wondering if you have any advice for a total newb as far as course load. I am working full time but have few commitments outside of my 40 hour work week schedule. So far, which classes do you think can be reasonably combined and which ones would you recommend taking by themselves?

Hi Shea,

There is a suggested course outline given on the EU website. I’d recommend sticking with one course at a time if you can afford to, so that you can focus on each course. The suggested guideline for 2 courses is:

Term 1: 520/550

Term 2: 650/660

Term 3: 575/670

Term 4: 600/680

Term 5: 690/691

If you’re doing one course at a time, you’d follow that same progression, you would just take 520 in Term 1 and 550 in term 2. The hardest courses are 660 (Intro to Database Design/SQL), 670 (Foundations of Machine Learning) and 680 (Applied Machine Learning.) 650 (Data Analytics with R) was probably my favorite course in the program even though I’d never used R before.

The “easiest” courses will differ person to person but I would say 575 and 600 are probably the most manageable. The suggested course outline pairs up one of those most difficult courses with an easier course, you’ll notice.

To reiterate, I would start with one course at a time and only double up if you’re really sure you can handle the workload. Better to finish early than have to rush and not learn as deeply as you would like.

Happy learning!

Dustin

Hi Dustin!

Thank you for taking the time to provide such in depth info on this program. I was wondering how much time there is between the 7 week sessions. Thanks in advance!

Hi Antonia,

Most sessions have a week between them, this winter session actually has a month – and a few sessions have no gaps at all. You can see the full breakdown here: https://www.eastern.edu/about/offices-centers/office-registrar/academic-calendars/2020-2023-accelerated-7-week-schedule

Hope this helps,

Dustin

Hi Dustin,

Thanks for putting this together. You mention a textbook for DTSC-670, can you share the name of it?

Thanks

Hi Joe,

The textbook for 670/680 is Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems 2nd Edition by Aurélien Géron. It’s pretty inexpensive to buy and a great reference.

Dustin

Hello, I just signed up to start January 10th. Wondering if you have any additional reviews of any courses? This Blog definitely helped me decide that I wanted to apply. How far along are you now?

Hi Scooby,

This is the most recent review I’ve written. I just completed DTSC-680 the last term. On January 10 I’ll start DTSC-600 (Data Visualization) and DTSC-690 (the first of the two capstone courses.) I’m really enjoying the program and I’ll definitely write more once I’m done the program in April.

Dustin

Thanks for getting back with me and good luck!