Statistics For Data Science

This unit highlights the importance of statistical methods and tools for today’s data scientist, business managers, data analysts, and demonstrates how to apply these methods to business problems using real-world data. The quantitative skills that students learn in this unit are useful in all areas of data science journey from data understanding to model building to communicating results. Through taking this unit students learn how to model and analyse the relationships within business data; how to classify the appropriate statistical technique in diverse business environments; how to interpret results in the context of the business problem; and how to forecast using business data. The unit is taught through data-driven examples, exercises and business case studies.

Learning unit outcomes:

  • Build a strong quantitative skill set for business decision making.
  • Create statistical models for studying relationship amongst business variables.
  • Evaluate underlying theories, concepts, assumptions and arguments in business related fields
  • Manage, analyse, evaluate and use information efficiently and effectively.
  • Demonstrate coherent arguments when recommending solutions
  • Communicate confidently and coherently to a professional standard both orally and in writing
  • Understand statistical relevance in driving data science solutions to real world problems.

List of topics:

  1. Introduction to statistics for data scientist
  2. Probability & Random variables
  3. Discrete and continuous probability distribution
  4. Statistical inference using sampling distributions, CLT and confidence intervals
  5. Hypothesis testing- one sample and two sample tests
  6. Simple linear regression
  7. Multiple linear regression

The course will focus mostly on applications of statistics that are relevant for data science.