Training Course Content – Data Science Basic

Course :  Data Science – Basic

Course Duration : 25  Hours

Batches on:

1st August 2017

1 September 2017

R Programming Software

R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Polls, surveys of data miners,and studies of scholarly literature databases show that R’s popularity has increased substantially in recent years

Introduction to R

  • Principle and software paradigm
  • Description of R interface
  • Advantages of R
  • Drawbacks of R

Advance Data Manipulation in R (Packages like DPLYR, PLYR,SQLDF, MASS)

  • Importing and exporting data from .txt files and .xls -like files
  • Advanced data manipulation
  • Accessing variables and management of subsets in data
  • Working with characters, text and dates

Module 1 – Fundamental of Statistics

  • Types of Variables, measures of central tendency and dispersion
  • Variable Distributions and Probability Distributions
  • Normal Distribution and Properties
  • Central Limit Theorem and Application

Module 3 : Data Preparation

  • Need for data preparation
  • Outlier treatment
  • Missing values treatment
  • Multicollinearity

Module 5 : Machine Learning Algorithm

  • Decision tree
  • NaïveBayes Algorithm
  • K-NN Classification& Regression

Module 2 : Statistical Significant Tests

  • Hypothesis Testing Null/Alternative Hypothesis formulation
  • Z‐Test,T‐Test, Chi‐Squaretest
  • Analysis ofVariance(ANOVA)
  • ChiSquareTest
  • Correlation

Module 4 : Predictive modeling & Time Series Analysis

  • Basics of regression analysis
  • Linear regression
  • Logistic regression
  • Interpretation of results
  • Multivariate Regression modeling

Case Study Project:

  • Customer Marketing Response Predictive Modeling
  • Patient Satisfaction Analysis
  • Call Centre Effectiveness Predictive Modeling
  • Customer Segmentation for Cross Sell-UpSell Modeling