Data Science with SAS

Data Science with SAS

Description

Data Science with SAS Training at Massive Tech - During Data Science with SAS training encompasses basic statistical concepts to advanced analytics and predictive modeling technique using Statistical Analysis System. This training is designed by taking consideration of specific industry segments and is preferred for reporting analytics and predictive modeling, while tools like R and Python gets an edge when it comes to advance data science, Machine learning and AI applications.

Prerequisites

Prerequisites

Participants from various backgrounds like Engineering, Finance, Maths, Statistics, Business Management who learn SAS & R for Advanced analytics job roles.

Key Features
  • LIVE Instructor-led Classes
  • 24x7 on-demand technical support for assignments, queries, quizzes, project, etc.
  • Flexibility to attend the class at your convenient time.
  • Server Access to Massive's Tech Management System until you get into your dream carrier.
  • A huge database of Interview Questions
  • Professional Resume Preparation
  • Earn a Skill Certificate
  • Enroll today and get the advantage.
Curriculum
  • Analytics World
    • Introduction to Analytics
    • Concepts of ETL
    • S-A-S in advanced analytics
  • Introduction and Walk through
    • Getting Started
    • Software Installation
    • Introduction to GUI
    • Different components of the language
    • All programming windows
    • Concepts of Libraries and Creating Libraries
    • Variable Attributes
    • Importing Data and Entering data manually
  • Understanding Data sets
    • Description portion of a Dataset (Proc Contents)
    • Data Portion of a Dataset
    • Variable Names and Values
    • Data Libraries
  • Understanding Data Step Processing
    • Data Step and Proc Step
    • Data step execution
    • Compilation and execution phase
    • Input buffer and concept of PDV
  • Importing Raw Data Files
    • Column Input and List Input and Formatted methods
    • Delimiters, Reading missing and non-standard values
    • Reading one to many and many to one records
    • Reading Hierarchical files
    • Creating raw data files and put statement
    • Formats/ Informat
  • Importing and Exporting Data (Fixed Format / Delimited)
  • Proc Import / Delimited text files
  • Proc Export / Exporting Data
  • Datalines / Cards
  • Atypical importing cases (mixing different style of inputs)
    • Reading Multiple Records per Observation
    • Reading Mixed Record Types
    • Sub-setting from a Raw Data File
    • Multiple Observation per Record
    • Reading Hierarchical Files
  • Understanding and Exploration Data
    • Introduction to basic Procedures – Proc Content, Proc Print
  • Understanding and Exploration Data
    • Operators and Operands
    • Conditional Statements
    • Difference between WHERE and IF Statements and limitation of WHERE Statement
    • Lables, Commenting
    • System options (OBS,FSTOBS,NOOBS etc..)
  • Data Manipulation
    • Proc Sort – with options / De-Duping
    • Accumulator variable and By-Group processing
    • Explicit Output Statements
    • Nesting Do loops
    • Do While and Do until Statements
    • Array elements and Range
  • Combining Datasets
    • Concatenation
    • Interleaving
    • Proc Append
    • One to One Merging
    • Match Merging
    • IN = Controlling merge and Indicator
  • Introduction to Databases
  • Introduction to Proc SQL
  • Basics of General SQL language
  • Creating table and Inserting Values
  • Retrieve & Summarize data
  • Group, Sort & Filter
  • Using Joins (Full, Inner, Left, Right and Outer)
  • Reporting and summary analysis
  • Concept of Indexes and creating Indexes (simple and composite)
  • Connecting S-A-S to external Databases
  • Implicit and Explicit pass through method
  • Macros Parameters and Variables
  • Different types of Macros Creation
  • Defining and Calling a Macro
  • Using call Symput and Symget
  • Macros options (mprintsymbolgenmlogic mirror serror)
  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing
  • Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square
  • Introduction to Predictive Modeling
  • Types of Business problems – Mapping of Techniques
  • Different Phases of Predictive Modeling
  • Need of Data preparation
  • Data Audit Report and Its importance
  • Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable Reduction
  • Variable Reduction Techniques – Factor & PCA Analysis
  • Introduction to Segmentation
  • Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
  • Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
  • Behavioural Segmentation Techniques (K-Means Cluster Analysis)
  • Cluster evaluation and profiling
  • Interpretation of results – Implementation on new data
  • Introduction – Applications
  • Assumptions of Linear Regression
  • Building Linear Regression Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
  • Validation of Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
  • Interpretation of Results – Business Validation – Implementation on new data
  • Introduction – Applications
  • Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
  • Building Logistic Regression Model
  • Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, etc)
  • Validation of Logistic Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, ROC Curve,
    Probability Cut-offs, Lift charts, Model equation, Drivers, etc)
  • Interpretation of Results – Business Validation -Implementation on new data
  • Introduction – Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Classification of Techniques(Pattern based – Pattern less)
  • Basic Techniques – Averages, Smoothening, etc
  • Advanced Techniques – AR Models, ARIMA, etc
  • Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc
  • Statistical learning vs. Machine learning
  • Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  • Concept of Overfitting and Under fitting (Bias-Variance Trade off) & Performance Metrics
  • Types of Cross validation(Train & Test, Bootstrapping, K-Fold validation etc)
  • Recursive Partitioning(Decision Trees)
  • Ensemble Models(Random Forest, Bagging & Boosting)
  • K-Nearest neighbours

Have Any Questions?

We are happy to answer any questions and we appreciate every feedback about our work!