Live Training in Data Science, Analytics, and Big Data


I've been delivering live training in data science and big data topics for 5 years. I've got data courses for professionals of all skill levels. Take a look at the courses shown below. Please keep in mind that I am able to adapt these offerings to meet your organization's specific needs. I may be able to add custom modules, in order to tailor the training to your precise requirements.

Course Topics:

The big data landscape overview

  • What is Big Data?
    • Big data vs. its predecessors
    • How big data relates to data analytics and data science
    • The big data paradigm
  • Big data professional roles
  • Overview of ways big data projects benefit businesses and industries
  • The Hadoop ecosystem and architecture
    • Overview of Hadoop, MapReduce YARN & Spark
  • Other technologies in the big data paradigm
    • Overview of MPP, In-memory appliances, Apache Spark (redo), NoSQL, Apache Lucene, Hive / Pig, HBASE, Cassandra, Kafka. Sqoop, Oozie, RDBMSs

Big data project planning

  • Conceptualizing how a big data project can meet organizational needs
    • Considering relevant use cases
      • NetFlix, LinkedIn, Experian, Shell Oil, Facebook, Google for Education, ETL off-loading, Enterprise search, Orbitz, Dell SecureWorks
    • Best practices in metrics selection
  • Assessing the current state of your organization
    • Assembling data teams
  • Finalizing your implementation plan
  • Implementing a data-driven solution

Analytical methods for problem-solving

  • Data-Driven Approach to Drive Improvements Across Business Workshop
    • Pinpointing the problem
    • Assessing the problem
    • Analyzing alternative solutions
    • Implementing your solution
  • Getting to know data science and analytics roles and objectives
  • Introduction to data analytics
  • Basic math and statistics for data science
  • Statistical algorithms in data science
  • Making value of location data with Geographic Information System (GIS)
  • Free analytics applications

Basic data science mechanics

  • The benefits of object-oriented programming
    • Programming Python
  • Structured Query Language (SQL) in analytics and data science
  • Data presentation workshop

Introduction to machine learning

  • Getting to know machine learning
  • Classification algorithms
  • Regression algorithms
  • Clustering algorithms
  • Linear algebra algorithms
  • Mathematical methods: MCDM
  • Recommendation systems
  • The ethics of artificial intelligence

Training Activities:

This course is highly interactive. It includes countless hands-on game-like and demos activities, 14 use cases, and 20+ written activities.

Training Hours:

40 hours

Skill-Level:

Beginner

(*) Optional Training Extension:

* These course modules offers students the option to access and deploy complimentary Python coding demonstrations.

* This course is accompanied by an optional certification exam.

Course Topics:

Data Munging Basics

  • Filtering and selecting data
  • Treating missing values
  • Removing duplicates
  • Concatenating and transforming data
  • Grouping and data aggregation

Data Visualization

  • Creating standard plots (line, bar, pie)
  • Defining elements of a plot
  • Plot formatting
  • Creating labels and annotations
  • Creating visualizations from time series data
  • Constructing histograms, box plots, and scatter plots

Basic Math and Statistics

  • Using NumPy to perform arithmetic operations on data
  • Generating summary statistics using pandas and scipy
  • Summarizing categorical data using pandas
  • Starting with parametric methods in pandas and scipy
  • Delving into non-parametric methods using pandas and scipy
  • Transforming dataset distributions

Dimensionality Reduction

  • Introduction to machine learning
  • Explanatory factor analysis
  • Principal component analysis (PCA)

Outlier Analysis

  • Extreme value analysis using univariate methods
  • Multivariate analysis for outlier detection
  • A linear projection method for multivariate data

Cluster Analysis

  • K-means method
  • Hierarchical methods
  • Instance-based learning w/ k-Nearest Neighbor

Network Analysis with NetworkX

  • Working with graph objects
  • The basics about drawing graph objects
  • Simulating a social network (ie; directed network analysis)
  • Generating stats on nodes and inspecting graphs

Basic Algorithmic Learning

  • Linear Regression
  • Logistic Regression
  • Naive Bayes Classifiers

Web-based Data Visualizations with Plotly

  • Basic charts
  • Statistical charts
  • Plotly maps
  • Generating reports and dashboards

Web Scraping with Beautiful Soup

  • Working with objects
  • Data parsing
  • Web scraping in practice

Training Activities:

This course involves lectures, live in-class Python coding, student practice problems, and capstone projects.

Training Hours:

40 hours

Skill-Level:

Beginner

Course Topics:

Simple Approaches to Recommender Systems

  • Introducing core concepts of recommendation systems
  • Popularity-based recommenders
  • Evaluating similarity based on correlation

Machine Learning Based Recommendation Systems

  • Classification-based collaborative filtering
  • Model-based collaborative filtering systems
  • Content-Based Recommender Systems
  • Evaluating recommendation systems

Training Activities:

This course involves lectures, live in-class Python coding, and student practice problems.

Training Hours:

8 hours

Skill-Level:

Intermediate

Course Topics:

Data Munging Basics

  • Filtering and selecting data
  • Treating missing values
  • Removing duplicates
  • Concatenating and transforming data
  • Grouping and data aggregation

Data Visualization

  • Creating standard plots (line, bar, pie)
  • Defining elements of a plot
  • Plot formatting
  • Creating labels and annotations
  • Creating visualizations from time series data
  • Constructing histograms, box plots, and scatter plots

Basic Math and Statistics

  • Performing arithmetic operations on data
  • Generating summary statistics
  • Summarizing categorical data using pandas
  • Starting with parametric methods
  • Delving into non-parametric methods
  • Transforming dataset distributions

Outlier Analysis

  • Extreme value analysis using univariate methods
  • Multivariate analysis for outlier detection

Introduction to Machine Learning

  • Introduction to machine learning
  • Linear regression
  • Logistic regression

Web-based Data Visualizations with Plotly

  • Basic charts
  • Statistical charts
  • Plotly maps

Training Activities:

This course involves lectures, live in-class R coding, student practice problems, and capstone projects.

Training Hours:

24 hours

Skill-Level:

Beginner

Course Topics:

Logistic Regression and SoftMax

  • Intro to Logistic Regression
  • Exploratory analysis of Titanic dataset
  • Treating missing values: Titanic dataset
  • One-hot encoding
  • Generating predictions
  • Confusion matrix & k-fold cross validation
  • The softmax function on the iris dataset

Gradient Descent

  • Logistic SGD descent on the Titanic dataset

Artificial Neural Networks

  • Installing TensorFlow
  • Getting Familiar with TensorFlow
  • The Perceptron
  • Building a deep learning model
  • Deploying and evaluating a deep learning model
  • Things to look out for with deep learning in practice

Training Activities:

This course involves lectures, live in-class Python coding, and student practice problems.

Training Hours:

12 hours

Skill-Level:

Intermediate / Advanced

Course Topics:

Identifying the different types of data visualizations

  • Seeing the three different data visualization types in action
  • Diving into data storytelling
  • Diving into data showcasing
  • Displaying data art

Following 3 simple steps to design for your audience

  • Step 1: Brainstorming to better understand your intended audience
  • Step 2: Clarifying your data visualization’s purpose
  • Step 3: Choosing the most functional design style for your purpose
  • Designing with your purpose in mind

Identifying the different types of data graphics

  • Standard chart graphics
  • Comparative graphics
  • Statistical plots
  • Topology structures
  • Spatial plots and maps

Choosing the best data graphics for your project

  • Choosing graphics for data storytelling
  • Choosing graphics for data showcasing
  • Choosing graphics for data art
  • 4 steps to choosing the best data graphics for your project
  • Testing out data graphics

Other elements of effective data viz

  • Using context in data visualizations
  • Seeing data storytelling in action

Training Activities:

This workshop is highly interactive. with hands-on game-like activities and practice problems.

Training Hours:

4 hours

Skill-Level:

Beginner

For more information about any of these courses or to make a training booking,

please email me through the contact form below.

Captcha:
1 + 3 =

Privacy Policy - Terms of Use - Affiliate Disclaimer - Contact