E23-09: Non-Linear Machine Learning for Calibration and Classification

One-Day Course
Sunday, Nov. 12, 8:30am – 5:00pm

Dr. Barry Wise, Eigenvector Research, Inc., Manson, WA
Manuel A. Palacios, Eigenvector Research

COURSE DESCRIPTION:
While linear machine learning methods, such as PLS regression, work in a very wide range of problems of chemical and biological interest, there are times when the relationships between variables are complex and require non-linear modeling methods. Many non-linear machine learning methods have been developed, however, we will focus on a few that we have found quite useful. The course begins with a discussion of linearizing transforms. Augmenting with non-linear transforms, e.g. polynomials, is discussed next. Locally Weighted Regression (LWR), Artificial Neural Networks (ANNs, including Deep-learning Networks) and Support Vector Machines (SVMs) are then considered, with SVMS for both regression and classification considered. Boosted regression and classification trees (XGBoost) and then covered. The course concludes with segments on how to choose a method and how to implement models online. The course includes hands-on computer time for participants to work example problems using PLS_Toolbox or Solo.

WHO SHOULD ATTEND:
Chemists, life scientists and engineers who want to be able to analyze their own laboratory or process data or develop their own data models. The course is especially well suited for those with an interest in process analytical technology (PAT) in the pharmaceutical industries, metabolomics, and systems biology. The courses serve individuals with a need for exploratory data analysis, development of predictive models such as analytical instrument calibrations, sample classification, and soft sensor models. 

TOPICS:

  1. Introduction 
    – Why non-linear methods?
    – How linear methods deal with non-linear data
  2. Factor based transforms
    – PCA Scores and Augmenting
    – Polynomial PLS
  3. Locally Weighted Regression
    – Weighted Regression
    – Distance Measures
    – Basing Models on PCA Scores
  4. Support Vector Machines
    – SVM basics
    – Kernel functions
    – Classification Models
    – Regression Models
  5. Artificial Neural Networks
    – ANN structures
    – Training procedures
    – Avoiding overfitting
    – Deep-learning networks using Sklearn and TensorFlow
  6. Gradient Boosted Decision Trees
    – Intro to decision trees
    – Classification and Regression Ensemble Models
    – XGBoost
  7. Choosing the right method
    – Prediction skill
    – Computational performance
    – Deployment options

INSTRUCTOR(S) BIOGRAPHIES:

Barry M. Wise, Ph.D. Barry Wise received B.S. degrees in Chemistry and Chemical Engineering from the University of Washington in 1982. After 3 years with Battelle Pacific Northwest National Laboratories, he returned to UW where he received M.S. (1987) and Doctor of Philosophy (1991) degrees in Chemical Engineering. Dr. Wise has authored more than 50 scientific publications, patents and book chapters, and a popular chemometrics software package, PLS_Toolbox, used by scientists and engineers world-wide. Wise is President and co-founder of Eigenvector Research, Inc. which delivers chemometrics software, training and consulting. Thousands have attended Wise’s chemometrics short courses at conferences and at Eigenvector University, in person and online. Dr. Wise was named winner of the Eastern Analytical Symposium Chemometrics Award for 2001 in recognition of his significant contributions to the field. Dr. Wise received the Wold Medal in Gold in 2019 for his deep commitment to the proliferation of chemometrics.