Short Course 1, Part 1: Functional Data Analysis
Speaker Bio:
Tom Donnelly works as a Systems Engineer for JMP Statistical Discovery supporting users of JMP software in the Defense and Aerospace sector. He has been actively using and teaching Design of Experiments (DOE) methods for the past 40 years to develop and optimize products, processes, and technologies. Donnelly joined JMP in 2008 after working as an analyst for the Modeling, Simulation & Analysis Branch of the US Army’s Edgewood Chemical Biological Center – now DEVCOM CBC. There, he used DOE to develop, test, and evaluate technologies for detection, protection, and decontamination of chemical and biological agents. Prior to working for the Army, Tom was a partner in the first DOE software company for 20 years where he taught over 300 industrial short courses to engineers and scientists. Tom received his PhD in Physics from the University of Delaware.
Abstract:
Are you currently NOT USING YOUR ENTIRE DATA STREAM to inform decisions?
Sensors that stream data (e.g., temperature, pressure, vibration, flow, force, proximity, humidity, intensity, concentration, etc.), as well as radar, sonar, chromatography, NMR, Raman, NIR, or mass spectroscopy, all measure a signal versus a longitudinal component like wavelength, frequency, energy, distance, or in many cases - time. Are you just using select points, peaks, or thresholds in your curved or spectral data to evaluate performance? This course will show you how use the complete data stream to improve your process knowledge and make better predictions.
Curves and spectra are fundamental to understanding many scientific and engineering processes. They are created by many types of test and manufacturing processes, as well as measurement and detection technologies. Any response varying over a continuum is functional data.
Functional Data Analysis (FDA) uses functional principal components analysis (FPCA) to break curve or spectral data into two parts - FPC Scores and Shape Components. The FPC Scores are scalar quantities (or weights) that explain function-to-function variation. The Shape Components explain the longitudinal variation. FPC Scores can then be used with a wide range of traditional modeling and machine learning methods to extract more information from curves or spectra.
When these functional data are used as part of a designed experiment, the curves and spectra can be well predicted as functions of the experimental factors. Curves and spectra can also be used to optimize or “reverse engineer” factor settings. In a machine learning application functional data analysis uses the whole curve or spectra to better predict outcomes than employing “landmark” or summary statistical analyses of individual peaks, slopes, or thresholds.
References and links will be provided for open-source tools to do FDA, but in this course JMP Pro 18 software will be used to demonstrate analyses and to illustrate multiple case studies. See how a functional model is created by fitting a B-spline, P-spline, Fourier, or Wavelets basis model to the data. One can also perform functional principal components analysis directly on the data, without fitting a basis function model first. Direct Models include several Singular Value Decomposition (SVD) approaches as well as Multivariate Curve Resolution (MCR).
Curve or spectral data can often be messy. Several data preprocessing techniques will be presented. Methods to cleanup (remove, filter, reduce), transform (center, standardize, rescale), and align data (line up peaks, dynamic time warping) will be demonstrated. Correction methods specific to spectral data including Standard Normal Variate (SNV), Multiplicative Scatter Correction (MSC), Savitzky-Golay filtering, and Baseline Correction will be shown.
Case studies will be used to demonstrate the methods discussed above.
Session Materials:https://dataworks.testscience.org/wp-content/uploads/formidable/23/FDA-Course_all-materials-1.zip
Session Materials Website:https://drive.google.com/drive/folders/15uBMoFTLtH0sKmITjbpgv4PpnUYwxHgJ?usp=sharing