Anomaly detection

This will explore PCA models for anomaly detection and diagnosis. Data is available on tab 1, 2 and 3 on the excel file attached, which includes three matrices - data, test1, and test2 - with 21 variables each. The matrices contain data collected from a Slurry Red Ceramic Melter (SRCM): the first twenty variables are temperature readings from thermowells throughout the Melter and the 21st variable is the glass level. The temperature readings are in units of degree Celsius(°C), while the glass level reading is in inches. Here, we will use principal component analysis to determine if the data included in test1 and test2 matches with the data in data.

Q1. Briefly describe PCA and how it is used for anomaly detection.
a. Describe the use of Q- and T2 -statistics for detecting anomalies.
b. Give the equations for calculating the Q- and T2 -statistics.

Q2. Develop a PCA model of the normal operating data contained in data.
a. Determine on your own how many PCs should be included in the model.
b. Describe how you made this decision and why it makes sense!
c. Using this model and the fault-free data in data, determine thresholds for the Q- and T2 -statistics. Explain how you selected thresholds.

Q3. Project the data in test1 and test2 into the PC space that you found in Q2. Evaluate the T2
- and Q-statistics to identify which, if any, observations are anomalous.

Q4. For data observations that are flagged as anomalies, look at the contributions to the T2 - or Q-statistics (whichever is appropriate) to determine which sensor readings might be causing the anomalies. Visually inspect the raw signals for these to see if the faults are apparent.

© SolutionLibrary Inc. solutionlibary.com November 15, 2019, 4:41 am 9836dcf9d7 https://solutionlibrary.com/statistics/multivariate-time-series-and-survival-analysis/anomaly-detection-jct0