Principal Component Analysis for Enhancement of Infrared Spectra Monitoring by Ricky Lance Haney A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Auburn, Alabama August 6, 2011 Copyright 2011 by Ricky Lance Haney Approved by Jeffrey Fergus, Chair, Professor of Materials Engineering Ruel (Tony) Overfelt, Professor of Mechanical Engineering Bart Prorok, Associate Professor of Materials Engineering Curtis Shannon, Professor of Chemistry and Biochemistry ii Abstract The issue of air quality within the aircraft cabin is receiving increasing attention from both pilot and flight attendant unions. This is due to exposure events caused by poor air quality that in some cases may have contained toxic oil components due to bleed air that flows from outside the aircraft and then through the engines into the aircraft cabin. Significant short and long-term medical issues for aircraft crew have been attributed to exposure. The need for air quality monitoring is especially evident in the fact that currently within an aircraft there are no sensors to monitor the air quality and potentially harmful gas levels (detect-to-warn sensors), much less systems to monitor and purify the air (detect-to-treat sensors) within the aircraft cabin. The specific purpose of this research is to utilize a mathematical technique called principal component analysis (PCA) in conjunction with principal component regression (PCR) and proportionality constant calculations (PCC) to simplify complex, multi-component infrared (IR) spectra data sets into a reduced data set used for determination of the concentrations of the individual components. Use of PCA can significantly simplify data analysis as well as improve the ability to determine concentrations of individual target species in gas mixtures where significant band overlap occurs in the IR spectrum region. Application of this analytical numerical technique to IR spectrum analysis is important in improving performance of commercial sensors that airlines and aircraft manufacturers could potentially use in an aircraft cabin environment for multi-gas component monitoring. iii The approach of this research is two-fold, consisting of a PCA application to compare simulation and experimental results with the corresponding PCR and PCC to determine quantitatively the component concentrations within a mixture. The experimental data sets consist of both two and three component systems that could potentially be present as air contaminants in an aircraft cabin. In addition, experimental data sets are analyzed for a hydrogen peroxide (H2O2) aqueous solution mixture to determine H2O2 concentrations at various levels that could be produced during use of a vapor phase hydrogen peroxide (VPHP) decontamination system. After the PCA application to two and three component systems, the analysis technique is further expanded to include the monitoring of potential bleed air contaminants from engine oil combustion. Simulation data sets created from database spectra were utilized to predict gas components and concentrations in unknown engine oil samples at high temperatures as well as time-evolved gases from the heating of engine oils. iv Acknowledgments I would like to express great appreciation towards Dr. Jeffrey Fergus, who was instrumental in my decision to attend Auburn University and pursue a Ph.D. in materials engineering. Dr. Fergus has also provided invaluable insight towards this research effort as well as significant funding support, along with Dr. Ruel (Tony) Overfelt, throughout my time at Auburn. I would like to express sincere appreciation to Dr. Overfelt for allowing me to work within the Air Transportation Center of Excellence (ACER), which has provided access to all the necessary resources for this research effort. I would like to thank Dr. Curtis Shannon, who guided the research effort towards the use of principal component analysis at the proposal stage. Dr. Shannon has also been vital in furthering my understanding of infrared spectroscopy. Additionally, I would like to thank Dr. Bart Prorok for serving on my Ph.D. committee and providing insight to further this research effort during weekly group meetings. I would like to acknowledge fellow students that have helped tremendously with this research effort. Mobbassar Hassan Sk was instrumental in the data collection and analysis of the hydrogen peroxide in aqueous solution samples. John Andress helped significantly with using the FTIR instrument as well as the collection and analysis of numerous gas samples for this research effort as well as for commercial sensor testing. Amanda Neer?s work with the data collection of the engine oil samples was critical in completion of this dissertation. v Most of all, I must thank my wonderful wife, Lacey, without whom the completion of this dissertation would not have been possible. She was always there to help me and always believed in my abilities to finish this research effort. This project was funded by the U.S. Federal Aviation Administration (FAA) Office of Aerospace Medicine through the National Air Transportation Center of Excellence for Research in the Intermodal Transport Environment (RITE), Cooperative Agreement 07-C-RITE-AU. Although the FAA has sponsored this project, it neither endorses nor rejects the findings of this research. vi Table of Contents Abstract ......................................................................................................................................... ii Acknowledgments........................................................................................................................ iv List of Tables ............................................................................................................................... ix List of Figures ............................................................................................................................. xii List of Abbreviations and Mathematical Symbols................................................................... xxiv Chapter 1: Introduction ................................................................................................................. 1 Research Purpose ............................................................................................................ 3 Research Approach ......................................................................................................... 4 Organization of Dissertation ........................................................................................... 5 Chapter 2: Commercial Aircraft Systems Overview .................................................................... 7 Commercial Aircraft Background ................................................................................... 7 Vapor Phase Hydrogen Peroxide (VPHP) Aircraft Cabin Decontamination ............... 11 Potential Environmental Air Contaminants within the Aircraft Cabin ......................... 12 Potential Bleed Air Contaminants within the Aircraft Cabin ....................................... 13 Laboratory Test Environment ....................................................................................... 13 Chapter 3: Fourier Transform Infrared (FTIR) Spectroscopy Overview .................................... 23 FTIR Theoretical Background ...................................................................................... 23 FTIR Measurement Technique ..................................................................................... 24 IR Characteristics of CO/CO2/H2O ............................................................................... 28 vii Carbon Monoxide (CO) .................................................................................... 28 Carbon Dioxide (CO2) ...................................................................................... 29 Water (H2O) ...................................................................................................... 31 Chapter 4: Principal Component Analysis (PCA) ...................................................................... 33 PCA Theoretical Background ....................................................................................... 33 Principal Component Regression (PCR) ...................................................................... 36 Proportionality Constant Calculation (PCC) ................................................................ 38 Application to FTIR Spectroscopy Data ....................................................................... 39 Chapter 5: PCA Application to FTIR Spectroscopy Data of Vapor Phase Hydrogen Peroxide (VPHP) Aircraft Cabin Decontamination Events ........................................................... 53 Discussion of Analyzed Data Sets ................................................................................ 53 PCA Results and Discussion ......................................................................................... 55 Data Set 1 .......................................................................................................... 55 Data Set 2 .......................................................................................................... 67 Chapter 6: PCA Application to FTIR Spectroscopy Data of Potential Environment Air Contaminants within the Aircraft Cabin ......................................................................... 71 Discussion of Analyzed Data Sets ................................................................................ 71 PCA Results and Discussion: 2-Component Systems .................................................. 72 CH2O/C3H4O Simulation Data Set ................................................................... 72 CO/CO2 Simulation Data Set ............................................................................ 80 CO/CO2 Experimental Data Set ........................................................................ 87 PCA Results and Discussion: 3-Component Systems .................................................. 98 CH2O/C3H4O/H2O Simulation Data Set ........................................................... 98 CO/CO2/H2O Experimental Data Set .............................................................. 109 viii Chapter 7: PCA Application to FTIR Spectroscopy Data of Potential Bleed Air Contaminants within the Aircraft Cabin .............................................................................................. 119 Discussion of Analyzed Data Sets .............................................................................. 119 PCA Results and Discussion ....................................................................................... 121 Data Set 1 ? Engine Oil Samples at Temperatures of Greatest Mass Loss .... 121 Data Set 2 ? Simulated CH4O/CH2O/CO2 Gas Mixtures & Engine Oil Samples at Temperatures of Greatest Mass Loss ............. 124 Data Set 3 ? Simulated CH4O/CH2O/CO2/CO/H2O Gas Mixtures & BP Turbo Oil 2380 Engine Oil Time-Evolved Samples ..................... 143 Chapter 8: Conclusions ............................................................................................................. 155 Chapter 9: Future Work ............................................................................................................ 156 References ................................................................................................................................. 158 Appendix A: Principal Component Analysis MATLAB? Source Code ................................. 163 ix List of Tables Table 2-1: Relationship between Altitude and Pressure with the Typical Aircraft Cabin Pressure Highlighted ........................................................................................................ 8 Table 2-2: Limits on Contaminants that Could Potentially Be Found in Aircraft Cabin Air .... 13 Table 3-1: Relationship between I/I0, %T, and A ....................................................................... 26 Table 4-1: CH2O/C3H4O Simplified Pure Spectra for Illustration of PCA Application to FTIR Spectroscopy Data ................................................................................................ 40 Table 4-2: CH2O/C3H4O Spectra Compositions for Calibration Data Set, [XC](10 x 7), Ordered from Lowest to Highest Amount of CH2O in the Gas Mixture ....................... 41 Table 4-3: CH2O/C3H4O Spectra Compositions for Prediction Data Set, [XP](5 x 7), Ordered from Lowest to Highest Amount of CH2O in the Gas Mixture ....................... 42 Table 4-4: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques ............. 52 Table 5-1: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; H2O2 in Aqueous Solution Data Set 1 ..................................................................................... 67 Table 5-2: H2O2 in Aqueous Solution Spectra Compositions for Prediction Data Set 2, [XP](5 x 601) ....................................................................................................................... 67 Table 5-3: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; H2O2 in Aqueous Solution Data Set 2 ..................................................................................... 70 Table 6-1: CH2O/C3H4O Spectra Compositions for Calibration Data Set, [XC](8 x 2655), Ordered from Low to High Concentration of CH2O ..................................................... 73 Table 6-2: CH2O/C3H4O Spectra Compositions for Prediction Data Set, [XC](8 x 2655), Ordered from Low to High Concentration of CH2O ..................................................... 73 x Table 6-3: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; CH2O/C3H4O Gas Mixtures ........................................................................................... 80 Table 6-4: CO/CO2 Spectra Compositions for Calibration Data Set, [XC](10 x 2636) ................... 80 Table 6-5: CO/CO2 Spectra Compositions for Prediction Data Set, [XP](5 x 2636) ...................... 81 Table 6-6: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; CO/CO2 Gas Mixtures .................................................................................................... 87 Table 6-7: CO/CO2 Spectra Compositions for Calibration Data Set, [XC](8 x 5001) .................... 87 Table 6-8: CO/CO2 Spectra Compositions for Prediction Data Set, [XP](4 x 5001) ...................... 88 Table 6-9: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; CO/CO2 Experimental Gas Mixtures ............................................................................. 98 Table 6-10: CH2O/C3H4O/H2O Spectra Compositions for Calibration Data Set, [XC](8 x 2655), Ordered from Low to High Concentration of CH2O ..................................................... 99 Table 6-11: CH2O/C3H4O/H2O Spectra Compositions for Prediction Data Set, [XP](3 x 2655), Ordered from Low to High Concentration of CH2O ................................................... 100 Table 6-12: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; CH2O/C3H4O Gas Mixtures ......................................................................................... 109 Table 6-13: CO/CO2/H2O Spectra Compositions for Calibration Data Set, [XC](8 x 1001) ........ 109 Table 6-14: CO/CO2/H2O Spectra Compositions for Prediction Data Set, [XP](8 x 1001) .......... 110 Table 6-15: Errors Associated with the Principal Component Regression (PCR) Analysis Technique; CO/CO2/H2O Experimental Gas Mixtures ................................................ 116 Table 7-1: Calibration Data Set 2 Composed of Simulated Spectra of Various Concentrations of Methanol (CH4O), Formaldehyde (CH2O), and Carbon Dioxide (CO2) ................. 125 Table 7-2: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 ......... 128 Table 7-3: RMSE for Predicted Spectra based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 ........................................................................................... 131 xi Table 7-4: PCR Calculated Concentrations of CH4O (Modified), CH2O, and CO2 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 ......................................................................................................................... 135 Table 7-5: Comparison of RMSE using the Standard CH4O Spectra and the Modified CH4O Spectra for Predicted Spectra based on PCR Calculated Concentrations ......... 138 Table 7-6: Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 ........................................................................................... 139 Table 7-7: Comparison of RMSE using the Standard CH4O Spectra, the Modified CH4O Spectra, and the Truncated Spectra for Predicted Spectra based on PCR Calculated Concentrations ............................................................................................................. 141 Table 7-8: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 2380 .............................................................................................................. 141 Table 7-9: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 274 ....................................................................................................................................... 142 Table 7-10: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for Mobile Jet Oil II ............................................................................................................ 142 Table 7-11: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for Aeroshell Turbine Oil 560 ............................................................................................................ 142 Table 7-12: Calibration Data Set 2 Composed of Simulated Spectra of Various Concentrations of Methanol (CH4O), Formaldehyde (CH2O), Carbon Dioxide (CO2), Carbon Monoxide (CO), and Water (H2O) ...................................................... 144 Table 7-13: Comparison of RMSE using the Full and the Truncated Spectra, for the PCR Calculated Concentrations ........................................................................................... 153 xii List of Figures Figure 1-1: Commercial Aircraft Bleed Air System .................................................................... 2 Figure 2-1: Characteristics of Ideal CO2 Partial Pressure Sensor; Measures pCO2 = 0.39 at PT = 77 kPa (7000 ft Altitude) for pCO2 = 0.51 at PT = 101 kPa (Sea Level) ................... 9 Figure 2-2: Recirculation System Outside Air Changes per Hour for Aircraft Compared with Other Environments ....................................................................................................... 10 Figure 2-3: Cabin Ventilation System Illustrating Typical Aircraft Recirculated Airflow ....... 11 Figure 2-4: Laboratory Test Environment, Control Module ..................................................... 14 Figure 2-5: Laboratory Test Environment, Commercial Sensor Analysis Module ................... 15 Figure 2-6: Laboratory Test Environment, FTIR Gas Analysis Module with Spectrum GX FTIR (Perkin Elmer, Shelton, CT, USA) with M-5-22-V Variable Pathlength Long Path Gas Cell (Infrared Analysis, Inc., Anaheim, CA, USA) ............................. 16 Figure 2-7: Internal Gold Plated Mirrors of the Variable Pathlength Long Path Gas Cell Highlighting the Path of the IR Beam within the Cell ................................................... 17 Figure 2-8: System Block Diagram Illustrating Gas Flows through the Commercial Sensor Analysis Module to the FTIR Gas Analysis Module then Out through the Vacuum Pump .............................................................................................................................. 18 Figure 2-9: Theoretical Gas Flow Model Calculations Compared to Experimental Results for 800 ppm CO Test Gas in N2 with Total Inflow of 500 sccm with Vacuum Pressure (Pvac) = Atmospheric Pressure (Pa) ................................................................. 21 Figure 2-10: Theoretical Gas Flow Model Calculations for FTIR Gas Analysis Module in Series with the Commercial Gas Sensor Analysis Module ........................................... 22 Figure 3-1: Energy Diagram Highlighting Transitions from Electronic Ground State to Electronic Excited State with Rotational and Vibrational Transitions .......................... 23 Figure 3-2: Beam Path from IR Source Highlighting the use of a Fixed Mirror and Moving Mirror to Construct an Interferogram Time Domain Signal Containing the IR Source Information .................................................................................................................... 25 xiii Figure 3-3: Typical Atmospheric Background Spectra Highlighting CO2 and H2O Interference, 16 Co-added Scans with Resolution 0.5 cm-1 .......................................... 27 Figure 3-4: CO Fundamental Vibrational Mode, k1 = 2143 cm-1 .............................................. 28 Figure 3-5: IR Absorbance Spectra for CO from QASoft? Database ...................................... 28 Figure 3-6: CO2 First Fundamental Vibrational Mode, k1 = 1340 cm-1 .................................... 29 Figure 3-7: CO2 Second Fundamental Vibrational Frequency, k2 = 667 cm-1 .......................... 29 Figure 3-8: CO2 Third Fundamental Vibrational Frequency, k3 = 2350 cm-1 ........................... 30 Figure 3-9: IR Absorbance Spectra for CO2 from QASoft? Database ..................................... 31 Figure 3-10: H2O Second Fundamental Vibrational Frequency, k2 = 1595 cm-1 ...................... 32 Figure 3-11: IR Absorbance Spectra for H2O from QASoft? Database ................................... 32 Figure 4-1: Variance-Covariance Matrix, [Z](p x p) Calculated from Mean Centered Data Matrix, [XM](n x p) .......................................................................................................................... 34 Figure 4-2: Formaldehyde (CH2O) and Acrolein (C3H4O) Complete Pure Spectra .................. 40 Figure 4-3: CH2O/C3H4O Simplified Pure Spectra for Illustration of PCA Application to FTIR Spectroscopy Data ................................................................................................ 40 Figure 4-4: CH2O/C3H4O Calibration Data Set, [XC](10 x 7); Note ? Only 5 of 10 Calibration Spectra Shown ............................................................................................................... 42 Figure 4-5: CH2O/C3H4O Prediction Data Set, [XP](5 x 7) .......................................................... 43 Figure 4-6: CH2O/C3H4O Calibration Data Set, Mean Centered, [XM](10 x 7); Note ? Only 5 of 10 Calibration Spectra Shown ................................................................................... 43 Figure 4-7: CH2O/C3H4O Variance-Covariance Matrix, [Z](7 x 7) .............................................. 44 Figure 4-8: CH2O/C3H4O Eigenvalues from Solving Equation 4.2 ........................................... 44 Figure 4-9: SCREE Plot Indicating Eigenvalues for Each Principal Component; Principal Components 1 and 2 Explain 77.1% and 22.9% of the Total Calibration Data Set Variance ......................................................................................................................... 45 Figure 4-10: CH2O/C3H4O Loadings Matrix, [V](7 x 7) .............................................................. 45 Figure 4-11: CH2O/C3H4O Scores Matrix, [S](10 x 10) ................................................................ 46 xiv Figure 4-12: CH2O/C3H4O Estimates of Regression Coefficients from Calibration Data Set, [b](7x1) ............................................................................................................................. 46 Figure 4-13: Principal Component Regression (PCR) for CH2O Concentrations in CH2O/C3H4O Gas Mixtures; RMSE Calibration = 1.1x10-4 ppm, RMSE Prediction = 1.1x10-4 ppm .............................................................................................. 47 Figure 4-14: PCA Separated CH2O Spectra in CH2O/C3H4O Gas Mixtures ............................. 48 Figure 4-15: PCA Separated C3H4O Spectra in CH2O/C3H4O Gas Mixtures ........................... 48 Figure 4-16: PCC C3H4O Calibration Data Set; Note ? Only 5 of 10 Calibration Spectra Shown ............................................................................................................................ 49 Figure 4-17: PCC C3H4O Prediction Data Set ........................................................................... 49 Figure 4-18: PCC CH2O Calibration Data Set; Note ? Only 5 of 10 Calibration Spectra Shown ............................................................................................................................ 50 Figure 4-19: PCC CH2O Prediction Data Set ............................................................................ 50 Figure 4-20: PCC CH2O Calibration Data Set with No Baseline Correction ............................ 51 Figure 4-21: PCC Calibration Data Set with Baseline Correction ............................................ 51 Figure 4-22: PCC CH2O Prediction Data Set; RMSE Calibration = 2.6x10-2 ppm, RMSE Prediction = 2.6x10-2 ppm ................................................................................... 52 Figure 5-1: H2O2 in Aqueous Solution Calibration Data Set 1, [XC](8 x 801); Note ? Only 4 of 8 Calibration Spectra Shown, 0%, 10%, 20%, 30% H2O2 ........................................ 56 Figure 5-2: H2O2 in Aqueous Solution Prediction Data Set 1, [XP](1 x 801); 63.7% H2O2 ........... 56 Figure 5-3: SCREE Plot Indicating Eigenvalues for Each Principal Component for H2O2 in Aqueous Solution Data Set 1; Principal Components 1 and 2 Explain 67.5% and 28.5% of the Total Calibration Data Set Variance ........................................................ 57 Figure 5-4: H2O2 in Aqueous Solution Mixtures ? Reduced Principal Component Loadings, [V*]; V-1 represents the variable in the original data set contributing the most variance within the spectra, the H2O component, V-2 represents the variable in the original data set contributing the second most variance, the H2O2 component ............. 58 Figure 5-5: H2O2 in Aqueous Solution Mixtures ? Estimates of Regression Coefficients for wt. % of H2O2 from Calibration Data Set 1 as Function of Wavenumber, [bH2O2](801 x 1) ................................................................................................................... 59 xv Figure 5-6: Principal Component Regression (PCR) ? H2O2 Concentrations for H2O2 in Aqueous Solution Mixtures; RMSE Calibration = 2.1 wt.%, RMSE Prediction = 12.0 wt.% ....................................................................................... 59 Figure 5-7: H2O2 RMSE Calibration and RMSE Prediction as a Function of the Number of Principal Components Used to Represent the Original Data Set for H2O2 in Aqueous Solution Mixtures Data Set 1 .......................................................................... 60 Figure 5-8: PCC H2O2 Calibration Data Set 1; Note ? Only 3 of 8 Calibration Spectra Shown, 0%, 20%, 30% H2O2 ......................................................................................... 61 Figure 5-9: PCC H2O2 Prediction Data Set 1; 63.7% H2O2 ....................................................... 62 Figure 5-10: PCC H2O Calibration Data Set 1; Note ? Only 2 of 8 Calibration Spectra Shown, 70%, 100% H2O ................................................................................................ 62 Figure 5-11: PCC H2O Prediction Data Set 1; 36.3% H2O ....................................................... 63 Figure 5-12: PCC H2O2 Calibration Data Set 1 with No Baseline Correction .......................... 64 Figure 5-13: PCC H2O Calibration Data Set 1 with No Baseline Correction ........................... 64 Figure 5-14: PCC H2O2 Calibration Data Set 1 with Baseline Correction ................................ 65 Figure 5-15: PCC H2O Calibration Data Set 1 with Baseline Correction ................................. 66 Figure 5-16: PCC Model for wt.% of H2O2 in Aqueous Solution with Variable Total Amount in Data Set 1; kH2O2 = 0.19, kH2O = 0.67; RMSE Calibration = 4.1 wt.%, RMSE Prediction = 5.2 wt.% ......................................................................................... 66 Figure 5-17: SCREE Plot Indicating Eigenvalues for Each Principal Component for H2O2 in Aqueous Solution Data Set 2; Principal Components 1 and 2 Explain 61% and 34% of the Total Calibration Data Set Variance .................................................................... 68 Figure 5-18: Principal Component Regression (PCR) ? H2O2 Concentrations for H2O2 in Aqueous Solution with Variable Total Amount in Data Set 2; RMSE Calibration = 10.7 wt.%, RMSE Prediction = 3.9 wt.% ................................................ 68 Figure 5-19: PCC Model for wt.% of H2O2 in Aqueous Solution with Variable Total Amount in Data Set 2; kH2O2 = 0.65, kH2O = 0.40; RMSE Calibration = 5.9 wt.%, RMSE Prediction = 6.0 wt.% ......................................................................................... 69 Figure 6-1: CH2O/C3H4O Gas Mixtures Calibration Data Set, [XC](8 x 2655); Note ? Only 3 of 8 Calibration Spectra Shown ..................................................................................... 73 Figure 6-2: CH2O/C3H4O Gas Mixtures Prediction Data Set, [XP](3 x 2655) ................................ 74 xvi Figure 6-3: SCREE Plot Indicating Eigenvalues for Each Principal Component for CH2O/C3H4O Gas Mixtures Data Set; Principal Components 1 and 2 Explain 98.4% and 1.6% of the Total Calibration Data Set Variance .................................................... 74 Figure 6-4: Principal Component Regression (PCR) ? CH2O Concentrations in CH2O/C3H4O Gas Mixtures; RMSE Calibration = 0.0 ppm, RMSE Prediction = 0 ppm ......................................................................................................... 75 Figure 6-5: Principal Component Regression (PCR) ? C3H4O Concentrations in CH2O/C3H4O Gas Mixtures; RMSE Calibration = 0.0 ppm, RMSE Prediction = 0 ppm ......................................................................................................... 76 Figure 6-6: PCA Separated CH2O Spectra in CH2O/C3H4O Gas Mixtures ............................... 77 Figure 6-7: PCA Separated C3H4O Spectra in CH2O/C3H4O Gas Mixtures .............................. 77 Figure 6-8: PCC CH2O Calibration and Prediction Data Sets in CH2O/C3H4O Gas Mixtures; kCH2O = 0.08; RMSE Calibration = 1 ppm, RMSE Prediction = 2 ppm ........................ 78 Figure 6-9: PCC C3H4O Calibration and Prediction Data Sets in CH2O/C3H4O Gas Mixtures; kC3H4O = 0.32; RMSE Calibration = 0 ppm, RMSE Prediction = 2 ppm ....................... 79 Figure 6-10: CO/CO2 Gas Mixtures Calibration Data Set, [XC](10 x 2636); Note ? Only 4 of 10 Calibration Spectra Shown ............................................................................................ 81 Figure 6-11: CO/CO2 Gas Mixtures Prediction Data Set, [XP](5 x 2636) ...................................... 82 Figure 6-12: SCREE Plot Indicating Eigenvalues for Each Principal Component for CO/CO2 Gas Mixtures Data Set; Principal Components 1 and 2 Explain 99.1% and 0.9% of the Total Calibration Data Set Variance .......................................................... 82 Figure 6-13: Principal Component Regression (PCR) ? CO Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm .............................. 83 Figure 6-14: Principal Component Regression (PCR) ? CO2 Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm .............................. 84 Figure 6-15: PCA Separated CO Spectra in CO/CO2 Gas Mixtures ......................................... 85 Figure 6-16: PCA Separated CO2 Spectra in CO/CO2 Gas Mixtures ........................................ 85 Figure 6-17: PCC CO Calibration and Prediction Data Sets in CO/CO2 Gas Mixtures; kCO = 0.75; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm ........................... 86 Figure 6-18: PCC CO2 Calibration and Prediction Data Sets in CO/CO2 Gas Mixtures; kCO2 = 0.06; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm .......................... 86 xvii Figure 6-19: CO/CO2 Gas Mixtures Calibration Data Set, [XC](8 x 5001); Note ? Only 4 of 8 Calibration Spectra Shown ............................................................................................ 88 Figure 6-20: CO/CO2 Gas Mixtures Prediction Data Set, [XP](4 x 5001) ...................................... 88 Figure 6-21: SCREE Plot Indicating Eigenvalues for Each Principal Component for CO/CO2 Gas Mixtures Data Set; Principal Components 1 and 2 Explain 97.9% and 0.9% of the Total Calibration Data Set Variance ........................................................................ 89 Figure 6-22: Principal Component Regression (PCR) ? CO Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 92 ppm, RMSE Prediction = 49 ppm .......................... 90 Figure 6-23: Principal Component Regression (PCR) ? CO2 Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 0.7 ppm, RMSE Prediction = 0.8 ppm ........................ 90 Figure 6-24: CO RMSE Calibration and RMSE Prediction as a Function of the Number of Principal Components Used to Represent the Original Data Set in CO/CO2 Gas Mixtures ......................................................................................................................... 91 Figure 6-25: Principal Component Regression (PCR) with 3 Principal Components ? CO Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 16 ppm, RMSE Prediction = 14 ppm ....................................................................................................... 92 Figure 6-26: Principal Component Regression (PCR) with 4 Principal Components ? CO Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 8 ppm, RMSE Prediction = 14 ppm ....................................................................................................... 92 Figure 6-27: CO/CO2 Gas Mixtures ? Estimates of Regression Coefficients for Concentrations of CO from Calibration Data Set as Function of Wavenumber, [bCO](5000 x 1); 2 Principal Components Used .................................................................. 93 Figure 6-28: CO/CO2 Gas Mixtures ? Estimates of Regression Coefficients for Concentrations of CO from Calibration Data Set as Function of Wavenumber, [bCO](5000 x 1); 3 Principal Components Used .................................................................. 94 Figure 6-29: CO/CO2 Gas Mixtures ? Estimates of Regression Coefficients for Concentrations of CO from Calibration Data Set as Function of Wavenumber, [bCO](5000 x 1); 4 Principal Components Used .................................................................. 94 Figure 6-30: PCA Separated CO Spectra in CO/CO2 Gas Mixtures Represented by V-2 ......... 95 Figure 6-31: PCA Separated CO2 Spectra in CO/CO2 Gas Mixtures Represented by V-1 ........ 96 Figure 6-32: PCC CO Calibration and Prediction Data Sets in CO/CO2 Gas Mixtures; kCO = 20; RMSE Calibration = 204 ppm, RMSE Prediction = 194 ppm ....................... 97 xviii Figure 6-33: PCC CO2 Calibration and Prediction Data Sets in CO/CO2 Gas Mixtures; kCO2 = 0.25; RMSE Calibration = 1 ppm, RMSE Prediction = 1 ppm .......................... 97 Figure 6-34: Formaldehyde (CH2O), Acrolein (C3H4O), and Water (H2O) Pure Spectra for Simulated Data Sets Illustrating Spectral Overlap Between All Three Components .... 99 Figure 6-35: CH2O/C3H4O/H2O Gas Mixtures Calibration Data Set, [XC](8 x 2655); Note ? Only 3 of 8 Calibration Spectra Shown ....................................................................... 100 Figure 6-36: CH2O/C3H4O/H2O Gas Mixtures Prediction Data Set, [XP](3 x 2655) ................... 101 Figure 6-37: SCREE Plot Indicating Eigenvalues for Each Principal Component for CH2O/C3H4O/H2O Gas Mixtures Data Set; Principal Components 1, 2, and 3 Explain 71.9%, 26.9%, and 1.2%, respectively, of the Total Calibration Data Set Variance ....................................................................................................................... 102 Figure 6-38: Principal Component Regression (PCR) ? CH2O Concentrations in CH2O/C3H4O/H2O Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm ....................................................................................................... 103 Figure 6-39: Principal Component Regression (PCR) ? H2O Concentrations in CH2O/C3H4O/H2O Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm ....................................................................................................... 103 Figure 6-40: Principal Component Regression (PCR) ? C3H4O Concentrations in CH2O/C3H4O/H2O Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm ........................................................................................................ 104 Figure 6-41: PCA Separated CH2O Spectra in CH2O/C3H4O/H2O Gas Mixtures .................. 105 Figure 6-42: PCA Separated H2O Spectra in CH2O/C3H4O/H2O Gas Mixtures ..................... 105 Figure 6-43: PCA Separated C3H4O Spectra in CH2O/C3H4O/H2O Gas Mixtures ................. 106 Figure 6-44: PCC CH2O Calibration and Prediction Data Sets in CH2O/C3H4O/H2O Gas Mixtures; kCH2O = 0.08; RMSE Calibration = 1 ppm, RMSE Prediction = 2 ppm ...... 107 Figure 6-45: PCC H2O Calibration and Prediction Data Sets in CH2O/C3H4O/H2O Gas Mixtures; kH2O = 1.25; RMSE Calibration = 9 ppm, RMSE Prediction = 6 ppm ....... 107 Figure 6-46: PCC C3H4O Calibration and Prediction Data Sets in CH2O/C3H4O/H2O Gas Mixtures; kC3H4O = 0.39; RMSE Calibration = 1 ppm, RMSE Prediction = 1 ppm .... 108 Figure 6-47: CO/CO2/H2O Gas Mixtures Calibration Data Set, [XC](8 x 1001); Note ? Only 2 of 8 Calibration Spectra Shown ................................................................................... 110 xix Figure 6-48: CO/CO2/H2O Gas Mixtures Prediction Data Set, [XP](4 x 1001); Note ? Only 2 of 4 Prediction Spectra Shown ..................................................................................... 111 Figure 6-49: SCREE Plot Indicating Eigenvalues for Each Principal Component for CO/CO2/H2O Gas Mixtures Data Set; Principal Components 1, 2, and 3 Explain 69.7%, 12.1%, and 7.7% of the Total Calibration Data Set Variance ......................... 112 Figure 6-50: Principal Component Regression (PCR) ? CO Concentrations in CO/CO2/H2O Gas Mixtures; RMSE Calibration = 14 ppm, RMSE Prediction = 18 ppm ..................................................................................................... 113 Figure 6-51: Principal Component Regression (PCR) ? CO2 Concentrations in CO/CO2/H2O Gas Mixtures; RMSE Calibration = 2 ppm, RMSE Prediction = 4 ppm ....................................................................................................... 113 Figure 6-52: Principal Component Regression (PCR) ? H2O Concentrations in CO/CO2/H2O Gas Mixtures; RMSE Calibration = 157 ppm, RMSE Prediction = 519 ppm ................................................................................................... 114 Figure 6-53: CO RMSE Calibration and RMSE Prediction as a Function of the Number of Principal Components Used to Represent the Original Data Set in CO/CO2/H2O Gas Mixtures ......................................................................................... 115 Figure 6-54: SCREE Plot Indicating Eigenvalues for Each Principal Component for CO/CO2/H2O Gas Mixtures Data Set; Principal Components 1, 2, and 3 Explain 73.4%, 12.6%, and 6.2% of the Total Calibration Data Set Variance ......................... 117 Figure 6-55: CO RMSE Calibration and RMSE Prediction as a Function of the Number of Principal Components Used to Represent the Original Data Set in CO/CO2/H2O Gas Mixtures with the IR Spectral Data Truncated to Include Only Contributions from CO and CO2 (2500 cm-1 to 2075 cm-1) for PCA .......................................................... 118 Figure 7-1: FTIR Scans of Engine Oil Samples at Temperatures of Greatest Mass Loss used for PCA; BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 at 306 ?C (583 ?F), 301 ?C (574 ?F), 306 ?C, and 326 ?C (619 ?F), respectively .................................................................................................................. 121 Figure 7-2: SCREE Plot Indicating Eigenvalues for Each Principal Component for Engine Oil Data Set 1; Principal Components 1, 2, and 3 Explain 76.1%, 21.6%, and 2.3% of the Total Engine Oil Data Set 1 Variance ............................................................... 122 Figure 7-3: Mass Spectrometry (MS) Data of Engine Oil Samples at Temperatures of Greatest Mass Loss; BP Turbo Oil 274, and Mobile Jet Oil II at 301 ?C (574 ?F), and 306 ?C (578 ?F), respectively ................................................................................ 123 xx Figure 7-4: Mass Spectrometry (MS) Data Database Files for Formaldehyde (CH2O), Methanol (CH4O), and Carbon Dioxide (CO2) ............................................................ 123 Figure 7-5: Formaldehyde (CH2O), Methanol (CH4O), and Carbon Dioxide (CO2) Pure Spectra for Simulated Data Set Illustrating Spectral Overlap Between CH2O and CH4O ............................................................................................................................ 124 Figure 7-6: CH4O/CH2O/CO2 Gas Mixtures Calibration Data Set, [XC](10 x 1763); Note ? Only 3 of 10 Calibration Spectra Shown ..................................................................... 125 Figure 7-7: SCREE Plot Indicating Eigenvalues for Each Principal Component for CH4O/CH2O/CO2 Gas Mixtures Data Set; Principal Components 1, 2, and 3 Explain 86.8%, 95.4%, and 4.6%, respectively, of the Total Calibration Data Set Variance ....................................................................................................................... 126 Figure 7-8: Principal Component Regression (PCR) ? CH4O Concentrations in CH4O/CH2O/CO2 Gas Mixtures; RMSE Calibration = 0.0 ppm; Predicted concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 206, 221, 214, and 292 ppm, respectively ............... 127 Figure 7-9: Principal Component Regression (PCR) ? CH2O Concentrations in CH4O/CH2O/CO2 Gas Mixtures; RMSE Calibration = 0.0 ppm; Predicted concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 145, 190, 142, and 263 ppm, respectively ............... 127 Figure 7-10: Principal Component Regression (PCR) ? CO2 Concentrations in CH4O/CH2O/CO2 Gas Mixtures; RMSE Calibration = 0.0 ppm; Predicted concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 30, 49, 20, and 63 ppm, respectively ....................... 128 Figure 7-11: Predicted Spectra for BP Turbo Oil 2380 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.4% ................... 129 Figure 7-12: Predicted Spectra for BP Turbo Oil 274 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 3.1% ................... 129 Figure 7-13: Predicted Spectra for Mobile Jet Oil II based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.7% ................... 130 Figure 7-14: Predicted Spectra for Aeroshell Turbine Oil 560 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 3.9% ................... 130 xxi Figure 7-15: Illustration of Mutual Interactions within the Hydroxyl Group via Hydrogen Bonded Methanol (CH4O) Clusters that Leads to O-H Bong Lengthening and C-O Bond Shortening Explaining the Observed Red and Blue Wavenumber Shifts, Respectively, within the CH4O IR Spectra Component of the Engine Oil at High Temperature ................................................................................................................... 132 Figure 7-16: Comparison of Actual Methanol (CH4O) Spectra to Modified CH4O Spectra with the O-H Stretching Bands Red Shifted and the C-O Stretching Bands Blue Shifted Bands Due to High Temperature Disturbance of Hydrogen Bonds ................ 134 Figure 7-17: CH4O (Modified)/CH2O/CO2 Gas Mixtures Calibration Data Set, [XC](10 x 1763); Note ? Only 3 of 10 Calibration Spectra Shown ................................... 135 Figure 7-18: Modified Predicted Spectra for BP Turbo Oil 2380 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.1% ................... 136 Figure 7-19: Modified Predicted Spectra for BP Turbo Oil 274 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.7% ................... 136 Figure 7-20: Modified Predicted Spectra for Mobile Jet Oil II based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.2% ................... 137 Figure 7-21: Modified Predicted Spectra for Aeroshell Turbine Oil 560 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 3.4% .............................................................................................................. 137 Figure 7-22: Truncated (3200 ? 1600 cm-1) CH4O/CH2O/CO2 Gas Mixtures Calibration Data Set, [XC](10 x 830); Note ? Only 3 of 10 Calibration Spectra Shown % ................. 138 Figure 7-23: Modified Predicted Spectra for BP Turbo Oil 2380 based on Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.1% ............................................................................................. 139 Figure 7-24: Modified Predicted Spectra for BP Turbo Oil 274 based on Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.7% ............................................................................................. 140 Figure 7-25: Modified Predicted Spectra for Mobile Jet Oil II based on Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.3% ............................................................................................. 140 Figure 7-26: Modified Predicted Spectra for Aeroshell Turbine Oil 560 based on Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 3.4% ............................................................................................. 141 xxii Figure 7-27: Methanol (CH4O), Formaldehyde (CH2O), Carbon Dioxide (CO2), Carbon Monoxide (CO), and Water (H2O) Pure Spectra for Simulated Data Set Illustrating Spectral Overlap Between CH4O, CH2O, and H2O ..................................................... 143 Figure 7-28: BP Turbo Oil 2390 Time Evolved Spectra Used for Prediction Data Set, [XP](20 x 1763); Note ? Only 2 of 20 Prediction Spectra Shown, Time 5 minutes and 90 minutes .................................................................................................................. 144 Figure 7-29: Principal Component Regression (PCR) Calculated Gas Concentrations for CH4O, CH2O, and CO of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off ...................... 145 Figure 7-30: Principal Component Regression (PCR) Calculated Gas Concentrations for CO2 of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off ............................................................. 146 Figure 7-31: Principal Component Regression (PCR) Calculated Gas Concentrations for H2O of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off ............................................................. 146 Figure 7-32: Predicted Spectra for BP Turbo Oil 2380 at Time = 10 min. based on PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 2.7% .............................................................................................................. 147 Figure 7-33: Predicted Spectra for BP Turbo Oil 2380 at Time = 30 min. based on PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 8.4% ............................................................................................................................. 148 Figure 7-34: Predicted Spectra for BP Turbo Oil 2380 at Time = 60 min. based on PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 9.2% .............................................................................................................. 148 Figure 7-35: Predicted Spectra for BP Turbo Oil 2380 at Time = 90 min. based on PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 9.0% .............................................................................................................. 149 Figure 7-36: Principal Component Regression (PCR) Calculated Gas Concentrations for CH4O, CH2O, and CO of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off ...................... 150 Figure 7-37: Principal Component Regression (PCR) Calculated Gas Concentrations for H2O of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off ............................................................. 150 xxiii Figure 7-38: Predicted Spectra for BP Turbo Oil 2380 at Time = 10 min. based on Truncated (4000-2000 cm-1) PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 2.2% ................................................................ 151 Figure 7-39: Predicted Spectra for BP Turbo Oil 2380 at Time = 30 min. based on Truncated (4000-2000 cm-1) PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 7.2% ................................................................ 152 Figure 7-40: Predicted Spectra for BP Turbo Oil 2380 at Time = 60 min. based on Truncated (4000-2000 cm-1) PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 7.8% ................................................................ 152 Figure 7-41: Predicted Spectra for BP Turbo Oil 2380 at Time = 90 min. based on Truncated (4000-2000 cm-1) PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 7.5% ................................................................ 153 xxiv List of Abbreviations and Mathematical Symbols ACER Airliner Cabin Environment Research Center AMU Atomic Mass Unit BTI Business Travel Index CFD Computational Fluid Dynamics CH2O Formaldehyde CH4O Methanol C3H4O Acrolein CO Carbon Monoxide CO2 Carbon Dioxide FAA Federal Aviation Administration FTIR Fourier Transform Infrared H2O Water H2O2 Hydrogen Peroxide HEPA High Efficiency Particulate Air IR Infrared MS Mass Spectrometry NBTA National Business Travel Association O3 Ozone OHRCA Occupational Health Research Consortium in Aviation xxv OSHA Occupational Safety and Health Administration Pa Pascal PCA Principal Component Analysis PCC Proportionality Constant Calculation PCR Principal Component Regression PPM Parts per Million PSI Pound per Square Inch RMSE Root Mean Square Error RMSEC Root Mean Square Error Calibration RMSEP Root Mean Square Error Prediction SARS Severe Acute Respiratory Syndrome SCCM Standard Cubic Centimeter S/N Signal-to-Noise STP Standard Temperature and Pressure SVD Singular Value Decomposition TCP Tricresyl Phosphate TGA Thermogravimetric Analysis TWA Time Weighted Average VPHP Vapor Phase Hydrogen Peroxide XPS X-ray Photoelectron Spectroscopy 1 Chapter 1: Introduction In today?s turbulent airline industry, a company?s survival is highly dependent on maintaining strict operational cost controls. With the current global economic downturn not expected to recover for some time, the operational cost driver is becoming even more important in order for a company to maintain expected overall profit margins and in some cases to reduce massive operational net losses with declining revenues. The Business Travel Index (BTI) indicated that 2010 would only have slight increases, 3.8% compared to 2009, in overall business travel growth, which is a major revenue source for the airline industry [1]. Michael McCormick, National Business Travel Association (NBTA) Executive Director and COO, recently said in an interview ?we?re looking forward to the end of 2012 ? when the industry should see a return to peak levels? [1]. In the long-term, the Federal Aviation Administration (FAA) has projected that one billion passengers will be flying per year in 2023 and that the number of passengers will continue to grow [2]. Even though the industry has a strong focus on operational cost controls to maintain profitability, this should not come at the expense of safety. Terrorist attacks utilizing aircraft, such as those against the World Trade Center on September 11, 2001, are a global safety concern that not only the airline industry has worked hard to prevent, but that also has required significant contribution from many national governments in the form of national defense intelligence gathering and information sharing [3, 4]. In addition, both domestic and international transmission of diseases, such as severe acute respiratory syndrome (SARS) and the 2 recent outbreak of H1N1 flu in Latin America, on aircraft is a major concern as well [5-7]. These two focus points regarding safety currently receive the majority of media attention and rightfully so, but there is also a safety risk that many passengers as well as a number of airline employees are just beginning to realize. The issue of air quality within the aircraft cabin is receiving increasing attention from both pilot and flight attendant unions [7]. This is due to exposure events caused by poor air quality that in some cases may have contained toxic oil components due to bleed air that flows from outside the aircraft through the engines into the aircraft cabin [8-10]. The system that supplies bleed air to the aircraft is shown schematically in Figure 1-1. Figure 1-1: Commercial Aircraft Bleed Air System [11] Exposure events to contaminated air have been suggested as the primary cause of significant short and long-term medical issues for aircraft crew in which it was thought that a 3 leak in the bleed air system occurred during flight. In 2009, the FAA Office of Aviation Medicine collaborated with the Occupational Health Research Consortium in Aviation (OHRCA) as well as the Airliner Cabin Environment Research Center (ACER) to create a guide for health care providers to deal with this specific issue [12]. The need for air quality monitoring is especially evident in the fact that currently within an aircraft there are no sensors to monitor the air quality and potentially harmful gas levels (detect-to-warn sensors), much less systems to monitor and purify the air (detect-to-treat sensors) within the aircraft cabin [13]. In 2009, the FAA estimated the total number of aircraft in the U.S. commercial fleet was 7,132, with 3,666 mainline passenger aircraft and 2,612 regional carrier aircraft [7]. With these numbers of aircraft currently in service, it is not feasible to replace the current aircraft fleet with new model aircraft that do not utilize a bleed air system, such as the Boeing 787 Dreamliner [14]. Research Purpose The specific purpose of this research is to utilize a mathematical technique called principal component analysis (PCA) in conjunction with principal component regression (PCR) and proportionality constant calculations (PCC) to simplify complex, multi-component infrared (IR) spectra data sets into a reduced data set used for determination of the concentrations of the individual components [15]. This can significantly simplify data analysis as well as improve the ability to determine concentrations of individual target species in gas mixtures where significant band overlap in the IR spectrum region occurs. PCR is a mathematical technique that determines component concentrations of a prediction data set based on multivariate regression of a calibration data set. For PCC, the total integrated intensity of an IR absorbance band is assumed to increase linearly with the amount of the component concentrations in a mixture. Based on this 4 assumption, one can determine the proportionality constant for each of the individual components in a calibration data set. With the proportionality constants known for each component, a relationship to calculate the prediction data set component concentrations can be derived. In some cases where the overall volume is not constant, the PCC technique is expanded to perform the analysis on volume and integrated area fractions instead of components. Application of this analytical numerical technique to IR spectrum analysis is important in improving performance of commercial sensors that airlines and aircraft manufacturers could potentially use in an aircraft cabin environment for multi-gas component monitoring. Research Approach The approach of this research is to utilize PCA with both simulation and experimental results in conjunction with PCR and PCC to determine quantitatively the component concentrations within a mixture. In the simulation data sets, pure spectra from the QASoft? database (Infrared Analysis, Inc., Anaheim, CA, USA) are used. To form a simulated mixture, pure spectra are added together and different multiplication factors are applied to achieve a range of component concentrations. The simulated data sets consist of various spectra of two and three targeted component systems. The experimental data sets consist of both two and three targeted component systems that could potentially be present as air contaminants in an aircraft cabin. In addition, experimental data sets are analyzed for a hydrogen peroxide (H2O2) aqueous solution mixture to determine H2O2 concentrations at various levels that could be produced during use of a vapor phase hydrogen peroxide (VPHP) decontamination system. After the PCA application to two and three component systems, the analysis technique is further expanded to include the monitoring of 5 potential bleed air contaminants from engine oil combustion, in which a simulation data set is utilized to predict gas components and concentrations in unknown engine oil samples. For the analysis of combusted aircraft engine oils, a simulated data set is used for PCA to determine regression coefficients for PCR to apply to the experimentally obtained data. Organization of Dissertation This dissertation contains a systematic utilization of PCA in conjunction with Fourier Transform infrared (FTIR) spectroscopy data for components that are applicable to the airline industry. Chapter 2 provides a detailed background of commercial aircraft systems that are applicable to this research. The background information includes a discussion on environmental air contaminants and bleed air contaminants that could potentially be present in aircraft cabins due to circulation of outside air of the combusted engine oils. Also within Chapter 2, the experimental aircraft cabin simulation environment is detailed. Next, in Chapter 3 a detailed overview of FTIR spectroscopy is presented. This discussion provides theoretical background on the FTIR spectroscopy technology as well as details on the experimental procedure and materials used in the collection of FTIR spectroscopy data. In addition, Chapter 3 also provides in-depth background on the characteristic IR modes of vibration on the key molecules of interest to this research, CO, CO2, and H2O. Chapter 4 then provides a detailed and mathematical background of PCA as well as the associated PCR and PCC techniques. Chapter 5 highlights the application of PCA to FTIR spectroscopy data from solutions to calculate H2O2 concentrations in an aqueous solution that could potentially be present during a VPHP decontamination event. The PCC technique for a variable volume of solution is utilized with this analysis in Chapter 5. Chapter 6 then uses PCA to obtain results that identify the individual component spectra within a 6 multi-component system with both simulated and experimental FTIR spectroscopy data sets. Chapter 6 includes the application of PCA to the monitoring of concentrations of environmental air contaminants. Concluding the PCA on FTIR spectroscopy data in Chapter 7, is an application of the technique to the monitoring of potential bleed air contaminants within the aircraft cabin, which requires the use of simulated data sets to predict the components and concentrations of gas species. The PCR technique is used to determine the quantitative concentrations of the individual components of the system and these concentrations are used to calculate predicted spectra for the oils and then compared to the original spectra to determine the experimental error. Chapter 8 summarizes the research findings and contains concluding remarks about the potential scientific impact of the results found within this dissertation. Chapter 9 contains a brief discussion of potential future work that could further both scientific and engineering understanding of PCA application to FTIR spectroscopy. The references cited are found at the end of this dissertation. Appendix A contains the MATLAB? source code used to perform the matrix manipulations for PCA. 7 Chapter 2: Commercial Aircraft Systems Overview Commercial Aircraft Background Within the typical aircraft, numerous systems are responsible for the stability and operation of a desirable environmental control. This project focuses on the main subsystems of the aircraft environmental control system (ECS) relating to cabin air quality, which are the bleed air controls, air conditioning pack, mix manifold, recirculation devices, and the cabin vents [11, 16]. The ECS air controls takes in outside air while the aircraft is in operation using bleed air controls. This outside air is compressed to 220 kPa (32 psi) and rises to a temperature of 160 ?C (320 ?F). This system, shown in Figure 1-1, has a number of valves and heat exchangers that conditions the air to a desirable temperature and pressure for the other flight systems. During flight, air entering the bleed air system could potentially have high concentrations of ozone (O3) due to elevated atmospheric concentrations at the flight altitude. Typical levels of O3 in the outside air range from 0.5 to 1.0 parts per million (ppm) [11, 17, 18]. Most of this O3 partially dissociates when it goes through compression stages of the engine and the catalytic converter but significant and harmful amounts have been measured in simulated aircraft cabin environments [17, 18]. After air leaves the bleed air system, it then enters the air conditioning (AC) packs where it is cooled to a temperature of about 15 ?C (59 ?F) and decompressed to a pressure of 78-82 kPa 8 before continuing to the mix manifold. This pressure range corresponds to the typical aircraft cabin altitude setting that ranges from 6,000-8,000 feet above sea level. Table 2-1 summarizes the changes in pressure in relation to altitude. One item to note is the air that enters the mix manifold is not monitored for harmful gas concentrations. In addition to O3 that may still be present, the CO2 and CO proportional concentrations in the air are unchanged from outside levels. Table 2-1: Relationship between Altitude and Pressure with the Typical Aircraft Cabin Pressure Highlighted The change in total pressure has an effect that is described by Dalton?s Law of Partial Pressure shown in Equation 2.1, in which the partial pressure of a gas, pi, is a product of the mole fraction, Xi, of the gas and the total pressure of the gas mixture, PT. Tii PXp = (Dalton?s Law of Partial Pressures) (2.1) 9 With total pressure changes due to changes in altitude, the measured concentration of a gas of interest, if the sensor measurement principal is based on partial pressure, will be affected. This effect will thus change readings of a gas ideal partial pressure sensor. As shown in Figure 2-1, an ideal partial pressure sensor used to measure CO2 levels at approximately 7,000 ft altitude, where the total pressure is approximately 77% that of sea level, would read pCO2 = 0.39 kPa for a true concentration of pCO2 = 0.51 kPa measured at standard temperature and pressure (STP). Figure 2-1: Characteristics of Ideal CO2 Partial Pressure Sensor; Measures pCO2 = 0.39 at PT = 77 kPa (7000 ft Altitude) for pCO2 = 0.51 at PT = 101 kPa (Sea Level) The HEPA filters, similar to those used in critical wards of a hospital, are present in the recirculation system, and when in a relatively new condition remove 99.97% of bacteria and viruses at a particle size of 0.3 ?m produced or brought on board the aircraft by passengers [11, 16]. These filters, however, do not filter harmful gases such as CO or CO2 that may be present. The system attempts to control the levels of gases that may be present due to internal contamination of air within the aircraft through dilution with high quantities of outside air as 10 highlighted by the 10-15 outside air changes per hour shown in Figure 2-2 in comparison to hospital delivery and operating rooms as well as the typical building [16]. Figure 2-2: Recirculation System Outside Air Changes per Hour for Aircraft Compared with Other Environments [16] The final subsystem the air passes through as it reaches the passengers is the cabin ventilation system shown in Figure 2-3. The airflow is directed from overhead air supply nozzles and extracted through return air grilles where the sidewall meets the floor along the length of the cabin. The air here has a typical temperature of 18-30 ?C (64-86 ?F) and a relative humidity of 10-20% [19]. Within the ventilation system, for nearly all commercial aircraft there are currently no sensors or monitoring for potentially harmful gases but recent computational fluid dynamics (CFD) simulation work has been conducted to determine the optimal position to place sensors when and if they are installed to ensure the earliest warning possible to both flight crew and passengers [20]. 11 Figure 2-3: Cabin Ventilation System Illustrating Typical Aircraft Recirculated Airflow [19] Vapor Phase Hydrogen Peroxide (VPHP) Aircraft Cabin Decontamination An increasing awareness towards aircraft cabin sterilization for biological and chemical contaminants has led to the development of full-scale methods using VPHP [21, 22, 23]. VPHP at concentrations greater than 80 ppm have been shown to have sporicidal effects, while the typical aircraft cabin sterilization utilizes VPHP concentrations in the range of 150-600 ppm [21]. In addition, the typical VPHP process contains an initial dehumidification process that reduces the relative humidity to less than 10% [21]. Concentrations of the initial liquid condensing from the H2O2-H2O vapor can be as high as 50-75 wt.% H2O2 even though the original flash vaporized liquid is only 35 wt.% and these high H2O2 concentration in the condensate have been shown to increase susceptibility to hydrogen embrittlement for 4340 high strength steel [24]. For VPHP, three major processing parameters affect inactivation of microorganisms. These three factors are sterilant concentration, exposure time, and percent saturation. Although 12 commercial systems are available, monitoring the H2O2 and H2O conditions during operation require specialized sensors, such as those from Analytical Technology, Inc (Collegeville, PA, USA). These sensors as well as others are primarily used to monitor H2O2 and H2O in the gas phase. Although the occurrence of microcondensation can be detected with optical dew point sensors, accurate monitoring of the concentrations of condensates is not routinely practiced [25]. Potential Environmental Air Contaminants within the Aircraft Cabin During flight operations, air quality of an aircraft cabin is critical to crew and passenger safety and comfort and as noted previously. However, there are currently no environmental monitoring sensors present in the aircraft cabin. By diluting the aircraft cabin air with high quantities of outside air toxic gases, such as CO, CO2, and O3 are assumed to be below harmful levels. Even so, recent aircraft cabin studies have shown that formaldehyde (CH2O) has been specifically detected as a reaction product of ozone-initiated chemistry due to ozone reactions with human skin oils, hair, and clothing as well as the fabric within the aircraft cabin [26]. In [17], both CH2O and acrolein (C3H4O) resulting from O3 interactions were detected at concentrations exceeding their OSHA recommended exposure limits. Table 2-2 highlights the limits that the FAA and the Occupational Safety and Health Administration (OSHA) currently have on some contaminants of interests that could potentially be found in aircraft cabin air [27]. In Table 2-2, it should be noted that time weighted average (TWA) is the average concentration in a normal 8-hour workday and a 40-hour workweek. In addition, the ppm levels are sea level equivalents values. 13 Table 2-2: Limits on Contaminants that Could Potentially Be Found in Aircraft Cabin Air [27] Contaminants FAA Limit OSHA Permissible Exposure Limit Carbon Monoxide (CO) 50 ppm 50 ppm Carbon Dioxide (CO2) 5000 ppm 5000 ppm Ozone (O3) 0.1 ppm 0.1 ppm Formaldehyde (CH2O) N/A 0.75 ppm (TWA) Acrolein (C3H4O) N/A 0.1 ppm * TWA ? Average concentration in a normal 8-hour workday and a 40-hour workweek Potential Bleed Air Contaminants within Aircraft Cabin A bleed-air system within an aircraft is very beneficial in that the compressed air that it produces can be used as a major power source for many environmental control systems from de- icing the wings of a plane to pressurizing the cabin. A drawback of this system though, is that it has the potential to allow contaminated air from the environment during taxiing operations, as well as noxious gases due to possible leaks of engine oil, hydraulic fluid, de-icing fluid, etc., into the aircraft cabin. In addition to forming due to O3 reactions, CH2O and C3H4O have been shown to form when engine oil is burned [28]. The oils and hydraulics used in aircraft are also known to contain toxic chemicals, such as the irritant phenyl-alpha-napthylamine and the neurotoxin tricresyl phosphate (TCP) [10, 29]. In 2000, measurement of various gases and volatile compounds from various engine oils showed that oil pyrolyzed at 525 ?C (977 ?F) generated significant amounts of CO2 and CO in excess of 100 ppm [30]. In addition, TCP was found within the samples using a gas chromatography (GC) laboratory measurement technique [30]. Laboratory Test Environment The experimental setup consists of three major modules. The first is the Control Module, shown in Figure 2-4, which is responsible for control of pressure and the flow of both inert 14 carrying gases as well as test gases of interest. The pressure setting within the system allows for testing of sensors at various altitudes that are encountered in the airplane cabin environment. The gas lines in the system are rated for vacuum pressures of 15 inches of mercury (380 mm Hg) or about 50% of atmospheric pressure (50.5 kPa), which corresponds to altitudes up to 12,000 feet (3,700 meters). The flow meters allow precise control of these gases and allow custom mixing ratios for sensor performance testing. Figure 2-4: Laboratory Test Environment, Control Module The second module, the Commercial Sensor Analysis Module, shown in Figure 2-5, is an enclosed, vacuum-sealed, Plexiglas (PMMA) chamber, which has a total volume of 42.4 liters. 15 With this module, an environment replicating the various airplane cabin conditions can be maintained to test commercial sensor performance in regards to detection of the gases of interest. Figure 2-5: Laboratory Test Environment, Commercial Sensor Analysis Module The final module shown in Figure 2-6, which is used as the standard in evaluating commercial sensor performance as well as for independent gas analysis studies, is the FTIR Gas Analysis Module. This module contains a Spectrum GX FTIR System (Perkin Elmer, Shelton, CT, USA), as well as an M-5-22-V variable pathlength long path gas cell (Infrared Analysis, Inc., Anaheim, CA, USA). The optical path is folded in a volume of 8.5 liters, while the cell path length is determined by the number of passes times the base path length (56 cm). The FTIR spectrometer can take scans over a possible wavenumber scan range from 10,000 cm-1 to 400 cm- 1 with possible spectral resolutions of 64 cm-1 to 0.5 cm-1. The IR source is produced by a temperature stabilized wire coil that operates at 1350 K. The windows in the variable pathlength long pass gas cell are made of potassium chloride (KCl) and are 4 mm thick. The detector for 16 the IR beam is a fast recovery deuterated triglycine sulfate (FR-DTGC) module, which is standard for the mid-IR region of interest. Figure 2-6: Laboratory Test Environment, FTIR Gas Analysis Module with Spectrum GX FTIR (Perkin Elmer, Shelton, CT, USA) with M-5-22-V Variable Pathlength Long Path Gas Cell (Infrared Analysis, Inc., Anaheim, CA, USA) Shown in Figure 2-7 are the internal gold plated mirrors of the variable pathlength long path gas cell. Adjustments to the mirrors allow for multiple passes of the IR beam within the gas cell. Using a viewing window located on the top of the long path gas cell, a laser can be used to see the number of passes that the IR beam will make. The number of passes that the IR beam traverses in the long path gas cell is 4 times the number of laser dots found on the bottom row on the gold plated mirror when looking through the viewing window. The base pathlength of the M-5-22-V variable pathlength long path gas cell is 0.56 m, with a minimum number of passes of 17 4 and a maximum number of passes of 64. These adjustments to the pathlength allows for variable pathlength within the long path gas cell ranging from 2.24 m to 35.84 m (64 passes). Figure 2-7: Internal Gold Plated Mirrors of the Variable Pathlength Long Path Gas Cell Highlighting the Path of the IR Beam within the Cell With the flow controls on the Control Module and known volumes of the Commercial Sensor Analysis Module as well as the FTIR Gas Analysis Module, a differential equation based theoretical mixing model, based on the simplified system diagram shown in Figure 2-8, can be constructed. This model details the expected concentrations of gas at a given time during an experiment. The gases within the modules are assumed to be well mixed and this is accomplished using fans in the Commercial Sensor Analysis Module and the input of gas for the FTIR Gas Analysis Module being sufficiently far away from the output. 18 Figure 2-8: System Block Diagram Illustrating Gas Flows through the Commercial Sensor Analysis Module to the FTIR Gas Analysis Module then Out through the Vacuum Pump The basis for this theoretical gas flow model is the assumption that the change in concentration of a test gas within the Commercial Sensor Analysis Module, dC/dt, equals the inflow rate, Fin, of the test gas minus the outflow rate of the test gas, Fout (Equation 2.2). The term Fout is multiplied by the concentration of the test gas at a given time divided by the total of volume of the module, V. This multiplication is necessary because the total outflow volume includes the carrier gases as well as the test gas. In addition, Fout, is a function of the ratio of atmospheric pressure, P (1 atm. = 101.325 kPa), versus the applied vacuum pressure to the system, Pvac. The inflow rate for the test and carrier gases are set with the flow controllers. ??? ? ??? ?? ? ?? ? ?+= vac outin P P V CFF dt dC (2.2) 19 The second step in the derivation of a theoretical model is to use algebra and separation of variables producing Equations 2.3 and 2.4. ? ? ? ? ? ? ??? ? ??? ? ??? ? ??? ?? ??? ? ??? ?? ? ?? ? ??= a vac out in vac out P P F FVC P P V F dt dC (2.3) dtPPVF P P F FVC dC vac out a vac out in ??? ? ??? ?? ? ?? ? ??= ? ? ? ? ? ? ??? ? ??? ? ??? ? ??? ?? (2.4) Integrating Equation 2.4 gives Equation 2.5, with k being a constant of integration. ktPPVFPPFFVC vac out a vac out in +?? ? ? ??? ?? ? ?? ? ??=? ? ? ? ? ? ??? ? ??? ? ??? ? ??? ??ln (2.5) Next, taking the exponential of each side in Equation 2.5, the relationship shown in Equation 2.6 is produced and with further simplification, the equation for concentration as a function of time is given in Equation 2.7. ? ? ? ? ? ? ??? ? ??? ?? ? ?? ? ??=? ? ? ? ? ? ??? ? ??? ? ??? ? ??? ?? t P P V Fk P P F FVC vac aout a vac out in exp (2.6) ? ? ? ? ? ? ??? ? ??? ?? ? ?? ? ??+ ??? ? ??? ? ??? ? ??? ?= t P P V Fk P P F FVC vac aout a vac out in exp (2.7) 20 From Equation 2.7 and incorporation of the initial condition, the concentration of the test gas in the chamber, C(0), at time, t=0, the value of the integration constant, k can be found as shown in Equation 2.8. ??? ? ??? ? ??? ? ??? ??= a vac out in P P F FVCk )0( (2.8) Now, the final relationship of test gas concentration within the chamber at a given time, t, can be found using Equation 2.9. An initial concentration of the test gas, C(0), can be present before the gas flow begins and this value must be measured either with a sensor in the Commercial Sensor Analysis Module or with the FTIR in the FTIR Gas Analysis Module. ? ? ? ? ? ? ??? ? ??? ?? ? ?? ? ??? ? ? ? ? ? ??? ? ??? ? ??? ? ??? ??+ ??? ? ??? ? ??? ? ??? ?= t P P V F P P F FVC P P F FVC vac aout a vac out in a vac out in exp)0( (2.9) This model was used to predict the expected CO concentration within the FTIR Gas Analysis Module for a given time. Shown in Figure 2-9 is a comparison between the calculations derived from the simplified model versus the FTIR measured CO concentrations for given test parameters. For this particular experiment, the inflow gas was a 800 ppm CO in N2, which corresponds to a 0.40 standard cubic centimeter (sccm) flow rate with the flow controller set to 500 sccm. 21 Figure 2-9: Theoretical Gas Flow Model Calculations Compared to Experimental Results for 800 ppm CO Test Gas in N2 with Total Inflow of 500 sccm with Vacuum Pressure (Pvac) = Atmospheric Pressure (Pa) For the case when the FTIR Gas Analysis Module is in series with the Commercial Gas Sensor Analysis Module, the inflow value to the model for FTIR Gas Analysis Module will be used to calculate outflow from the Commercial Sensor Analysis Module. This produces a lagging effect for both the theoretical and measured test gas concentrations in the FTIR Gas Analysis Module during the initial ramp up to the final steady-state test gas concentration. This effect is shown graphically for the theoretical models in Figure 2-10. 22 Figure 2-10: Theoretical Gas Flow Model Calculations for FTIR Gas Analysis Module in Series with the Commercial Gas Sensor Analysis Module 23 Chapter 3: Fourier Transform Infrared (FTIR) Spectroscopy Overview FTIR Theoretical Background FTIR analysis relies on the principle that all polyatomic molecules and hetero-nuclear diatomic molecules absorb IR radiation. When a molecule interacts with an IR source, it experiences a vibrational transition due to photon absorption, illustrated in Figure 3-1 in red, thus placing the molecule at a higher energy state [31-33]. In addition, when a molecules of gas absorbs IR energy, rotational transitions can occur in conjunction with the vibrational transitions. These rotational and vibrational transitions produce a number of relatively closely spaced absorption lines [34]. The total energy (Etot) within a molecule is defined as three additive components (Equation 3.1), energy due to rotation of the molecule (Erot), energy due to vibration of atoms (Evib), and energy due motion of electrons (Ee-). ?++= evibrottot EEEE (3.1) Figure 3-1: Energy Diagram Highlighting Transitions from Electronic Ground State to Electronic Excited State with Rotational and Vibrational Transitions 24 The physical properties of a molecule determine the pattern of absorption, defined by the IR wavelength at which the species absorbs. The major physical properties that help define a characteristic IR absorption spectrum for a molecule are the number of atoms, the bond angles, and the bond strengths [31]. Each IR absorption band, which is due to a particular vibrational energy change, is composed of a number of relatively closely spaced absorption lines and these components can be related to simultaneous rotational changes that accompany vibrational energy changes [31, 32]. For N atomic nuclei within a molecule, there are 3N-6 degrees of freedom for a nonlinear molecule and 3N-5 degrees of freedom for a linear molecule [33]. These degrees of freedom refer to the number of vibrational modes expected within a molecular structure. The absorption frequency depends on the molecular vibrational frequency, while the intensity of the molecular absorption depends on how efficiently IR energy is transferred to the molecule. FTIR Measurement Technique In regards to the FTIR measurement technique, the system is technically referred to as a Michelson Interferometer that produces a time domain measurement based on a path difference of two beams from a single an IR source. These beams are combined before interacting with an IR absorbing species using a specialized signal called an interferogram [35]. The interferometer utilizes a beam splitter that transmits about 50% and reflects about 50% of the incoming IR source thus dividing the signal it into two. One beam reflects off a flat mirror that is fixed and the other is reflected off a mirror that is allowed to move. When the two beams are recombined after reflecting off their respective mirrors, the resulting beams interfere with each other (Figure 3-2). Because the path of the beam on the moving mirror is constantly changing, the data points, which make up the interferogram signal, contain all the infrared source information from the IR 25 source within a very short time domain signal. Once the interferogram interacts with the sample, the resulting signal is then converted to the frequency domain through a mathematical technique called a Fourier Transform. Figure 3-2: Beam Path from IR Source Highlighting the use of a Fixed Mirror and Moving Mirror to Construct an Interferogram Time Domain Signal Containing the IR Source Information [35] The intensity of the transmitted power through the sample, I, is compared to the intensity of the IR radiation incident on the sample, I0 and percent transmittance, %T, value for each frequency is calculated (Equation 3.2). For quantitative analysis of a spectrum, this transmittance value is converted to a unit-less absorbance value, A, (Equation 3.3). Absorbance values are ideal for quantitative studies because as shown in Equation 3.4, the Beer-Lambert Law, absorbance is directly proportional to the concentration of light-absorbing species [31]. In Equation 3.4, ? is the molar absorptivity, b is the pathlength that the light source travels, and c is the concentration of the light-absorbing species. Table 3-1 highlights the relationship between I/I0, %T, and A. 26 %100% 0 ?= IIT (3.2) ??????=?= IITA 01010 log)(log (3.3) bcA ?= (Beer-Lambert Law) (3.4) Table 3-1: Relationship between I/I0, %T, and A I/I0 %T A 1 100 0 0.1 10 1 0.01 1 2 0.001 0.1 3 0.0001 0.01 4 It is necessary to collect a background spectrum sample before taking scans of the sample to remove instrument and atmospheric characteristics. It is important to note that the signal-to- noise (S/N) ratio is determined by both the sample spectrum and the background spectrum and typically it is necessary to have as many background co-added scans (scans averaged over a given spectral range) as samples co-added scans to obtain the best S/N. The S/N determines the weakest feature that can be confidently identified within a spectrum and is directly proportional to the number of scans taken for a sample or background. As an example, if it is necessary to double the S/N value of four co-added scans, it is necessary to collect 16 co-added scans. When analyzing samples for strongly absorbing species such as CO2, it is not necessary to take as many scans, but to analyze species that do not absorb significantly in the IR region, it is necessary to have a high S/N value. Figure 3-3 shows a background scan of 16 co-added spectra, taken at a resolution of 0.5 cm-1 over a wavenumber range of 2500-1000 cm-1. This 27 spectrum highlights the typical atmospheric interference of CO2 in the wavenumber range of 2400-2250 cm-1 and H2O (water vapor) in the wavenumber range of 2100-1300 cm-1. Figure 3-3: Typical Atmospheric Background Spectra Highlighting CO2 and H2O Interference, 16 Co-added Scans with Resolution 0.5 cm-1 The most generally accepted resolution for gas analysis is 0.5 cm-1 because this takes advantage of detailed fine structure in the bands of gaseous molecules and widens the range over which absorption is valid. In addition, the S/N is proportional to the resolution squared [35]. The resulting spectrum is then analyzed by comparison to known databases. The database used in this study is from QASoft? and contains molecular absorption spectra for 386 gases, with most of them being in an inert gas, such as N2, to maintain total pressure of 1 atmosphere. The purpose of the background gas is to establish the total pressure of the system as close to atmosphere as possible while limiting background gas interference. Since N2 is inactive in the IR region, it is a desirable gas for this purpose. This database covers the region of 3700 cm-1 to 500 cm-1, which is the fundamental IR region where rotation and vibrations of molecules give rise to IR absorption. 28 IR Characteristics of CO/CO2/H2O Carbon Monoxide (CO) CO is a homogeneous, linear, diatomic molecule and thus should have a single characteristic mode of vibration (3N-5 = 3x2-5 = 1) [31, 33]. This vibrational mode, which is along the chemical bond, shown in Figure 3-4, has a characteristic vibrational frequency of 2143 cm-1. The CO spectrum from the QASoft? database is shown in Figure 3-5. Figure 3-4: CO Fundamental Vibrational Mode, k1 = 2143 cm-1 Figure 3-5: IR Absorbance Spectra for CO from QASoft? Database 29 Carbon Dioxide (CO2) CO2 is a heterogeneous, linear, tri-atomic molecule that has four characteristic modes of vibration, based on the calculated 4 degrees of freedom (3N-5 = 3x3-5 = 4) [31]. Figure 3-6 shows the CO2 first fundamental vibrational mode, k1, which occurs with symmetrical motion of the oxygen atoms while the carbon atom is fixed. The characteristic vibrational frequency associated with this vibration occurs at 1340 cm-1. For pure CO2, this vibrational mode is inactive in the IR energy region because the molecular dipole moment is not changed with vibration. Figure 3-6: CO2 First Fundamental Vibrational Mode, k1 = 1340 cm-1 Figure 3-7 shows the CO2 second fundamental vibrational mode, k2, which results when the carbon atom oscillates perpendicular to the oxygen atoms with the two vibrational modes arising from rotations by 90? [36]. Because the two vibrational modes are just rotations of the same molecular motion, they have the same fundamental vibrational frequency of 667 cm-1. Figure 3-7: CO2 Second Fundamental Vibrational Frequency, k2 = 667 cm-1 30 The third fundamental vibrational mode for CO2, k3, which is shown in Figure 3-8 results when the carbon atom moves relative to the center of mass of the oxygen atoms. The characteristic vibrational frequency associated with this motion is 2350 cm-1. Figure 3-8: CO2 Third Fundamental Vibrational Frequency, k3 = 2350 cm-1 In regards to FTIR measurements, CO2 has IR bands that will absorb so strongly that that it is possible to reach a concentration level where the energy transmitted to the detector will not produce a spectrum with features that are distinguishable from the noise level of the instrument. At this point, above 3 absorbance units (< 0.1% Transmission), the Beer-Lambert law is no longer applicable, and typical methods for concentration calculations are no long applicable. The CO2 spectrum from the QASoft? database is shown in Figure 3-9 and from this 100 ppm spectra the maximum absorbance level is 0.65 absorbance units at the k2 (667 cm-1) characteristic frequency and 0.40 absorbance units at the k3 (2350 cm-1) characteristic frequency. Because of this strong absorption, the Beer-Lambert law is not applicable to monitor CO2 concentrations above 460 ppm near the k2 (667 cm-1) characteristic frequency and 750 ppm near the k3 (2350 cm-1) characteristic frequency. It is standard industry practice to monitor CO2 high concentrations within a working wavenumber window of detection of 2390-2379 cm-1 to avoid significant absorption of the IR source. 31 Figure 3-9: IR Absorbance Spectra for CO2 from QASoft? Database Water (H2O) H2O is a heterogeneous, bent, tri-atomic molecule that has three characteristic modes of vibration (3N-6 = 3x3-6 = 3) [31]. Figure 3-10 shows the H2O second fundamental vibrational mode, k2 that results when the hydrogen atoms bend their O-H bonds. The characteristic vibrational frequency associated with this motion is 1595 cm-1. The first and third fundamental vibrational modes are outside the spectral window used for analysis in this research. This second fundamental vibrational mode, along with its associated rotational modes is a major source of interference for a number of characteristic IR spectra of interest, spanning from approximately 2000 cm-1 to 1300 cm-1. The H2O spectrum from the QASoft? database is shown in Figure 3- 11. 32 Figure 3-10: H2O Second Fundamental Vibrational Frequency, k2 = 1595 cm-1 Figure 3-11: IR Absorbance Spectra for H2O from QASoft? Database 33 Chapter 4: Principal Component Analysis (PCA) PCA Theoretical Background PCA is a technique that has been exploited quite extensively and successfully in the area of applied chemistry for a wide variety of functions, ranging from surface enhanced Raman scattering [37, 38], to X-ray photoelectron spectroscopy (XPS) [39], liquid chromatography [40- 42], and FTIR spectroscopy [43-45]. This versatile technique allows for a large number of variables in a data set, such as absorbance values for given wavenumbers in the case of IR spectra, to be reduced to simple primary, i.e. principal, components. These principal components are orthogonal and have been shown to retain a significant amount of the original data set variation [46, 47]. A further principal component reduction process allows for the use of only the first few uncorrelated and ordered principal components for determining the simplified internal structure of the original data [43]. Typically, the data matrix used in PCA, [X](n x p), is a data set consisting of n samples taken at p measurements points. Using the singular value decomposition (SVD) theorem of matrix algebra, [X] can be written as a product of three terms as shown in Equation 4.1, where the product [U][L], is most commonly referred to as the scores matrix, [S](n x p), and [V](p x p) as the loadings matrix. TVUX ]][][[][ ?= (4.1) 34 The first step in solving for [S] and [V] of the original data set is to calculate the mean adjusted data matrix, [XM](n x p), by subtracting the columns means from each column component in [X]. Mean subtraction is necessary to ensure that the first principal component describes the direction of maximum variance instead of corresponding to the mean of the data [48]. From [XM], a variance-covariance matrix, [Z](p x p), can be constructed where the diagonals, Zij (i = j), represent the variance of the data points, Var, at a given wavenumber and the Zij (i ? j) components represent the covariance, Cov, of particular wavenumbers among the samples as shown in Figure 4-1. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? )(...)()()( ............... )(...)()()( )(...)()()( )(...)()()( 321 3333231 2232221 1131211 ppMpMpMpM pMMMM pMMMM pMMMM XVarXCovXCovXCov XCovXVarXCovXCov XCovXCovXVarXCov XCovXCovXCovXVar Figure 4-1: Variance-Covariance Matrix, [Z](p x p) Calculated from Mean Centered Data Matrix, [XM](n x p) The magnitude of each eigenvalue, ?, in Equation 4.2 indicates the relative contribution of the corresponding eigenvector to the variance of the original data. In Equation 4.2, I is the identity matrix. The eigenvalues are arranged in order from largest to smallest and the measure of reconstruction accuracy, ?, is provided by the relative contribution of the retained eigenvalues to the sum of squares of eigenvalues as shown in Equation 4.3, where p* is the number of retained eigenvalues. 0=? IZ ? (4.2) 35 ? ? = p k k p k k 2 * 2 ? ? ? (4.3) From each of the ordered eigenvalues, corresponding eigenvectors or loadings matrix, [V](p x p), can be found by solving Equation 4.4 for each of the p eigenvalues from Equation 4.2. In Equation 4.4, Vi,j corresponds to the eigenvector associated with the ith eigenvalue. These eigenvectors are the coefficients required to transform the original variables into the principal component variable space. ? ? ? ? ? ? ? ? ? ? ? ? = ?? ? ? ? ? ? ?? ? ? ? ? ? ?? ? ? ? ? ? ?? ? ? ? ? ? ? ? ? 0 ... 0 0 ... ... ............ ... ... 2 1 21 22221 11211 ip i i ipppp pi pi V V V ZZZ ZZZ ZZZ ? ? ? (4.4) From these eigenvectors, the individual elements of the new variable of [S] are calculated from Equation 4.5. )()()( ][][][ pxpnxpMnxp VXS = (4.5) The columns of [V] are then arranged from largest to smallest and a cut-off for the number of retained or significant eigenvalues, p*, is made when the cumulative variance explained from each of the p eigenvalues is within the experimental error associated with the measurement process, which is typically 5% for FTIR experimental data. The first p* columns 36 of [V] and [S] are used as a reduced loadings matrix, [V*](p x p*), and reduced scores matrix, [S*](n x p*), respectively, for further calculations. The error matrix, [E], associated with the use of the reduced loadings and reduced scores matrices can be calculated from Equation 4.6, where [X*] is the reconstructed data matrix defined in Equation 4.7. )()()( ][*][][ nxpnxpnxp EXX += (4.6) )()*(*)()( ][*][*][*][ nxpMxpp T nxpnxp XVSX += (4.7) This residual error is typically very small when the reconstruction accuracy, ?, of the original data explained is high (Equation 4.3). Principal Component Regression (PCR) PCR is a mathematical technique that determines component concentrations of a prediction data set based on multivariate regression of a calibration data set. In most multivariate regression techniques though, there are correlations within the set of variables on which the measured response is dependent, and these correlations add redundancy to the regression model that can cause numerical instability in estimating regression coefficients [48]. PCR is employed when the responses of one variable are dependent on a set of other variables as shown in Equation 4.8 for the mean adjusted values, where [b] is the vector of estimates of regression coefficients to be determined [49]. The advantage of PCR over other multivariate regression models is that through PCA, the number of significant components has been determined and the analyzed variables are orthogonal, which by definition do not have correlations. 37 ]][[][ bXaY MM += (4.8) The vector [b] is defined as the product of the reduced loading, [V*], and the y-loadings term, [q], shown in Equation 4.9. The [q] term is defined in Equation 4.10, where [D](pxp) is a diagonal matrix that has each diagonal element (i = j) equal to the inverse of the ith eigenvalue. ]*][[][ qVb = (4.9) ]*][][[][ YSDq = (4.10) With the values for [b] calculated, the constant a from Equation 4.8 can be found using Equation 4.11, where YM is the mean value of the dependent variable and XM is the mean value of the independent variable at a given measurement. ][bXYa MM ?= (4.11) With the regression analysis completed with a calibration data set, the regression analysis results can be applied to a prediction data set. In each case, root mean square error (RMSE) values can be computed to provide a measure of how well the PCR technique performs. RMSE is defined in Equation 4.12. n actualpredicted RMSE n i ii? = ? = 1 2)( (4.12) 38 Proportionality Constant Calculation (PCC) For PCC, the total integrated intensity of an IR absorbance band, Ii, is assumed to depend linearly by a proportionality constant, ki, on the amount of the component concentration, Xi, in a mixture as shown in Equation 4.13. iii IkX = (4.13) The mixture of r components with their respected concentrations is defined in Equation 4.14, where XT is the total amount (i.e. total volume) of the mixture analyzed. rT XXXX +++= ...21 (4.14) Based on the linearity assumption, one can determine the proportionality constant for each of the individual components in a calibration data set in which the amount is known and the peaks associated with the particular component can be isolated from the peaks associated with the mixture. This isolation of peaks associated with a particular component is an ideal task for PCA that reduces mixtures down to principal components. As with PCR, for both the calibration and prediction data sets, calculation of RMSE using Equation 4.12 can provide a measure of how well the PCC technique performs. With the proportionality constants known for each component, a relationship to calculate the prediction data set component concentrations can be derived. In some cases where the overall volume is not constant, the PCC technique can be expanded to perform the analysis on 39 volume and integrated area fractions instead of components as shown in Equation 4.15 for a 2- component system. 2211 11 21 1 1% IkIk Ik XX XX +=+= (4.15) Equation 4.15 can be further simplified to Equation 4.16 where the ratio of proportionality constants for the two components calculated from a calibration data set are utilized to determine the volume fraction of a component in prediction data set mixtures where the total volume of the system cannot be kept constant. 2 1 2 1 1 21 1 1% IkkI I XX XX ??? ? ??? ?+=+= (4.16) Application to FTIR Spectroscopy Data In the case of FTIR spectroscopy data for CH2O and C3H4O, a simplified data set that has overlap between the two components can be constructed to highlight the mathematics involved within PCA, PCR, and PCC. Figure 4-2 shows the complete spectra of CH2O and C3H4O that the simplified spectra values are shown in Table 4-1 and displayed graphically in Figure 4-3 that consists of seven wavenumbers with their corresponding absorbance values. 40 Figure 4-2: Formaldehyde (CH2O) and Acrolein (C3H4O) Complete Pure Spectra Table 4-1: CH2O/C3H4O Simplified Pure Spectra for Illustration of PCA Application to FTIR Spectroscopy Data Wavenumber (cm-1) Component 2897 2863 2813 2810 2802 2791 2778 CH2O Absorbance 0.551 0.448 0.277 0.437 0.533 0.144 0.369 C3H4O Absorbance 0.000 0.009 0.078 0.079 0.042 0.072 0.055 Figure 4-3: CH2O/C3H4O Simplified Pure Spectra for Illustration of PCA Application to FTIR Spectroscopy Data 41 From these pure spectra, a set of 10 samples (n = 10) at 7 different wavenumbers (p = 7) were simulated with concentrations given in Table 4-2. Five spectra of the ten spectra used to produce a calibration data set [XC](10 x 7) are shown in Figure 4-4. A set of five samples, with concentrations given in Table 4-3, were simulated as shown in Figure 4-5 to produce a prediction data set [XP](10 x 7) for PCR. In the case of the PCC, a single data set [X](15 x 7) consisting of the calibration and prediction data sets were combined and PCA was simultaneously performed on all 15 samples to extract out the pure components in each simulated spectra. The mean centered data matrix is shown in Figure 4-6. Table 4-2: CH2O/C3H4O Spectra Compositions for Calibration Data Set, [XC](10 x 7), Ordered from Lowest to Highest Amount of CH2O in the Gas Mixture Sample # Amount of CH2O (x100 ppm) Amount of C3H4O (x100 ppm) 1 0.00 10.00 2 0.33 6.67 3 0.45 4.75 4 0.50 7.75 5 0.50 8.00 6 0.67 3.33 7 0.75 5.00 8 0.80 0.50 9 0.95 9.50 10 1.00 0.00 42 Figure 4-4: CH2O/C3H4O Calibration Data Set, [XC](10 x 7); Note ? Only 5 of 10 Calibration Spectra Shown Table 4-3: CH2O/C3H4O Spectra Compositions for Prediction Data Set, [XP](5 x 7), Ordered from Lowest to Highest Amount of CH2O in the Gas Mixture Sample # Amount of CH2O (x100 ppm) Amount of C3H4O (x100 ppm) 11 0.17 4.25 12 0.25 8.75 13 0.40 9.75 14 0.60 6.00 15 0.70 1.25 43 Figure 4-5: CH2O/C3H4O Prediction Data Set, [XP](5 x 7) Figure 4-6: CH2O/C3H4O Calibration Data Set, Mean Centered, [XM](10 x 7); Note ? Only 5 of 10 Calibration Spectra Shown From the covariance-variance matrix shown in Figure 4-7, solving Equation 4.2 yields the eigenvalues, ordered from largest to smallest, shown in Figure 4-8. As shown in the plot of 44 eigenvalues as a function of principal components, Figure 4-9, the number of significant eigenvalues obtained from PCA is two. The first principal component explains 77.1% of the total calibration data set variance and the second principal component explains of 22.9% of the total calibration data set variance. With these two principal components, 100.0% of the original data variance can be explained. Figure 4-9 is commonly referred to as a SCREE plot that shows the eigenvalues as a function of each principal component. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ?? ?? ??? 0250.00292.00174.00322.00317.00040.00004.0 0292.00526.00155.00444.00480.00057.00149.0 0174.00155.00207.00229.00194.00111.00117.0 0322.00444.00229.00521.00473.00027.00037.0 0317.00480.00194.00473.00552.00023.00106.0 0040.00057.00111.00027.00023.00145.00177.0 0004.00149.00177.00037.00106.00177.00278.0 Figure 4-7: CH2O/C3H4O Variance-Covariance Matrix, [Z](7 x 7) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0 1072.2 1070.6 1036.6 1031.1 1067.5 1091.1 11 11 10 9 2 1 Figure 4-8: CH2O/C3H4O Eigenvalues from Solving Equation 4-2 45 Figure 4-9: SCREE Plot Indicating Eigenvalues for Each Principal Component; Principal Components 1 and 2 Explain 77.1% and 22.9% of the Total Calibration Data Set Variance Next, Equation 4.4 can be solved to find the eigenvectors corresponding to the calculated eigenvalues. This yields the loadings matrix, [V] shown in Figure 4-10. With [V], Equation 4.5 can be solved to determine the scores matrix, [S], which is shown in Figure 4-11. Since the number of significant principal components was calculated to be two, only the first two columns of both [V] and [S] are needed to represent the 100% of the variance found in the calibration data set for [V*] and [S*]. The residual error from using two principal components is found by solving Equation 4.6 for [E], which for this simulated data set is for all practical purposes zero. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ??? ????? ???? ? ????? ? ??? 4835.06139.02454.03276.02717.01576.03510.0 0027.01729.07224.00805.03667.02143.05113.0 5769.04707.01913.03900.01328.04319.02297.0 4835.03442.00194.02983.05267.01109.05184.0 1593.00366.05595.04629.03906.00821.05355.0 02341.01528.05681.05853.05065.00018.0 4175.04439.02144.03288.00397.06834.00842.0 Figure 4-10: CH2O/C3H4O Loadings Matrix, [V](7 x 7) 46 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 000000000.25760.5595 000000000.0394-0.3017- 000000000.12740.0197- 00000000.00010.02590.2806 000000000.1432-0.2135 000000000.38710.8184- 00000000.0001-0.2146-0.0068 0000000.000100.0440-0.2646- 000000000.08940.6159 000000000.4462-0.2720- Figure 4-11: CH2O/C3H4O Scores Matrix, [S](10 x 10) With [V*] and [S*], PCR can be conducted and the vector of estimates of regression coefficients, [b], shown in Figure 4-12 can be found by solving Equation 4.9. The intercept, a, can be found by solving Equation 4.11 and for this example is calculated to be -4.2x10-4. Applying this analysis, a plot can be created that shows the calibration data set used to determine the concentrations in the prediction data as shown in Figure 4-13. With the PCR technique, the RMSE (Equation 4.12) for both the calibration set (RMSEC) and for the prediction data set (RMSEP) is found to be 1.1x10-4 ppm. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0.1418 0.3443- 0.5008 0.0588 0.1839- 0.6291 0.8606 Figure 4-12: CH2O/C3H4O Estimates of Regression Coefficients from Calibration Data Set, [b](7 x 1) 47 Figure 4-13: Principal Component Regression (PCR) for CH2O Concentrations in CH2O/C3H4O Gas Mixtures; RMSE Calibration = 1.1x10-4 ppm, RMSE Prediction = 1.1x10-4 ppm PCA on the entire set of collected spectra, both calibration and prediction sets, is capable of isolating the peaks associated with each particular component as shown in Figures 4-14 and 4- 15. Since the principal component loadings, V-1 and V-2, are abstract representations of information within the original data set, it is acceptable and in most cases unavoidable to have negative elements due to the orthogonality requirement of PCA. The benefit of Figures 4-14 and 4-15 are that V-1 and V-2 can be easily identifiable as corresponding to the pure spectra of C3H4O and CH2O, respectively. 48 Figure 4-14: PCA Separated CH2O Spectra in CH2O/C3H4O Gas Mixtures Figure 4-15: PCA Separated C3H4O Spectra in CH2O/C3H4O Gas Mixtures With the components separated, the contribution of each component to the sample mixtures can be determined as shown as in Figures 4-16 and 4-17 for the C3H4O calibration and 49 prediction data sets, respectively, and Figures 4-18 and 4-19 for CH2O calibration and prediction data set, respectively. Figure 4-16: PCC C3H4O Calibration Data Set; Note ? Only 5 of 10 Calibration Spectra Shown Figure 4-17: PCC C3H4O Prediction Data Set 50 Figure 4-18: PCC CH2O Calibration Data Set; Note ? Only 5 of 10 Calibration Spectra Shown Figure 4-19: PCC CH2O Prediction Data Set As shown in Figure 4-20 for CH2O, a baseline correction is necessary to remove the generated noise into the data set due to the mathematics in the PCA technique. With the baseline correction made (Figure 4-21), and based on the linearity assumption, the proportionality 51 constant can be determined from Equation 4.13 for each of the individual components in the calibration data set from the isolated peaks. This calculated proportionality constant, equal to the inverse of the slope of the line of best fit shown in Figure 4-21, can then be used to calculate the concentration of the mixtures in the prediction data set that are shown in Figure 4-22. Figure 4-20: PCC CH2O Calibration Data Set with No Baseline Correction Figure 4-21: PCC Calibration Data Set with Baseline Correction 52 Figure 4-22: PCC CH2O Prediction Data Set; RMSE Calibration = 2.6x10-2 ppm, RMSE Prediction = 2.6x10-2 ppm The comparison of errors associated with both the PCR and PCC techniques shown in Table 4-4 indicates that the PCC technique could be a viable alternative quantitative analysis method to the PCR technique. The PCC technique in most cases provides a more tangible solution method that directly relates total area under an absorbance as a function of wavenumber curve proportionally to the volume of the component in the mixture, while the PCR technique is solely a mathematical multivariate regression technique. In addition, the PCC technique can be further expanded, as will be shown in Chapter 5, to analyze data sets that do not contain a constant volume of the mixtures across the entire sample space and outperform the PCR technique in terms of RMSEC and RMSEP. Table 4-4: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques Analysis Technique RMSE Calibration RMSE Prediction PCR 1.1x10-4 1.1x10-4 PCC 2.6x10-2 2.6x10-2 53 Chapter 5: PCA Application to FTIR Spectroscopy Data of Vapor Phase Hydrogen Peroxide (VPHP) Aircraft Cabin Decontamination Events Discussion of Analyzed Data Sets Data set 1 was comprised of experimentally obtained IR spectra from H2O2 aqueous solution mixtures. Eight concentrations for the calibration data set were prepared by diluting a 35 wt.% H2O2 solution with de-ionized water at approximately 5 wt.% intervals. The H2O2 aqueous solution mixture samples for calibration in data set 1 were made from the solution in the Steris 1000ED Bio-decontamination Unit (Mentor, OH, USA) that uses VAPROX? (35 wt.% H2O2) as the sterilant. A single sample, for the prediction data set, was taken from an experimental VPHP run where conditions were known to produce significant and observable condensation of the H2O2 within a sample chamber. The IR spectra for data set 1 was obtained using the FTIR spectrometer over a wavenumber scan range from 2000 cm-1 to 1200 cm-1 with a spectral resolution of 4 cm-1. Data set 2 was comprised of a calibration data set with 15 spectra of H2O2 concentrations in aqueous solutions prepared by diluting a 70 wt.% H2O2 solution offered by Armeka Canada Inc. with de-ionized water at approximately 5 wt. % intervals thus giving a calibration set that spans 0-70 wt.% H2O2 in solution. The prediction data set for the second set of H2O2 experiments consisted of 5 samples taken from different experimental VPHP runs using the Steris 1000ED Bio-decontamination Unit that produced condensation of the H2O2 within a 54 sample chamber. The IR spectra for data set 2 was obtained using the FTIR spectrometer over a wavenumber scan range from 1800 cm-1 to 1200 cm-1 with a spectral resolution of 4 cm-1. The purpose of data set 2 was to confirm the findings from data set 1. Data set 1 consisted of a restricted range for the calibration data set because the Steris 1000ED Bio- decontamination Unit operates with 35 wt.% H2O2 in aqueous solution. Use of this 35 wt.% H2O2 solution though is capable of producing condensates that are much higher in H2O2 concentration depending on the particular operating conditions as discussed in [24]. Data set 2 contains a more complete calibration data set that may not always be available in engineering applications and shows that the performance of the PCA technique in regards to data set 1 is sufficient when a complete calibration data is not feasible or possible to obtain. Sample collection for the prediction data sets 1 and 2 was performed by Mobbassar Hassan Sk (Ph.D. Graduate Student, Materials Engineering, Auburn University). To confirm the PCR and PCC techniques for prediction in each of the data sets, a titration process (performed by Mobbassar Hassan Sk) on the prediction data sets for the H2O2 aqueous solutions [50]. First, a 5 N aqueous H2SO4 solution, where N is the number of protons (H+) in a molecule of the acid, was made using 36 N H2SO4 (Fisher Scientific, Pittsburg, PA, USA, lot no. 094134) and a 0.05 M (moles/L) KMnO4 solution was made from solid KMnO4 (Fisher Scientific). Next, 50 ml of the 5 N aqueous H2SO4 solution was taken in a flask and an exactly weighed sample of liquid H2O2 was added to it followed by thorough mixing. Then, the 0.05 M KMnO4 solution was taken in a burette and added to the solution mixture drop wise with constant stirring of the solution mixture. The end of titration process was easily identifiable by the permanent change of color of the solution mixture into pale pink. Once this point was 55 reached, the volume (measured in mL) of KMnO4 solution consumed in the titration process was noted. The weight percentage of H2O2 in the solution was calculated using Equation 5.1. wtsolutionOHsample mLconsumedMstrengthKMNOOHwt )( 425175.0][)05.0(.% 22 4 22 ?== (5.1) PCA Results and Discussions Data Set 1 From the experimental FTIR spectral analysis of the H2O2 in aqueous solution mixture, a set of 8 samples (n = 8) at 801 different wavenumbers (p = 801) were collected as shown in Figure 5-1 to produce a calibration data set [XC](8 x 801). A single sample was collected as shown in Figure 5-2 to produce a prediction data set [XP](1 x 801) for PCR. In the case of the PCC, a single data set [X](9 x 801) consisting of the calibration and prediction data sets were combined and PCA was simultaneously performed on all 9 samples to extract out the pure components in each spectra. 56 Figure 5-1: H2O2 in Aqueous Solution Calibration Data Set 1, [XC](8 x 801); Note ? Only 4 of 8 Calibration Spectra Shown, 0%, 10%, 20%, 30% H2O2 Figure 5-2: H2O2 in Aqueous Solution Prediction Data Set 1, [XP](1 x 801); 63.7% H2O2 As shown in the plot of eigenvalues as a function of principal components (Figure 5-3), the number of significant eigenvalues obtained from PCA is two. The first principal component 57 explains 67.5% of the total calibration data set variance and the second principal component explains of 28.5% of the total calibration data set variance. With these two principal components, 96.0% of the original data variance can be explained. Figure 5-3: SCREE Plot Indicating Eigenvalues for Each Principal Component for H2O2 in Aqueous Solution Data Set 1; Principal Components 1 and 2 Explain 67.5% and 28.5% of the Total Calibration Data Set Variance Next, Equation 4.4 can be solved to find the eigenvectors corresponding to the calculated eigenvalues and this yields the loadings matrix, [V]. With [V], Equation 4.5 can be solved to determine the scores matrix, [S]. Since the number of significant principal components was calculated to be two, only the first two columns of both [V] and [S] are needed to represent the 96.0% of the variance found in the calibration data set for [V*] and [S*]. Figure 5-4 shows the plot of the reduced loadings matrix [V*] for the first two principal components. V-1 represents the variable in the original data set contributing the most variance within the spectra, the H2O component and V-2 represents the variable in the original data set contributing the second most variance, the H2O2 component. V-2 produces two partitions, with the positive loadings representing the bands for H2O2 and the negative loadings representing the bands for H2O. 58 Figure 5-4: H2O2 in Aqueous Solution Mixtures ? Reduced Principal Component Loadings, [V*]; V-1 represents the variable in the original data set contributing the most variance within the spectra, the H2O component, V-2 represents the variable in the original data set contributing the second most variance, the H2O2 component With [V*] and [S*], PCR can be conducted and the vector of estimates of regression coefficients for wt. % H2O2, [bH2O2], shown graphically by wavenumber in Figure 5-5, can be found by solving Equation 4.9. The intercept, aH2O2, can be found by solving Equation 4.11 and for H2O2 calibration data set 1 it is calculated to be 12.7. Applying this analysis, a plot can be created that shows the calibration data set used to determine the concentrations in the prediction data set as shown in Figure 5-6. With the PCR technique, the RMSE (Equation 4.12) for the calibration data set 1 is found to be 2.1 wt.% of H2O2, while the RMSE for the prediction data set 1 is found to be 12.0 wt.% of H2O2. 59 Figure 5-5: H2O2 in Aqueous Solution Mixtures ? Estimates of Regression Coefficients for wt. % of H2O2 from Calibration Data Set 1 as Function of Wavenumber, [bH2O2](801 x 1) Figure 5-6: Principal Component Regression (PCR) ? H2O2 Concentrations for H2O2 in Aqueous Solution Mixtures; RMSE Calibration = 2.1 wt.%, RMSE Prediction = 12.0 wt.% 60 In some cases, to improve the PCR analysis for H2O2 the analysis can utilize more of the identified principal components. This is not always possible because in some cases, the representation of redundant data can cause numerical instability within the regression analysis. To show that for this experimental data the PCR technique is not dramatically improved and can actually become worse with the use of more principal components, Figure 5-7 was produced that shows both RMSEC and RMSEP as a function of increasing number of principal components used to represent the original data set. This analysis indicates that no significant improvements can be made in terms of RMSEC for the H2O2 data set with the use of more principal components than was calculated to be necessary to explain 96.0% of the original data variance. Figure 5-7: H2O2 RMSE Calibration and RMSE Prediction as a Function of the Number of Principal Components Used to Represent the Original Data Set for H2O2 in Aqueous Solution Mixtures Data Set 1 PCA on the entire set of collected spectra, both calibration and prediction sets, is capable of isolating the peaks associated with each particular component. Since the principal component loadings, V-1 and V-2, are abstract representations of information within the original data set, it 61 is acceptable and in most cases unavoidable to have negative elements due to the orthogonality requirement of PCA. With the components separated, the contribution of each component to the sample mixtures can be determined as shown as in Figure 5-8 and 5-9 for the H2O2 calibration and prediction data sets, respectively, and Figures 5-10 and 5-11 for H2O calibration and prediction data set, respectively. Figure 5-8: PCC H2O2 Calibration Data Set 1; Note ? Only 3 of 8 Calibration Spectra Shown, 0%, 20%, 30% H2O2 62 Figure 5-9: PCC H2O2 Prediction Data Set 1; 63.7% H2O2 Figure 5-10: PCC H2O Calibration Data Set 1; Note ? Only 2 of 8 Calibration Spectra Shown, 70%, 100% H2O 63 Figure 5-11: PCC H2O Prediction Data Set 1; 36.3% H2O As evident in Figure 5-12 and 5-13 for the H2O2 and H2O calibration data sets, a baseline correction is necessary to remove the generated noise into the data set due to the mathematics in the PCA technique as well as the noise from the experiment. In addition, it is also clear that the samples contain variable total amounts of solution and it will be necessary to correct for this by using the relationship in Equation 4.14 for the prediction data set. 64 Figure 5-12: PCC H2O2 Calibration Data Set 1 with No Baseline Correction Figure 5-13: PCC H2O Calibration Data Set 1 with No Baseline Correction With the baseline correction made (Figures 5-14 and 5-15), and based on the linearity assumption, the proportionality constant can be determined from Equation 4.13 for each of the individual components in the calibration data set from the isolated peaks. For the baseline 65 correction, it was found that subtraction of the calculated baseline value from the fourth data point (15 wt.% H2O2) in the calibration data set produces a negative value. Because of this, the proportionality constant calculation uses a calibration data set with seven components. This may indicate that some error occurred during the FTIR sampling procedure for the 15 wt.% H2O2 sample. The calculated proportionality constants (kH2O2 = 0.19, kH2O = 0.67), equal to the inverse of the slope of the lines of best fit shown in Figures 5-14 and 5-15, can then be used to calculate the concentration of the mixtures in the PCC model as shown in Figure 5-16. This model requires the utilization of the relationship presented in Equation 4.14 to correct for the total amount of solution varying throughout both the calibration and prediction sample spaces. With the PCC technique, the RMSE (Equation 4.12) for the calibration data set 1 is found to 4.1 wt.%, while the RMSE for the prediction data set 1 is found to be 5.2 wt.%. Figure 5-14: PCC H2O2 Calibration Data Set 1 with Baseline Correction 66 Figure 5-15: PCC H2O Calibration Data Set 1 with Baseline Correction Figure 5-16: PCC Model for wt.% of H2O2 in Aqueous Solution with Variable Total Amount in Data Set 1; kH2O2 = 0.19, kH2O = 0.67; RMSE Calibration = 4.1 wt.%, RMSE Prediction = 5.2 wt.% As shown in Table 5-1, comparison of the PCR and PCC technique for analysis of data sets that do not contain constant volume of the mixtures across the entire sample space indicates 67 that the PCC technique performs significantly better in terms of prediction of the unknown concentration. This finding will be further explored by an additional data set presented next that contains 15 spectra of H2O2 concentrations in aqueous solutions for the calibration data set and 5 spectra for the prediction data set. Table 5-1: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; H2O2 in Aqueous Solution Data Set 1 Analysis Technique RMSE Calibration (wt.%) RMSE Prediction (wt.%) PCR 2.1 12.0 PCC 4.1 5.2 Data Set 2 From the experimental FTIR spectral analysis of the H2O2 in aqueous solution mixture, a set of 15 samples (n = 15) at 601 different wavenumbers (p = 601) were collected to produce a calibration data set [XC](15 x 601) of wt.% H2O2 in aqueous solution from 0-70% at 5% intervals. A set of 5 samples shown in Table 5-2 were collected to produce a prediction data set [XP](5 x 601) for PCR. In the case of the PCC, a single data set [X](20 x 601) consisting of the calibration and prediction data sets were combined and PCA was simultaneously performed on all 20 samples to extract out the pure components in each spectra. Table 5-2: H2O2 in Aqueous Solution Spectra Compositions for Prediction Data Set 2, [XP](5 x 601) Sample # H2O2 Concentration (wt.%) from Titration 16 38.4 17 35.9 18 26.0 19 38.7 20 45.2 The number of significant eigenvalues obtained from PCA is two (Figure 5-17). The first principal component explains 61% of the total calibration data set variance and the second 68 principal component explains of 34% of the total calibration data set variance. With these two principal components, 95% of the original data variance can be explained. Figure 5-17: SCREE Plot Indicating Eigenvalues for Each Principal Component for H2O2 in Aqueous Solution Data Set 2; Principal Components 1 and 2 Explain 61% and 34% of the Total Calibration Data Set Variance The plot of the calibration data set used to determine the concentrations in the prediction data set is shown in Figure 5-18. The RMSE for the calibration data set 2 is found to be 10.7 wt.%, while the RMSE for the prediction data set 2 is found to be 3.9 wt.%. Figure 5-18: Principal Component Regression (PCR) ? H2O2 Concentrations for H2O2 in Aqueous Solution with Variable Total Amount in Data Set 2; RMSE Calibration = 10.7 wt.%, RMSE Prediction = 3.9 wt.% 69 With the components separated from PCA on the entire set of collected spectra, the contribution of each component to the sample mixtures can be determined. Baseline corrections were made to the calibration data set spectra to account for non-zero values for the 0% H2O2 concentration spectra as well as from extrapolation to 0% H2O concentration from the spectra values shown. The calculated proportionality constants (kH2O2 = 0.65, kH2O = 0.40) are used to calculate the concentration of the mixtures in the PCC model as shown in Figure 5-19. With the PCC technique, the RMSE for the calibration data set 2 is found to be 5.9 wt.%, while the RMSE for the prediction data is found to be 6.0 wt.%. Figure 5-19: PCC Model for wt.% of H2O2 in Aqueous Solution with Variable Total Amount in Data Set 2; kH2O2 = 0.65, kH2O = 0.40; RMSE Calibration = 5.9 wt.%, RMSE Prediction = 6.0 wt.% As shown in Table 5-3, comparison of the PCR and PCC technique for analysis of data sets that do not contain constant volume of the mixtures across the entire sample space indicates that the PCC technique performs significantly better overall for the calibration data set when compared on a RMSE basis. PCC compares reasonable well to the PCR technique in terms of 70 RMSE for the prediction data set. For the PCC technique, the RMSE were of 5.9 wt.% and 6.0 wt.% for the calibration and prediction data sets, respectively compared to the PCR technique, which had RMSE of 10.7 wt.% and 3.9 wt.% for the calibration and prediction data set, respectively. The PCC technique performs similarly in low (0-20 wt.%), medium (25-45 wt.%), and high ranges (50-70 wt.%) of H2O2 concentration with the RMSE of calibration for each of the segments for being 6.0 wt.%, 6.0 wt.%, and 5.7 wt.%, respectively. In contrast, the performance of the PCR technique in low H2O2 concentrations (15.7 wt.%) is worse than that for medium (6.8 wt.%) and high (7.0 wt.%) H2O2 concentration ranges. Table 5-3: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; H2O2 in Aqueous Solution Data Set 2 Analysis Technique RMSE Calibration (wt.%) RMSE Prediction (wt.%) PCR ? Overall 10.7 3.9 PCR ? Low (0-20 wt.% H2O2) 15.7 - PCR ? Med (25-45 wt.% H2O2) 6.8 - PCR ? High (50-70 wt.% H2O2) 7.0 - PCC ? Overall 5.9 6.0 PCC ? Low (0-20 wt.% H2O2) 6.0 - PCC ? Med (25-45 wt.% H2O2) 6.0 - PCC ? High (50-70 wt.% H2O2) 5.7 - This study indicates that FTIR spectroscopy used in conjunction with the discussed chemometric techniques has the potential to be utilized in determining the H2O2 concentrations in aqueous solutions from condensation events that may occur during a VPHP decontamination event, even when a complete calibration data set is not utilized, as was shown with data set 1. 71 Chapter 6: PCA Application to FTIR Spectroscopy Data of Potential Environment Air Contaminants within the Aircraft Cabin Discussion of Analyzed Data Sets In the simulation data sets, pure spectra from the QASoft? database are used. To form a simulated mixture, pure spectra are added together and different multiplication factors are applied to achieve a range of component concentrations. The simulated data sets consist of various spectra of targeted component systems for two and three component systems. The simulated data sets are comprised of the entire pure spectra of each of the components. To make the data sets more manageable, the analyzed data sets are comprised of the raw data averaged over ten wavenumber data points, thus reducing the size by an order of magnitude. This reduced data set is then used as input in a MATLAB? program to determine the PCA, PCR, and PCC characteristics of the data. The first two-component simulation data set was a gas mixture of CH2O and C3H4O. This system is of interest because of the strong overlap that the two components have in their respective IR spectra. The second two-component simulation data set was of gas mixtures consisting of CO and CO2. The experimental two-component data set consisted of mixtures of CO and CO2 from gas cylinders. This system is of interest in that it is expected to be a combination of potential environmental air contaminants most commonly found in the aircraft cabin as well as being a data set that can be tested within both the Commercial Sensor Module and FTIR Gas Analysis Module in conjunction with simultaneous testing of commercial sensors. 72 For the three-component systems, a simulation data set consisting of a gas mixture of CH2O, C3H4O, and H2O was analyzed and like the two-component system, all three components have significant overlap within their respective spectra. The second three-component simulation data set was of gas mixtures consisting of CO, CO2, and H2O. The experimental three- component data set consisted of mixtures of CO and CO2 from gas cylinders with H2O introduced into the system from evaporation in heated crucible linked to the FTIR Gas Analysis Module. The introduced was not directly controlled and was allowed to enter the system as it was evaporated to present variable water vapor levels when detecting known concentrations of CO/CO2 gas mixtures. Even though the H2O spectrum does not overlap with CO or CO2 spectra, it is expected to be present at significantly higher concentrations than the target components that could be potential environmental air contaminants within the aircraft cabin. The pathlength of the gas cell for the experimental gas mixtures was kept constant at 2.24 cm. PCA Results and Discussions: 2-Component Systems CH2O/C3H4O Simulation Data Set From the simulated FTIR spectral analysis of the CH2O/C3H4O gas mixtures, a set of 8 samples (n = 8) at 2,655 different wavenumbers (p = 2,655) with concentrations given in Table 6-1 were compiled as shown in Figure 6-1 to produce a calibration data set [XC](8 x 2655). A set of 3 samples with concentrations given in Table 6-2 were compiled as shown in Figure 6-2 to produce a prediction data set [XP](3 x 2655) for PCR. In the case of the PCC, a single data set [X](11 x 2655) consisting of the calibration and prediction data sets were combined and PCA was simultaneously performed on all 11 samples to extract out the pure components in each spectra. 73 Table 6-1: CH2O/C3H4O Spectra Compositions for Calibration Data Set, [XC](8 x 2655), Ordered from Low to High Concentration of CH2O Sample # Amount of CH2O (x100 ppm) Amount of C3H4O (x100 ppm) 1 0.00 2.50 2 0.50 1.00 3 1.00 0.50 4 1.50 2.00 5 2.00 1.50 6 3.00 4.75 7 4.75 3.00 8 10.00 10.00 Figure 6-1: CH2O/C3H4O Gas Mixtures Calibration Data Set, [XC](8 x 2655); Note ? Only 3 of 8 Calibration Spectra Shown Table 6-2: CH2O/C3H4O Spectra Compositions for Prediction Data Set, [XP](3 x 2655), Ordered from Low to High Concentration of CH2O Sample # Amount of CH2O (x100 ppm) Amount of C3H4O (x100 ppm) 9 1.33 8.00 10 2.50 0.00 11 8.00 1.33 74 Figure 6-2: CH2O/C3H4O Gas Mixtures Prediction Data Set, [XP](3 x 2655) As shown in the plot of eigenvalues as a function of principal components, the number of significant eigenvalues obtained from PCA is two (Figure 6-3). The first principal component explains 98.4% of the total calibration data set variance and the second principal component explains of 1.6% of the total calibration data set variance. With these two principal components, 100.0% of the original data variance can be explained. Figure 6-3: SCREE Plot Indicating Eigenvalues for Each Principal Component for CH2O/C3H4O Gas Mixtures Data Set; Principal Components 1 and 2 Explain 98.4% and 1.6% of the Total Calibration Data Set Variance 75 Applying PCR, a plot can be created that shows the calibration data set used to determine the concentrations in the prediction data set as shown in Figures 6-4 and 6-5 for CH2O and C3H4O, respectively. With the PCR technique, the RMSE for the calibration and prediction data sets in regards to the concentration of CH2O and C3H4O are found to be 0 ppm. Figure 6-4: Principal Component Regression (PCR) ? CH2O Concentrations in CH2O/C3H4O Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm 76 Figure 6-5: Principal Component Regression (PCR) ? C3H4O Concentrations in CH2O/C3H4O Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm PCA on the entire set of collected spectra, both calibration and prediction sets, isolates the peaks associated with each particular component as shown in Figures 6-6 and 6-7. The benefit of these figures are that V-1 and V-2 can be easily identifiable as corresponding to the pure spectra of CH2O and C3H4O and, respectively. With the components separated, the contribution of each component to the sample mixtures can be determined. 77 Figure 6-6: PCA Separated CH2O Spectra in CH2O/C3H4O Gas Mixtures Figure 6-7: PCA Separated C3H4O Spectra in CH2O/C3H4O Gas Mixtures With the baseline correction made, and based on the linearity assumption, the proportionality constant is determined for each of the individual components in the calibration 78 data set from the isolated peaks. The calculated proportionality constants (kCH2O = 0.08, kC3H4O = 0.32), equal to the inverse of the slope of the lines of best fit shown, can then be used to calculate the concentrations of the simulate gas mixtures for CH2O and C3H4O shown in Figures 6-8 and 6-9, respectively. With the PCC method, the RMSE for CH2O and C3H4O calibration data sets are found to be 1 ppm and 2 ppm, respectively, while the RMSE of the prediction data sets for CH2O and C3H4O are found to be 0 ppm and 2 ppm, respectively. Figure 6-8: PCC CH2O Calibration and Prediction Data Sets in CH2O/C3H4O Gas Mixtures; kCH2O = 0.08; RMSE Calibration = 1 ppm, RMSE Prediction = 2 ppm 79 Figure 6-9: PCC C3H4O Calibration and Prediction Data Sets in CH2O/C3H4O Gas Mixtures; kC3H4O = 0.32; RMSE Calibration = 0 ppm, RMSE Prediction = 2 ppm As shown in Table 6-3, the comparison of the PCR and PCC techniques indicate that the PCC technique could be a viable alternative quantitative analysis method to the PCR technique. The PCC technique for this example does not perform as well as PCR due to the strong overlap of the two components that make the separation of the two spectra more difficult for PCA. The PCA representation for CH2O still contains small amounts of spectra that are due to C3H4O thus, the calculation for the area under the curve for each concentration introduces error within the PCC technique. In addition, the PCA representation for C3H4O does not contain the entire spectra for the wavenumber range of 2900-2700 cm-1, where there is a strong overlap of the CH2O spectra. 80 Table 6-3: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; CH2O/C3H4O Gas Mixtures Analysis Technique RMSE Calibration (ppm) RMSE Prediction (ppm) CH2O: PCR 0 0 CH2O: PCC 1 2 C3H4O: PCR 0 0 C3H4O: PCC 0 2 CO/CO2 Simulation Data Set From the simulated FTIR spectral analysis of the CO/CO2 gas mixtures, a set of 10 samples (n = 10) at 2,636 different wavenumbers (p = 2,636) with concentrations given in Table 6-4 were compiled as shown in Figure 6-10 to produce a calibration data set [XC](10 x 2636). A set of 5 samples with concentrations given in Table 6-5 were compiled as shown in Figure 6-112 to produce a prediction data set [XP](5 x 2636) for PCR. In the case of the PCC, a single data set [X](15 x 2636) consisting of the calibration and prediction data sets were combined and PCA was simultaneously performed on all 15 samples to extract out the pure components in each spectra. Table 6-4: CO/CO2 Spectra Compositions for Calibration Data Set, [XC](10 x 2636) Sample # Amount of CO (x100 ppm) Amount of CO2 (x100 ppm) 1 0.00 5.00 2 8.00 3.00 3 6.00 1.00 4 4.00 0.50 5 2.00 1.50 6 1.00 3.50 7 3.00 0.25 8 5.00 0.00 9 7.00 2.00 10 9.00 4.00 81 Figure 6-10: CO/CO2 Gas Mixtures Calibration Data Set, [XC](10 x 2636); Note ? Only 4 of 10 Calibration Spectra Shown Table 6-5: CO/CO2 Spectra Compositions for Prediction Data Set, [XP](5 x 2636) Sample # Amount of CO (x100 ppm) Amount of CO2 (x100 ppm) 11 1.25 0.40 12 5.67 0.80 13 2.25 2.20 14 4.33 1.20 15 8.75 0.60 82 Figure 6-11: CO/CO2 Gas Mixtures Prediction Data Set, [XP](5 x 2636) As shown in the plot of eigenvalues as a function of principal components, the number of significant eigenvalues obtained from PCA is two (Figure 6-12). The first principal component explains 99.1% of the total calibration data set variance and the second principal component explains of 0.9% of the total calibration data set variance. With these two principal components, 100.0% of the original data variance can be explained. Figure 6-12: SCREE Plot Indicating Eigenvalues for Each Principal Component for CO/CO2 Gas Mixtures Data Set; Principal Components 1 and 2 Explain 99.1% and 0.9% of the Total Calibration Data Set Variance 83 Applying PCR, a plot is created that shows the calibration data set used to determine the concentrations in the prediction data set as shown in Figures 6-13 and 6-14 for CO and CO2, respectively. With the PCR technique, the RMSE for the calibration and prediction data set in regards to concentrations of CO and CO2 is found to be 0 ppm. Figure 6-13: Principal Component Regression (PCR) ? CO Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm 84 Figure 6-14: Principal Component Regression (PCR) ? CO2 Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm Figures 6-15 and 6-16, produced from PCA on the entire set of compiled spectra, show that V-2 and V-1 can be easily identifiable as corresponding to the pure spectra of CO and CO2 and, respectively. With the components separated, the contribution of each component to the sample mixtures can be determined for the CO calibration and prediction data sets, and for CO2 calibration and prediction data set. 85 Figure 6-15: PCA Separated CO Spectra in CO/CO2 Gas Mixtures Figure 6-16: PCA Separated CO2 Spectra in CO/CO2 Gas Mixtures The calculated proportionality constants (kCO = 0.75, kCO2 = 0.06), equal to the inverse of the slope of the lines of best fit shown, are then be used to calculate the concentration of the 86 mixtures for CO and CO2 shown in Figures 6-17 and 6-18, respectively. With the PCC method, the RMSE for CO and CO2 calibration and prediction data sets are found to be 0 ppm . Figure 6-17: PCC CO Calibration and Prediction Data Sets in CO/CO2 Gas Mixtures; kCO = 0.75; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm Figure 6-18: PCC CO2 Calibration and Prediction Data Sets in CO/CO2 Gas Mixtures; kCO2 = 0.06; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm 87 The values shown in Table 6-6 highlight that the PCC techniques compares very well with the PCR technique and although not zero for CO, the values are still statistically low enough to indicate PCC technique could be a viable alternative quantitative analysis method to the PCR technique. The PCC technique for CO in this example performs just as well as the PCR technique even with the strong absorption of CO2 that lowers the impact the CO spectra has on the overall data set variation. Table 6-6: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; CO/CO2 Gas Mixtures Analysis Technique RMSE Calibration (ppm) RMSE Prediction (ppm) CO: PCR 0.0 0.0 CO: PCC 0.3 0.2 CO2: PCR 0.0 0.0 CO2: PCC 0.0 0.0 CO/CO2 Experimental Data Set From the experimental FTIR spectral analysis of the CO/CO2 gas mixtures, a set of 8 samples (n = 8) at 5,001 different wavenumbers (p = 5,001) with concentrations given in Table 6-7 were collected as shown in Figure 6-19 to produce a calibration data set [XC](8 x 5001). A set of 4 samples with concentrations given in Table 6-8 were collected as shown in Figure 6-20 to produce a prediction data set [XP](4 x 5001) for PCR. In the case of the PCC, a single data set [X](12 x 5001) consisting of the calibration and prediction data sets were combined and PCA was simultaneously performed on all 12 samples to extract out the pure components in each spectra. Table 6-7: CO/CO2 Spectra Compositions for Experimental Calibration Data Set, [XC](8 x 5001) Sample # Amount of CO (ppm) Amount of CO2 (ppm) 1 258.0 3.1 3 653.8 3.3 4 737.8 2.7 5 756.5 6.6 6 735.9 13.1 8 741.2 103.1 10 750.5 178.8 12 777.5 210.6 88 Figure 6-19: CO/CO2 Gas Mixtures Experimental Calibration Data Set, [XC](8 x 5001); Note ? Only 4 of 8 Calibration Spectra Shown Table 6-8: CO/CO2 Spectra Compositions for Experimental Prediction Data Set, [XP](8 x 5001) Sample # Amount of CO (ppm) Amount of CO2 (ppm) 2 441.2 1.4 7 737.8 55.5 9 754.8 127.8 11 759.5 195.2 Figure 6-20: CO/CO2 Gas Mixtures Prediction Data Set, [XP](4 x 5001) 89 As shown in the plot of eigenvalues as a function of principal components, the number of significant eigenvalues obtained from PCA is two (Figure 6-21). The first principal component explains 97.9% of the total calibration data set variance and the second principal component explains of 0.9% of the total calibration data set variance. With these two principal components, 98.8% of the original data variance can be explained. This compares well with the explanation of variation that was found with the CO/CO2 simulation data set, where the principal components corresponding to CO2 and CO explained 99.1% and 0.9% of the data set variance, respectively. In an unknown mixture, the second component would typically not be analyzed since the first component explains at least 95% of the total data set variance as necessary for an experimental data set. The 95% explanation of the total data set variance level is associated with the typical error of the FTIR sampling procedure, which is 5%. Figure 6-21: SCREE Plot Indicating Eigenvalues for Each Principal Component for CO/CO2 Gas Mixtures Data Set; Principal Components 1 and 2 Explain 97.9% and 0.9% of the Total Calibration Data Set Variance Applying PCR, a plot can be created that shows the calibration data set used to determine the concentrations in the prediction data set as shown in Figures 6-22 and 6-23 for CO and CO2, respectively. With the PCR technique, the RMSEC for CO is found to be 92 ppm, while the 90 RMSEP for CO is found to be 49 ppm. The RMSEC and RMSEP values for CO2 are found to be 1 ppm. Figure 6-22: Principal Component Regression (PCR) ? CO Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 92 ppm, RMSE Prediction = 49 ppm Figure 6-23: Principal Component Regression (PCR) ? CO2 Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 1 ppm, RMSE Prediction = 1 ppm 91 To improve the PCR analysis for CO, more principal components can be utilized. This is not always possible because in some cases, the representation of redundant data can cause numerical instability within the regression analysis. To show that for this experimental data the PCR technique is improved with the use of more principal components, Figure 6-24 was produced that shows both RMSE for calibration and prediction as a function of increasing number of principal components used to represent the original data set. This analysis indicates with the use of three principal components, which explains 99.6% of the original data variance, the RMSEC is reduced to 16 ppm and the RMSEP is reduced to 14 ppm as shown in Figure 6-25. Further analysis with four principal components, which cumulatively explain 100.0% of the original data variance, the RMSEC is found to be 8 ppm and the RMSEP is found to be 14 ppm as shown in Figure 6-26. The addition of the remaining principal components will not provide additional explanation of the data set variance. Figure 6-24: CO RMSE Calibration and RMSE Prediction as a Function of the Number of Principal Components Used to Represent the Original Data Set in CO/CO2 Gas Mixtures 92 Figure 6-25: Principal Component Regression (PCR) with 3 Principal Components ? CO Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 16 ppm, RMSE Prediction = 14 ppm Figure 6-26: Principal Component Regression (PCR) with 4 Principal Components ? CO Concentrations in CO/CO2 Gas Mixtures; RMSE Calibration = 8 ppm, RMSE Prediction = 14 ppm 93 This analysis indicates that the spectra may contain beneficial data beyond the two principal components that explain 98.8% of the original calibration data set variation. This may be an indication that the variance for the IR spectra of the CO component in the CO/CO2 mixed gas environment is partitioned among three principal components instead of just one. By utilizing more principal components, the quantification of the CO concentration is improved but there is no evidence to support four distinct components in the experimentally collected IR spectra since the system was purged with IR-inactive N2 and only CO and CO2 gas from certified gas cylinders were allowed into the FTIR Gas Analysis Module. The improved in using more principal components to calculated the regression coefficients for CO is highlighted in Figures 6- 27 thru 6-29, where the contribution to the CO2 peak region (2400 cm-1 ? 2300 cm-1) is diminished. Figure 6-27: CO/CO2 Gas Mixtures ? Estimates of Regression Coefficients for Concentrations of CO from Calibration Data Set as Function of Wavenumber, [bCO](5000 x 1); 2 Principal Components Used 94 Figure 6-28: CO/CO2 Gas Mixtures ? Estimates of Regression Coefficients for Concentrations of CO from Calibration Data Set as Function of Wavenumber, [bCO](5000 x 1); 3 Principal Components Used Figure 6-29: CO/CO2 Gas Mixtures ? Estimates of Regression Coefficients for Concentrations of CO from Calibration Data Set as Function of Wavenumber, [bCO](5000 x 1); 4 Principal Components Used 95 Figures 6-30 and 6-31, derived from PCA on the entire set of collected experimental data set spectra, show that V-2 and V-1 can be identified as corresponding to the pure spectra of CO and CO2 and, respectively. With the components separated, the contribution of each component to the sample mixtures can be determined for the CO calibration and prediction data sets, and for CO2 calibration and prediction data set. Figure 6-30: PCA Separated CO Spectra in CO/CO2 Gas Mixtures Represented by V-2 96 Figure 6-31: PCA Separated CO2 Spectra in CO/CO2 Gas Mixtures Represented by V-1 The calculated proportionality constants (kCO = 20, kCO2 = 0.25), equal to the inverse of the slope of the lines of best fit shown, are used to calculate the concentration of the mixtures for CO and CO2 shown in Figures 6-32 and 6-33, respectively. With the PCC method, the RMSE for CO and CO2 calibration data sets are found to be 204 ppm and 1 ppm, respectively, while the RMSE of the prediction data sets for CO and CO2 are found to be 194 ppm and 1 ppm, respectively. 97 Figure 6-32: PCC CO Calibration and Prediction Data Sets in CO/CO2 Gas Mixtures; kCO = 20; RMSE Calibration = 204 ppm, RMSE Prediction = 194 ppm Figure 6-33: PCC CO2 Calibration and Prediction Data Sets in CO/CO2 Gas Mixtures; kCO2 = 0.25; RMSE Calibration = 1 ppm, RMSE Prediction = 1 ppm 98 As shown in Table 6-9 the values for CO2 concentration calculations compare very well between the PCR and PCC techniques. The analysis using PCR is improved in quantitatively determining the CO concentration in the CO/CO2 experimental gas mixtures by utilizing additional principal components. PCC for CO in this example does not perform well at all due to the strong absorption of CO2 that lowers the impact the CO spectra has on the overall data set variation and thus makes it more difficult for the PCA process to extract the pure CO spectra using only 2 principal components. Because of this, the regression calculations for PCR and the calculation for the area under the curve for PCC, a significant amount of error is introduced in determining the CO concentration. Table 6-9: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; CO/CO2 Experimental Gas Mixtures Analysis Technique # of Principal Components Used in Analysis RMSE Calibration (ppm) RMSE Prediction (ppm) CO: PCR 2 92 49 3 16 14 4 8 14 CO: PCC 2 204 194 CO2: PCR 2 1 1 CO2: PCC 2 1 1 PCA Results and Discussions: 3-Component Systems CH2O/C3H4O/H2O Simulation Data Set From the simulated FTIR spectral analysis of the pure CH2O/C3H4O/H2O gas mixtures shown in Figure 6-34, a set of 8 samples (n = 8) at 2,655 different wavenumbers (p = 2,655) with concentrations given in Table 6-10 were combined as shown in Figure 6-35 to produce a calibration data set [XC](8 x 2655). A set of 3 samples with concentrations given in Table 6-11 were combined as shown in Figure 6-36 to produce a prediction data set [XP](3 x 2655) for PCR. In the case of the PCC, a single data set [X](11 x 2655) consisting of the calibration and prediction data 99 sets were combined and PCA was simultaneously performed on all 11 samples to extract out the pure components in each spectra. Figure 6-34: Formaldehyde (CH2O), Acrolein (C3H4O), and Water (H2O) Pure Spectra for Simulated Data Sets Illustrating Spectral Overlap Between All Three Components Table 6-10: CH2O/C3H4O/H2O Spectra Compositions for Calibration Data Set, [XC](8 x 2655), Ordered from Low to High Concentration of CH2O Sample # Amount of CH2O (x100 ppm) Amount of C3H4O (x100 ppm) Amount of H2O (x100 ppm) 1 0.00 2.50 0.00 2 0.50 1.00 50.00 3 1.00 0.50 1.00 4 1.50 2.00 25.00 5 2.00 1.50 40.00 6 3.00 4.75 33.30 7 4.75 3.00 20.00 8 10.00 10.00 5.00 100 Figure 6-35: CH2O/C3H4O/H2O Gas Mixtures Calibration Data Set, [XC](8 x 2655); Note ? Only 3 of 8 Calibration Spectra Shown Table 6-11: CH2O/C3H4O/H2O Spectra Compositions for Prediction Data Set, [XP](3 x 2655), Ordered from Low to High Concentration of CH2O Sample # Amount of CH2O (x100 ppm) Amount of C3H4O (x100 ppm) Amount of H2O (x100 ppm) 9 1.33 8.00 30.00 10 2.50 0.00 10.00 11 8.00 1.33 6.67 101 Figure 6-36: CH2O/C3H4O/H2O Gas Mixtures Prediction Data Set, [XP](3 x 2655) As shown in the plot of eigenvalues as a function of principal components, the number of significant eigenvalues obtained from PCA is three (Figure 6-37). The first principal component explains 71.9% of the total calibration data set variance, the second principal component explains of 26.9%, and the third principal component explains 1.2% of the total calibration data set variance. With these three principal components, 100.0% of the original data variance can be explained. 102 Figure 6-37: SCREE Plot Indicating Eigenvalues for Each Principal Component for CH2O/C3H4O/H2O Gas Mixtures Data Set; Principal Components 1, 2, and 3 Explain 71.9%, 26.9%, and 1.2%, respectively, of the Total Calibration Data Set Variance Applying PCR, a plot can be created that shows the calibration data set used to determine the concentrations in the prediction data set as shown in Figures 6-38, 6-39, and 6-40 for CH2O, H2O, and C3H4O, respectively. With the PCR technique, the RMSE for the calibration and prediction data sets in regards to concentration of CH2O, H2O, and C3H4O are is found to be 0 ppm. 103 Figure 6-38: Principal Component Regression (PCR) ? CH2O Concentrations in CH2O/C3H4O/H2O Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm Figure 6-39: Principal Component Regression (PCR) ? H2O Concentrations in CH2O/C3H4O/H2O Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm 104 Figure 6-40: Principal Component Regression (PCR) ? C3H4O Concentrations in CH2O/C3H4O/H2O Gas Mixtures; RMSE Calibration = 0 ppm, RMSE Prediction = 0 ppm PCA on the entire set of collected spectra isolates the peaks associated with each particular component as shown in Figures 6-41, 6-42, and 6-43. The variables V-1, V-2, and V-3 can be easily identifiable as corresponding to the pure spectra of CH2O, H2O, and C3H4O and, respectively. With the components separated, the contribution of each component to the sample mixtures can be determined for the CH2O calibration and prediction data sets, for the H2O calibration and prediction data sets, and for the C3H4O calibration and prediction data set. 105 Figure 6-41: PCA Separated CH2O Spectra in CH2O/C3H4O/H2O Gas Mixtures Figure 6-42: PCA Separated H2O Spectra in CH2O/C3H4O/H2O Gas Mixtures 106 Figure 6-43: PCA Separated C3H4O Spectra in CH2O/C3H4O/H2O Gas Mixtures The calculated proportionality constants (kCH2O = 0.08, kH2O = 1.25, kC3H4O = 0.39) are used to calculate the concentration of the mixtures for CH2O, H2O, and C3H4O shown in Figures 6-44, 6-45 and 6-46, respectively. With the PCC method, the RMSE for CH2O, H2O, and C3H4O calibration data sets are found to be 1 ppm, 9 ppm, and 1 ppm, respectively, while the RMSE of the prediction data sets for CH2O, H2O, and C3H4O are found to be 2 ppm, 6 ppm, and 1 ppm, respectively. 107 Figure 6-44: PCC CH2O Calibration and Prediction Data Sets in CH2O/C3H4O/H2O Gas Mixtures; kCH2O = 0.08; RMSE Calibration = 1 ppm, RMSE Prediction = 2 ppm Figure 6-45: PCC H2O Calibration and Prediction Data Sets in CH2O/C3H4O/H2O Gas Mixtures; kH2O = 1.25; RMSE Calibration = 9 ppm, RMSE Prediction = 6 ppm 108 Figure 6-46: PCC C3H4O Calibration and Prediction Data Sets in CH2O/C3H4O/H2O Gas Mixtures; kC3H4O = 0.39; RMSE Calibration = 1 ppm, RMSE Prediction = 1 ppm As shown in Table 6-12, the comparison of the PCR and PCC techniques indicate that the PCC technique could be a viable alternative quantitative analysis method to the PCR technique. The PCC technique for this example does not perform as well as PCR due to the strong overlap of the two components that make the separation of the three spectra more difficult for PCA. The PCA representation for H2O still contains small amounts of spectra that are due to CH2O and C3H4O and thus the calculated for the area under the curve for each concentration introduces error within the PCC technique. In addition, similar to the two-component CH2O/C3H4O gas mixtures, the PCA representation for C3H4O does not contain the entire spectra for the wavenumber range of 2900-2700 cm-1, where there is a strong overlap of the CH2O spectra. 109 Table 6-12: Comparison of Errors Associated with the Principal Component Regression (PCR) and Proportionality Constant Calculation (PCC) Analysis Techniques; CH2O/C3H4O Gas Mixtures Analysis Technique RMSE Calibration (ppm) RMSE Prediction (ppm) CH2O: PCR 0 0 CH2O: PCC 1 2 H2O: PCR 0 0 H2O: PCC 9 6 C3H4O: PCR 0 0 C3H4O: PCC 1 1 CO/CO2/H2O Experimental Data Set From the experimental FTIR spectral analysis of the CO/CO2/H2O gas mixtures, a set of 8 samples (n = 8) at 1,001 different wavenumbers (p = 1,001) with concentrations given in Table 6-13 were collected as shown in Figure 6-47 to produce a calibration data set [XC](8 x 1001). A set of 4 samples with concentrations given in Table 6-14 were collected as shown in Figure 6-48 to produce a prediction data set [XP](4 x 1001) for PCR. In the case of the PCC, a single data set [X](12 x 1001) consisting of the calibration and prediction data sets were combined and PCA was simultaneously performed on all 12 samples to extract out the pure components in each spectra. Table 6-13: CO/CO2/H2O Spectra Compositions for Calibration Data Set, [XC](8 x 1001) Sample # Amount of CO (ppm) Amount of CO2 (ppm) Amount of H2O (ppm) 1 91 1 1376 2 113 1 2454 3 96 2 4366 4 90 3 5385 9 71 15 6771 10 25 18 6675 11 34 23 6704 12 8 28 8231 110 Figure 6-47: CO/CO2/H2O Gas Mixtures Calibration Data Set, [XC](8 x 1001); Note ? Only 2 of 8 Calibration Spectra Shown Table 6-14: CO/CO2/H2O Spectra Compositions for Prediction Data Set, [XP](4 x 1001) Sample # Amount of CO (ppm) Amount of CO2 (ppm) Amount of H2O (ppm) 5 108 5 5059 6 80 5 5792 7 58 7 5734 8 46 10 6157 111 Figure 6-48: CO/CO2/H2O Gas Mixtures Prediction Data Set, [XP](4 x 1001); Note ? Only 2 of 4 Prediction Spectra Shown From the plot of eigenvalues as a function of principal components, the number of significant eigenvalues obtained from PCA is determined to be five (Figure 6-49). The first principal component explains 69.7% of the total calibration data set variance, the second principal component explains of 12.1%, the third explains of 7.7% of the total calibration data set variance, the fourth explains 4.4%, and the fifth explains 3.0%. With these five principal components, 96.9% of the original data variance can be explained. The analysis indicates that the spectra may contain beneficial data beyond the three principal components expected in the system that explain 89.5% of the original calibration data set variation. This may be an indication that the variance for the IR spectra of individual components in the CO/CO2/H2O mixed gas environment is partitioned among multiple principal components instead of just one. By utilizing more principal components, the quantification of the gas concentrations for each component may be improved but there is no evidence to support five distinct components in the experimentally collected IR spectra. 112 Figure 6-49: SCREE Plot Indicating Eigenvalues for Each Principal Component for CO/CO2/H2O Gas Mixtures Data Set; Principal Components 1, 2, and 3 Explain 69.7%, 12.1%, and 7.7% of the Total Calibration Data Set Variance Applying PCR, a plot can be created that shows the calibration data set used to determine the concentrations in the prediction data set as shown in Figures 6-50, 6-51, and 6-52 for CO, CO2, and H2O, respectively. With the PCR technique, the RMSE for calibration for CO is found to be 14 ppm, while the RMSE for prediction for CO is found to be 18 ppm. The RMSEC for CO2 is found to be 2 ppm, while the RMSEP for CO2 is found to be 4 ppm. The RMSEC for H2O is found to be 157 ppm, while the RMSEP for H2O is found to be 519 ppm. 113 Figure 6-50: Principal Component Regression (PCR) ? CO Concentrations in CO/CO2/H2O Gas Mixtures; RMSE Calibration = 14 ppm, RMSE Prediction = 18 ppm Figure 6-51: Principal Component Regression (PCR) ? CO2 Concentrations in CO/CO2/H2O Gas Mixtures; RMSE Calibration = 2 ppm, RMSE Prediction = 4 ppm 114 Figure 6-52: Principal Component Regression (PCR) ? H2O Concentrations in CO/CO2/H2O Gas Mixtures; RMSE Calibration = 157 ppm, RMSE Prediction = 519 ppm As shown in Figures 6-53 including more than three principal components does not improve the RMSE for calibration and prediction significantly. In the case of RMSE for prediction, the inclusion of more than three principal components actually makes the results worse, most likely due to the inclusion of repetitive data that causes numerical instability in the regression model. This indicates that even though five principal components are necessary to represent at least 95% of the original calibration data set variance, that no beneficial and quantitative analysis is provided by including more than the three known components within the CO/CO2/H2O mixed gas system. In an unknown system, the determined number of components necessary for the analysis would be five and thus the actual system would not perform as well in terms of quantifying the CO, CO2, and H2O concentrations. 115 Figure 6-53: CO RMSE Calibration and RMSE Prediction as a Function of the Number of Principal Components Used to Represent the Original Data Set in CO/CO2/H2O Gas Mixtures PCA on the entire set of collected experimental data set spectra was not capable of isolating the peaks associated with each particular component. This failure was due to the strong absorbance of the H2O in comparison to the CO and CO2 gas species. The strong contrast between the components is highlighted in the dynamic range values found in the previous PCR regression coefficient analysis, where CO had values of -4.6 to 6.4 (dynamic range of 11.0), CO2 had values of -1.9 to 1.4 (dynamic range of 3.4), and H2O had values of -10.4 to 204.7 (dynamic range of 215.1). With this large contrast between the three species, PCA can only identify that there are three main components in the mixed gas environment but cannot quantify the amounts of each. This data set highlights a shortcoming of the PCC analysis technique in dealing with multi-component analysis that contains both significant IR absorbing species and relatively low IR absorbing species. The PCR technique though was able quantitatively determine with 116 relatively low RMSE values the amounts of CO, CO2, and H2O in the three component gas mixture as shown in Table 6-15. Table 6-15: Errors Associated with the Principal Component Regression (PCR) Analysis Technique; CO/CO2/H2O Experimental Gas Mixtures Analysis Technique RMSE Calibration (ppm) RMSE Prediction (ppm) CO: PCR 14 18 CO2: PCR 2 4 H2O: PCR 157 519 In an attempt to overcome the shortcoming of the PCC analysis technique in dealing with multi-component analysis that contains both significant IR absorbing species and relatively low IR absorbing species, PCA was conducted on the experimentally collected IR spectra from 2500 to 2075 cm-1, which removes the IR spectra of the H2O component. This truncated the analysis to a set of 8 samples (n = 8) at 501 different wavenumbers (p = 425) with concentrations given in Table 6-13 for the calibration data set [XC](8 x 425). A set of 4 samples with their concentrations given in Table 6-14 were used for the prediction data set [XP](4 x 425) for PCR. In the case of the PCC, a single data set [X](12 x 425) consisting of the calibration and prediction data sets were combined and PCA was simultaneously performed on all 12 samples. From the plot of eigenvalues as a function of principal components, the number of significant eigenvalues obtained from PCA is determined to be four (Figure 6-54). The first principal component explains 73.4% of the total calibration data set variance, the second principal component explains of 12.6%, the third explains of 6.2% of the total calibration data set variance, and the fourth explains 4.5%. With these four principal components, 96.7% of the original data variance can be explained. 117 Figure 6-54: SCREE Plot Indicating Eigenvalues for Each Principal Component for CO/CO2/H2O Gas Mixtures Data Set; Principal Components 1, 2, and 3 Explain 73.4%, 12.6%, and 6.2% of the Total Calibration Data Set Variance As shown in Figures 6-55 including more than two principal components does not improve the RMSE for calibration and prediction significantly. In the case of RMSE for prediction, the inclusion of more than the expected number of principal components actually makes the results worse as more principal components are used in the PCR analysis. This indicates that even though five principal components are necessary to represent at least 95% of the original calibration data set variance, that no beneficial and quantitative analysis is provided by including more than the two known components within the CO/CO2 mixed gas system. In addition, the RMSE for calibration and prediction are significantly worse when PCA is performed on the three-component system where the spectral data is truncated to include only contributions from two components. 118 Figure 6-55: CO RMSE Calibration and RMSE Prediction as a Function of the Number of Principal Components Used to Represent the Original Data Set in CO/CO2/H2O Gas Mixtures with the IR Spectral Data Truncated to Include Only Contributions from CO and CO2 (2500 cm-1 to 2075 cm-1) for PCA PCA on the entire set of collected experimental truncated data set spectra was not capable of isolating the peaks associated with each particular component, as was the case with PCA on the entire set of collected experimental data. This failure indicates that due to the strong absorbance of the H2O in comparison to the CO and CO2 gas species within each FTIR scan, the data analysis for wavenumbers containing information for only CO and CO2 is still adversely affected. Because of this, to determine quantitatively the amount of CO and CO2 gas species within the system, FTIR scans avoiding the H2O absorbance range are necessary as was shown in the CO/CO2 Experimental Data Set where the wavenumber range was 2500-2000 cm-1. This method of analysis covering the wavenumber range of 2500-1500 cm-1 though is still beneficial as it identifies that only three components are present at significant amounts within the spectra as was discussed in relation to Figure 6-62. 119 Chapter 7: PCA Application to FTIR Spectroscopy Data of Potential Bleed Air Contaminants within the Aircraft Cabin Discussion of Analyzed Data Sets Data set 1 was comprised of experimentally obtained IR spectra from four different engine oils used in commercial aircraft engines. The oils analyzed were BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560. These engine oils samples were heated in a thermogravimetric analysis (TGA) chamber (Thermo-Microbalance TG 209 F1 Iris?) with a programmed heat treatment of 20? C per minute ramp rate until evaporation of the oil was completed. Coupled with the TGA were a FTIR (Bruker FTIR Tensor? 27) and a mass spectrometry (MS) analyzer (QMS 403 C A?olos?) via heated transfer lines. The FTIR scans were recorded at the temperature of greatest mass loss, which for the BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 306 ?C (583 ?F), 301 ?C (574 ?F), 306 ?C, and 326 ?C (619 ?F), respectively. The IR spectra for data set 1 was obtained using the FTIR spectrometer over a wavenumber scan range from 4000 cm-1 to 600 cm-1 with a spectral resolution of 2 cm-1. PCA was performed on data set 1 without PCR or PCC to determine the number of components most likely present in the evolved gas species from the heated engine oil samples. This data was then compared to the MS data for verification and the MS data was used to determine the identity of the components. Data set 2 was comprised of a simulated calibration data set with 10 spectra containing various concentrations of methanol (CH4O), CH2O, and CO2. These gas components were 120 identified in the MS analysis of data set 1 in addition to PCA identifying three significant components present in data set 1. The prediction data set for data set 2 consisted of the experimental engine oil samples from data set 1. The purpose of this was to determine not only the component present but to quantify the concentrations of each of the identified gases through PCR. With the concentrations of the individual gases determined from PCR for the engine oil samples, predicted spectra were calculated and compared to the original FTIR scan data over the wavenumber range of 4000 cm-1 to 600 cm-1 at a resolution of 2 cm-1. The RMSE between the predicted spectra and actual spectra of the engine oils after completion of PCA with PCR was computed based on absorbance values at each wavenumber. The experimental data used in data sets 1 and 2 were collected by Netzsch Instruments, Inc. (Burlington, MA, USA). Data set 3 consisted of time-evolved room temperature (23-24 ?C/ 74-75 ?F) gas analysis of heated BP Turbo Oil 2380. The 1mL sample of engine oil was heated in a cylindrical furnace with a programmed heat treatment of 10? C per minute ramp rate until the hold temperature was reached, which in this case was an oil temperature of approximately 275 ?C (527 ?F). The engine oil heating system was operated by Amanda Neer (M.S. Graduate Student, Materials Engineering, Auburn University). The specific details of the system can be found in her Master Thesis. The gas was monitored with a temperature sensor and allowed to cool to room temperature as it was transferred to the Spectrum GX FTIR with variable pathlength long path gas cell discussed in Chapter 2 with the use of a vacuum pump. The FTIR scans were recorded at time intervals every 5 minutes while the engine oil was held at 275 ?C for 1 hour before the heating furnace was shut off and the sample allowed to return to room temperature. The IR spectra for data set 3 were obtained over a wavenumber scan range from 4000 cm-1 to 600 cm-1 with a spectral resolution of 2 cm-1. The pathlength of the gas cell was 2.24 m. 121 PCA Results and Discussions Data Set 1 ? Engine Oil Samples at Temperatures of Greatest Mass Loss From the experimental FTIR spectral analysis of the engine oils at temperature of greatest mass loss, a set of 4 samples (n = 4) at 1763 different wavenumbers (p = 1763) were collected as shown in Figure 7-1 for PCA. As shown in Figure 7-2, the number of significant eigenvalues obtained from PCA is three. The first principal component explains 76.1% of the total calibration data set variance, the second principal component explains of 21.6% of the total calibration data set variance, and the third principal component explains of 2.3% of the total calibration data set variance. With these three principal components, 100.0% of the original data variance can be explained. Figure 7-1: FTIR Scans of Engine Oil Samples at Temperatures of Greatest Mass Loss used for PCA; BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 at 306 ?C (583 ?F), 301 ?C (574 ?F), 306 ?C, and 326 ?C (619 ?F), respectively 122 Figure 7-2: SCREE Plot Indicating Eigenvalues for Each Principal Component for Engine Oil Data Set 1; Principal Components 1, 2, and 3 Explain 76.1%, 21.6%, and 2.3% of the Total Engine Oil Data Set 1 Variance Figure 7-3 highlights the MS data for the BP Turbo Oil 274 at 301 ?C (574 ?F), and Mobile Jet Oil II at 306 ?C (583 ?F). This MS data corresponds to characteristic MS data for CH4O, CH2O, and CO2, whose database spectra are shown in Figure 7-4. As evident based on relative intensities of the two oils analyzed in Figure 7-3, the amount of CH2O is significantly higher than that of CO2. In addition, in regards to Figure 7-3, the peak that corresponds to CH4O at atomic mass unit (amu) 32 is not included because it is such a high intensity that the relative intensities for the other components would become indistinguishable. This indicates that a majority of the evolved gas at the temperature of highest mass loss for the engine oils is due to CH4O. The analysis could have been possible with the MS data, but a comparison of the FTIR plot would have to have been made to numerous alcohol and aldehyde based molecules to determine the exact gas species present. 123 Figure 7-3: Mass Spectrometry (MS) Data of Engine Oil Samples at Temperatures of Greatest Mass Loss; BP Turbo Oil 274, and Mobile Jet Oil II at 301 ?C (574 ?F), and 306 ?C (578 ?F), respectively Figure 7-4: Mass Spectrometry (MS) Data Database Files for Formaldehyde (CH2O), Methanol (CH4O), and Carbon Dioxide (CO2) 124 Data Set 2 ? Simulated CH4O/CH2O/CO2 Gas Mixtures & Engine Oil Samples at Temperatures of Greatest Mass Loss From the simulated FTIR spectral analysis of the pure CH4O/CH2O/CO2 gas mixtures shown in Figure 7-5, a set of 10 samples (n = 10) at 1,763 different wavenumbers (p = 1,763) with concentrations given in Table 7-1 were calculated as shown in Figure 7-6 to produce a calibration data set [XC](10 x 1763). The experimental data used in data set 1 (Figure 7-1) was used as the prediction data set [XP](4 x 1763) for PCR. Figure 7-5: Formaldehyde (CH2O), Methanol (CH4O), and Carbon Dioxide (CO2) Pure Spectra for Simulated Data Set Illustrating Spectral Overlap Between CH2O and CH4O 125 Table 7-1: Calibration Data Set 2 Composed of Simulated Spectra of Various Concentrations of Methanol (CH4O), Formaldehyde (CH2O), and Carbon Dioxide (CO2) Sample CH4O Concentration (x100 ppm) CH2O Concentration (x100 ppm) CO2 Concentration (x100 ppm) 1 0.0 3.0 2.0 2 0.5 2.0 4.0 3 1.0 0.0 3.5 4 2.0 6.0 5.5 5 3.5 7.0 0.0 6 5.0 1.5 0.5 7 6.5 3.5 4.5 8 8.0 4.5 2.5 9 9.0 0.5 3.0 10 10.0 5.0 1.0 Figure 7-6: CH4O/CH2O/CO2 Gas Mixtures Calibration Data Set, [XC](10 x 1763); Note ? Only 3 of 10 Calibration Spectra Shown As shown in Figure 7-7, the number of significant eigenvalues obtained from PCA is three. The first principal component explains 86.8% of the total calibration data set variance, the second principal component explains of 8.6%, and the third principal component explains 4.6% of the total calibration data set variance. With these three principal components, 100.0% of the original data variance can be explained. 126 Figure 7-7: SCREE Plot Indicating Eigenvalues for Each Principal Component for CH4O/CH2O/CO2 Gas Mixtures Data Set; Principal Components 1, 2, and 3 Explain 86.8%, 95.4%, and 4.6%, respectively, of the Total Calibration Data Set Variance Applying PCR, a plot can be created that shows the calibration data set used to determine the concentrations in the prediction data sets as shown in Figures 7-8, 7-9, and 7-10 for CH4O, CH2O, and CO2, respectively. The RMSE for the calibration data set in regards to concentrations of CH4O, CH2O, and CO2 are found to be 0 ppm (simulated data sets). The predicted CH4O concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 206, 221, 214, and 292 ppm, respectively. The predicted CH2O concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 145, 190, 142, and 263 ppm, respectively. The predicted CO2 concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 30, 49, 20, and 63 ppm, respectively. 127 Figure 7-8: Principal Component Regression (PCR) ? CH4O Concentrations in CH4O/CH2O/CO2 Gas Mixtures; RMSE Calibration = 0 ppm; Predicted concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 206, 221, 214, and 292 ppm, respectively Figure 7-9: Principal Component Regression (PCR) ? CH2O Concentrations in CH4O/CH2O/CO2 Gas Mixtures; RMSE Calibration = 0 ppm; Predicted concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 145, 190, 142, and 263 ppm, respectively 128 Figure 7-10: Principal Component Regression (PCR) ? CO2 Concentrations in CH4O/CH2O/CO2 Gas Mixtures; RMSE Calibration = 0 ppm; Predicted concentrations for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 were 30, 49, 20, and 63 ppm, respectively After using PCR to determine concentrations of CH4O, CH2O, and CO2 for the four engine oil samples (Table 7-2), a reconstructed prediction spectrum can be calculated and compared to the experimentally obtained FTIR data as shown in Figures 7-11 thru 7-14 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560, respectively. The RMSE between the predicted and actual spectra for each of the engine oils are shown in Table 7-3. The calculation of RMSE with this method is necessary because the actual amounts of the gas components are unknown. Table 7-2: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 Sample CH4O Concentration (ppm) CH2O Concentration (ppm) CO2 Concentration (ppm) BP Turbo Oil 2380 206 145 30 BP Turbo Oil 274 221 190 49 Mobile Jet Oil II 214 142 20 Aeroshell Turbine Oil 560 292 263 63 129 Figure 7-11: Predicted Spectra for BP Turbo Oil 2380 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.4% Figure 7-12: Predicted Spectra for BP Turbo Oil 274 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 3.1% 130 Figure 7-13: Predicted Spectra for Mobile Jet Oil II based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.7% Figure 7-14: Predicted Spectra for Aeroshell Turbine Oil 560 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 3.9% 131 Table 7-3: RMSE for Predicted Spectra based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 Sample RMSE (%) BP Turbo Oil 2380 2.4 BP Turbo Oil 274 3.1 Mobile Jet Oil II 2.7 Aeroshell Turbine Oil 560 3.9 From the predicted spectra, it was noted that the peaks associated with methanol did not line up properly at the expected wavenumbers, particularly the wavenumbers associated with the C-O stretching (1150 cm-1 ? 1050 cm-1) and the O-H stretching (3700 cm-1 ? 3500 cm-1). This indicates that there are disturbances in the hydrogen bonding present within CH4O due to the clustering of CH4O molecules. This is similar to observed and theoretical predicted shifts reported in literature, where the peaks associated with the C-O stretching shift towards higher wavenumbers (blue shifts, i.e. shift to lower frequency) and the peaks associated with the O-H stretching shift towards lower wavenumbers (red shifts, i.e. shift to higher frequency) [51-54]. The blue shift in the C-O stretching peak is due to a shortening of the bond, while the red shift in the O-H stretching is due to the elongation of the bond, which occur due to the mutual interactions within the hydroxyl group via the hydrogen bonded CH4O clusters [55]. This mutual interaction within the hydroxyl group is illustrated in Figure 7-15. Wavenumber shifts are not observed in the CH2O or CO2 spectrum components and are not to be expected since these molecules do not have hydrogen bonding present. 132 Figure 7-15: Illustration of Mutual Interactions within the Hydroxyl Group via Hydrogen Bonded Methanol (CH4O) Clusters that Leads to O-H Bong Lengthening and C-O Bond Shortening Explaining the Observed Red and Blue Wavenumber Shifts, Respectively, within the CH4O IR Spectra Component of the Engine Oil at High Temperature An additional method to describe the observed shifts in characteristic wavenumber for the C-O and O-H stretching modes is to recognize that molecular vibrations can be treated utilizing Newtonian mechanics. In this case, each vibration or stretching mode corresponds to a spring with a spring or force constant, k, as defined in Equation 7.1 (Hooke?s Law), which can then be related to effective mass of the molecule, ? (defined in Equation 7.2), and the acceleration using Newton?s 2nd Law of Motion (Equation 7.3). In Equation 7.2, m1 and m2 correspond to the mass of the bonded atoms. kxF ?= (Hooke?s Law) (7.1) 133 21 21 mm mm +=? (7.2) kxdt xd ?=2 2 ? (7.3) Rearranging Equation 7.3 yields the homogenous linear differential equation of 2nd order shown in Equation 7.4, which has a standard general solution in which the force constant can be related to the vibrational mode characteristic wavenumber, v (cm-1), through Equation 7.5, where c is the speed of light in cm/s2. 02 2 =+ kxdt xd? (7.4) ?pi kcv 21= (7.5) With the observed shifts, a new calibration data set for the PCA can be created that has the CH4O modified as shown in Figure 7-16, with a C-O stretch blue shifted by approximately 121 cm-1 and the O-H stretch red shifted by approximately 121 cm-1. Using Equation 7.5, the room temperature C-O stretch with a peak located at ~1033 cm-1 corresponds to kC-O = 4.3 N/cm, while the shifted peak located at ~1154 cm-1 corresponds to kC-O = 5.4 N/cm. Again using Equation 7.5, the room temperature O-H stretch with a peak located at ~3682 cm-1 corresponds to kO-H = 7.6 N/cm, while the shifted peak located at ~3561 cm-1 corresponds to kO-H = 7.1 N/cm. These force constant values and their respective changes are consistent with those reported in literature of hydrogen bonding associated with methanol [51-55]. The actual determination of wavenumber shifts as a function of temperature can only be estimated through inspection at this 134 time but as detailed in the future work section found in Chapter 9, more experiments could be undertaken to better quantify these shifts. Figure 7-16: Comparison of Actual Methanol (CH4O) Spectra to Modified CH4O Spectra with the O-H Stretching Bands Red Shifted and the C-O Stretching Bands Blue Shifted Bands Due to High Temperature Disturbance of Hydrogen Bonds From the simulated FTIR spectral analysis of the pure CH4O (modified)/CH2O/CO2 gas mixtures, a set of 10 samples (n = 10) at 1,763 different wavenumbers (p = 1,763) with the same concentrations given in Table 7-1 were calculated as shown in Figure 7-17 to produce a calibration data set [XC](10 x 1763). The experimental data used in data set 1 (Figure 7-1) was used as the prediction data set [XP](4 x 1763) for PCR. 135 Figure 7-17: CH4O (Modified)/CH2O/CO2 Gas Mixtures Calibration Data Set, [XC](10 x 1763); Note ? Only 3 of 10 Calibration Spectra Shown After using PCA with PCR to determine concentrations of CH4O, CH2O, and CO2 with the modified methanol calibration spectra for the four engine oil samples (Table 7-4), a reconstructed prediction spectrum can be calculated. This predicted spectrum is then compared to the experimentally obtained FTIR data as shown in Figures 7-18 thru 7-21 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560, respectively. The RMSE between the predicted and actual spectra for the standard CH4O and modified CH4O spectra for each of the engine oils are shown in Table 7-5. Table 7-4: PCR Calculated Concentrations of CH4O (Modified), CH2O, and CO2 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 Sample CH4O Concentration (ppm) CH2O Concentration (ppm) CO2 Concentration (ppm) BP Turbo Oil 2380 350 127 30 BP Turbo Oil 274 417 166 49 Mobile Jet Oil II 407 118 20 Aeroshell Turbine Oil 560 511 235 63 136 Figure 7-18: Modified Predicted Spectra for BP Turbo Oil 2380 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.1% Figure 7-19: Modified Predicted Spectra for BP Turbo Oil 274 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.7% 137 Figure 7-20: Modified Predicted Spectra for Mobile Jet Oil II based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.2% Figure 7-21: Modified Predicted Spectra for Aeroshell Turbine Oil 560 based on PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 3.4% 138 Table 7-5: Comparison of RMSE using the Standard CH4O Spectra and the Modified CH4O Spectra for Predicted Spectra based on PCR Calculated Concentrations Sample RMSE (%) Std. CH 4O RMSE (%) Mod. CH4O BP Turbo Oil 2380 2.4 2.1 BP Turbo Oil 274 3.1 2.7 Mobile Jet Oil II 2.7 2.2 Aeroshell Turbine Oil 560 3.9 3.4 In an attempt to reduce further the RMSE values of the predicted spectra, PCA along with PCR was conducted on truncated simulated and experimentally collected IR spectra data from 3200 to 1600 cm-1. This removes the IR spectra due to the shifted C-O and O-H stretching bands of the CH4O component. Within the 3200 to 1600 cm-1 wavenumber range, IR spectra for all three components are still present. The truncated analysis consisted of a set of 10 samples (n = 10) at 830 different wavenumbers (p = 830) with the same concentrations given in Table 7-1, shown in Figure 7-22, to produce a calibration data set [XC](10 x 830). The experimental data used in data set 1 (Figure 7-1) was truncated and used as the prediction data set [XP](4 x 830) for PCR. Figure 7-22: Truncated (3200 ? 1600 cm-1) CH4O/CH2O/CO2 Gas Mixtures Calibration Data Set, [XC](10 x 830); Note ? Only 3 of 10 Calibration Spectra Shown 139 After using PCA with PCR to determine concentrations of CH4O, CH2O, and CO2 with the truncated calibration and prediction data sets, shown in Table 7-6, a reconstructed prediction spectrum is calculated. This predicted spectrum is then compared to the experimentally obtained FTIR data as shown in Figures 7-23 thru 7-26 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560, respectively. Table 7-6: Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 2380, BP Turbo Oil 274, Mobile Jet Oil II, and Aeroshell Turbine Oil 560 Sample CH4O Concentration (ppm) CH2O Concentration (ppm) CO2 Concentration (ppm) BP Turbo Oil 2380 437 111 31 BP Turbo Oil 274 378 159 50 Mobile Jet Oil II 468 106 21 Aeroshell Turbine Oil 560 567 221 65 Figure 7-23: Modified Predicted Spectra for BP Turbo Oil 2380 based on Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.1% 140 Figure 7-24: Modified Predicted Spectra for BP Turbo Oil 274 based on Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.7% Figure 7-25: Modified Predicted Spectra for Mobile Jet Oil II based on Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 2.3% 141 Figure 7-26: Modified Predicted Spectra for Aeroshell Turbine Oil 560 based on Truncated (3200 ? 1600 cm-1) PCR Calculated Concentrations of CH4O, CH2O, and CO2 Gas Mixtures; RMSE = 3.4% A comparison of the RMSE between the predicted and actual spectra using each analysis method for each of the engine oils are shown in Table 7-7. A summary of the calculated concentrations of each of the three components is shown in Tables 7-8 thru 7-11 for each engine oil sample and each method of analysis. Table 7-7: Comparison of RMSE using the Standard CH4O Spectra, the Modified CH4O Spectra, and the Truncated Spectra for Predicted Spectra based on PCR Calculated Concentrations Sample RMSE (%) Std. CH 4O RMSE (%) Mod. CH4O RMSE (%) Truncated BP Turbo Oil 2380 2.4 2.1 2.1 BP Turbo Oil 274 3.1 2.7 2.7 Mobile Jet Oil II 2.7 2.2 2.3 Aeroshell Turbine Oil 560 3.9 3.4 3.4 Table 7-8: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 2380 Sample CH4O Concentration (ppm) CH2O Concentration (ppm) CO2 Concentration (ppm) Standard CH4O Spectra 206 145 30 Modified CH4O Spectra 350 127 30 Truncated Spectra 437 111 31 142 Table 7-9: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for BP Turbo Oil 274 Sample CH4O Concentration (ppm) CH2O Concentration (ppm) CO2 Concentration (ppm) Standard CH4O Spectra 221 190 49 Modified CH4O Spectra 417 166 49 Truncated Spectra 378 159 50 Table 7-10: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for Mobile Jet Oil II Sample CH4O Concentration (ppm) CH2O Concentration (ppm) CO2 Concentration (ppm) Standard CH4O Spectra 214 142 20 Modified CH4O Spectra 407 118 20 Truncated Spectra 468 106 21 Table 7-11: PCR Calculated Concentrations of CH4O, CH2O, and CO2 for Aeroshell Turbine Oil 560 Sample CH4O Concentration (ppm) CH2O Concentration (ppm) CO2 Concentration (ppm) Standard CH4O Spectra 292 263 63 Modified CH4O Spectra 511 235 63 Truncated Spectra 567 221 65 Performing PCA with the modified CH4O spectra in the calibration data set improves the RMSE for the predicted engine oil samples, in addition to calculating a higher concentration of CH4O in each of the engine oil samples. The higher concentration is due to a better alignment of the peaks that shift in the CH4O component. With more of the spectra of the engine oil samples being attributed to the CH4O component, the calculated concentrations for the CH2O component slightly decrease. When the analysis is performed with a truncated calibration data set, the CH4O concentrations are slightly increased and the CH2O concentrations are further reduced. The concentration values calculated with the truncated method have the highest probability of best representing the actual values since they do not analyze portions of the spectra that contain peak-shifted areas. An additional source of error for concentration calculations in the prediction data sets can be attributed to the broadening of the peaks due to the increase in the gas temperature. The calculated amount of CO2 for each of the three cases, standard CH4O, 143 modified CH4O, and truncated spectrum, is unaffected because the CO2 IR spectra do not overlap with either CH4O or the CH2O spectra. This analysis indicates that PCA on simulated calibration data sets is capable of identifying and quantifying gas species within experimental and unknown prediction data sets. Data Set 3 ? Simulated CH4O/CH2O/CO2/CO/H2O Gas Mixtures & BP Turbo Oil 2380 Engine Oil Time-Evolved Samples From the simulated FTIR spectral analysis of the pure CH4O/CH2O/CO2/CO/H2O gas mixtures shown in Figure 7-27, a set of 10 samples (n = 10) at 1,763 different wavenumbers (p = 1,763) with concentrations given in Table 7-12 were calculated to produce a calibration data set [XC](10 x 1763). The experimental data consisting of time evolved IR spectra of the heated BP 2380 engine oil (Figure 7-28) was used as the prediction data set [XP](20 x 1763) for PCR. Figure 7-27: Methanol (CH4O), Formaldehyde (CH2O), Carbon Dioxide (CO2), Carbon Monoxide (CO), and Water (H2O) Pure Spectra for Simulated Data Set Illustrating Spectral Overlap Between CH4O, CH2O, and H2O 144 Table 7-12: Calibration Data Set 2 Composed of Simulated Spectra of Various Concentrations of Methanol (CH4O), Formaldehyde (CH2O), Carbon Dioxide (CO2), Carbon Monoxide (CO), and Water (H2O) Sample CH4O Concentration (x100 ppm) CH2O Concentration (x100 ppm) CO2 Concentration (x100 ppm) CO Concentration (x100 ppm) H2O Concentration (x1000 ppm) 1 0.0 3.0 2.0 4.5 5.0 2 0.5 2.0 4.0 6.0 6.5 3 1.0 0.0 3.5 0.5 1.0 4 2.0 6.0 5.5 0.0 3.5 5 3.5 7.0 0.0 1.5 2.0 6 5.0 1.5 0.5 0.6 3.0 7 6.5 3.5 4.5 2.0 0.0 8 8.0 4.5 2.5 3.0 0.5 9 9.0 0.5 3.0 0.2 4.5 10 10.0 5.0 1.0 5.0 6.0 Figure 7-28: BP Turbo Oil 2390 Time Evolved Spectra Used for Prediction Data Set, [XP](20 x 1763); Note ? Only 2 of 20 Prediction Spectra Shown, Time 5 minutes and 90 minutes Applying PCR, a plot can be created that shows the calibration data set used to determine the concentrations in the prediction data sets. The RMSE for the calibration data set in regards to concentrations of CH4O, CH2O, CO2, CO, and H2O are found to be 0 ppm (simulated data sets). Figure 7-29 shows the predicted CH4O, CH2O, and CO concentrations for the time evolved BP Turbo Oil 2380, while Figures 7-30 and 7-31 shown the predicted CO2 and H2O time evolved 145 concentrations, respectively. Noted on the figures are the times at which the heater reached the set point (Time = 25 min.) and the time at which the heater was turned off (Time = 90 min.). Figure 7-29: Principal Component Regression (PCR) Calculated Gas Concentrations for CH4O, CH2O, and CO of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off 146 Figure 7-30: Principal Component Regression (PCR) Calculated Gas Concentrations for CO2 of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off Figure 7-31: Principal Component Regression (PCR) Calculated Gas Concentrations for H2O of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off 147 After using PCR to determine concentrations for CH4O, CH2O, CO2, CO, and H2O of the BP Turbo Oil 2380 time evolved samples, reconstructed prediction spectra are calculated. These reconstructed spectra are then compared to the experimentally obtained FTIR data as shown in Figures 7-32 thru 7-35 for BP Turbo Oil 2380 at Time = 10 min., 30 min., 60 min., and 90 min. The RMSE between the predicted and actual spectra for each of the time-evolved spectra are calculated to be 2.7%, 8.4%, 9.2%, and 9.0% for 10 min., 30 min., 60min., and 90 min., respectively. The average RMSE between the predicted and actual spectra for all 20 of the time- evolved spectra is found to be 7.5%. Figure 7-32: Predicted Spectra for BP Turbo Oil 2380 at Time = 10 min. based on PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 2.7% 148 Figure 7-33: Predicted Spectra for BP Turbo Oil 2380 at Time = 30 min. based on PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 8.4% Figure 7-34: Predicted Spectra for BP Turbo Oil 2380 at Time = 60 min. based on PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 9.2% 149 Figure 7-35: Predicted Spectra for BP Turbo Oil 2380 at Time = 90 min. based on PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 9.0% In an attempt to reduce the RMSE values of the predicted spectra, PCA along with PCR was conducted on truncated simulated and experimentally collected IR spectra data from 4000 to 2000 cm-1. This removes the region in IR spectra that has considerable overlap between the CH2O and H2O components. Within the 4000 to 2000 cm-1 wavenumber range, characteristics of IR spectra for all five components are still present. The truncated analysis consisted of a set of 10 samples (n = 10) at 1037 different wavenumbers (p = 1037) with the same concentrations given in Table 7-12, to produce a calibration data set [XC](10 x 1037). The BP 2390 time-evolved spectra data was truncated and used as the prediction data set [XP](20 x 1037) for PCR. Figure 7-36 shows the predicted CH4O, CH2O, and CO concentrations from the truncated analysis for the time evolved BP Turbo Oil 2380, while Figure 7-37 shows the predicted H2O time evolved concentrations from the truncated analysis, respectively. The predicted CO2 time evolved concentrations from the truncated analysis does not change significantly from Figure 7-30. 150 Figure 7-36: Principal Component Regression (PCR) Calculated Gas Concentrations for CH4O, CH2O, and CO of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off Figure 7-37: Principal Component Regression (PCR) Calculated Gas Concentrations for H2O of BP Turbo Oil 2380 Time Evolved Spectra; Time = 25 min. Heater Reached Set Point, Time = 90 min. Heater Turned Off 151 After using PCR to determine concentrations for CH4O, CH2O, CO2, CO, and H2O of the BP Turbo Oil 2380 time evolved samples, reconstructed prediction spectra are calculated. These reconstructed spectra are then compared to the experimentally obtained FTIR data as shown in Figures 7-38 thru 7-41 for BP Turbo Oil 2380 at Time = 10 min., 30 min., 60 min., and 90 min. The RMSE between the predicted and actual spectra for each of the time-evolved spectra are calculated to be 2.2%, 7.2%, 7.8%, and 7.5% for 10 min., 30 min., 60min., and 90 min., respectively. The average RMSE between the predicted and actual spectra for all 20 of the time- evolved spectra is found to be 6.4%. A comparison of the RMSE between the predicted and actual spectra using full and truncated spectral analysis method for each of the time-evolved samples are shown in Table 7-13. Figure 7-38: Predicted Spectra for BP Turbo Oil 2380 at Time = 10 min. based on Truncated (4000-2000 cm-1) PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 2.2% 152 Figure 7-39: Predicted Spectra for BP Turbo Oil 2380 at Time = 30 min. based on Truncated (4000-2000 cm-1) PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 7.2% Figure 7-40: Predicted Spectra for BP Turbo Oil 2380 at Time = 60 min. based on Truncated (4000-2000 cm-1) PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 7.8% 153 Figure 7-41: Predicted Spectra for BP Turbo Oil 2380 at Time = 90 min. based on Truncated (4000-2000 cm-1) PCR Calculated Concentrations of CH4O, CH2O, CO2, CO, and H2O Gas Mixtures; RMSE = 7.5% Table 7-13: Comparison of RMSE using the Full and the Truncated Spectra, for the PCR Calculated Concentrations Sample RMSE (%) Full Spectra RMSE (%) Truncated Time = 10 min. 2.7 2.2 Time = 30 min. 8.4 7.2 Time = 60 min. 9.2 7.8 Time = 90 min. 9.0 7.5 Average over 20 Samples 7.5 6.4 Performing PCA with the truncated spectra in the calibration data set improves the RMSE (reduced from an average of 7.5% to 6.4%) for the predicted time-evolved BP 2380 Turbo Oil samples, in addition to calculating a higher concentration of CH4O in each of the engine oil samples. The higher concentration is due to a more accurate calculation concentration of the CH2O component when the significant overlap region between H2O and CH2O is not analyzed. With more of the spectra of the engine oil samples being attributed to the H2O component, the calculated concentrations for the CH2O component significantly decrease. As was the case with the data set 2, the concentration values calculated with the truncated method 154 have the highest probability of best representing the actual values since they do not analyze portions of the spectra that contain areas of significant overlap with a H2O, which is a strong IR absorbing gas. In addition to improving the calculated concentration of CH2O, removal of the wavenumbers below 2000 cm-1 better quantifies the CO concentration in the time-evolved gas mixture. This is due to the relatively small but critical overlap between the CO and H2O IR spectra. The calculated amount of CO2 for each of the two cases, full and truncated spectrum, is unaffected because the CO2 IR spectra do not overlap with either CH4O, CH2O, CO or the H2O spectra. This analysis indicates that PCA on simulated calibration data sets is capable of identifying and quantifying gas species within experimental and unknown prediction data sets. 155 Chapter 8: Conclusions The specific purpose of this research was to utilize a mathematical technique called principal component analysis (PCA) in conjunction with principal component regression (PCR) and proportionality constant calculations (PCC) to simplify complex, multi-component infrared (IR) spectra data sets into a reduced data set used for determination of the concentrations of the individual components. The application of this analytical numerical technique to IR spectrum analysis could play an important role in improving performance of commercial sensors that airlines and aircraft manufacturers could potentially use in an aircraft cabin environment for multi-gas component monitoring. PCA along with PCR and PCC was successfully applied to the monitoring of H2O2 concentration in an aqueous solution in which analysis was performed on variable volumes of solutions in both the calibration and prediction data set. The analysis was then applied to both simulated and experimental two and three component gas systems that could be potential environmental air contaminants within the aircraft cabin. These analyzed systems consisted of mixtures of CH2O/C3H4O, CO/CO2, CH2O/C3H4O/H2O, and CO/CO2/H2O gas spectra. After the PCA application to two and three component systems, the technique was further expanded to include the monitoring of potential bleed air contaminants from engine oil combustion, in which a simulation data set was utilized to predict gas components and concentrations in unknown engine oil samples at high temperatures as well as time-evolved gas from heating of engine oil. 156 Chapter 9: Future Work Based on the success of PCA application to the H2O2 solutions, it is recommended that the engine oils, as received, in liquid form be analyzed. This analysis should indicate the major components of the engine oil and these can then be compared to the gas species that evolved when the liquid engine oil is heated up to the point of combustion. Another segment of this future work should be the development of a model that accurately simulates the observed shifts in methanol at higher temperatures. Within this research, the shift was estimated for the particular temperature that the engine oils experienced the greatest mass loss. With further experiments monitoring a much wider temperature profile, the movement in the characteristic IR spectra of methanol should be readily observed. Further experiments should be performed where analysis of the gas phase products released at various temperatures are monitored with the FTIR at or near room temperature. The experimental setup best simulate the monitoring of the bleed air contaminants as would be measured in the aircraft cabin. Additional holding temperatures of the heating furnace used in the time-evolved engine oil study should be investigated to determine if other gas components are released and if the concentrations of the gas species significantly change as a function of temperature. In addition, other aircraft engine oils should be investigated to determine characteristics of each. As discussed in Chapter 2, engine oils are known to contain a toxic chemical called tricresyl phosphate (TCP) and to understand more thoroughly its IR characteristics within the 157 aircraft cabin environment, experiments similar to those performed with the engine oil should be carried out. TCP in the liquid phase, diluted with methanol, would be an ideal solution to monitor in the gas phase with FTIR to determine characteristics temperatures that the TCP solution could be released within an aircraft cabin. 158 References [1] Airline News Resource (October 2010). U.S. Business Travel Spending, Trips to Increase This Year Despite Slowdown in Economic Growth. The Business Travel Quarterly Outlook ? United States, Retrieved November 1, 2010 from http://www.airlinenewsresource.com/article49669.html [2] Federal Aviation Administration (March 2010). Forecast Highlights 2010-2030. FAA Aerospace Forecast Fiscal Years 2010-2030, Retrieved November 3, 2010 from http://www.faa.gov/data_research/aviation/aerospace_forecasts/2010-2030 [3] McAllister, B. (February 2004). Al Qaeda and the Innovative Firm: Demythologizing the Network. Studies in Conflict & Terrorism, 27, 297-319. [4] Kean, T.H., Hamilton, L. H., Ben-Veniste, R., Kerrey, B., Fielding, F.F., Lehman, J. F., Gorelick, J. S., Roemer, T. J., Gorton, S., and Thompson, J. R. (2004). The 9/11 Commission Report. Retrieved November 1, 2010 from http://govinfo.library.unt.edu/911/report/911Report.pdf [5] Hwang, G. M., DiCarlo, A., Teig, L. J., Lin, G., and Harkin, M. (May 2009). Detecting infectious and biological contaminants aboard aircraft ? Is it feasible? IEEE Conference on Technologies for Homeland Security, 477-484. [6] Gupta, J. K., Lin, C. ?H., and Chen, Q. (June 2010). Transport of expiratory droplets in an aircraft cabin. International Journal of Indoor Environment and Health, Retrieved November 2, 2010 from http://onlinelibrary.wiley.com/doi/10.1111/j.1600- 0668.2010.00676.x/pdf [7] Federal Aviation Administration (March 2010). Review of 2009. FAA Aerospace Forecast Fiscal Years 2010-2030, Retrieved November 3, 2010 from http://www.faa.gov/data_research/aviation/aerospace_forecasts/2010-2030 [8] Netten, C. V., and Leung, V. (March/April 2001). Hydraulic Fluids and Jet Engine Oil: Pyrolysis and Aircraft Air Quality. Environmental Health, 56(2), 181-186. [9] Spengler, J. D., and Wilson, D. G. (April 2003). Air quality in aircraft. Proceedings Institution of Mechanical Engineers, Part E: Journal of Process Mechanical Engineering, 217, 323-335. 159 [10] Winder, C., and Balouet, J. (June 2002). The Toxicity of Commercial Jet Oils. Environmental Research, 89(2), 146-164. [11] Hunt, E., Reid, D., Space, D., and Tilton, F. (n.d.). The Commercial Airliner Environmental Control System: Engineering Aspects of Cabin Air Quality. Retrieved November 24, 2010 from http://www.boeing.com/commercial/cabinair/ecs.pdf [12] Harrison, R., Murawski, J., McNeely, E., Guerriero, J., and Milton, D. (April 2009). Exposure to Aircraft Bleed Air Contaminants Among Airline Workers: A Guide for Health Care Providers. Retrieved November 2, 2010 from http://www.ohrca.org/Medicalprotocol031909.pdf [13] Prorok, B., Gale, W. F., Gale, H. S., Simonian, A., Kim, D. ?J., Hong, J. ?W., Cheng, Z. ? Y., Callender, C. M., Sofyan, N., Loo, N. M., Kim, Y. E., Low, P., and Reifenberger, R. R. (November 2006). Proceedings of 4th International Aviation Security Technology Symposium. Retrieved November 2, 2010 from http://acer.eng.auburn.edu/partners/conf_papers/prorok_paper_4th_int_aviation_security_s ym_nov_2006.pdf [14] Sinnett, M. (2007). 787 No Bleed Systems: Saving Fuel and Enhancing Operational Efficiencies. Aero Quarterly. Retrieved November 8, 2010 from http://www.boeing.com/commercial/aeromagazine/articles/qtr_4_07/AERO_Q407_article2 .pdf [15] Shlens, J. (April 2009). A Tutorial on Principal Component Analysis. Retrieved November 8, 2010 from http://www.snl.salk.edu/~shlens/pca.pdf [16] Hunt, E., and Space, D. (n.d.). The Airplane Cabin Environment: Issues Pertaining to Flight Attendant Comfort. The Boeing Company. Retrieved December 6, 2010 from http://www.boeing.com/commercial/cabinair/ventilation.pdf [17] Tamas, G., Weschler, C., Bako-Biro, Z., Wyon, D., and Strom-Tejsen, P. (October 2006). Factors affecting ozone removal rates in a simulated aircraft cabin environment. Atmospheric Environment, 40(32), 6122-6133. [18] Strom-Tejsen, P., Weschler, C., Wargocki, P., Myskow, D., and Zarzycka, J. (June 2007). The influence of ozone on self-evaluation of symptoms in a simulated aircraft cabin. Journal of Exposure Science and Environmental Epidemology, 18(3), 272-281. [19] Thibeault, C. (January 1997). Special Committee Report: Cabin Air Quality. Aviation, Space, and Environmental Medicine, 68(1), 80-82. [20] Zhang, T., Chen, Q., and Lin, C. (2007). Optimal sensor placement for airborne containment detection in an aircraft cabin. HVAC&R Research, 13(5), 683-696. 160 [21] Chou, S., Overfelt, R., Gale, W., Gale, H., Shannon, C., Buschle-Diller, G., and Watson, J. (August 2009). Effects of Hydrogen Peroxide on Common Aviation Textiles. Civil Aerospace Medical Institute, Report No. DOT/FAA/AM-09/16. [22] Gale, W., Sofyan, N., Gale, H., Sk, M., Chou, S., Fergus, J., and Shannon, C. (January 2009). Effect of vapour phase hydrogen peroxide, as a decontaminant for civilian aviation applications, on microstructure, tensile properties and corrosion resistance of 2024 and 7075 age hardenable aluminum alloys and 304 austenitic stainless steel. Material Science and Technology, 25(1), 76-84. [23] Rickloff. J. (November 2008). Factors Influencing Hydrogen Peroxide Gas Sterilant Efficacy. Advanced Barrier Concepts, Inc. 2008, 1-4. Retrieved December 7, 2010 from http://isolationinfo.com/docs/Sterilant%20Efficacy%20Factors%20JRR.pdf?org=/tech_val _topics.asp [24] Sk, M. H., Overfelt, R. A., Haney, R. L., and Fergus, J.W. (2010). Hydrogen Embrittlement of 4340 Steel due to Condensation during Vaporized Hydrogen Peroxide Treatment. Materials Science and Engineering A, doi:10.1016/j.msea.2011.01.100 [25] Adams, D., Brown, G. P., Fritz, C., and Todd, T. R. (1998). Calibration of a near-infrared (NIR) H2O2 vapor monitor. Pharmaceutical Engineering, 18(4), 66-85. [26] Pandrangi, L., and Morrison, G. (February 2008). Ozone interactions with human hair: Ozone uptake rates and product formation, Atmospheric Environment, 42, 5079-5089. [27] Board on Environmental Studies, National Research Council. (2002). The Airliner Cabin Environment and the Health of Passengers and Crew. National Academy Press, Washington D.C., USA. Retrieved December 8, 2010 from http://www.nap.edu/openbook.php?isbn=0309082897 [28] Winder, C. (2006). Air monitoring studies for aircraft contamination. Current Topics in Toxicology, 3, 33-48. [29] Mackerer, C., Barth, M., Krueger, A., Chawala, B., and Roy, T. (1999). Comparison of Neurotoxic Effects and Potential Risks from Oral Administrations or Ingestion of Tricresyl Phosphate and Jet Engine Oil Containing Tricresyl Phosphate. Journal of Toxicology and Environmental Health, Part A, 56, 293-328. [30] van Netten, C., and Lueng, V. (2000). Comparison of the Constituents of Two Jet Engine Lubricating Oils and Their Volatile Pyrolytic Degradation Products. Applied Occupational and Environmental Hygiene, 15(3), 277-283. [31] Nakamoto, K. (2009). Infrared and Raman Spectra of Inorganic and Coordination Compounds, Part A: Theory and Applications in Inorganic Chemistry 6th ed. Wiley Publishing, Hoboken, NJ, USA. 161 [32] Nyquist, R. (2001) Interpreting Infrared, Raman, and Nuclear Magnetic Resonance Spectra: Volume 1 ? Variables in Data Interpretation of Infrared and Raman Spectra, Academic Press, San Diego, Ca, USA. [33] Grobelnik, B. (March 2006). Seminar II: Infrared Spectroscopy. University of Ljubljana. [34] Barrow, G. (1963). The Structure of Molecules ? An Introduction to Molecular Spectroscopy. W. A. Benjamin, Inc., New York, NY, USA. [35] PerkinElmer, Inc. (December 2000). Spectrum GX User?s Guide: Release D. [36] Califano, S. (1976). Vibrational States. Wiley-Interscience Publishing, New York, USA. [37] Etchegoin, P. G., Meyer, M., Blackie, E., and Le Ru, E. C. (November 2007). Statistics of Single-Molecule Surface Enhanced Raman Scattering Signals: Fluctuation Analysis with Multiple Analyte Technique. Analytical Chemistry, 79(21), 8411-8415. [38] Cortes, E., Etchegoin, P. G., Le Ru, E. C., Fainstein, A., Vela, M. E., and Salvarezza, R.C. (July 2010). Electrochemical Modulation for Signal Discrimination in Surface Enhanced Raman Scattering (SERS). Analytical Chemistry, 82(16), 6919-6925. [39] Mc Evoy, K. M., Genet, M. J., and Dupont-Gillain, C. C. (August 2008). Principal Component Analysis: A Versatile Method for Processing and Investigation of XPS Spectra. Analytical Chemistry, 80(19), 7226-7238. [40] Osten, D. W., and Kowalski, B.R. (May 1984). Multivariate curve resolution in liquid chromatography . Analytical Chemistry, 56(6), 991-995. [41] Vandeginste, B. G., Derks, W., and Kateman, G. (1985). Multicomponent self-modelling curve resolution in high-performance liquid chromatography by iterative target transformation analysis. Analytica Chimica Acta, 173, 253-264. [42] Vandeginste, B., Essers, R., Bosman, T., Reijnen, J., and Katemen, G. (May 1985). Three- component curve resolution in liquid chromatography with multiwavelength diode array detection. Analytical Chemistry, 57(6), 971-985. [43] Windig, W., and Guilment, J. (July 1991). Interactive self-modeling mixture analysis. Analytical Chemistry, 63(14), 1425-1432. [44] Banas, K., Banas, A., Moser, H. O., Bahou, M., Li, W., Yang, P., Cholewa, M., and Lim, S. K. (March 2010). Multivariate Analysis Techniques in the Forensics Investigation of the Postblast Residues by Means of Fourier Transform-Infrared Spectroscopy. Analytical Chemistry, 82(7), 3038-3044. 162 [45] Nieuwoudt, H.H., Prior, B. A., Pretorius, I. S., Manley, M., and Bauer, F. F. (2004). Principal Component Analysis Applied to Fourier Transform Infrared Spectroscopy for the Design of Calibration Sets for Glycerol Prediction Models in Wine and for the Detection and Classification of Outlier Samples. Journal of Agriculture and Food Chemistry, 52(12), 3726-3735. [46] Bu, D., and Brown, C. W. (2000). Self-Modeling Mixture Analysis by Interactive Principal Component Analysis. Applied Spectroscopy, 54(8), 1214-1221. [47] Knorr, F. J., and Futrell, J. H. (July 1979). Separation of Mass Spectra of Mixtures by Factor Analysis. Analytical Chemistry, 51(8), 1236-1241. [48] Adams, M. J. (1995). Chemometrics in Analytical Spectroscopy. The Royal Society of Chemistry, Thomas Graham House, Science Park, Cambridge, U.K. [49] Massart, D. L., Vandeginste, B. G. M., Deming, S. N., Michotte, Y., Kaufman, L. (1988). Chemometrics: a textbook (Data handling in science and technology; no. 2). Elsevier Science Publishing Company Inc. New York, NY, U.S.A. [50] Huckaba, C. E., and Keyes, F. G. (April 1948). The Accuracy of Estimation of Hydrogen Peroxide by Potassium Permanganate Titration. Journal of American Chemical Society, 70(4), 1640-1644. [51] Marco, J., Orza, J. M., and Abboud, J. ?L. M. (1994). Fourier Transform infrared study of gas phase H-bonding: absorptivities and formation equilibrium constants of fluoroalcohol complexes. Vibrational Spectroscopy, 6, 267-283. [52] Bodis, J., Kornatowski, J., and Lercher, J. A. (2006). FT-IR Spectroscopy Study of Interactions Between Methanol and MeAPO-5 Single Crystals. Seria F Chemica, 9, 29-38. [53] Dixon, J. R., George, W. O., Hossain, M. F., Lewis, R., and Price, J. M. (January 1997). Hydrogen-bonded forms of methanol. Journal Chemical Society, Faraday Transactions, 93 (20), 3611-3618. [54] Keefe, C. D., Gillis, E. A., and MacDonald, L. (2009). Improper Hydrogen-Bonding CH*Y Interactions in Binary Methanol Systems as Studied by FTIR and Raman Spectroscopy. Journal of Physical Chemistry A, 113, 2544-2550. [55] Joseph, J., and Jemmis, E. D. (2007). Red-, Blue-, or No-Shift in Hydrogen Bonds: A Unified Explanation. Journal of American Chemical Society, 129, 4620-4632. 163 Appendix A: Principal Component Analysis MATLAB? Source Code % data.txt is the location of the file with absorbance values for each wavenumber and sample fid=fopen('data.txt'); % [fileData_x fileData_y] = [wavenumbers+1 samples] % extract data from data.txt file into a fileData matrix [fileData]=fscanf(fid,'%f %f',[1764 17]); X=fileData'; fclose(fid); % data matrix [X] [n,p]=size(X); % n-1=num of samples % p=num of wavenumbers Xdata=zeros(n,p-1); y=zeros(n,1); for i=1:n for j=1:p-1 Xdata(i,j)=X(i,j+1); end end for i=1:n y(i,1)=X(i,1); end [n,p]=size(Xdata) Xdata_mean=mean(Xdata); Xdata_meanAdj=Xdata-repmat(Xdata_mean,[n 1]); y_mean=mean(y); y_meanAdj=y-repmat(y_mean,[n 1]); eigValues=flipud(eig(Xdata_meanAdj'*Xdata_meanAdj)); cumVar_explained=cumsum(eigValues./sum(eigValues)); num_sig_eigValues=0; % determine number of significant eigValues until cumulative variance explained >= XX% i=1; while cumVar_explained(i)<0.95 num_sig_eigValues=num_sig_eigValues+1; i=i+1; end num_sig_eigValues=6 164 % [loadings] loadings matrix % [scores] score matrix % columns are in order of decreasing component variance % princomp automatically subtracts of column means (use raw data) [loadings,scores,latent]=princomp(Xdata); % reduced loadings & scores based on num_sig_eigValues reduced_loadings=zeros(p,num_sig_eigValues); reduced_scores=zeros(n,num_sig_eigValues); for i=1:p for j=1:num_sig_eigValues reduced_loadings(i,j)=loadings(i,j); end end for i=1:n for j=1:num_sig_eigValues reduced_scores(i,j)=scores(i,j); end end % ***** PCR VARIABLES ***** % vector of estimates of the regression coeffecients % [b] is product of eigenvectors [loadings] and y-loadings [q] % y-loadings [q] determined by regression of [y] on [scores] % [D] is diagonal matrix with each diagonal element equal to 1/tk % tk is eigenvalue of factor k % [q] = [D][scores]'[y] D=zeros(num_sig_eigValues,num_sig_eigValues); for i=1:num_sig_eigValues for j=1:num_sig_eigValues if(i==j) D(i,j)=1/eigValues(i); end end end q=D*reduced_scores'*y; b=reduced_loadings*q; a=y_mean-Xdata_mean*b; % output key program variables to .txt files for further analysis in Excel dlmwrite('output1.txt',X,'delimiter', '\t', 'precision', 4); dlmwrite('output2.txt',reduced_loadings,'delimiter', '\t', 'precision', 4); dlmwrite('output4.txt',latent,'delimiter', '\t', 'precision', 4); dlmwrite('output5.txt',q,'delimiter', '\t', 'precision', 4); dlmwrite('output6.txt',b,'delimiter', '\t', 'precision', 4); dlmwrite('output7.txt',a,'delimiter', '\t', 'precision', 4);