Development of a Quantitative Structure?Activity Relationship (QSAR) Model relating Solvent Structure to Ibuprofen Crystal Morphology using 2D and 3D Molecular Descriptors by John Colin Haser A thesis submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Master of Science Auburn, Alabama August 3, 2013 Keywords: crystallization, ibuprofen, descriptors, CAMD, QSAR, PCA Copyright 2013 by John Colin Haser Approved by: Mario R. Eden, Department Chair, Joe T. and Billie Carole McMillan Professor Allan E. David, John W. Brown Assistant Professor Marko Hakovirta, Professor & AC-PABE Director ii Abstract The objective of this thesis is to develop a quantitative structure-activity relationship (QSAR) that relates solvent structure to the morphology of ibuprofen crystals grown within that solvent. Morphology can be quantified by aspect ratio, and ibuprofen aspect ratio data was obtained for crystals grown in 16 different organic solvents. Developing this QSAR requires accurate geometry optimization using empirical force fields to estimate the three-dimensional structure of the solvent molecules. Three different force fields are implemented and their effect on the developed models is analyzed. Next, a combination of 2D and 3D molecular descriptors are calculated using those structures to provide a quantitative representation of the geometry optimized solvent molecules. The descriptor data matrix is then reduced in size for regression into linear models. This stage is executed using Bayesian Information Criterion (BIC) methods and also Principal Component Analysis (PCA) with Principal Component Regression. The final step in the development is to evaluate the predictive capabilities of the resulting models. The QSAR models developed with either technique were all able to fit the training set data and PCA models generally had better predictive capabilities than the models developed using BIC. However, it was also shown that the applicability domain for the models is very small and the predictive capabilities were less than expected. The principal conclusion from this work is that both methods produce models that can fit the training set data, but that additional experimental iii data should be obtained to produce better predictive models that could be used for crystallization solvent design for pharmaceutical or other industrial applications. iv Acknowledgments First and foremost, I would like to thank my advisor, Dr. Mario Richard Eden, for all his help and support in completing my classes and research at Auburn University. I would also like to thank the entire faculty and staff within the Samuel Ginn College of Engineering and the Department of Chemical Engineering, especially Dr. Allan David and Dr. Marko Hakovirta for devoting their time to serve on my thesis committee. In addition, I would like to thank Subin Hada and Robert Herring for their support as I began my research work and helping me complete my thesis. I would like thank Dr. Charles Acquah of the University of Connecticut and Dr. Arunprakash Karunanithi of the University of Colorado Denver for their willingness to supply me with the data used within this work. I would like to thank the Auburn Department of Chemical Engineering and the Department of Energy National Energy Technology Laboratory (DOE-NETL), the Department of Agriculture (USDA-NIFA-AFRI) and the Walt Woltosz Fellowship Program for the financial support of my degree work. I also want to thank my parents, Ed and Peggy Haser, for their help and support throughout my entire academic career. Finally, I would like to thank my fianc?e Cortney Crouse for her love and support as I completed my degree. v Table of Contents Abstract ........................................................................................................................................... ii Acknowledgments.......................................................................................................................... iv List of Figures .............................................................................................................................. viii List of Tables ................................................................................................................................. xi List of Abbreviations .................................................................................................................... xii Introduction ............................................................................................................... 1 Chapter 1: 1.1 Building a CAMD Framework for Crystallization Solvent Design ................................. 2 1.2 Descriptors ....................................................................................................................... 3 1.2.1 2D Descriptors .......................................................................................................... 3 1.2.2 3D Descriptors .......................................................................................................... 4 1.3 Motivation for Using 2D and 3D Descriptors .................................................................. 5 1.4 Thesis Outline .................................................................................................................. 6 Theoretical Background ............................................................................................ 7 Chapter 2: 2.1 Molecular Mechanics ....................................................................................................... 7 2.1.1 GAFF ........................................................................................................................ 8 2.1.2 Ghemical ................................................................................................................... 9 2.1.3 MMFF94s ................................................................................................................. 9 2.2 2D and 3D Descriptors ..................................................................................................... 9 2.3 QSAR Development ....................................................................................................... 11 vi 2.3.1 BIC .......................................................................................................................... 11 2.3.2 PCA ......................................................................................................................... 12 2.3.3 Validation Methods ................................................................................................. 13 2.4 Crystal Morphology Prediction ...................................................................................... 13 2.5 Crystallization Solvent Design Framework ................................................................... 18 Methodology ........................................................................................................... 21 Chapter 3: 3.1 Estimating Solvent Geometry ........................................................................................ 21 3.1.1 Optimizing Structures with Avogadro .................................................................... 22 3.2 Calculating Descriptors .................................................................................................. 23 3.2.1 E-DRAGON ............................................................................................................ 24 3.2.2 Eliminating Zero Descriptors .................................................................................. 24 3.2.3 Normalizing Descriptors ......................................................................................... 25 3.3 Regression and Analysis Methods ................................................................................. 25 3.3.1 BIC in JMP? .......................................................................................................... 26 3.3.2 PCA ......................................................................................................................... 27 3.3.3 Validation ................................................................................................................ 29 3.4 Data Set Expansion ........................................................................................................ 31 3.5 Summary ........................................................................................................................ 34 Results and Discussion ........................................................................................... 35 Chapter 4: 4.1 Training Set Aspect Ratio Data ...................................................................................... 35 4.2 BIC in JMP? .................................................................................................................. 36 4.2.1 External Validation ................................................................................................. 37 4.3 PCA Using 16 Solvents .................................................................................................. 44 vii 4.3.1 Internal Validation .................................................................................................. 45 4.3.2 External Validation ................................................................................................. 49 4.4 Expanded Training Set Aspect Ratio Data ..................................................................... 53 4.5 PCA Using 51 Solvents .................................................................................................. 55 4.5.1 Internal Validation .................................................................................................. 55 4.5.2 External Validation ................................................................................................. 59 4.6 Summary ........................................................................................................................ 63 Conclusions ............................................................................................................. 64 Chapter 5: 5.1 BIC in JMP? .................................................................................................................. 64 5.2 PCA ................................................................................................................................ 65 5.2.1 16 Solvents .............................................................................................................. 65 5.2.2 51 Solvents .............................................................................................................. 66 5.3 Impact of Geometry Optimization Force Fields ............................................................ 67 5.4 Summary ........................................................................................................................ 68 Future Work ............................................................................................................ 69 Chapter 6: 6.1 Acquiring Additional Experimental Data ...................................................................... 69 6.2 Genetic Algorithm QSAR Models ................................................................................. 70 6.3 Expanding from Single Solute to Multiple Solute ......................................................... 71 6.4 Summary ........................................................................................................................ 73 References ..................................................................................................................................... 74 Appendices .................................................................................................................................... 77 A. BIC Method ...................................................................................................................... 77 B. PCA Method ..................................................................................................................... 81 viii List of Figures Figure 1.1: Molecular graph of propylene glycol ........................................................................... 4 Figure 1.2: 3D representation of propylene glycol ......................................................................... 4 Figure 1.3: Computational expense vs. degeneracy using 0-4D descriptors .................................. 5 Figure 2.1: Key contributions to a molecular mechanics force field (Adapted from [5]) .............. 8 Figure 2.2: 2D structure (a) and 3D structure (b) of ibuprofen .................................................... 14 Figure 2.3: Three step ibuprofen production mechanism (Adapted from [19]) ........................... 15 Figure 2.4: Ibuprofen crystal shape when grown in n-hexane (a) and methanol (b) .................... 17 Figure 2.5: Results from CAMD framework for crystallization solvent design........................... 19 Figure 3.1: n-Hexane geometry optimized with Avogadro .......................................................... 22 Figure 3.2: Scree plot .................................................................................................................... 27 Figure 3.3: PCA matrix decomposition. ....................................................................................... 28 Figure 3.4: Original solvents and expansion solvents. ................................................................. 32 Figure 3.5: Iterative process to add expansion solvents to training set ........................................ 33 Figure 4.1: BIC experimental comparison with 2-ethoxyethyl acetate ........................................ 38 Figure 4.2: BIC experimental comparison with chloroform ......................................................... 38 Figure 4.3: BIC experimental comparison with decanol .............................................................. 39 Figure 4.4: 2D BIC Q2 external validation ................................................................................... 41 Figure 4.5: 3D GAFF BIC Q2 external validation ........................................................................ 41 Figure 4.6: 2D & 3D GAFF BIC Q2 external validation .............................................................. 42 ix Figure 4.7: 3D Ghemical BIC Q2 external validation ................................................................... 42 Figure 4.8: 2D & 3D Ghemical BIC Q2 external validation ......................................................... 43 Figure 4.9: 3D MMFF94s BIC Q2 external validation ................................................................. 43 Figure 4.10: 2D & 3D MMFF94s BIC Q2 external validation ..................................................... 44 Figure 4.11: 2D PCA Q2 internal validation ................................................................................. 45 Figure 4.12: 3D GAFF PCA Q2 internal validation ...................................................................... 46 Figure 4.13: 2D & 3D GAFF PCA Q2 internal validation ............................................................ 46 Figure 4.14: 3D Ghemical PCA Q2 internal validation ................................................................ 47 Figure 4.15: 2D & 3D Ghemical PCA Q2 internal validation ...................................................... 48 Figure 4.16: 3D MMFF94s PCA Q2 internal validation ............................................................... 48 Figure 4.17: 2D & 3D MMFF94s PCA Q2 internal validation ..................................................... 49 Figure 4.18: 2D PCA Q2 external validation ................................................................................ 50 Figure 4.19: 3D GAFF PCA Q2 external validation ..................................................................... 50 Figure 4.20: 2D & 3D GAFF PCA Q2 external validation ........................................................... 51 Figure 4.21: 3D Ghemical PCA Q2 external validation ................................................................ 51 Figure 4.22: 2D & 3D Ghemical PCA Q2 external validation ...................................................... 52 Figure 4.23: 3D MMFF94s PCA Q2 external validation .............................................................. 52 Figure 4.24: 2D & 3D MMFF94s PCA Q2 external validation .................................................... 53 Figure 4.25: 2D PCA Q2 internal validation with 51 solvents ...................................................... 55 Figure 4.26: 3D GAFF PCA Q2 internal validation with 51 solvents .......................................... 56 Figure 4.27: 2D & 3D GAFF PCA Q2 internal validation with 51 solvents ................................ 56 Figure 4.28: 3D Ghemical PCA Q2 internal validation with 51 solvents ..................................... 57 Figure 4.29: 2D & 3D Ghemical PCA Q2 internal validation with 51 solvents ........................... 57 x Figure 4.30: 3D MMFF94s PCA Q2 internal validation with 51 solvents.................................... 58 Figure 4.31: 2D & 3D MMFF94s PCA Q2 internal validation with 51 solvents ......................... 58 Figure 4.32: 2D PCA Q2 external validation with 51 solvents ..................................................... 59 Figure 4.33: 3D GAFF PCA Q2 external validation with 51 solvents .......................................... 60 Figure 4.34: 2D & 3D GAFF PCA Q2 external validation with 51 solvents ................................ 60 Figure 4.35: 3D Ghemical PCA Q2 external validation with 51 solvents .................................... 61 Figure 4.36: 2D & 3D Ghemical PCA Q2 external validation with 51 solvents .......................... 61 Figure 4.37: 3D MMFF94s PCA Q2 external validation with 51 solvents ................................... 62 Figure 4.38: 2D & 3D MMFF94s PCA Q2 external validation with 51 solvents ......................... 62 Figure 6.1: NSAIDs within the propionic acid derivative family ................................................. 72 xi List of Tables Table 2.1: Coefficients of determination for relating AR to hydrogen bonding properties ......... 17 Table 3.1: Original 16 solvents and their two-dimensional structures ......................................... 23 Table 3.2: 2D and 3D descriptor classes ....................................................................................... 24 Table 3.3: Data matrix sizes.......................................................................................................... 25 Table 4.1: 16 solvent training set aspect ratio data ....................................................................... 36 Table 4.2: Descriptors selected for BIC models ........................................................................... 37 Table 4.3: External validation of BIC models .............................................................................. 40 Table 4.4: Expansion solvent training set estimated aspect ratio data .......................................... 54 Table A.1: Descriptors used in BIC method regression ............................................................... 77 Table A.2: Equations developed with BIC method regression ..................................................... 80 Table B.1: 2D descriptor matrix ................................................................................................... 82 Table B.2: Eigenvector matrix from 2D descriptors ..................................................................... 92 Table B.3: 2D PCA factor matrix ............................................................................................... 102 Table B.4: Equations developed with PCA method regression .................................................. 103 Table B.5: Expansion solvents and their two-dimensional structures ........................................ 104 xii List of Abbreviations AIC Akaike Information Criterion AMBER Assisted Model Building with Energy Refinement AR Aspect Ratio BIC Bayesian Information Criterion CAMD Computer-Aided Molecular Design FDA Food and Drug Administration GAFF Generalized AMBER Force Field GA Genetic Algorithm GC Group Contribution MMFF94 Merck Molecular Force Field MMFF94s Merck Molecular Force Field for static molecules NSAID Non-Steroidal Anti-Inflammatory Drug PC Principal Component PCA Principal Component Analysis PCR Principal Component Regression PRESS Sum of squares of the prediction errors R2 Coefficient of determination RMSE Root Mean-Squared Error RSS Residual sum of squares xiii Q2 Predictive squared correlation coefficient QM Quantum Mechanics QSAR Quantitative Structure-Activity Relationship SMILES Simplified Molecular-Input Line-Entry System TSS Total Sum of Squares 1 Introduction Chapter 1: The purpose of this thesis work is to develop a quantitative structure-activity relationship (QSAR) to relate ibuprofen crystal aspect ratio (AR) to the structure of the solvents that it is crystallized within. A valid QSAR with strong predictive capabilities can be used to design or select crystallization solvents while minimizing time, cost and environmental impacts associated with experimental work. Crystallization is a much less-studied separation unit operation compared to vapor-liquid distillation and liquid-liquid extraction [1]. In addition to the lack of study of crystallization processes, there are also more variables that can affect the quality and usefulness of the end- product of this unit operation. The important output variables from distillation columns and liquid-liquid extraction columns are mainly product flow rate and product composition. In crystallization, crystal clarity and morphology are also crucial to the final product quality in addition to flow rate and composition. Along with the increase in product variables, the driving forces behind how crystals grow in solution are not nearly as well-known. In distillation, the difference in boiling points between two compounds is what drives the separation for most mixtures that need to be separated. For crystallization, the interactions between solute and solvent are more difficult to quantify and can vary with every solute-solvent combination. Crystallization is often used within the pharmaceutical industry; therefore the quality of the end- product is very important because any defects within the crystalline drug product could lead to injury or death of patients. Crystal morphology can have a significant impact on downstream processing of pharmaceutical products and also how the drug is metabolized within the human body [2]. For example, needle-like ibuprofen crystals (high AR) tend to stick to tablet presses and dies much more than plate-like crystals (low AR) [3]. 2 Because crystallization processes tend to be used on much smaller processes than the more traditional separation unit operations, the products are typically value-added specialty chemicals and pharmaceutical products. Therefore, any deviation from the desired product specifications, even for a small quantity of product, can result in a significant loss of revenue due to off- specification product. Issues such as crystal clarity and morphology that are not considered in other separation processes become crucial to meeting specifications for crystallization products. 1.1 Building a CAMD Framework for Crystallization Solvent Design An application of the QSAR developed within this work would be to use it within a computer- aided molecular design (CAMD) framework to design solvents that can be used to grow crystals of a specific morphology. A pharmaceutical company could utilize a framework of this manner to develop solvents to crystallize a novel drug with a desired morphology. Using a CAMD framework to design an ideal solvent (or solvents) can reduce the need for money- and time- consuming experimental work. As computers continue to gain processor speed and more data becomes available, higher quality models can be developed and implemented. If a CAMD framework can be utilized to develop crystallization solvents that will yield the desired product, thousands of dollars of experimental work can be saved. If a CAMD framework produces five candidate solvents for a given crystallization process, then experiments can be performed using just those solvents. Otherwise, dozens or hundreds of solvents would need to be evaluated experimentally to find an optimal solvent. Saving time is crucial as well, especially within the pharmaceutical industry. In the United States, approval of a drug from the Food and Drug Administration (FDA) can take up to 15 years [4]. If a CAMD framework can reduce the amount of time needed to develop the drug, then pharmaceutical companies can begin to produce the drug and gain FDA approval much quicker. There is currently a push within industry and the 3 government for corporations to become more sustainable and produce less waste. Companies can reduce their environmental footprint and improve sustainability by replacing experimental design work with CAMD frameworks. Many of the solvents utilized within the bulk and specialty chemicals industry can have negative environmental impacts, so any simulation work that can reduce the amount of solvent waste produced is beneficial environmentally. Finally, it has become easier and cheaper to acquire vast amounts of chemical data, and CAMD frameworks can utilize that data to save time and money and reduce environmental impact. 1.2 Descriptors In order to relate the structure of solvent molecules to the aspect ratio of the ibuprofen grown within it, solvent structure must be quantified somehow. This can be achieved through the calculation of molecular descriptors. These descriptor values can be arranged into a data matrix and then regressed to fit aspect ratio data. There are several dimensions of descriptors from 0D through 4D. As the dimensionality of the descriptors increase, the calculations behind them become more complex. Many 0D and 1D descriptors, such as atom counts and structural fragment lists, can be determined by looking at a text string of a molecule, but 2D, 3D and 4D descriptors typically require significant computational software. This work utilizes 2D and 3D descriptors to quantify solvent molecules. 1.2.1 2D Descriptors 2D molecular descriptors are calculated simply based on the bonds and identities of atoms within the molecules. Descriptor calculation software can calculate 2D descriptors from simplified molecular-input line-entry system (SMILES) notation which for propylene glycol is CC(O)CO. The calculations can also be made from simple 2D molecular graphs of molecules, and an example is shown in Figure 1.1: 4 O O CC C H H H HH H H H Figure 1.1: Molecular graph of propylene glycol 2D descriptors have a lower computational expense but do not provide as much information as higher dimensionality descriptors. 1.2.2 3D Descriptors As the name suggests, 3D molecular descriptors account for the position of atoms within a molecule in the x, y, and z directions. Bond lengths, angles and rotations, as well as non-bonded interactions, are used in the calculation of 3D molecular descriptors. The calculation of 3D descriptors requires geometry optimization utilizing molecular force fields, which are further described in Chapter 2, to estimate the location of the atoms within three-dimensional space. Geometrically optimized propylene glycol is shown below in Figure 1.2: Figure 1.2: 3D representation of propylene glycol 3D descriptors carry a higher computational expense but can provide more information than lower dimensional descriptors. 5 1.3 Motivation for Using 2D and 3D Descriptors In this work, a combination of 2D and 3D molecular descriptors are calculated for each solvent molecule. Each class of descriptors has its advantages and disadvantages according to the information provided within that descriptor value. When referring to molecular descriptor values, degeneracy is synonymous with uniqueness. A representation of these advantages and disadvantages is shown in Figure 1.3: Figure 1.3: Computational expense vs. degeneracy using 0-4D descriptors 0D descriptors, such as atom counts, are very inexpensive to calculate but the resulting descriptor value can be shared with thousands of other molecules. 1D descriptors, such as lists of structural fragments, are more expensive to calculate but the degeneracy of that descriptor value is reduced. The pattern continues for 2D descriptors, like the Wiener Index, with the descriptor being more expensive to calculate but having lower degeneracy. For 3D descriptors such as the 3D-Balaban Index, the calculations become more complex but the degeneracy is further decreased. Finally for 4D descriptors which include conformers with 3D coordinates, the computational expense is great but degeneracy of the resulting descriptor value is very low or zero. 6 The rationale for using a combination of 2D and 3D descriptors is that they provide a solid middle ground between the 0D and 1D descriptors which do not provide much useful information for this analysis and 4D descriptors that are very difficult to calculate, require specialized software, and could be subject to statistical noise. 1.4 Thesis Outline Chapter 2 of this thesis details the background work completed in the fields of molecular descriptors, crystallization solvent research and QSAR development. Chapter 3 explains the methodology utilized to develop QSAR models. Chapter 4 presents the results of this methodology in relating solvent structure to crystal aspect ratio. Chapter 5 contains the conclusions that can be drawn from the presented results. Finally, Chapter 6 explores potential future work that could be performed based on the results presented in this thesis. 7 Theoretical Background Chapter 2: In this chapter, the previous work in the fields of molecular descriptors, crystallization solvent design and QSAR developments will be presented. With the increase in computational work in many industrial fields, the development of accurate predictive models has become more and more important. Greater and greater amounts of data are becoming available and this data can be harvested and transformed into useful predictive models. The work presented in this chapter will detail the developments in each of these areas and how they relate to the work within this thesis. 2.1 Molecular Mechanics Utilizing 3D descriptors requires geometry optimization of molecular structures. This can be done using software packages utilizing molecular force fields. These force field methods, also known as molecular mechanics, calculate the energy of a system using nuclear positions and can be effectively used on systems with high numbers of particles. This varies from quantum mechanics (QM) in that QM uses the positions of electrons and is more applicable for smaller systems with fewer particles. There are four key elements to a molecular mechanics force field that determine the 3D structure of a molecule: bond stretching, angle bending, bond torsion, and non-bonded interactions [5]. A functional form for an energy minimization force field is shown in Equation 2.1. The optimized structure will have the lowest potential energy according to this equation. ( ) ? ( ) ? ( ) ? ( ( )) ? ? ( [( ) ( ) ] ) (2.1) [5] ( ) is the potential energy as a function of the positions of atoms . is the bond length , is the bond angle, is the torsion angle, is referred to as the barrier height, is the 8 multiplicity and is the phase factor. The final term of the equation represents the non-bonded electrostatic interactions between point charges (atoms within the molecule) [5]. These four contributions are shown in Figure 2.1: Figure 2.1: Key contributions to a molecular mechanics force field (Adapted from [5]) In order to calculate values for 3D descriptors, geometry optimized structures are needed. Three different force fields were chosen to estimate the three-dimensional structures of solvents. As further described in Chapters 3 and 4, QSAR models were developed using descriptors calculated from structures estimated with each of these force fields and then compared. 2.1.1 GAFF The Generalized AMBER Force Field (GAFF) was developed by Wang et al. in 2003 to extend the Assisted Model Building with Energy Refinement (AMBER) force field to most organic molecules that include the atoms of H, C, N, O, S, P and halogens [6]. This force field was chosen because it was developed with the aim of being applicable to drug-like molecules and the molecules with which they interact. 9 2.1.2 Ghemical The Ghemical force field was developed in the computational chemistry software package of the same name. The Ghemical force field was chosen because it is applicable to simple organic molecules, under which all of the solvents used in the experimental work fall. 2.1.3 MMFF94s The Merck Molecular Force Field (MMFF94) force field was originally introduced in 1996 by Halgren to reproduce a number of molecular properties including: molecular geometries, intermolecular-interaction energies and vibrational frequencies [7]. It was developed to handle the large quantity of molecules within the Merck Index. However, the MMFF94 force field was not optimal for energy-minimization studies. Therefore, a static option was developed by Halgren in 1999 and labeled as MMFF94s. The two force fields share many parameters so most molecules that can be handled by MMFF94 can be handled with the static option, called the Merck Molecular Force Field for static molecules (MMFF94s) [8]. This force field was chosen because the compound classes used in its core parameterization are expansive and each of the solvents used in this work fall within that expansive space. 2.2 2D and 3D Descriptors Utilizing a combination of 2D and 3D molecular descriptors has shown to be promising in predicting the biological targets of ligand probes [9]. In this study by Nettles et al., 2D and 3D molecular descriptors are used to bridge the gap between chemical and biological space by identifying the molecular target for a single chemical entity by evaluating a new compound?s activity against structures whose activities are already well known. The application of this work is in the pharmaceutical industry, where any viable method needs to work quickly and able to handle millions of calculations. The speed with which 2D descriptors can be calculated can be 10 combined with the low degeneracy of 3D descriptors to meet these requirements. The results of these studies show that the slower 3D descriptors can be successfully utilized in conjunction with 2D descriptors. The goals of the work in this thesis are obviously different, but the requirements for useful application are the same which is the reason both 2D and 3D molecular descriptors were utilized to construct the QSAR models. Other studies have been performed using a combination of 2D and 3D methods to develop QSAR models [10]. This work by Gavernet et al. was intended to select new anticonvulsant candidate molecules from a natural product library. The approach used was to first select candidates using solely the quick computing 2D descriptors to save on computational expense. Next, the candidates that made it through the first filters were subjected to the more complex 3D pharmacore superposition process. While this methodology differs significantly from the work presented in this thesis, it does show that combination of both 2D and 3D methods can yield better results in QSAR development than using single dimensionality methods. Regardless of the dimensionality of the descriptors used, using combinations of descriptor dimensionalities has shown to be consistently more effective than using just one. Helguera et al. used a combination of 0D, 1D and 2D descriptors to construct QSAR models for predicting carcinogenic potency of nitroso-compounds [11]. QSAR models using 0D descriptors unsurprisingly had low coefficient of determination (R2) and predictive squared correlation coefficient (Q2) values and models with 2D descriptors had significantly higher R2 and Q2 values. However, the highest R2 and Q2 values were obtained by models using a combination of 0D, 1D and 2D descriptors. These previous works have shown that employing combinations of 2D and 3D descriptors is more effective than using just one dimensionality of descriptors for the development of QSAR 11 models. The quick calculation of 2D descriptors combined with the low degeneracy and vast amount of information provided by 3D descriptors are ideal for the construction of QSAR models. 2.3 QSAR Development The development of QSAR models consists of four basic steps [12]: 1. Calculation of molecular descriptors 2. Descriptor selection for model building 3. Finding an optimal relationship between selected descriptors and target activity 4. Validation of that relationship?s predictive capabilities and applicable domain There are many different methods of analyzing large amounts of molecular descriptor data and constructing QSARs from that data. Two of these methods are explored within this thesis, Bayesian Information Criterion (BIC) and Principal Component Analysis (PCA). The main difference between them is that BIC selects the descriptors that best model the desired activity and eliminates the remainder. PCA calculates principal components (PC) that are linear combinations of the all of the original descriptor values. 2.3.1 BIC Bayesian information criterion (BIC) is a method of selecting an optimal model from a finite set of models. It was developed by Schwarz in 1978 and is similar to the Akaike Information Criterion (AIC) which was developed in 1974 [13] [14]. BIC recommends selecting a model that maximizes the equation [15]: ( | ) ? (2.2) 12 where ( | ) is the likelihood of the data, is the vector of model parameters, k is the number of parameters and n is the sample size. When used in linear regression, maximizing S is essentially the same as minimizing BIC in this equation [15]: ( ? ) (2.3) where RSS is the residual sum of squares, calculated from regression. The second term in the equation is a penalty term that increases each time a parameter is added to the mode. This reduces overfitting by encouraging models to be constructed using fewer parameters. 2.3.2 PCA Principal component analysis (PCA) is a method that is used to find systematic patterns in data and visualizes multivariate data using as few variables as possible. It maps a large multi- dimensional matrix of data onto lower dimensions with a minimal loss of information. PCA converts a correlated data matrix into a new set of uncorrelated factors that are linear combinations of the original variables. PCA extracts the most important data from the correlated large matrix and creates a smaller matrix of uncorrelated principal component factors that represent a percentage of the variance in the original data. The first factor is a linear combination of the original variables that have the greatest possible variance and each following factor is another linear combination of the original variables that have the greatest possible variance and has zero correlation with the previous factors. In the past, PCA has been used for the prediction of the mechanism of action of anti-cancer drugs by Lauria et al. [16]. PCA was used to reduce a large data matrix containing over 600 descriptor values for 60 compounds within the training set down to five PC factors. These five PC factors are able to cover over 84% of the total variance within the original data matrix. All of the QSAR 13 models developed are able to attain high R2 and Q2 values. In this study, a large matrix of descriptor data was reduced to far fewer uncorrelated factors that contain most of the information from the original matrix and relate it to the desired target activities. The algorithm developed by Lauria et al. correlates very strongly with the work presented in this thesis except with a very different application. 2.3.3 Validation Methods The methods that can be used to validate the predictive ability of a model are numerous. The two that are used within this thesis are commonly used to evaluate the predictive capabilities of QSAR models. A very commonly used method is to use the predictive squared correlation coefficient (Q2) for leave-one-out cross validations. Q2 is defined as [17]: (2.4) where PRESS is the sum of squares of the prediction errors for the test set, and TSS is the total sum of squares which is the sum of squared deviations from the training set mean. Obviously, the highest possible value for Q2 is one. In order to evaluate a model?s ability to match training set data, the coefficient of determination (R2) is used. R2 is defined as [17]: (2.5) where RSS is the residual sum of squares from the training set data and TSS is the total sum of squares for the training set data. Similar to Q2, the highest possible value of R2 is one. 2.4 Crystal Morphology Prediction Ibuprofen is a common non-steroidal anti-inflammatory drug (NSAID) that is typically used for relief of muscular and skeletal pain. In addition to its anti-inflammatory effects, it has analgesic (pain relieving) and antipyretic (fever reducing) effects. Ibuprofen is sold over-the-counter under 14 trade names such as Advil and Motrin. Its risks are assumed to be less severe than aspirin and therefore it is a very common and widely applicable drug [18]. Ibuprofen?s structure consists of a isobutyl chain and a propionic acid group attached on opposite (carbons 1,4) sides of a phenyl ring. Its full chemical name is iso-butyl-propionic-phenolic acid which is where the common name of ibuprofen is derived. A molecular graph of ibuprofen and also a geometry optimized structure of ibuprofen are shown in Figure 2.2: O O H C H 3CH 3 C H 3 (a) (b) Figure 2.2: 2D structure (a) and 3D structure (b) of ibuprofen The Boots Company, who discovered ibuprofen in the 1960s, developed a method for synthesizing ibuprofen that was used for many years [19]. This process, known as the brown synthesis method, produced the millions of pounds of desired ibuprofen product, but also millions of pounds of undesired byproducts that had to be disposed of. The percentage atom economy, which is a ratio of the molecular weight of ibuprofen divided by the molecular weight of all reactants, for the brown method was about 40 percent, meaning that more waste was produced from the process than ibuprofen product [19]. More recently, the Hoechst Celanese Corporation and the Boots Company agreed to a joint venture known as the BHC Company in order to develop a greener ibuprofen synthesis process [19]. This process has a percentage atom economy of 77 percent, meaning that the amount of unwanted byproducts is significantly reduced with this new method. 15 This new greener synthesis method consists of three different steps [19]: 1. Acylation of isobutylbenzene using hydrogen fluoride as a catalyst 2. Hydrogenation using Raney nickel catalyst 3. Carbonylation using palladium as a catalyst Each of the catalysts used within the process are recycled and reused to reduce the amount of waste produced during production. Figure 2.3 shows the three step process commonly used to produce ibuprofen that was described earlier. Figure 2.3: Three step ibuprofen production mechanism (Adapted from [19]) 16 Winn emphasized the importance of crystal morphology prediction in relation to industrial and pharmaceutical processes [20]. The morphology of crystals can affect the efficiency of downstream processes for instance filtering, washing and drying. It can also influence material properties such as bulk density and mechanical strength [20]. It can also have an impact on particle flowability, agglomeration and mixing characteristics and also redissolution properties [20]. This is especially important for pharmaceutical products because the crystallization step is one of the last in production and the characteristics of the crystal can have an effect on a drug?s usefulness. Because of the impact that crystal morphology can have on downstream processing and the utility of industrial and pharmaceutical products, it is important to be able to predict and control crystal morphology. Aspect ratio is commonly used to quantify crystal morphology, and is simply the ratio of the longest to shortest crystal dimension [21]. Lower aspect ratio ibuprofen crystals are preferred for the easier downstream processing and higher product quality reasons mentioned earlier. It has been established that for many organic solvents, the solvent utilized in crystallization processes has a strong influence on the resulting crystal morphology [22]. There has been significant study of how crystals of carboxylic acids, and specifically of ibuprofen, are grown within solvents and how those solvents affect the resulting morphology. Before packing into a crystal, ibuprofen molecules form hydrogen-bonded dimers with other ibuprofen molecules via dispersion forces. It is believed that the growth unit of ibuprofen crystals is the non-polar entity of the dimer (isobutyl) [21]. Due to this, ibuprofen will crystallize much differently within polar solvents versus non-polar solvents. Winn and Doherty predict needle-like ibuprofen crystals using non-polar n-hexane as the solvent and plate-like ibuprofen crystals using polar methanol as 17 the solvent [21]. Their predictions accurately match previous experimental data [23] and a representation of those predictions is shown in Figure 2.4: (a) (b) Figure 2.4: Ibuprofen crystal shape when grown in n-hexane (a) and methanol (b) The experimental data used in the work presented in this thesis was previously presented in 2009 by Acquah et al. [24]. In this work, linear models were constructed relating crystal morphology to hydrogen bonding propensities of 16 different solvents. These 16 solvents and their structure were used in the analysis within this thesis and are presented in Chapter 3. Aspect ratio was used to quantify crystal morphology. The solvents selected were not necessarily industrially important or common pharmaceutical solvents. Cooling crystallization was utilized to grow the ibuprofen crystals and the aspect ratio was measured using optical microscopy images [24]. The aspect ratio data was then regressed with 9 different solvent properties. The coefficient of determination for each regression with the property as the independent variable and aspect ratio as the dependent variable is shown in Table 2.1 [24]: Table 2.1: Coefficients of determination for relating AR to hydrogen bonding properties Property Symbol R2 Hansen?s dispersion parameter (MPa1/2) ?D 0.000 Dielectric constant (dimensionless) ? 0.510 Kamlet-Taft hydrogen bond acceptor (dimensionless) ? 0.599 Hansen?s polar parameter (MPa1/2) ?P 0.671 Hildebrand?s total solubility parameter (MPa1/2) ? 0.739 Kamlet-Taft hydrogen bond donor (dimensionless) ? 0.751 Hansen?s hydrogen bonding solubility parameter (MPa1/2) ?H 0.815 Kosower?s parameter (kcal/mol) Z 0.833 Acceptance number (dimensionless) AN 0.925 18 The highest R2 value was achieved with acceptance number, which is defined as the ability of the solvent to form a hydrogen bond by accepting an electron pair of a donor atom from a solute molecule, as the independent variable [25]. Because acceptance number depicts the underlying solute-solvent interaction, it is not surprising that it has the best correlation with the aspect ratio of the ibuprofen crystals. The predictive power of the acceptance number model was evaluated by comparing model predictions to previous experimental work. Lower root mean-squared error (RMSE) values were observed which indicates that the models predict very well. The hydrogen bonding solubility parameter model was used to predict the aspect ratio of ibuprofen in 2-ethoxyethyl acetate, which was not in the original training set. The predicted value (3.4) falls within the interval of the experimental value (3.1?0.5) [24]. The data from this particular study was acquired for the analysis performed in this thesis. The previous analysis related solvent hydrogen bonding properties to ibuprofen aspect ratio, but the work in this thesis will relate the solvent structure to ibuprofen aspect ratio. 2.5 Crystallization Solvent Design Framework A CAMD framework has been developed for ibuprofen crystallization solvent design by Karunanithi et al. [1]. In this work, a mixed-integer nonlinear programming problem was solved to design an ideal solvent for the crystallization of ibuprofen. Their framework incorporated seven solvent properties in order to design a solvent that is safe for pharmaceutical use and also will provide the desired crystal morphology. The seven properties studied within the work are solubility, potential recovery, crystal morphology (estimated using hydrogen bonding solubility parameter), flammability limit, toxicity, viscosity and liquid state of the solvent [1]. Potential recovery was the property that was maximized within the framework. Within the CAMD, group 19 contribution (GC) was used to design optimal solvents for ibuprofen crystallization. The result from this CAMD framework is an overall optimal solvent, methoxymethyl ethoxyacetate, and an optimal solvent among readily available compounds, 2-ethoxyethyl acetate [1]. Both of these molecules are shown in Figure 2.5: Methoxymethyl ethoxyacetate O O O O CH 3 C H 3 2-Ethoxyethyl acetate C H 3O O OCH 3 Figure 2.5: Results from CAMD framework for crystallization solvent design The results of this framework were then experimentally verified [3]. Ibuprofen was crystallized using the cooling crystallization method in 2-ethoxyethyl acetate and also n-hexane for comparison purposes. It has been predicted that ibuprofen crystals grown in n-hexane will have a high aspect ratio [20]. The experimental verification validated the CAMD results in that crystals from 2-ethoxyethyl acetate were considerably larger in size and lower in aspect ratio than those from n-hexane [3]. 2-ethoxyethyl acetate also maximizes potential recovery of ibuprofen among known solvents. 2-ethoxyethyl acetate was selected for use in the test set within this analysis because it was shown to be the optimal solvent for ibuprofen crystallization. The other two solvents chosen for use in the test set were chloroform and decanol. These were chosen because data was available 20 and they were not included in the training set data used by Acquah et al. in their study of linear models using hydrogen bonding properties as independent variables [24]. The experimental data for each of the test set solvents was developed using cooling crystallization by the same research group [26]. 21 Methodology Chapter 3: As described in Chapter 1, the intent of this work is to develop a QSAR that relates solvent structure to the aspect ratio of ibuprofen crystals grown within that solvent. The original training set included 16 solvents for which experimental aspect ratio data was obtained [24]. The training set was later expanded with 35 additional solvents for a total of 51 solvents to expand the application domain for the developed QSAR models. The test set contained three solvents for which experimental aspect ratio data was also obtained. The first step in the method developed in this thesis was to estimate the 3D structure of each molecule within both the training and test sets. Multiple force fields were used to geometry optimize each solvent structure to determine the ultimate effects of different geometry estimations on the QSAR developed. Once the structure geometries had been estimated, their descriptor values were calculated. Finally, two different data regression methods were applied to the descriptor matrices to determine a linear relationship between the descriptor values and the aspect ratio of the ibuprofen crystals. Linear regression was utilized in developing the QSAR models because it has been shown that ibuprofen crystal aspect ratio has a linear relationship with solvent properties [24]. Once models were developed, internal and external validation was performed on each to check the fit and predictive prowess of each QSAR in prediction of ibuprofen crystal aspect ratio. 3.1 Estimating Solvent Geometry The first step in developing this relationship is to estimate the 3D geometry of the solvent molecules. Estimations were made using the Avogadro molecular modeling software applying various force fields that produced different estimations for the structure of each solvent [27] [28]. The three force fields used for structure estimation were the GAFF, Ghemical and MMFF94s described in Chapter 2. Three force fields were chosen to examine the effect of using different 22 geometries for solvent structure estimation and see how the determination of the QSAR changed. In Figure 3.1, n-hexane is optimized using different force fields. In general, the structures estimated using the Ghemical force field and MMFF94s were similar in geometry while the structures estimated with GAFF tended to have very distinct geometries. The criterion for choosing force fields was that they be applicable to the solvent molecules in the experimental data matrix as well as the ibuprofen molecules. While not important within this study, future work includes expanding the models to handle multiple solutes which would require accurate 3D geometries for those molecules as well. Figure 3.1: n-Hexane geometry optimized with Avogadro 3.1.1 Optimizing Structures with Avogadro Each of the sixteen solvent structures were drawn and then geometrically optimized using the three different force fields. 10000 steps were included in each optimization using the steepest descent algorithm option with a convergence criterion of 1 ? 10-6. The energetically minimized molecules were saved as *.mol files for input into the E-DRAGON web applet [29]. The 16 solvents and their two-dimensional structures are shown below in Table 3.1: (GAFF) (Ghemical) (MMFF94s) 23 Table 3.1: Original 16 solvents and their two-dimensional structures CH3 CH3 O Acetone CH3N Acetonitrile CH3 CH3OH CH3 t-Amyl alcohol OH Benzyl alcohol Cl Cl Cl Cl Carbon tetrachloride Cyclohexane CH3 OH Ethanol O OCH3 CH3 Ethyl acetate OH OH Ethylene glycol CH 3 CH 3 n-Hexane OH CH3CH3 Isopropanol OHCH3 Methanol ClCl HH Methylene dichloride CH3 OH OH Propylene glycol O O S Sulfolane CH3 Toluene 3.2 Calculating Descriptors The next step in relating solvent structure to crystal aspect ratio was to calculate the descriptor values for each solvent within the data set. E-DRAGON was chosen to calculate these values because it is provided as a free java web applet and also can easily handle the files produced within Avogadro. The applet calculated over 1600 descriptors for each molecule in less than 10 seconds. This was convenient with respect to calculating descriptors for data regression, but will also be advantageous in any CAMD algorithm because candidate molecules? descriptor values can be calculated with a relatively small computational expense compared to the expense involved with geometry optimization. 24 3.2.1 E-DRAGON E-DRAGON was chosen to calculate descriptors [29]. Within E-DRAGON, all available descriptors were calculated for each solvent. 1666 descriptors were calculated and then classified according to dimensionality from 0D to 3D and then by descriptor class. Only 2D and 3D descriptors were kept for this analysis and the classes analyzed are shown in Table 3.2: Table 3.2: 2D and 3D descriptor classes 2D Descriptors 3D Descriptors Topographical descriptors Randic molecular profiles Walk and path counts Geometrical descriptors Connectivity indices RDF descriptors Information indices 3D-MoRSE descriptors 2D autocorrelations WHIM descriptors Edge adjacency indices GETAWAY descriptors Burgen eigenvalue indices Eigenvalue-based indices 3.2.2 Eliminating Zero Descriptors This left 578 2D descriptors and 721 3D descriptors for each solvent. Seven different descriptors lists were then composed. One list contained only 2D descriptors. Three lists (one for each optimization force field implemented) contained only 3D descriptors. The final three lists added the 2D descriptors to each list of 3D descriptors. The next step was to remove descriptors from each matrix that had values of zero for multiple solvents within the data set and also the descriptors that were constant across all solvents. If a descriptor had values of zero for more than eight solvents, then that descriptor was removed from the data matrix. Table 3.3 shows the size of each of the data matrices used in the regression stage of the QSAR development. 25 Table 3.3: Data matrix sizes Descriptor Set Matrix Size (solvents ? descriptors) 2D 16 ? 341 3D GAFF 16 ? 481 2D & 3D GAFF 16 ? 822 3D Ghemical 16 ? 490 2D & 3D Ghemical 16 ? 831 3D MMFF94s 16 ? 490 2D & 3D MMFF94s 16 ? 831 3.2.3 Normalizing Descriptors The final step in preparing the data was normalizing the data so that no descriptor?s importance was artificially inflated/deflated by having a large/small value. The equation for normalization is shown below as Equation 3.1: (3.1) With this normalization procedure, descriptor values that were the same as the mean have a normalized value of zero. Values that were one standard deviation above the mean have a normalized value of positive one and conversely, values one standard deviation below the mean are normalized to negative one. The ibuprofen aspect ratio for each solvent was normalized in the same manner. All regression operations and predictions were carried out with the normalized vales. Predicted aspect ratios from the QSAR models were therefore also normalized, so then Equation 3.1 was applied in reverse to determine the actual predicted value. Validation methods produced the same results if applied on the normalized value or the actual value. 3.3 Regression and Analysis Methods Due to the large number of descriptors calculated for each solvent and the small size of the training set, traditional linear regression was not an option for building the QSARs. The data 26 matrix needed to be reduced in size while still capturing the important information within it. The first method used was the Bayesian Information Criterion (BIC) method which chooses a certain number of descriptors to be used and discards the others such that the BIC value is minimized. Then linear regression was performed using the remaining descriptors to determine the model equation. The second method used was Principal Component Analysis (PCA) which calculates a number of factors which cover a certain percentage of the variance within the original data set. Linear regression was performed using these factors to determine the model equation. Both of the regression methods were applied on the normalized descriptor values. 3.3.1 BIC in JMP? The BIC algorithm selects variables that increase the likelihood of model fit without overfitting the model. BIC introduces a penalty for addition of variables into the model. In this way, BIC selects the optimal number of variables that results in a better model fit [30]. The models produced using BIC are in the form of Equation 3.2: (3.2) where is a regression coefficient and is a descriptor selected by JMP?. 3.3.1.1 Variable Selection One of the benefits of the BIC method relative to PCA is that it selects certain descriptors to be used in linear regression while PCA calculates factors that are linear combinations of all the descriptors within the training set matrix. Therefore a BIC model will require a lot less computational expense than a PCA model when used in CAMD frameworks. PCA models will require each of several hundred descriptors to be calculated while BIC models will only need the descriptor values needed in the model (14 and fewer in this analysis). 27 3.3.2 PCA PCA was also used to create a QSAR relating solvent structure to crystal aspect ratio. PCA is very different from BIC methods because it does not eliminate descriptors from the analysis, but instead transforms the large, highly correlated data matrix into a smaller matrix of uncorrelated factors that are linear combinations of the original data set. Each factor covers a certain percentage of the variance within the original data matrix. The total number of factors is equal to one less than the number of solvents included in each analysis. XLSTAT?, an add-in within Microsoft Excel?, was utilized to carry out all the PCA and linear regression calculations [31]. When PCA is applied to one of the descriptor matrices, it calculates eigenvalues for each factor that shows how much variance of the data that factor covers. These eigenvalues can be graphed on a Scree plot, shown below in Figure 3.2: Figure 3.2: Scree plot Typically, there is an elbow in the cumulative variability covered at which adding additional factors do not cover much more of the variance in the data set. In the above plot, that point 0 20 40 60 80 100 0 50 100 150 200 250 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 Cum ulat ive vari abi lity (%) Eig env alue Factors Scree plot 28 appears to be at eight or nine factors. One method of selecting how many factors to use in linear regression is to choose a minimum variability coverage percentage (i.e. 85%). In order to calculate the factor scores for each solvent within the data set, PCA calculates an eigenvector (loadings) matrix and a scores matrix to decompose the original data matrix. These two matrices are calculated so that the error between the decomposition matrices and the original matrix is minimized. Figure 3.3 below shows how the original data matrix ? is decomposed into a factors matrix and a loadings matrix . The number of factors calculated within the scores matrix is . PCA decomposes the large matrix with variables to a smaller matrix containing factors where . The scores matrix contains the principal component factors that represent the original data. Figure 3.3: PCA matrix decomposition. PCA was applied to each of the seven descriptor matrices and seven different factor matrices were produced. The descriptor matrices were different sizes depending on the types of descriptors they contain, but through PCA each scores matrix had the same dimension. The scores matrices contain the independent variables for linear regression. Linear regression was chosen to build the models over nonlinear, power or exponential regression because a strong 29 linear relationship between ibuprofen aspect ratio and many solvent properties has been shown [24]. Principal component regression (PCR) was performed to determine the multivariable linear equations that make up the QSAR models to predict aspect ratio. For each of the seven factor matrices, models were regressed repeatedly, starting with just one factor and then adding factors until the maximum number was reached. XLSTAT?s? data input system allowed this to be done quickly and easily. This was necessary for the internal and external validation steps. Each of the models produced were in the form of Equation 3.3: (3.3) where is a regression coefficient and is a PC from PCA. Using the eigenvector matrix from Figure 3.3, the factors for the test set solvents can be calculated in order to insert them into the model equation to determine the predicted aspect ratio for ibuprofen grown in that particular solvent. 3.3.3 Validation The validation techniques used to determine the predictive ability of the QSAR models include percent error analysis, the coefficient of determination (R2) and predictive squared correlation coefficient (Q2) methods described in Chapter 2. External validation for the models produced using BIC was performed by analyzing R2 values and the percent error between the experimental values and predicted values within the test set. The experimental aspect ratios and predicted values were also plotted to analyze any trends within the data. In addition the percent error analysis, external validation was also performed R2 and Q2 values for each model developed using the BIC method. The test set for this analysis included 2-ethoxyethyl acetate, chloroform and decanol. For the PCA models, external and 30 internal validation was performed using the same R2 and Q2 analysis as with BIC. The external test set used one solvent, 2-ethoxyethyl acetate. 2-ethoxyethyl acetate was chosen as the sole solvent for the test set because it was the same solvent used by Acquah for his external validation analysis. The internal test set removed isopropanol from the training set to create the test set. Isopropanol was chosen because its value was close to the mean value of the training set and through inspection it was structurally similar to many solvents within the training set. Cross validation was performed using Q2 and R2 values. As shown in Chapter 2, Q2 is defined as [17]: (2.4) where PRESS is the sum of squares of the prediction errors for the test set, and TSS is the total sum of squares which is the sum of squared deviations from the training set mean. As shown in Chapter 2, R2 is defined as [17]: (2.5) where RSS is the residual sum of squares from the training set data and TSS is the total sum of squares for the training set data. Similar to how R2 values near one represent a strong correlation between the predicted and experimental values within the training set; a Q2 value closer to one represents a stronger predictive model. The Q2 equation takes into account the error between the experimental value and the predicted value but also how far the experimental value is from the mean of the training set. If the values in the test set are closer to the values within the training set, then the error between predicted and experimental needs to be smaller in order to achieve a high Q2 value. Conversely, if the experimental values within the test set are further from the values in the training set, the error between predicted and experimental values can be larger and still achieve a high Q2 value. 31 It is important for a model to have strong correlation with both the training set and the test set data. A model with a high R2 value and a low Q2 value fits the training set data well, but is not useful for predictive purposes. A model with a low R2 value and a high Q2 value does not fit the training set data well but does have some predictive capabilities. The strongest models will have R2 values and Q2 values very close to each other, and obviously the higher those values are the more powerful the model is overall. 3.4 Data Set Expansion The original set of experimental data, with only 16 solvents, lacks the diversity and variability of molecular structures to build a QSAR with strong predictive capabilities. In order to improve the predictive capabilities of the models developed, more solvents were added to the data set. Solvent were not randomly chosen to fill in the data set. The criterion for solvents to be added to the data set were that it be a liquid at the temperatures used in the experimental work and also similar in structure to the solvents in the original data set. Solvents were also chosen to expand the chemical space covered by the model so that the solvents used for external validation would be within or very close to that chemical space. This is why several long chain alkanes and alcohols were included within the expansion set. A ?bridge? was built between the chemical space covered by the original data set to those solvents utilized for external validation. Figure 3.4 shows the original solvents along with the expansion solvents. The lines show the connections between the original data solvents and the expansion solvents. 32 Figure 3.4: Original solvents and expansion solvents. 33 As shown in Figure 3.4, 35 new solvents were added to the data set. Since the solvents chosen were structurally very similar to at least one original solvent, an assumption was made that the predicted aspect ratio would be a good approximation of the actual experimental aspect ratio. The predicted aspect ratio for the expansion solvents was calculated using a model constructed using 2D descriptors with the PCA regression method. Also, the maximum number of factors available was used for each linear regression. The model built with only 2D descriptors was chosen because it had a strong correlation to the training set data (R2 > 0.95). Instead of arbitrarily choosing one of the force fields for the expansion solvents, 2D descriptors were chosen because their values do not change with geometry optimization. Adding the additional solvents was an iterative process. One solvent would be added and its aspect ratio value was estimated. Then that solvent was added to the training set and a prediction was made for the next solvent. This was done repeatedly until all the expansion solvents had been added. Figure 3.5 shows the iterative process repeated for each solvent added. Figure 3.5: Iterative process to add expansion solvents to training set 34 Once the expanded data set of 51 solvents had been developed, PCA with linear regression was performed on the data set in a similar manner to that done on the original set of 16 solvents. Models were built using the same seven sets of descriptor values described earlier and then internal and external validation was performed also in the same manner. 3.5 Summary In this chapter, a method has been presented for constructing QSAR models that relate solvent structure to the aspect ratio of ibuprofen crystals grown within them. Three empirical force fields were used to estimate the three-dimensional structure of each solvent and then a combination of 2D and 3D molecular descriptors were calculated to quantify the solvent structures. These descriptors were then arranged into matrices for regression with aspect ratio data. Two techniques, BIC and PCA, were used to linearly relate the information in the descriptors to the aspect ratio data. External and internal validation was performed to assess the model?s ability to fit the training set and test set data. The original solvent data set included 16 solvents, but was expanded to 51 solvents in order to increase the chemical space covered by the models. 35 Results and Discussion Chapter 4: In this section, the results from each analysis are presented and analyzed. The BIC method was first used to construct a QSAR model that contained a minimal number of descriptors. While the models built using this method were able to achieve a high correlation to the training set aspect ratio values, they were unable to make satisfactory predictions of aspect ratio for solvents in the test set. In an attempt to develop a model with stronger predictive capabilities, PCA was used using the same training set data. Most of the models built with this method had strong internal validation with isopropanol as the solvent transferred from the training set to the test set. However, most of the QSARs developed left much to be desired when they were externally validated with 2-ethoxyethyl acetate in the test set. It was hypothesized that expanding the data set to include many more solvents would increase the predictive capabilities of the models developed using PCA. With 51 solvents in the training set, the QSARs developed maintained their strong internal validation with isopropanol as the solvent shifted from the training set to the test set. Similarly to the models constructed with the smaller training set, the external validation with 2-ethoxyethyl acetate in the test set did not show powerful predictive capabilities, although they were improved over the smaller training set. 4.1 Training Set Aspect Ratio Data The training set solvents displayed in Table 3.1 were utilized for the BIC analysis and the first PCA analysis. The aspect ratios for these 16 solvents were acquired from Acquah et al. and are shown in Table 4.1 [24]: 36 Table 4.1: 16 solvent training set aspect ratio data Solvent Aspect Ratio Acetone 4.27 Acetonitrile 3.01 Benzyl alcohol 2.63 Carbon tetrachloride 4.81 Cyclohexane 5.64 Ethanol 2.85 Ethyl acetate 4.65 Ethylene glycol 2.20 n-Hexane 7.23 Isopropanol 3.10 Methanol 1.85 Methylene dichloride 3.20 Propylene glycol 3.02 Sulfolane 4.05 t-Amyl alcohol 3.21 Toluene 4.94 4.2 BIC in JMP? The BIC models constructed using JMP? software were constructed only using the original 16 solvent data points. Each model had a very strong correlation with the training set data, with an R2 value of 1.00 on each one. However, the predictive capabilities of these models were much poorer than expected. As shown in this section, this regression method did not produce reliable and consistent results to merit using it on the expanded training set with additional solvent molecules added. 7 different BIC models were constructed using the different sets of descriptors explained in Chapter 3. They all were in the form of Equation 3.2 and each contained a unique set of 14 different descriptors. 14 descriptors were chosen by the program which covered all the degrees of freedom in the regression and allowed for the high R2 values mentioned earlier. The descriptors selected by JMP? are shown in Table 4.2 below, with the meaning on each descriptor abbreviation shown in Table A.1. 37 Table 4.2: Descriptors selected for BIC models 2D 3D GAFF 2D & 3D GAFF 3D Ghemical 2D & 3D Ghemical 3D MMFF94s 2D & 3D MMFF94s Ram SP01 S2K SP01 BAC SP13 SPI TI2 L/Bw AAC SP03 X0v PJI3 MAXDP Rww DISPe SIC1 MEcc SIC2 RDF040e ECC Jhetv RDF015m MATS1v RDF040e MATS3p Mor11m DECC Jhete RDF025m EPS0 RDF045p GATS1v Mor12e IC2 S1K Mor32u ESpm04d Mor27u BEHe8 Mor05p EEig01d Lop Mor13p DISPv Mor18m MEcc E1s BEHm1 X0v E1e Mor03u Mor01v RDF035v H3u BEHv4 SIC1 L2s Mor03e Mor13e Mor16v HATS3u BEHp8 CIC2 G2s E1e Mor28p Mor13e HATS3m Mor06p MATS2p Ks G3s Dm Gs R3v+ E2v ESpm15u R3m+ Kv HATS5v ISH R1e+ E1e ESpm04d R5e Kp R5u HATS3u RTe+ R5u+ BEHm4 R3p R2u R5u+ R1u R3p+ RTe+ There is very little overlap in descriptors selected over the seven different descriptor matrices. This outcome was not expected in that the important interactions between the ibuprofen and the solvents were thought to be same; therefore the effects of certain descriptors would be magnified over all sets of descriptors. Therefore, it was expected that similar 2D descriptors would be selected by the program in the model built solely with those descriptors as well as the models built with 2D descriptors and a set of 3D descriptors. The QSAR model equations constructed via BIC method in JMP? are included in Table A.2. 4.2.1 External Validation In order to determine the predictive capabilities of each QSAR model, they were used to predict the aspect ratio of ibuprofen crystal grown in test set solvents. The test set for these models included 2-ethoxyethyl acetate, chloroform and decanol. These solvents were chosen for the test set because experimental data was readily available and they were not used in the linear models constructed by Acquah [24]. Figure 4.1 through Figure 4.3 compare the predicted values for each solvent compared with their experimental values. The solid black lines represent the 38 experimental value and the markers represent the predicted aspect ratio values from the models developed using the BIC method for each set of descriptors. Figure 4.1: BIC experimental comparison with 2-ethoxyethyl acetate Figure 4.2: BIC experimental comparison with chloroform 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 2D 3D GAFF 2D & 3D GAFF 3D Ghemical 2D & 3D Ghemical 3D MMFF94s 2D & 3D MMFF94s As pect Rati o 2-Ethoxyethyl Acetate Experimental Predicted 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 2D 3D GAFF 2D & 3D GAFF 3D Ghemical 2D & 3D Ghemical 3D MMFF94s 2D & 3D MMFF94s Asp ect Ra tio Chloroform Experimental Predicted 39 Figure 4.3: BIC experimental comparison with decanol There is very little consistency between the experimental and predicted AR values. The models constructed with 2D descriptors provided the best predictions for the test set solvents, but it consistently predicted aspect ratios above what was observed in experimental work. The model built using the 3D Ghemical descriptor set makes an accurate prediction for the aspect ratio of ibuprofen grown in chloroform but inaccurate predictions with the other two test set solvents. The same is true with 3D MMFF94s in predicting an accurate aspect ratio of ibuprofen grown in decanol but poor predictions in the other solvents. Table 4.3 below shows the percent errors between the predicted and experimental values for each QSAR model and each solvent within the test set. 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 2D 3D GAFF 2D & 3D GAFF 3D Ghemical 2D & 3D Ghemical 3D MMFF94s 2D & 3D MMFF94s Asp ect Ra tio Decanol Experimental Predicted 40 Table 4.3: External validation of BIC models 2-Ethoxyethyl acetate Chloroform Decanol Experimental AR 3.1 2.3 1.98 Model Predicted AR Percent Error Predicted AR Percent Error Predicted AR Percent Error 2D 4.17 34% 3.02 31% 3.69 86% 3D GAFF 4.51 45% 6.24 171% 8.20 314% 2D & 3D GAFF 1.00 68% 3.83 67% 0.98 51% 3D Ghemical 12.68 309% 2.65 15% 5.13 159% 2D & 3D Ghemical 6.22 101% 3.25 41% 5.61 183% 3D MMFF94s 4.17 34% 4.66 103% 1.98 0% 2D & 3D MMFF94s 5.56 79% 3.57 55% 5.71 188% As shown in Figure 4.1 through Figure 4.3 and Table 4.3, there are no easily seen patterns between model predictions across the different solvents or between the models across the same solvent. The models described above each used the maximum number of descriptors chosen by the BIC algorithm. The results above are consistent with a model that has been overfitted to the training set data, sacrificing its predictive capabilities. In order to improve the predictive power of models developed with method, some of the descriptors were removed from the model equations to eliminate overfitting and improve the predictive capabilities of the QSARs. The descriptors were arbitrarily removed from the end of the BIC equation. These models were analyzed using R2 and Q2 values. The test set for this analysis contained the same three solvents as in the previous analysis. The results of this analysis are shown below in Figure 4.4 through Figure 4.10. The vertical axis scale is fixed from zero to one, therefore any Q2 value less than zero is not shown. 41 While the Q2 value was less than zero for most QSAR models constructed using the BIC method, significant improvements were seen when PCA methods were used. Figure 4.4: 2D BIC Q2 external validation Figure 4.5: 3D GAFF BIC Q2 external validation 42 Figure 4.6: 2D & 3D GAFF BIC Q2 external validation Figure 4.7: 3D Ghemical BIC Q2 external validation 43 Figure 4.8: 2D & 3D Ghemical BIC Q2 external validation Figure 4.9: 3D MMFF94s BIC Q2 external validation 44 Figure 4.10: 2D & 3D MMFF94s BIC Q2 external validation All of the models built using the BIC method were able to achieve high correlation with the training set data. For all the models except those with 3D MMFF94s descriptors, an R2 value above 0.9 could be achieved using 10 descriptors in the training set. For the 3D MMFF94s and 2D & 3D MMFF94s descriptor sets, an R2 value of 1.00 was achieved with 14 descriptors. The only QSAR that had any positive Q2 values was the one built with only 2D descriptors. In that model, the highest Q2 value attained was around 0.7 when 13 descriptors were used. In the remainder of the models, the Q2 values were negative which represents very poor predictive capabilities. It had been hypothesized that eliminating some of the descriptors would remove the overfitting within the models and improve predictive capabilities. Through this analysis, it can be determined that the BIC method of regressing QSARs does not provide adequate predictive capabilities for ibuprofen crystal aspect ratio. Therefore, a different method of analysis will be needed to create the desired model. 4.3 PCA Using 16 Solvents PCA was selected to build the QSARs after the BIC method was unable to produce models that adequately fit the training set and also had strong predictive capabilities. PCA was applied to the 45 training set containing the same 16 solvents as used in the BIC method. The models were in the form of Equation 3.3. Unlike the BIC method, PCA used all the descriptor values within the training and test sets. Also, in these analyses presented from this point on, chloroform and decanol were removed from the external validation test set. These two solvents were removed from the test set because they were possibly far outside the applicable domain for the model. Also, the study that provided this data only included 2-ethoxyethyl acetate in the test set therefore that same test was used to analyze the PCA models [24]. A strong predictive model will have high R2 and Q2 values for both internal and external validation at the same number of factors. 4.3.1 Internal Validation Internal validation was performed on the models constructed with PCA using isopropanol as the solvent to be left out and then used within the test set. As isopropanol was left out of the training set, the models were built using the remaining 15 solvents. The results of this analysis are shown in Figure 4.11 through Figure 4.17. In each of the seven descriptor sets, the R2 value increases as more factors are added to the model. Figure 4.11: 2D PCA Q2 internal validation 46 In the 2D model, the Q2 value fluctuates as more factors are added. At the highest R2 values, the Q2 value is very low. The optimal model using only 2D descriptors contains 11 PC factors. Figure 4.12: 3D GAFF PCA Q2 internal validation In the 3D GAFF model, the Q2 reaches a plateau at 3 and then decreases after 9 factors. At the highest R2 values, the Q2 value is very low. The optimal model for 3D GAFF descriptors utilizes 9 factors. Figure 4.13: 2D & 3D GAFF PCA Q2 internal validation 47 In the 2D & 3D GAFF model, the Q2 model again plateaus with 3 factors but then decreases as more than 10 factors are added. As would be expected in a model combining the descriptors from the previous two, at the highest R2 values the Q2 value is very low. The optimal model for 2D & 3D GAFF descriptors is one containing 10 factors. Figure 4.14: 3D Ghemical PCA Q2 internal validation In the 3D Ghemical model, the Q2 value plateaus with at 3 factors until 9 factors and then decreases before rising again at 13 factors. The optimal model for 3D Ghemical descriptors is one using 9 factors. 48 Figure 4.15: 2D & 3D Ghemical PCA Q2 internal validation In the model with 2D & 3D Ghemical descriptors, the Q2 value has a similar profile to the model built with only 3D Ghemical descriptors, but the decrease after 10 factors is much smaller. The optimal model for 2D & 3D Ghemical descriptors utilizes 10 factors. Figure 4.16: 3D MMFF94s PCA Q2 internal validation The model with 3D MMFF94s descriptors has high Q2 and R2 values as more factors are added. There are no significant decreases with more factors. Optimal models are observed using between 8 and 13 factors. 49 Figure 4.17: 2D & 3D MMFF94s PCA Q2 internal validation With 2D descriptors added to the 3D MMFF94s model, the results are very similar with high R2 and Q2 values as more factors are added with no significant decreases. Similar to the models using only 3D MMFF94s descriptors, optimal models for 2D & 3D MMFF94s are observed with 7 to 13 factors. Overall, the 2D, 3D GAFF and 2D & 3D GAFF models validated adequately internally. The 3D Ghemical, 2D & 3D Ghemical, 3D MMFF94s and 2D & 3D MMFF94s models internally validated very strongly with the MMFF94s models validating very strongly with isopropanol. Since these models are able to internally validate strongly, the next step in the analysis is to insert isopropanol back into the training set and using 2-ethoxyethyl acetate for external validation. 4.3.2 External Validation The results of the external validation are not as strong as those from the internal validation. They are presented as Figure 4.18 through Figure 4.24. The vertical axis scale is fixed from zero to one, therefore any Q2 value less than zero is not shown. 50 Figure 4.18: 2D PCA Q2 external validation The R2 value of the 2D model increases as the number of factors increases, but the Q2 value is only positive for models including 5, 8, and 9 factors. Figure 4.19: 3D GAFF PCA Q2 external validation The 3D GAFF model has a high R2 value with 13 and 14 factors, but the Q2 value is very low regardless of the number of factors. 51 Figure 4.20: 2D & 3D GAFF PCA Q2 external validation The 2D & 3D GAFF model has high R2 values at 13 and 14 factors, and also has a high Q2 value with 14 factors included. Overall however, the models that include the GAFF descriptor sets do not internally validate strongly. Figure 4.21: 3D Ghemical PCA Q2 external validation The 3D Ghemical model once again has high R2 values as more factors are added, but only has a positive Q2 value with 14 factors, and that value is relatively low. 52 Figure 4.22: 2D & 3D Ghemical PCA Q2 external validation With the 2D descriptors added to the 3D Ghemical model, the same high R2 values are seen. In addition, a high Q2 value is seen with 13 and 14 factors. This model also internally validates strongly and is the best QSAR model built using the 16 solvent training set. Figure 4.23: 3D MMFF94s PCA Q2 external validation The 3D MMFF94s model has a high R2 value with 8 through 14 factors, but the Q2 value is very low regardless of the number of factors. 53 Figure 4.24: 2D & 3D MMFF94s PCA Q2 external validation The 2D & 3D MMFF94s model has a high R2 value with 7 through 14 factors, but the Q2 value is very low regardless of the number of factors. Only one model is able to validate strongly both internally and externally and that model is constructed using 2D & 3D Ghemical descriptors with a high number of factors included (Figure 4.15 and Figure 4.22). This model is able to fit the training set data and also make accurate an accurate prediction for the aspect ratio of ibuprofen grown in 2-ethoxyethyl acetate. With only one model able to meet both criteria for a strong predictive model, it was hypothesized that increasing the size of the training set would improve both the internal and external validation of models. 4.4 Expanded Training Set Aspect Ratio Data The expansion solvents displayed in Figure 3.4 along with the 16 solvents shown in Table 3.1 were utilized in the second PCA analysis. The expansion solvents and their corresponding ibuprofen crystal aspect ratios are shown in Table 4.4. The aspect ratios for these solvents were estimated using the method outlined in Section 3.4. The order of the solvents within the table corresponds to the order in which they were estimated. For example, the aspect ratio of ibuprofen 54 grown in butanone was estimated using the original 16 solvents in the training set. This step was then iterated for each expansion solvent as shown in Figure 3.5. In the final iteration, the aspect ratio of ibuprofen grown in 1-nonanol was estimated using the original 16 solvents and all of the expansion solvents within the training set. Table 4.4: Expansion solvent training set estimated aspect ratio data Solvent Aspect Ratio Butanone 3.97 Diacetone alcohol 3.70 Phenethyl alcohol 1.97 2-Phenylpropanol 1.36 Cycloheptane 4.85 Cyclopentane 5.13 Propyl acetate 4.64 Methyl acetate 4.21 1,3-Butanediol 3.75 2,3-Butanediol 2.88 3-Methylpentane 6.30 2-Methylpentane 6.16 Propanol 2.81 t-Butanol 2.97 Methanediol 3.00 1,2-Dichloroethane 3.27 1,1,1-Trichloroethane 3.85 Glycerol 3.45 1,3-Dihydroxypropane 3.61 1-Pentanol 4.85 2-Pentanol 4.09 Ethylbenzene 4.32 Cumene 3.50 2,2-Dimethyl-1-butanol 3.82 1-Hexanol 4.84 2-Methyl-1-pentanol 4.25 n-Heptane 6.89 n-Octane 5.98 n-Nonane 5.68 n-Decane 5.02 1-Ethoxyethyl acetate 4.31 2-Methoxyethyl acetate 4.44 1-Heptanol 4.03 1-Octanol 3.77 1-Nonanol 3.14 55 4.5 PCA Using 51 Solvents Using the method outlined in Chapter 3, the training set was expanded with 35 additional solvents to bring the total to 51 solvents. The same analysis described in the previous section was performed again with the expanded training set. Isopropanol was removed from the training set and moved to the test set for internal validation. It was then reinserted into the training set for external validation with 2-ethoxyethyl acetate in the test set. Just as with the smaller training set, the ideal model will have high R2 and Q2 values with the same amount of factors within the model equation. 4.5.1 Internal Validation The results of internal validation are shown in Figure 4.25 through Figure 4.31. The vertical axis scale is fixed from zero to one, therefore any Q2 value less than zero is not shown. Figure 4.25: 2D PCA Q2 internal validation with 51 solvents The 2D QSAR has consistently high R2 and Q2 values from about 12 factors and higher. However, there is a sharp decline in Q2 around 6 factors. Optimal models are seen with 12+ factors. 56 Figure 4.26: 3D GAFF PCA Q2 internal validation with 51 solvents The 3D GAFF model requires a lot of factors to reach a high R2 value and about 12 factors to reach a high Q2 value. Similar to the 2D model, there is a decline in Q2 around 6 factors. The optimal model using 3D GAFF descriptors is seen with 46 factors, and adequate models are observed with 30+ factors. Figure 4.27: 2D & 3D GAFF PCA Q2 internal validation with 51 solvents 57 The 2D & 3D GAFF model achieves high R2 and Q2 values from about 20 factors and higher, making these the optimal models for this descriptor set. There is the same sharp decline around 6 factors as seen in the 2 previous models. Figure 4.28: 3D Ghemical PCA Q2 internal validation with 51 solvents The 3D Ghemical model achieves high R2 and Q2 values from about 16 factors and higher. Following the trend of the previous models, there is decline in Q2 value around 6 factors. Optimal models are seen with 16+ PC factors used for PCR. Figure 4.29: 2D & 3D Ghemical PCA Q2 internal validation with 51 solvents 58 The 2D & 3D Ghemical model reaches and maintains high R2 and Q2 values about 18 factors. Once again, the same sharp decline in Q2 value is seen around 6 factors. Similar to the 3D Ghemical models, optimal models include 18+ descriptors. Figure 4.30: 3D MMFF94s PCA Q2 internal validation with 51 solvents The 3D MMFF94s model Q2 value fluctuates up to a plateau from about 15 to 30 factors and then decreases. The R2 value steadily increases as more factors are added. The top models are seen with about 32 factors included. Figure 4.31: 2D & 3D MMFF94s PCA Q2 internal validation with 51 solvents 59 The 2D & 3D MMFF94s model achieves high R2 and Q2 values around 20 factors and higher. Similar to most of the previous models, there is a sharp decline in Q2 around 6 factors. Optimal models are seen when 24+ factors are used. Overall, most of the models internally validated very strongly. There was a significant improvement in internal validation when the expansion solvents were added to the training set. The models were able to consistently reach high R2 and Q2 values as more factors were added. This was not seen in the model built using the original 16 solvents. The models using 3D descriptors only were improved with the addition of 2D descriptors to their data matrix. The next step was to move isopropanol back into the training set and then use 2-ethoxyethyl acetate in the training set for external validation. 4.5.2 External Validation The results of the external validation analysis with 2-ethoxyethyl acetate in the test set are shown in Figure 4.32 through Figure 4.38. The vertical axis scale is fixed from zero to one, therefore any Q2 value less than zero is not shown. Figure 4.32: 2D PCA Q2 external validation with 51 solvents 60 Similar to the internal validation, the R2 value reaches and maintains a very high level with about 14 factors. The Q2 value is very high from 3 to 9 factors and then as more factors are added it decreases significantly. Figure 4.33: 3D GAFF PCA Q2 external validation with 51 solvents In the 3D GAFF model, many factors are needed to reach a high R2 value. The Q2 value fluctuates at a fairly high level from 14 to 44 factors. Optimal models are observed when 33 to 45 PC factors are used. Figure 4.34: 2D & 3D GAFF PCA Q2 external validation with 51 solvents 61 In the 2D & 3D GAFF model, the Q2 value is initially very high, but has a sharp decline as soon as the R2 value reaches very high levels. Figure 4.35: 3D Ghemical PCA Q2 external validation with 51 solvents In the 3D Ghemical model, the R2 value reaches a high level about 15 factors but the Q2 value never reaches a high level. Figure 4.36: 2D & 3D Ghemical PCA Q2 external validation with 51 solvents In the 2D & 3D Ghemical mode, the Q2 value is high from 5 to 8 factors, but decreases sharply with more factors. The R2 value reaches a high level with about 14 factors and higher. 62 Figure 4.37: 3D MMFF94s PCA Q2 external validation with 51 solvents In the 3D MMFF94s model, there is a sharp spike in Q2 with 5 factors and smaller spikes as more factors are added. However, the R2 value only reaches high levels after many factors are added. Figure 4.38: 2D & 3D MMFF94s PCA Q2 external validation with 51 solvents In the 2D & 3D MMFF94s model, a high Q2 level is reached with 5 to 7 factors, however, the R2 value only reaches high values after many factors are added. 63 With internal validation having improved greatly with the addition of the 35 expansion solvents, it was expected that the external validation would show a similar improvement. And while the models made better predictions using the expanded training set than with the original training set, they did not produce strong predictive models. While there are certain factor numbers from certain models that do have high R2 and Q2 values in both internal and external validation, it is desired that the models consistently have those high values. In comparing many models, the Q2 value in internal validation is relatively low with fewer factors and then reaches a high plateau as more factors are added. However, the Q2 value in external validation is high with fewer factors and then tends to decrease as more factors are added. 4.6 Summary Ultimately, there is no model type that consistently produces strong internal and external validation results. The QSARs produced using the BIC method appear to be overfitted in that they fit the training set data very well but provide poor, inconsistent predictions for the test set solvents. The results shown using PCA show improvement over BIC, especially with the internal validation analysis. Almost all of the models were able to produce the desired results with isopropanol with both the original and expanded training set. Therefore, it can be concluded that isopropanol is well within the descriptor space of the both the original and expanded training sets. 2-Ethoxyethyl acetate should produce similar results if it is within the chemical space of the training set. However, it has been shown that 2-ethoxyethyl acetate is not within the chemical space of the original or expanded training sets based upon ibuprofen crystal aspect ratio. 64 Conclusions Chapter 5: The final results of the QSAR building technique developed in this project were not as powerful as expected but do show significant promise. The models constructed using BIC methods did not have very strong predictive capabilities with 16 solvents included in the training set. The QSARs constructed with PCA using the 16 solvent training set showed an improvement over the BIC method models, but did not consistently produce strong predictive models. When the expansion solvents were added to the training set to bring the total to 51 solvents, the QSARs showed significant increases in predictive capabilities but still left much to be desired. However, the improvement seen with a larger training set, especially with internal validation, shows that acquiring more experimental data could provide substantial improvements in predictive capabilities. 5.1 BIC in JMP? As shown in Section 4.2, the models built with BIC methods were able to fit the training set data very well but did not fit the test set data. Across all of the descriptor sets, the coefficient of determination was 1.00. However, the models were unable to accurately predict the aspect ratio of ibuprofen crystals grown in the solvents within the test set. There did not appear to be any systematic error in the predictions. When descriptors were left out of the QSAR regression in an attempt to increase the predictive power of the model by sacrificing the fit to the training set data, no significant improvements were observed. No particular model produced consistent results for each test set solvent, nor was there any consistency in error for each particular solvent across all descriptor sets. The model selects a small number of descriptors from the data matrix, and it is possible that the descriptors selected do not provide enough information to make accurate predictions of the 65 aspect ratio of ibuprofen grown in the test set solvents. The information in the descriptors excluded from the regression could be necessary in order to produce QSARs that can accurately predict crystal aspect ratios. It would be preferred that as few descriptors as possible be included in the model because then any CAMD frameworks would be less computationally expensive because they only need to calculate for those few descriptors chosen and not the entire set. So while using all of the descriptors in the data matrix adds to the computational expense of utilizing any predictive model, it will be necessary to capture enough information to create an accurate predictive model. This was the reason that PCA was then used to build models because it includes all of the information included in the data matrix as every descriptor is used to calculate the factors. 5.2 PCA Overall, the models built with PCA performed better than the models built using BIC. Because PCA methods use all of the descriptors, they capture all of the relevant information within the data set. They also capture all of the data that may be extraneous and not relevant to the resulting crystal aspect ratio which can make any resulting CAMD framework more computationally expensive. 5.2.1 16 Solvents The results of PCA with 16 solvents show much more consistency than previous models. Internal validation using isopropanol in the test set showed very good results, especially for the models using Ghemical and MMFF94s descriptors. The similarities between these two are not unexpected because the geometry optimized molecules from these two force fields were very similar in shape. Internal validation using GAFF descriptors did not produce the same results, which can likely be attributed to the very different shape of the molecules that were geometry 66 optimized using the GAFF force field. Internal validation of the models that only used 2D descriptors were also not very good, but the addition of 2D descriptors to the models that used 3D descriptors produced better QSARs. When isopropanol was returned to the training set and 2-ethoxyethyl acetate was used as the test set for external validation, the results were significantly worse. The models were unable to consistently and correctly predict the aspect ratio of ibuprofen crystals grown in 2-ethoxyethyl acetate. Only the 2D & 3D GAFF and the 2D & 3D Ghemical models with a high number of factors made accurate predictions. When looking at the results of internal and external validation together, only one model was able to accurately predict the crystal aspect ratio for internal and external validation. That model was constructed using 2D & 3D Ghemical optimized descriptors. It can be concluded from the internal validation analysis that isopropanol is firmly within the chemical space outlined by the remainder of the training set solvents. The external validation analysis shows that 2-ethoxyethyl acetate is not within that same chemical space. Therefore, the data set was further populated in an effort to increase the size of the chemical space. 5.2.2 51 Solvents The expansion solvents were added to the 16 original solvents, making the training set contain 51 solvents. It was hypothesized that populating the data set in the manner would increase the chemical space to better include 2-ethoxyethyl acetate and produce better predictive models. The results seen for internal validation with isopropanol in the test set were as expected. Isopropanol is certainly within the chemical space where these models are applicable. As seen with the original solvents, the best results were seen with the 2D & 3D Ghemical and 2D & 3D 67 MMFF94s models. Marked improvements were seen with the 2D and 2D & 3D GAFF models as well. When isopropanol was returned to the training set and 2-ethoxyethyl acetate was added to the test set, an improvement along the same lines as seen with internal validation was expected as the chemical space would be expanded to contain the new external test set solvent. The results for external validation were once again disappointing in that while improvements were made, they did not match the results seen during internal validation. In an unexpected result, the 3D GAFF model was the only one to simultaneously achieve both high R2 and Q2 values for both internal and external validation. From the analysis of the expanded data set, it can be concluded that isopropanol is still well within the chemical space of the rest of the training set. However, while it can be concluded that 2-ethoxyethyl acetate is closer to being within the chemical space of the training set with the expansion solvents added, it still falls outside that chemical space. 5.3 Impact of Geometry Optimization Force Fields From observing the shapes of the molecules produced through geometry optimization, there was a sharp contrast between the predictions for the GAFF force field versus the Ghemical and MMFF94s force fields. This can be seen very clearly in Figure 3.1 in the different shapes of the n-hexane molecule. Because of this, any difference that appeared in the GAFF models versus the Ghemical and MMFF94s models was not unexpected. With the three-dimensional geometries being so different, the descriptor values would obviously be different and that difference would propagate through the QSAR development process. What was unexpected was the difference in descriptors chosen using BIC methods between the Ghemical and MMFF94s models. There was no overlap between the descriptors selected by JMP? to minimize the BIC value. 68 In the PCA models, there was still a difference between the GAFF models and the Ghemical and the MMFF94s models, as seen in Figure 4.12 through Figure 4.38. There is, however, a similarity between the shapes of the internal and external validation curves with the Ghemical and MMFF94s models. None of the three force fields are able to distinguish themselves from the others in leading to more accurate QSAR models for ibuprofen crystallization. However, the variation in results observed with the three different force fields highlights the importance of selecting a force field that can provide accurate models for the molecules of interest. 5.4 Summary Overall, the methodology presented in this work shows potential for constructing QSAR models that can relate solvent structure to crystal aspect ratio. The QSAR model developed show a good ability to match training set data but that same ability is not seen when the predictive capabilities are measured using the test set solvents. When the methodology is applied to this particular set of experimental data, the chemical space of the resulting models are relatively small and do not appear to include any of the solvents within the test set. In order for this methodology to produce strong predictive models, the applicable domain will need to be expanded and this need is expanded upon further in Chapter 6. While not able to indicate an optimal molecular mechanics force field, the results do underline the significance of choosing a force field that can accurately or consistently estimate the three-dimensional shapes of the molecules studied. 69 Future Work Chapter 6: In this chapter, future work that could be executed using this thesis as a basis will be presented. There are many different directions that this research could take. The first possibility would be to expand the training set used to build the QSARs by including more experimental work. Improvements were observed with the addition of the expansion solvents that were only estimates. Expanding the chemical space within the training set would lead to a stronger predictive model. The second possibility for future work would be to implement a new strategy for regressing the current data. There are other data regression techniques such as the genetic algorithm procedure that could be used to create a better predictive model from the existing ibuprofen crystal aspect ratio data. In order to increase the model?s usefulness in crystallization solvent design with novel drug molecules, the QSAR model would need to be expanded to include multiple solute molecules. 6.1 Acquiring Additional Experimental Data The work presented in this thesis was inspired by a study of the relationship between ibuprofen crystal aspect ratio and hydrogen bonding properties of the solvents in which ibuprofen is crystallized [24]. In this work, ibuprofen was crystallized in 16 different organic solvents, resulting in a relatively small training set in this analysis. The results of the analysis using that small training set were not as good as desired and an improvement was seen when the 35 expansion solvents were added to it. The aspect ratios of the expansion solvents were estimated using the 16 solvent 2D model. With 51 solvents now in the training set, the predictive capabilities of the developed QSARs increased, but the results were once again not as good as desired. However, the progress seen with a larger training set shows that models with better predictive powers can be developed if more experimental data can be acquired. 70 The cooling crystallization experimental methods used by Acquah et al. are relatively simple and could be replicated in most chemical labs [24]. It would be recommended that the solvents used to populate the data set be included in the additional experimental work. First, these solvents were selected with the intent of increasing the applicable chemical space of the developed QSARs and that would obviously be better achieved with actual laboratory experiments over the assumptions made in this work. Secondly, the method of populating the original data set could be validated if these solvents are utilized in ibuprofen crystallization experiments. Having adequate experimental data is crucial to constructing reliable predictive QSAR models. The results presented in this thesis would be significantly improved if more data was available for constructing the models. 6.2 Genetic Algorithm QSAR Models There are many different techniques available to construct QSAR models from molecular descriptor data. One such method that may be able to create better models is to use the Genetic Algorithm (GA) procedure for variable selection. Like BIC methods, GA can select the most relevant variables that describe a data set and then those variables can be regressed into a QSAR model. This method was successfully implemented by Gramatica and Papa to create QSAR models of bioconcentration factors using theoretical molecular descriptors [32]. The GA procedure can take thousands of molecular descriptors and select a very small number of them that best model training set data. Gramatica and Papa used the GA procedure to select the five most important descriptors from an input of 1150. The QSAR models they developed achieved R2 and Q2 values over 0.80 through external and internal validation [32]. The disadvantage of using PCA for QSAR construction is that each descriptor remains in the model therefore the calculations are much more complex than when variable selection methods 71 are employed. The GA procedure could provide clues in determining the solute-solvent interactions between ibuprofen and the solvents it is crystallized within. 6.3 Expanding from Single Solute to Multiple Solute Once a QSAR has been developed that can accurately predict the aspect ratio of ibuprofen crystals, the next step in making the model useful in designing solvents for novel drug crystallization is expanding the model to handle additional solvent molecules. This would require additional experimental work using the same solvents with various solute molecules. The work presented earlier in Chapters 3-5 is better for a multiple solute model because it takes into account more than one solvent property. It has been shown that models built using only the hydrogen bonding solubility parameter as the independent variable do not make accurate predictions for the aspect ratios of carboxylic acid crystals other than ibuprofen [22]. Because the method presented in this thesis incorporates the entire solvent structure into the model, it may be better suited for predicting aspect ratio for many different solute molecules. Because crystal aspect ratio is determined by the nature of the solute-solvent interaction, it is hypothesized that other NSAID molecules within the same propionic acid derivatives family could interact with organic solvents in the same manner as ibuprofen. Fenoprofen, flurbiprofen, ketoprofen, and naproxen are all within this class of drug molecules. The structures of these molecules, and also ibuprofen for comparison purposes, are shown below in Figure 6.1: 72 Fenoprofen Flurbiprofen O H O C H 3 O F O H O C H 3 Ketoprofen Naproxen O O H C H 3O CH 3 O O O H C H 3 Ibuprofen O O H C H 3CH 3 C H 3 Figure 6.1: NSAIDs within the propionic acid derivative family Each of these NSAID molecules has the same propionic acid group attached to phenolic rings. This structural similarity could mean that their interactions with crystallization solvents could be quite similar to that of ibuprofen and therefore the aspect ratio of their crystals could accurately be predicted with the same QSAR. This theory would need to be evaluated using the same crystallization process used for ibuprofen. Having a QSAR model that can accommodate multiple solutes is crucial for using it within a CAMD framework, especially for pharmaceuticals. Expanding the chemical space for solutes to accommodate any phenolic organic compound with a propionic acid group attached would allow the model to be applied to any drug molecule that has that structure within it. 73 6.4 Summary The results of this QSAR development algorithm show that while the method is promising, additional work will be needed to construct a model that can accurately model ibuprofen crystallization characteristics and also have strong predictive capabilities. Expanding the chemical applicability domain through additional experimental data would be one step to improve future models. Different regression techniques could also be employed in order to produce more accurate and predictive models. In any case, once an effective QSAR model is developed, it has been shown that it can be used within a CAMD framework to design optimal solvents for many crystallization processes. 74 References [1] Karunanithi, A, Achenie, L., and Gani, R. (2006). A computer-aided molecular design framework for crystallization solvent design. Chemical Engineering Science, 61(4), 1247?1260. doi:10.1016/j.ces.2005.08.031 [2] Rasenack, N., and M?ller, B. (2002). Ibuprofen crystals with optimized properties. International journal of pharmaceutics, 245, 9?24. Retrieved from http://www.sciencedirect.com/science/article/pii/S0378517302002946 [3] Karunanithi, A. T., Acquah, C., Achenie, L. E. K., Sithambaram, S., Suib, S. L., and Gani, R. (2007). An experimental verification of morphology of ibuprofen crystals from CAMD designed solvent. Chemical Engineering Science, 62(12), 3276?3281. doi:10.1016/j.ces.2007.02.017 [4] Shuler, M. L., and Kargi, F. (2002). Bioprocess engineering, basic concepts. (2nd ed., pp. 8-9). New Jersey: Pearson College Div. [5] Leach, A. (2001). Molecular modelling: Principles and applications. (2nd ed., pp. 165- 167). Harlow, England: Pearson Education Limited [6] Wang, J., Wolf, R., Caldwell, J., Kollman, P., and Case, D. (2004). Development and testing of a general amber force field. Journal of Computational Chemistry, 25(9), 1157? 1174. Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/jcc.20035/full [7] Halgren, T. (1996). Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. Journal of computational chemistry, 17(5-6), 490?519. Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1096- 987X(199604)17:5/6%3C490::AID-JCC1%3E3.0.CO;2-P/full [8] Halgren, T. (1999). MMFF VI. MMFF94s option for energy minimization studies. Journal of Computational Chemistry, 20(7), 720?729. Retrieved from http://doi.wiley.com/10.1002/%28SICI%291096- 987X%28199905%2920%3A7%3C720%3A%3AAID-JCC7%3E3.0.CO%3B2-X [9] Nettles, J., Jenkins, J., and Bender, A. (2006). Bridging chemical and biological space:?target fishing? using 2D and 3D molecular descriptors. Journal of Medicinal Chemistry, 49(23), 6802?6810. Retrieved from http://pubs.acs.org/doi/abs/10.1021/jm060902w [10] Gavernet, L., Talevi, A., Castro, E. A., and Bruno-Blanch, L. E. (2008). A Combined Virtual Screening 2D and 3D QSAR Methodology for the Selection of New Anticonvulsant Candidates from a Natural Product Library. QSAR and Combinatorial Science, 27(9), 1120?1129. doi:10.1002/qsar.200730055 [11] Helguera, A., Cordeiro, M., Gonzalez, M., Perez, M., Ruiz, R., and Castillo, Y. (2007). QSAR modeling for predicting carcinogenic potency of nitroso-compounds using 0D-2D 75 molecular descriptors. 11th International Electronic Conference on Synthetic Organic Chemistry. Retrieved from https://usc.es/congresos/ecsoc/11/hall_gCC/g003/g003.pdf [12] Burden, F. R., and Winkler, D. A. (2009). Optimal Sparse Descriptor Selection for QSAR Using Bayesian Methods. QSAR and Combinatorial Science, 28(6-7), 645?653. doi:10.1002/qsar.200810173 [13] Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics, 6(2), 461?464. Retrieved from http://projecteuclid.org/euclid.aos/1176344136 [14] Akaike, H. (1974). A new look at the statistical model identification. Automatic Control, IEEE Transactions on, 19(6), 716?723. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1100705 [15] Bogdan, M., Ghosh, J. K., and Doerge, R. W. (2004). Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci. Genetics, 167(2), 989?99. doi:10.1534/genetics.103.021683 [16] Lauria, A., Ippolito, M., and Almerico, A. M. (2009). Combined Use of PCA and QSAR/QSPR to Predict the Drugs Mechanism of Action. An Application to the NCI ACAM Database. QSAR and Combinatorial Science, 28(4), 387?395. doi:10.1002/qsar.200810062 [17] Consonni, V., Ballabio, D., and Todeschini, R. (2010). Evaluation of model predictive ability by external validation techniques. Journal of Chemometrics, 24(3-4), 194?201. doi:10.1002/cem.1290 [18] University of Oxford Department of Chemistry. (2002, December). Ibuprofen. Retrieved May 23, 2013, from http://www.chem.ox.ac.uk/mom/ibuprofen/ibuprofen.html [19] Cann, M.C.; and Connelly, M.E. Real World Cases in Green Chemistry, American Chemical Society: Washington, DC, 2000 [20] Winn, D., and Doherty, M. (1998). A new technique for predicting the shape of solution? grown organic crystals. AIChE journal, 44(11), 2501?2514. Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/aic.690441117/abstract [21] Winn, D., and Doherty, M. (2000). Modeling crystal shapes of organic materials grown from solution. AIChE journal, 46(7), 1348?1367. Retrieved from http://dx.doi.org/10.1002/aic.690460709 [22] Karunanithi, A. T., Acquah, C., Achenie, L. E. K., Sithambaram, S., and Suib, S. L. (2009). Solvent design for crystallization of carboxylic acids. Computers and Chemical Engineering, 33(5), 1014?1021. doi:10.1016/j.compchemeng.2008.11.003 [23] Storey, R. A. (1997). The Nucleation, Growth and Solid-State Properties of Particulate Pharmaceuticals. Ph.D. Thesis, University of Bradford. 76 [24] Acquah, C., Karunanithi, A., Cagnetta, M., Achenie, L., & Suib, S. (2009). Linear models for prediction of ibuprofen crystal morphology based on hydrogen bonding propensities. Fluid Phase Equilibria, 277(1), 73?80. doi:10.1016/j.fluid.2008.11 [25] Marcus, Y. (1998). The Properties of Solvents. Chichester, New York: Wiley [26] Acquah, C. (2008). Quantitative indices for tuning the crystal morphology of carboxlic acids. Ph.D. Thesis, University of Connecticut. [27] Avogadro: an open-source molecular builder and visualization tool. Version 1.1.0. http://avogadro.openmolecules.net/ [28] Hanwell, M., Curtis, D., Lonie, D., Vandermeersch, T., Zurek, E., and Hutchinson, G. (2012). Avogadro: an advanced semantic chemical editor, visualization, and analysis platform. Journal of cheminformatics, 4(1), 17. doi:10.1186/1758-2946-4-17 [29] Tetko, I. V, Gasteiger, J., Todeschini, R., Mauri, A., Livingstone, D., Ertl, P., Palyulin, V. a, et al. (2005). Virtual computational chemistry laboratory--design and description. Journal of computer-aided molecular design, 19(6), 453?63. doi:10.1007/s10822-005- 8694-y [30] JMP, Version 9.0. SAS Institute Inc., Cary, NC, 1989-2012. [31] XLSTAT, Version 2013.2.04. Addinsoft, New York, 1995-2013. [32] Gramatica, P., and Papa, E. (2003). QSAR modeling of bioconcentration factor by theoretical molecular descriptors. QSAR and Combinatorial Science, 22(3), 374?385. doi:10.1002/qsar.200390027 77 Appendices A. BIC Method Table A.1 shows the descriptors chosen by JMP? that result in models with minimum BIC values [30]. The class, symbol, and name of the descriptor are shown below for the seven descriptor sets employed in the construction of the QSAR models. Table A.1: Descriptors used in BIC method regression 2D Class Symbol Name Topographical Ram ramification index Topographical TI2 second Mohar index TI2 Topographical Rww reciprocal hyper-detour index Topographical Jhetv Balaban-type index from van der Waals weighted distance matrix Topographical Jhete Balaban-type index from electronegativity weighted distance matrix Topographical S1K 1-path Kier alpha-modified shape index Topographical Lop Lopping centric index Connectivity X0v valence connectivity index chi-0 Information SIC1 structural information content (neighborhood symmetry of 1-order) Information CIC2 complementary information content (neighborhood symmetry of 2-order) 2D Autocorrelations MATS2p Moran autocorrelation - lag 2 / weighted by atomic polarizabilities Edge Adjacency ESpm15u Spectral moment 15 from edge adj. matrix Edge Adjacency ESpm04d Spectral moment 04 from edge adj. matrix weighted by dipole moments Burgen Eigenvalue BEHm4 highest eigenvalue n. 4 of Burden matrix / weighted by atomic masses 3D GAFF Class Symbol Name Randic Molecular Profiles SP01 shape profile no. 01 Geometrical L/Bw length-to-breadth ratio by WHIM Geometrical DISPe d COMMA2 value / weighted by atomic Sanderson electronegativities RDF RDF015m Radial Distribution Function - 1.5 / weighted by atomic masses RDF RDF025m Radial Distribution Function - 2.5 / weighted by atomic masses 3D-MoRSE Mor32u 3D-MoRSE - signal 32 / unweighted 3D-MoRSE Mor13p 3D-MoRSE - signal 13 / weighted by atomic polarizabilities WHIM E1e 1st component accessibility directional WHIM index / weighted by atomic Sanderson electronegativities WHIM L2s 2nd component size directional WHIM index / weighted by atomic electrotopological states WHIM G2s 2st component symmetry directional WHIM index / weighted by atomic electrotopological states WHIM Ks K global shape index / weighted by atomic electrotopological states GETAWAY R3m+ R maximal autocorrelation of lag 3 / weighted by atomic masses GETAWAY R5e R autocorrelation of lag 5 / weighted by atomic Sanderson electronegativities GETAWAY R3p R autocorrelation of lag 3 / weighted by atomic polarizabilities 78 2D & 3D GAFF Class Symbol Name Topographical S2K 2-path Kier alpha-modified shape index Information AAC mean information index on atomic composition Information SIC1 structural information content (neighborhood symmetry of 1-order) 2D Autocorrelations MATS1v Moran autocorrelation - lag 1 / weighted by atomic van der Waals volumes Edge Adjacency EPS0 edge connectivity index of order 0 Edge Adjacency ESpm04d Spectral moment 04 from edge adj. matrix weighted by dipole moments Geometrical DISPv d COMMA2 value / weighted by atomic van der Waals volumes 3D-MoRSE Mor03u 3D-MoRSE - signal 03 / unweighted 3D-MoRSE Mor03e 3D-MoRSE - signal 03 / weighted by atomic Sanderson electronegativities WHIM E1e 1st component accessibility directional WHIM index / weighted by atomic Sanderson electronegativities WHIM G3s 3st component symmetry directional WHIM index / weighted by atomic electrotopological states WHIM Kv K global shape index / weighted by atomic van der Waals volumes WHIM Kp K global shape index / weighted by atomic polarizabilities GETAWAY R2u R autocorrelation of lag 2 / unweighted 3D Ghemical Class Symbol Name Randic Molecular Profiles SP01 shape profile no. 01 Randic Molecular Profiles SP03 shape profile no. 03 Geometrical MEcc molecular eccentricity RDF RDF040e Radial Distribution Function - 4.0 / weighted by atomic Sanderson electronegativities RDF RDF045p Radial Distribution Function - 4.5 / weighted by atomic polarizabilities 3D-MoRSE Mor27u 3D-MoRSE - signal 27 / unweighted 3D-MoRSE Mor18m 3D-MoRSE - signal 18 / weighted by atomic masses 3D-MoRSE Mor01v 3D-MoRSE - signal 01 / weighted by atomic van der Waals volumes 3D-MoRSE Mor13e 3D-MoRSE - signal 13 / weighted by atomic Sanderson electronegativities 3D-MoRSE Mor28p 3D-MoRSE - signal 28 / weighted by atomic polarizabilities WHIM Dm D total accessibility index / weighted by atomic masses GETAWAY HATS5v leverage-weighted autocorrelation of lag 5 / weighted by atomic van der Waals volumes GETAWAY R5u R autocorrelation of lag 5 / unweighted GETAWAY R5u+ R maximal autocorrelation of lag 5 / unweighted 2D & 3D Ghemical Class Symbol Name Topographical BAC Balaban centric index Connectivity X0v valence connectivity index chi-0 Information SIC2 structural information content (neighborhood symmetry of 2-order) 2D Autocorrelations MATS3p Moran autocorrelation - lag 3 / weighted by atomic polarizabilities 2D Autocorrelations GATS1v Geary autocorrelation - lag 1 / weighted by atomic van der Waals volumes Burgen Eigenvalue BEHe8 highest eigenvalue n. 8 of Burden matrix / weighted by atomic Sanderson electronegativities Geometrical MEcc molecular eccentricity 79 RDF RDF035v Radial Distribution Function - 3.5 / weighted by atomic van der Waals volumes 3D-MoRSE Mor16v 3D-MoRSE - signal 16 / weighted by atomic van der Waals volumes 3D-MoRSE Mor13e 3D-MoRSE - signal 13 / weighted by atomic Sanderson electronegativities WHIM Gs G total symmetry index / weighted by atomic electrotopological states GETAWAY ISH standardized information content on the leverage equality GETAWAY HATS3u leverage-weighted autocorrelation of lag 3 / unweighted GETAWAY R1u R autocorrelation of lag 1 / unweighted 3D MMFF94s Class Symbol Name Randic Molecular Profiles SP13 shape profile no. 13 Geometrical PJI3 3D Petijean shape index RDF RDF040e Radial Distribution Function - 4.0 / weighted by atomic Sanderson electronegativities 3D-MoRSE Mor11m 3D-MoRSE - signal 11 / weighted by atomic masses 3D-MoRSE Mor12e 3D-MoRSE - signal 12 / weighted by atomic Sanderson electronegativities 3D-MoRSE Mor05p 3D-MoRSE - signal 05 / weighted by atomic polarizabilities WHIM E1s 1st component accessibility directional WHIM index / weighted by atomic electrotopological states GETAWAY H3u H autocorrelation of lag 3 / unweighted GETAWAY HATS3u leverage-weighted autocorrelation of lag 3 / unweighted GETAWAY HATS3m leverage-weighted autocorrelation of lag 3 / weighted by atomic masses GETAWAY R3v+ R maximal autocorrelation of lag 3 / weighted by atomic van der Waals volumes GETAWAY R1e+ R maximal autocorrelation of lag 1 / weighted by atomic Sanderson electronegativities GETAWAY RTe+ R maximal index / weighted by atomic Sanderson electronegativities GETAWAY R3p+ R maximal autocorrelation of lag 3 / weighted by atomic polarizabilities 2D & 3D MMFF94s Class Symbol Name Topographical SPI superpendentic index Topographical MAXDP maximal electrotopological positive variation Topographical ECC eccentricity Topographical DECC eccentric Information IC2 information content index (neighborhood symmetry of 2-order) Edge Adjacency EEig01d Eigenvalue 01 from edge adj. matrix weighted by dipole moments Burgen Eigenvalue BEHm1 highest eigenvalue n. 1 of Burden matrix / weighted by atomic masses Burgen Eigenvalue BEHv4 highest eigenvalue n. 4 of Burden matrix / weighted by atomic van der Waals volumes Burgen Eigenvalue BEHp8 highest eigenvalue n. 8 of Burden matrix / weighted by atomic polarizabilities 3D-MoRSE Mor06p 3D-MoRSE - signal 06 / weighted by atomic polarizabilities WHIM E2v 2nd component accessibility directional WHIM index / weighted by atomic van der Waals volumes WHIM E1e 1st component accessibility directional WHIM index / weighted by atomic Sanderson electronegativities GETAWAY R5u+ R maximal autocorrelation of lag 5 / unweighted GETAWAY RTe+ R maximal index / weighted by atomic Sanderson electronegativities 80 Table A.2 shows the equations for each of the seven descriptor sets. They are linear equations in the form of Equation 3.2. These models minimize the BIC value calculated using Equation 2.2. Table A.2: Equations developed with BIC method regression Descriptor Class Equation to calculate normalized AR 2D ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 3D GAFF ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2D & 3D GAFF ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 3D Ghemical ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2D & 3D Ghemical ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 3D MMFF94s ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2D & 3D MMFF94s ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 81 B. PCA Method In this section, the matrices used in PCA are shown. The matrices shown in Table B.1 through Table B.3 are from the PCA analysis using 16 solvents and the 2D descriptor set. The same procedure is used for the other six descriptor sets and also when the training set is expanded to 51 solvents. Table B.1 contains the normalized (Equation 3.1) descriptor values for each solvent. Table B.2 contains the eigenvectors that are used in the calculation of the principal component factors shown in Table B.3. These factors are then linearly regressed to produce the QSAR model. 82 Table B.1: 2D descriptor matrix Ace ton e Ace to- nitrile Ben zy l Alco ho l Car bo n Tetr a- ch lor ide Cy clo - hex an e Eth an ol Eth yl Ace tate Eth ylen e Gly co l Hex an e Iso - pro pa no l Me than ol Me thy len e Dich lor ide Pro py len e Gly co l Su lfo lan e t-Am yl Alco ho l To luen e ZM1 -0.504 -1.094 1.659 0.283 0.676 -1.094 0.283 -0.701 0.086 -0.504 -1.487 -1.094 -0.111 1.659 0.676 1.266 ZM1V 0.242 -0.196 1.555 -1.057 -0.853 -0.634 1.701 0.387 -1.072 -0.415 -0.780 -1.539 0.606 1.498 0.023 0.533 ZM2 -0.592 -1.030 1.776 0.022 0.723 -1.030 0.197 -0.680 0.022 -0.592 -1.293 -1.030 -0.153 1.776 0.548 1.337 ZM2V 0.102 -0.262 2.193 -0.787 -0.262 -0.808 1.647 -0.262 -0.626 -0.398 -1.126 -1.212 0.193 -0.323 0.374 1.556 Qindex -0.364 -0.894 1.225 0.695 0.695 -0.894 -0.364 -0.894 -0.894 -0.364 -0.894 -0.894 -0.364 2.285 0.695 1.225 SNar -0.652 -0.906 1.941 -0.473 1.256 -0.906 0.212 -0.473 0.392 -0.652 -1.338 -0.906 -0.220 1.256 -0.041 1.509 HNar -0.666 -0.666 1.569 -0.749 2.102 -0.666 -0.026 -0.206 0.372 -0.666 -1.358 -0.666 -0.306 0.880 -0.448 1.500 GNar -0.583 -0.776 1.539 -0.569 1.784 -0.776 0.099 -0.244 0.355 -0.583 -1.676 -0.776 -0.185 1.130 -0.244 1.504 Xt 0.664 1.415 -0.910 0.345 -0.746 1.415 -0.308 0.345 -0.409 0.664 -2.235 1.415 0.035 -0.746 -0.129 -0.813 Dz -0.541 -1.251 1.731 0.121 0.311 -1.109 0.879 -0.257 0.311 -0.541 -1.677 -1.204 0.311 1.447 0.595 0.879 Ram 0.323 -0.968 0.323 1.614 -0.968 -0.968 0.323 -0.968 -0.968 0.323 -0.968 -0.968 0.323 1.614 1.614 0.323 Pol -0.891 -0.891 2.328 -0.891 0.489 -0.891 0.489 -0.431 0.489 -0.891 -0.891 -0.891 0.029 0.948 0.489 1.408 LPRS -0.614 -1.094 1.932 -0.103 0.531 -1.094 0.671 -0.547 0.755 -0.614 -1.550 -1.094 -0.010 1.099 0.544 1.187 VDA -0.647 -1.075 2.034 -0.204 0.402 -1.075 0.791 -0.531 1.024 -0.647 -1.463 -1.075 -0.018 0.902 0.480 1.102 MSD -0.006 1.073 -1.049 -0.737 -0.897 1.073 -0.220 0.627 0.235 -0.006 2.232 1.073 -0.184 -1.334 -0.817 -1.066 SMTI -0.680 -0.935 2.426 -0.324 0.542 -0.935 0.364 -0.655 0.491 -0.680 -1.088 -0.935 -0.273 1.153 0.211 1.318 SMTIV -0.416 -0.697 2.607 -0.785 -0.223 -0.816 0.962 -0.208 -0.253 -0.608 -0.964 -1.098 0.162 1.301 0.058 0.977 GMTI -0.688 -0.822 2.569 -0.495 0.695 -0.822 0.175 -0.628 0.353 -0.688 -0.896 -0.822 -0.376 1.082 -0.063 1.424 GMTIV -0.526 -0.657 2.957 -0.735 -0.254 -0.722 0.641 -0.085 -0.379 -0.608 -0.815 -0.819 0.183 1.142 -0.204 0.880 Xu -0.538 -1.103 1.702 -0.046 0.597 -1.103 0.742 -0.416 0.865 -0.538 -1.857 -1.103 0.104 1.008 0.563 1.120 SPI -0.039 -0.804 0.545 0.711 -1.848 -0.804 1.265 -0.168 0.898 -0.039 -1.848 -0.804 0.640 0.748 1.415 0.130 W -0.697 -0.979 2.402 -0.303 0.317 -0.979 0.599 -0.641 0.768 -0.697 -1.148 -0.979 -0.190 0.993 0.373 1.162 WA -0.561 -1.010 1.551 -0.293 0.245 -1.010 1.140 -0.113 1.677 -0.561 -1.905 -1.010 0.245 0.398 0.425 0.782 Har -0.580 -1.067 1.860 -0.011 0.748 -1.067 0.382 -0.625 0.314 -0.580 -1.474 -1.067 -0.101 1.434 0.504 1.328 Har2 -0.607 -1.062 1.957 -0.039 0.643 -1.062 0.416 -0.645 0.348 -0.607 -1.402 -1.062 -0.115 1.400 0.529 1.306 QW -0.682 -1.043 2.149 -0.176 -0.068 -1.043 0.980 -0.610 1.197 -0.682 -1.260 -1.043 -0.032 0.836 0.691 0.787 TI1 -0.086 0.372 1.139 -0.550 1.139 0.372 -1.621 -0.222 -1.880 -0.086 0.811 0.372 -0.761 1.139 -1.276 1.139 TI2 -0.660 0.040 0.351 -1.080 -1.360 0.040 1.551 0.826 2.468 -0.660 -0.660 0.040 0.479 -0.788 0.122 -0.712 HyDp -0.708 -0.917 2.564 -0.411 0.184 -0.917 0.660 -0.619 1.017 -0.708 -1.036 -0.917 -0.232 0.749 0.244 1.047 RHyDp -0.586 -1.067 1.880 -0.008 0.715 -1.067 0.387 -0.634 0.310 -0.586 -1.453 -1.067 -0.104 1.437 0.522 1.321 w -0.648 -0.785 2.505 -0.456 0.833 -0.785 -0.017 -0.620 0.065 -0.648 -0.867 -0.785 -0.401 1.162 -0.127 1.573 ww -0.611 -0.676 2.681 -0.518 0.832 -0.676 -0.186 -0.583 -0.075 -0.611 -0.713 -0.676 -0.463 0.989 -0.315 1.599 Rww -0.084 -0.872 0.102 0.861 -1.266 -0.872 1.507 -0.163 1.381 -0.084 -1.502 -0.872 0.703 0.073 1.727 -0.638 83 D/D -0.544 -1.142 1.499 0.253 -0.305 -1.142 1.248 -0.544 1.248 -0.544 -1.540 -1.142 0.253 0.687 1.248 0.465 Wap -0.612 -0.702 2.555 -0.485 0.854 -0.702 -0.196 -0.594 -0.141 -0.612 -0.757 -0.702 -0.449 1.216 -0.268 1.596 WhetZ -0.671 -0.996 2.195 -0.777 0.749 -0.930 0.579 -0.573 1.320 -0.591 -1.126 -1.079 -0.037 0.244 0.731 0.963 Whetm -0.669 -0.994 2.194 -0.791 0.749 -0.928 0.580 -0.571 1.320 -0.589 -1.124 -1.081 -0.036 0.244 0.731 0.963 Whetv -0.813 -1.173 2.011 -0.386 0.279 -0.995 1.553 -0.402 0.762 -0.636 -1.234 -1.111 0.196 0.861 0.627 0.460 Whete -0.756 -1.081 2.111 -0.363 0.663 -1.015 0.498 -0.657 1.234 -0.676 -1.212 -1.040 -0.121 0.892 0.646 0.877 Whetp -0.772 -1.136 2.038 -0.561 0.262 -0.942 1.775 -0.311 0.729 -0.580 -1.187 -1.127 0.297 0.408 0.671 0.437 J 0.264 -0.997 -0.099 1.542 -0.327 -0.997 0.817 -0.373 0.292 0.264 -2.152 -0.997 0.658 0.403 1.804 -0.103 JhetZ -0.147 -0.378 -0.179 3.172 -0.718 -0.796 0.139 -0.545 -0.517 -0.399 -1.113 0.838 -0.225 0.920 0.060 -0.113 Jhetm -0.160 -0.378 -0.190 3.198 -0.701 -0.776 0.112 -0.538 -0.510 -0.399 -1.077 0.886 -0.234 0.857 0.038 -0.126 Jhetv 0.430 0.254 0.769 1.405 -0.059 -1.331 -0.503 -1.066 0.426 -0.381 -2.187 -0.584 -0.297 0.729 0.995 1.401 Jhete 0.491 -0.112 0.410 1.831 -0.994 -1.204 1.225 -0.551 -0.469 -0.168 -2.034 -0.890 0.285 0.562 1.033 0.585 Jhetp 0.186 0.055 0.509 1.961 -0.109 -1.263 -0.664 -1.069 0.293 -0.494 -1.942 -0.082 -0.449 1.307 0.658 1.102 MAXDN 0.098 -0.731 -0.057 1.601 -1.559 -0.178 0.514 0.236 -1.399 0.098 -0.455 -0.117 0.533 2.267 0.328 -1.178 MAXDP 0.797 -0.252 0.885 -0.815 -1.481 -0.019 1.149 0.034 -1.265 0.435 -0.549 -0.873 0.487 1.717 1.153 -1.404 DELS 0.093 -0.564 0.147 0.336 -1.334 -0.418 0.840 0.563 -1.064 -0.134 -0.750 -0.572 1.003 2.671 0.316 -1.133 TIE -0.472 -1.095 0.753 -0.233 0.013 -0.990 0.369 -0.471 0.435 -0.417 -1.222 -1.122 0.526 2.359 1.631 -0.063 S0K -0.377 -0.594 2.064 -0.793 -1.423 -0.594 1.281 -0.725 0.235 -0.377 -1.074 -0.942 0.601 0.957 0.933 0.826 S1K -0.556 -1.480 0.731 1.374 -0.172 -1.115 0.977 -0.363 1.262 -0.333 -1.897 -0.638 0.419 0.562 1.232 -0.003 S2K -1.080 -0.623 0.388 -0.381 0.124 -0.144 0.770 0.842 2.974 -0.824 -1.130 0.482 0.077 -0.639 -0.509 -0.326 S3K -1.197 -0.130 -0.486 -1.197 -0.486 -0.130 1.479 0.941 1.649 -1.197 -0.423 -0.130 0.896 -0.576 1.630 -0.643 PHI -0.939 -0.676 -0.289 0.162 -0.390 -0.017 0.688 0.864 2.890 -0.634 -0.923 1.050 0.179 -0.831 -0.340 -0.794 BLI -0.555 -0.874 -1.189 0.782 0.241 0.335 -0.721 -0.748 0.911 0.007 -0.184 2.675 -0.646 1.086 -0.106 -1.014 PW2 0.321 -0.794 0.488 0.989 0.321 -0.794 0.341 -0.233 -0.053 0.321 -3.017 -0.794 0.435 0.956 0.936 0.575 PW3 -1.057 -1.057 1.380 -1.057 0.967 -1.057 0.578 0.295 0.675 -1.057 -1.057 -1.057 0.562 1.218 0.562 1.161 PJI2 0.939 0.939 0.001 0.939 -1.879 0.939 0.939 -0.470 0.001 0.939 -1.879 0.939 -0.470 -0.470 -0.470 -0.940 CSI -0.761 -0.938 2.240 -0.585 0.827 -0.938 0.533 -0.467 0.945 -0.761 -1.173 -0.938 -0.173 0.710 0.121 1.357 ECC -0.750 -0.988 2.103 -0.513 0.557 -0.988 0.795 -0.394 1.270 -0.750 -1.345 -0.988 -0.037 0.557 0.319 1.152 AECC -0.790 -0.884 1.622 -0.733 0.629 -0.884 1.007 0.062 1.764 -0.790 -1.641 -0.884 0.175 0.142 0.251 0.954 DECC -0.231 0.126 1.221 -0.515 -2.168 0.126 1.278 0.415 1.278 -0.231 -2.168 0.126 0.312 0.364 0.126 -0.060 MDDD -0.415 -0.854 1.379 -0.113 -1.491 -0.854 1.379 -0.056 1.697 -0.415 -1.491 -0.854 0.461 0.735 0.742 0.149 UNIP -0.734 -1.028 1.908 -0.440 1.028 -1.028 0.734 -0.440 1.028 -0.734 -1.321 -1.028 -0.147 0.734 0.147 1.321 CENT -0.467 -0.900 2.348 0.183 -1.117 -0.900 0.616 -0.683 0.616 -0.467 -1.117 -0.900 0.074 1.265 1.049 0.399 VAR -0.495 -0.855 2.385 -0.135 -1.215 -0.855 0.945 -0.495 0.945 -0.495 -1.215 -0.855 0.225 0.585 0.945 0.585 BAC 0.297 -0.582 -0.934 1.527 -1.461 -0.582 1.000 -0.055 0.648 0.297 -0.758 -0.582 0.824 -0.582 2.054 -1.110 Lop -0.137 0.124 0.630 -0.355 -2.120 0.124 1.447 0.325 1.755 -0.137 -2.120 0.124 0.254 0.122 0.124 -0.159 ICR -0.208 0.035 1.495 -0.410 -2.049 0.035 1.263 0.221 1.549 -0.208 -2.049 0.035 0.155 0.187 0.035 -0.090 MWC01 -0.573 -1.055 1.839 -0.090 0.874 -1.055 0.392 -0.573 0.392 -0.573 -1.538 -1.055 -0.090 1.357 0.392 1.357 84 MWC02 -0.204 -1.095 1.220 0.486 0.736 -1.095 0.486 -0.445 0.341 -0.204 -2.313 -1.095 0.181 1.220 0.736 1.046 MWC03 -0.257 -1.086 1.237 0.357 0.796 -1.086 0.484 -0.380 0.357 -0.257 -2.305 -1.086 0.213 1.237 0.701 1.074 MWC04 -0.145 -1.081 1.135 0.555 0.717 -1.081 0.462 -0.427 0.272 -0.145 -2.392 -1.081 0.209 1.233 0.770 0.998 MWC05 -0.192 -1.077 1.157 0.452 0.755 -1.077 0.463 -0.377 0.296 -0.192 -2.384 -1.077 0.237 1.244 0.747 1.022 MWC06 -0.115 -1.068 1.092 0.584 0.702 -1.068 0.451 -0.411 0.242 -0.115 -2.441 -1.068 0.227 1.232 0.787 0.971 MWC07 -0.154 -1.066 1.112 0.501 0.732 -1.066 0.452 -0.373 0.264 -0.154 -2.435 -1.066 0.247 1.243 0.768 0.991 MWC08 -0.097 -1.058 1.066 0.600 0.693 -1.058 0.443 -0.400 0.224 -0.097 -2.476 -1.058 0.239 1.230 0.796 0.953 MWC09 -0.130 -1.056 1.084 0.531 0.717 -1.056 0.445 -0.370 0.244 -0.130 -2.470 -1.056 0.254 1.241 0.780 0.972 MWC10 -0.085 -1.049 1.048 0.609 0.686 -1.049 0.438 -0.392 0.212 -0.085 -2.502 -1.049 0.246 1.227 0.801 0.942 TWC -0.232 -1.084 1.275 0.420 0.736 -1.084 0.468 -0.434 0.312 -0.232 -2.289 -1.084 0.180 1.262 0.723 1.062 SRW01 -0.542 -1.119 1.769 0.036 0.614 -1.119 0.614 -0.542 0.614 -0.542 -1.697 -1.119 0.036 1.192 0.614 1.192 SRW02 -0.573 -1.055 1.839 -0.090 0.874 -1.055 0.392 -0.573 0.392 -0.573 -1.538 -1.055 -0.090 1.357 0.392 1.357 SRW04 -0.483 -1.097 1.604 0.376 0.622 -1.097 0.253 -0.729 0.008 -0.483 -1.466 -1.097 -0.115 1.727 0.744 1.236 SRW06 -0.535 -1.050 1.552 0.467 0.522 -1.050 0.088 -0.779 -0.237 -0.535 -1.240 -1.050 -0.183 1.931 0.901 1.199 SRW08 -0.593 -0.971 1.558 0.427 0.439 -0.971 -0.045 -0.791 -0.383 -0.593 -1.059 -0.971 -0.272 2.100 0.934 1.191 SRW10 -0.627 -0.889 1.568 0.345 0.347 -0.889 -0.152 -0.776 -0.469 -0.627 -0.928 -0.889 -0.352 2.264 0.906 1.169 MPC01 -0.573 -1.055 1.839 -0.090 0.874 -1.055 0.392 -0.573 0.392 -0.573 -1.538 -1.055 -0.090 1.357 0.392 1.357 MPC02 -0.447 -1.098 1.505 0.529 0.529 -1.098 0.203 -0.773 -0.122 -0.447 -1.423 -1.098 -0.122 1.830 0.854 1.179 piPC01 -0.130 -0.130 1.668 -0.130 0.561 -1.177 0.561 -0.588 0.245 -0.588 -2.011 -1.177 -0.130 1.294 0.245 1.489 piPC02 0.128 -0.365 1.496 0.315 0.315 -1.208 0.477 -0.714 -0.094 -0.365 -2.050 -1.208 -0.094 1.462 0.477 1.428 piPC03 -0.899 -0.899 1.884 -0.899 0.713 -0.899 0.434 -0.325 0.249 -0.899 -0.899 -0.899 0.011 1.287 0.249 1.792 TPC -0.626 -0.969 2.074 -0.318 0.921 -0.969 0.289 -0.556 0.377 -0.626 -1.331 -0.969 -0.187 1.261 0.149 1.481 piID -0.538 -0.716 2.414 -0.392 0.516 -0.869 0.173 -0.567 0.117 -0.618 -1.134 -0.869 -0.296 1.015 -0.050 1.813 PCR 0.101 1.295 2.202 -0.602 -0.602 -0.602 -0.005 -0.602 -0.602 -0.602 -0.602 -0.602 -0.602 0.244 -0.602 2.186 CID -0.579 -1.074 1.857 -0.099 0.802 -1.074 0.494 -0.542 0.537 -0.579 -1.594 -1.074 -0.045 1.225 0.436 1.309 BID -0.487 -1.107 1.767 0.136 0.624 -1.107 0.523 -0.614 0.473 -0.487 -1.722 -1.107 -0.002 1.279 0.630 1.202 X0 -0.426 -1.190 1.542 0.384 0.158 -1.190 0.815 -0.569 0.671 -0.426 -1.810 -1.190 0.194 1.110 1.004 0.921 X1 -0.661 -1.036 1.936 -0.344 0.836 -1.036 0.564 -0.446 0.734 -0.661 -1.525 -1.036 -0.026 1.080 0.318 1.301 X2 -0.102 -1.127 1.079 1.167 0.287 -1.127 0.350 -0.834 -0.127 -0.102 -1.834 -1.127 -0.032 1.538 1.081 0.910 X3 -0.912 -0.912 1.922 -0.912 0.935 -0.912 0.154 -0.297 0.266 -0.912 -0.912 -0.912 0.093 1.498 0.394 1.420 X0A 0.597 0.696 -1.476 0.671 -1.725 0.696 -0.173 0.100 -0.508 0.597 1.912 0.696 0.137 -1.054 0.274 -1.439 X1A -0.154 0.810 -0.791 -0.724 -0.724 0.810 -0.324 0.298 -0.109 -0.154 2.981 0.810 -0.220 -1.035 -0.635 -0.835 X2A 0.648 1.369 -0.756 0.221 -0.589 1.369 -0.129 0.221 -0.184 0.648 -2.553 1.369 -0.051 -0.684 -0.245 -0.650 X0v -0.530 -1.346 0.856 1.276 0.603 -1.170 0.417 -1.039 1.099 -0.431 -1.770 -0.474 -0.300 1.135 0.951 0.725 X1v -0.728 -1.217 0.674 0.356 1.101 -0.912 -0.015 -0.801 1.014 -0.515 -1.499 -0.321 -0.365 2.356 0.372 0.501 X2v -0.453 -1.027 0.164 2.020 0.564 -0.949 -0.439 -0.840 0.217 -0.297 -1.214 -0.452 -0.349 2.277 0.602 0.174 X0Av -0.002 -0.626 -1.273 2.237 -0.162 -0.074 -0.450 -1.201 0.622 0.230 -0.026 2.117 -0.730 -0.250 0.390 -0.801 X1Av -0.557 -0.872 -1.186 0.781 0.241 0.338 -0.718 -0.751 0.910 0.007 -0.186 2.675 -0.646 1.087 -0.106 -1.017 85 X2Av -0.143 -0.523 -0.721 1.494 0.103 -0.080 -0.711 -0.523 0.454 0.156 -1.601 2.774 -0.360 0.401 -0.114 -0.605 X0sol -0.550 -1.244 1.240 1.783 -0.018 -1.244 0.579 -0.680 0.449 -0.550 -1.808 -0.446 0.014 1.047 0.751 0.676 X1sol -0.794 -1.145 1.633 0.605 0.605 -1.145 0.351 -0.593 0.510 -0.794 -1.602 -0.365 -0.201 1.775 0.120 1.039 X2sol -0.290 -0.916 0.431 2.774 -0.052 -0.916 -0.014 -0.737 -0.305 -0.290 -1.347 -0.376 -0.247 1.524 0.432 0.328 XMOD -0.840 -1.191 1.360 1.145 0.320 -1.133 0.338 -0.521 0.235 -0.840 -1.495 0.034 -0.190 2.099 -0.032 0.710 RDCHI -0.707 -0.941 1.843 -0.574 1.212 -0.941 0.468 -0.344 0.805 -0.707 -1.458 -0.941 -0.120 1.028 -0.023 1.402 RDSQ -0.617 -1.020 1.986 -0.068 0.694 -1.020 0.280 -0.676 0.181 -0.617 -1.276 -1.020 -0.186 1.547 0.449 1.363 ISIZ -0.463 -1.192 0.804 -1.352 1.260 -0.656 0.364 -0.463 1.728 -0.059 -1.192 -1.352 0.150 0.582 1.260 0.582 IAC -0.257 -1.000 1.050 -1.910 0.375 -0.600 0.865 -0.124 0.569 -0.026 -1.220 -1.203 0.520 1.802 1.064 0.096 AAC 0.287 0.954 0.193 -2.043 -1.246 -0.002 0.629 0.596 -1.397 -0.144 0.112 1.210 0.450 1.690 -0.364 -0.925 IDE -0.583 -0.733 1.353 -0.636 0.370 -0.733 1.077 0.255 1.516 -0.583 -2.410 -0.733 0.370 0.319 0.341 0.807 IDM -0.319 -1.100 1.346 0.262 0.677 -1.100 0.645 -0.364 0.616 -0.319 -2.268 -1.100 0.221 1.068 0.689 1.045 IDDE -0.580 -0.432 1.289 -0.704 -1.708 -0.432 0.958 -0.318 0.496 -0.580 -1.708 -0.432 0.964 1.003 0.783 1.400 IDDM -0.387 -1.122 1.418 0.192 0.714 -1.122 0.658 -0.387 0.656 -0.387 -2.128 -1.122 0.187 1.079 0.658 1.092 IDET -0.733 -0.933 2.453 -0.505 0.305 -0.933 0.662 -0.563 0.885 -0.733 -1.103 -0.933 -0.165 0.831 0.290 1.177 IDMT -0.705 -0.904 2.600 -0.351 0.253 -0.904 0.465 -0.682 0.584 -0.705 -0.977 -0.904 -0.284 1.035 0.304 1.175 IVDE -0.254 0.000 0.340 -0.466 -2.184 0.000 1.287 0.195 0.000 -0.254 -2.184 0.000 1.078 1.097 0.795 0.550 IVDM -0.535 -1.064 1.566 -0.159 0.900 -1.064 0.649 -0.307 0.786 -0.535 -1.969 -1.064 0.124 1.043 0.424 1.204 HVcpx -0.834 -0.737 1.484 -1.072 0.787 -0.737 0.976 0.325 1.704 -0.834 -1.734 -0.737 0.276 0.119 0.042 0.973 HDcpx -0.089 -0.891 1.023 0.377 0.611 -0.891 0.602 -0.158 0.569 -0.089 -2.881 -0.891 0.326 0.879 0.656 0.849 Uindex -0.512 -0.985 1.300 0.161 0.188 -0.985 1.134 -0.436 1.182 -0.512 -2.203 -0.985 0.238 0.663 1.044 0.708 Vindex 0.716 1.462 -0.973 0.652 -0.789 1.462 -0.461 -0.118 -0.702 0.716 -1.924 1.462 -0.090 -0.629 0.071 -0.857 Xindex 0.619 1.085 -0.949 0.697 -0.694 1.085 -0.042 0.243 -0.304 0.619 -2.733 1.085 0.240 -0.580 0.426 -0.797 Yindex -0.959 -0.959 0.103 -0.959 0.266 -0.959 0.612 0.908 0.208 -0.959 -0.959 -0.959 1.375 0.864 2.131 0.246 IC0 0.287 0.954 0.193 -2.043 -1.246 -0.002 0.629 0.596 -1.397 -0.144 0.112 1.210 0.450 1.690 -0.364 -0.925 TIC0 -0.257 -1.000 1.050 -1.910 0.375 -0.600 0.865 -0.124 0.569 -0.026 -1.220 -1.203 0.520 1.802 1.064 0.096 SIC0 0.152 1.597 -0.446 -0.504 -1.260 0.119 -0.080 0.343 -1.393 -0.330 0.933 2.353 -0.089 0.401 -0.828 -0.969 CIC0 -0.263 -1.450 0.663 -0.825 1.354 -0.369 0.269 -0.362 1.603 0.225 -1.177 -1.880 0.185 0.055 1.068 0.903 BIC0 0.249 1.053 -0.313 0.070 -0.801 0.361 0.076 0.509 -0.881 -0.035 -2.160 2.542 0.138 0.317 -0.443 -0.684 IC1 -0.327 0.159 1.548 -2.192 -1.761 0.352 0.820 0.445 -1.236 0.135 0.159 -0.434 1.055 0.412 0.159 0.708 TIC1 -0.518 -1.016 1.802 -1.735 -0.436 -0.396 0.849 -0.165 0.229 0.051 -1.016 -1.332 0.778 0.778 1.145 0.982 SIC1 -0.264 1.271 0.664 -1.394 -2.028 0.573 0.273 0.476 -1.693 -0.096 1.271 1.006 0.580 -0.159 -0.564 0.085 CIC1 0.029 -1.291 -0.212 -0.179 2.096 -0.606 -0.022 -0.454 1.976 0.102 -1.291 -1.280 -0.315 0.372 0.892 0.186 BIC1 -0.012 0.865 0.509 -0.607 -1.356 0.807 0.398 0.695 -1.080 0.212 -2.526 1.519 0.738 -0.044 -0.198 0.079 IC2 -0.496 -0.220 0.946 -2.456 -1.310 0.562 1.268 -0.058 0.097 0.197 -0.220 -1.208 1.321 0.588 0.492 0.496 TIC2 -0.653 -0.988 1.238 -1.716 -0.597 -0.242 1.113 -0.415 1.055 0.011 -0.988 -1.378 0.948 0.762 1.163 0.686 SIC2 -0.421 0.614 0.563 -2.648 -1.613 0.944 1.043 0.078 -0.417 0.144 0.614 -0.619 1.203 0.294 0.021 0.200 CIC2 0.387 -0.988 -0.298 1.205 2.538 -1.047 -0.945 -0.116 1.138 -0.032 -0.988 -0.228 -1.158 -0.020 0.467 0.086 86 BIC2 -0.195 0.462 0.478 -2.076 -1.201 1.130 1.043 0.334 -0.159 0.366 -2.076 -0.087 1.281 0.299 0.223 0.179 ATS1m -0.625 -1.242 0.886 1.510 0.320 -1.141 0.318 -0.479 0.017 -0.625 -1.845 0.293 -0.097 2.005 0.123 0.582 ATS2m -0.313 -1.162 0.566 2.380 0.136 -1.080 0.251 -0.580 -0.237 -0.313 -2.017 0.498 -0.036 1.081 0.413 0.413 ATS3m -1.050 -1.050 1.553 -1.050 0.621 -1.050 0.717 0.181 0.621 -1.050 -1.050 -1.050 0.654 1.174 0.717 1.110 ATS1v -0.402 -0.950 1.428 0.328 1.025 -1.095 0.127 -0.712 0.707 -0.402 -2.146 -0.727 -0.121 1.110 0.531 1.300 ATS2v -0.301 -1.116 1.316 0.883 0.883 -1.278 0.020 -0.867 0.408 -0.301 -1.861 -0.884 -0.080 1.154 0.786 1.237 ATS3v -0.928 -0.928 1.717 -0.928 0.953 -0.928 0.776 -0.612 0.953 -0.928 -0.928 -0.928 -0.151 0.584 0.776 1.504 ATS1e -0.476 -1.259 1.438 0.364 0.723 -1.132 0.715 -0.294 0.339 -0.476 -2.022 -0.982 0.189 1.351 0.471 1.054 ATS2e -0.124 -1.212 1.005 1.044 0.454 -1.105 0.597 -0.467 -0.023 -0.124 -2.302 -0.948 0.232 1.355 0.807 0.810 ATS3e -1.050 -1.050 1.554 -1.050 0.623 -1.050 0.718 0.176 0.623 -1.050 -1.050 -1.050 0.649 1.173 0.718 1.113 ATS1p -0.445 -0.985 1.310 0.622 0.939 -1.116 0.012 -0.783 0.637 -0.445 -2.142 -0.434 -0.204 1.387 0.449 1.200 ATS2p -0.368 -1.151 1.217 1.320 0.813 -1.300 -0.092 -0.934 0.360 -0.368 -1.804 -0.553 -0.173 1.198 0.685 1.151 ATS3p -0.913 -0.913 1.724 -0.913 0.978 -0.913 0.778 -0.656 0.978 -0.913 -0.913 -0.913 -0.221 0.500 0.778 1.531 MATS1m 0.013 -0.201 0.185 -1.360 1.730 -0.201 -0.433 -0.330 1.730 0.013 -1.360 -1.360 -0.073 -0.205 0.123 1.730 MATS2m -0.663 -1.301 0.159 0.495 1.573 -1.301 -0.007 -1.301 1.573 -0.663 0.136 0.855 -0.701 -0.114 -0.315 1.573 MATS3m -0.523 -0.523 -0.921 -0.523 1.641 -0.523 -0.523 1.641 1.641 -0.523 -0.523 -0.523 0.018 -0.521 -0.956 1.641 MATS1v -0.387 -0.604 -0.214 1.345 1.345 -0.604 -0.838 -0.733 1.345 -0.387 -1.774 1.345 -0.475 -0.434 -0.277 1.345 MATS2v -0.715 -1.286 0.020 1.286 1.286 -1.286 -0.129 -1.286 1.286 -0.715 0.000 1.286 -0.750 0.121 -0.404 1.286 MATS3v -0.350 -0.350 -0.693 -0.350 1.514 -0.350 -0.350 1.514 1.514 -0.350 -0.350 -0.350 0.116 -1.958 -0.723 1.514 MATS1e -0.028 -0.242 0.142 -1.395 1.680 -0.242 -0.473 -0.370 1.680 -0.028 -1.395 -1.395 -0.114 0.421 0.081 1.680 MATS2e -0.676 -1.314 0.146 0.482 1.560 -1.314 -0.021 -1.314 1.560 -0.676 0.123 0.841 -0.715 0.088 -0.328 1.560 MATS3e -0.297 -0.297 -0.617 -0.297 1.443 -0.297 -0.297 1.443 1.443 -0.297 -0.297 -0.297 0.138 -2.269 -0.645 1.443 MATS1p 0.045 -0.166 0.214 -1.307 1.735 -0.166 -0.394 -0.292 1.735 0.045 -1.307 -1.307 -0.040 -0.686 0.153 1.735 MATS2p -0.690 -1.325 0.130 0.465 1.539 -1.325 -0.037 -1.325 1.539 -0.690 0.107 0.823 -0.728 0.323 -0.343 1.539 MATS3p -0.482 -0.482 -0.872 -0.482 1.633 -0.482 -0.482 1.633 1.633 -0.482 -0.482 -0.482 0.046 -0.942 -0.906 1.633 GATS1m -0.282 -0.156 -0.535 2.496 -1.293 -0.156 0.412 0.222 -1.293 -0.282 0.222 0.980 -0.031 1.373 -0.384 -1.293 GATS2m 0.877 1.130 -0.472 -1.145 -1.145 1.130 0.561 1.130 -1.145 0.877 -1.145 -1.145 0.751 0.082 0.806 -1.145 GATS1v 0.306 0.509 -0.104 -1.330 -1.330 0.509 1.429 1.123 -1.330 0.306 1.123 -1.330 0.713 0.593 0.142 -1.330 GATS2v 0.880 1.133 -0.469 -1.143 -1.143 1.133 0.564 1.133 -1.143 0.880 -1.143 -1.143 0.754 0.044 0.808 -1.143 GATS1e -0.164 -0.030 -0.433 2.785 -1.237 -0.030 0.573 0.372 -1.237 -0.164 0.372 1.176 0.103 -0.577 -0.272 -1.237 GATS2e 0.865 1.118 -0.481 -1.154 -1.154 1.118 0.550 1.118 -1.154 0.865 -1.154 -1.154 0.739 0.238 0.794 -1.154 GATS1p -0.273 -0.144 -0.531 2.563 -1.305 -0.144 0.436 0.242 -1.305 -0.273 0.242 1.016 -0.016 1.172 -0.376 -1.305 GATS2p 0.894 1.147 -0.453 -1.126 -1.126 1.147 0.579 1.147 -1.126 0.894 -1.126 -1.126 0.768 -0.191 0.823 -1.126 EPS0 -0.641 -0.729 1.854 -0.505 0.899 -0.729 0.504 -0.216 0.810 -0.641 -2.180 -0.729 -0.009 0.923 0.165 1.223 EPS1 -0.486 -0.954 1.752 -0.019 0.916 -0.954 0.385 -0.567 0.368 -0.486 -1.889 -0.954 -0.118 1.332 0.323 1.351 EEig01x 0.295 0.295 0.797 0.917 -0.145 -1.207 0.433 -0.767 -0.430 -0.145 -2.269 -1.207 0.036 1.648 1.008 0.743 EEig02x -0.573 -0.573 1.585 -1.115 0.736 -1.115 0.838 -0.190 0.736 -1.115 -1.115 -1.115 0.098 1.178 0.207 1.534 EEig01d 0.694 1.149 0.013 0.996 -0.675 -1.114 0.810 -0.759 -0.920 -0.359 -1.716 -0.492 -0.194 2.184 0.533 -0.152 87 EEig02d -0.221 0.371 1.045 -1.106 0.273 -1.683 1.233 0.108 0.273 -1.479 -0.906 -0.670 0.337 1.890 -0.322 0.859 EEig03d -0.556 -0.113 1.271 -0.348 1.271 -0.113 0.594 -1.564 -0.113 -1.497 -0.113 -0.113 -0.916 1.793 -0.754 1.271 EEig04d 0.429 0.429 2.093 0.124 -1.365 0.429 0.140 0.429 -1.365 0.429 0.429 0.429 -1.871 0.413 -1.365 0.192 EEig01r 0.297 -0.933 0.760 0.659 0.220 -0.983 0.416 -0.530 -0.075 0.149 -2.197 -1.538 0.332 1.383 1.364 0.675 EEig02r -0.730 -0.889 1.343 -1.372 0.916 -0.937 0.863 -0.140 0.916 -0.844 -0.844 -1.372 0.172 1.238 0.403 1.280 EEig03r -0.518 -0.518 1.664 -1.173 1.664 -0.518 -0.072 -1.083 0.573 -0.667 -0.518 -0.518 -0.627 1.166 -0.518 1.664 ESpm02u -0.083 -1.088 1.102 0.652 0.652 -1.088 0.454 -0.483 0.215 -0.083 -2.393 -1.088 0.215 1.222 0.822 0.970 ESpm03u 0.471 -1.025 0.471 1.450 -1.025 -1.025 0.471 -1.025 -1.025 0.471 -1.025 -1.025 0.471 1.450 1.450 0.471 ESpm04u -0.005 -1.265 0.895 1.018 0.450 -1.265 0.412 -0.515 0.063 -0.005 -2.015 -1.265 0.283 1.302 1.122 0.791 ESpm05u 0.422 -1.046 0.711 1.298 -1.046 -1.046 0.541 -1.046 -1.046 0.422 -1.046 -1.046 0.541 1.407 1.348 0.634 ESpm06u 0.061 -1.331 0.793 1.133 0.368 -1.331 0.401 -0.554 -0.011 0.061 -1.823 -1.331 0.311 1.332 1.215 0.703 ESpm07u 0.400 -1.051 0.788 1.252 -1.051 -1.051 0.571 -1.051 -1.051 0.400 -1.051 -1.051 0.552 1.383 1.310 0.705 ESpm08u 0.103 -1.364 0.751 1.166 0.330 -1.364 0.402 -0.575 -0.046 0.103 -1.725 -1.364 0.330 1.343 1.241 0.668 ESpm09u 0.390 -1.053 0.817 1.235 -1.053 -1.053 0.584 -1.053 -1.053 0.390 -1.053 -1.053 0.553 1.370 1.294 0.736 ESpm10u 0.127 -1.383 0.735 1.176 0.306 -1.383 0.406 -0.587 -0.065 0.127 -1.668 -1.383 0.343 1.347 1.249 0.653 ESpm11u 0.386 -1.054 0.830 1.228 -1.054 -1.054 0.589 -1.054 -1.054 0.386 -1.054 -1.054 0.553 1.364 1.287 0.752 ESpm12u 0.142 -1.396 0.728 1.179 0.290 -1.396 0.410 -0.595 -0.077 0.142 -1.630 -1.396 0.353 1.347 1.252 0.646 ESpm13u 0.384 -1.054 0.836 1.225 -1.054 -1.054 0.591 -1.054 -1.054 0.384 -1.054 -1.054 0.552 1.361 1.284 0.760 ESpm14u 0.153 -1.405 0.726 1.180 0.278 -1.405 0.414 -0.599 -0.086 0.153 -1.604 -1.405 0.360 1.346 1.252 0.642 ESpm15u 0.383 -1.054 0.839 1.223 -1.054 -1.054 0.592 -1.054 -1.054 0.383 -1.054 -1.054 0.552 1.359 1.282 0.765 ESpm01x -0.130 -0.130 1.668 -0.130 0.561 -1.177 0.561 -0.588 0.245 -0.588 -2.011 -1.177 -0.130 1.294 0.245 1.489 ESpm02x 0.015 0.015 1.286 0.364 0.508 -1.230 0.508 -0.618 0.111 -0.327 -2.423 -1.230 0.015 1.266 0.576 1.167 ESpm03x 0.207 0.207 1.070 0.633 0.251 -1.196 0.478 -0.663 -0.082 -0.146 -2.587 -1.196 0.060 1.230 0.728 1.005 ESpm04x 0.285 0.285 0.933 0.723 0.163 -1.170 0.471 -0.646 -0.145 -0.073 -2.662 -1.170 0.099 1.228 0.793 0.886 ESpm05x 0.326 0.326 0.856 0.765 0.095 -1.145 0.469 -0.626 -0.184 -0.033 -2.704 -1.145 0.127 1.231 0.827 0.815 ESpm06x 0.346 0.346 0.809 0.784 0.064 -1.127 0.469 -0.609 -0.203 -0.012 -2.732 -1.127 0.143 1.234 0.844 0.771 ESpm07x 0.357 0.357 0.780 0.793 0.044 -1.113 0.470 -0.596 -0.214 0.001 -2.753 -1.113 0.154 1.235 0.852 0.743 ESpm08x 0.364 0.364 0.762 0.798 0.035 -1.102 0.472 -0.587 -0.221 0.009 -2.768 -1.102 0.160 1.236 0.857 0.724 ESpm09x 0.368 0.368 0.750 0.800 0.029 -1.094 0.472 -0.580 -0.224 0.014 -2.779 -1.094 0.165 1.235 0.858 0.713 ESpm10x 0.370 0.370 0.742 0.801 0.026 -1.088 0.473 -0.575 -0.226 0.017 -2.788 -1.088 0.168 1.234 0.859 0.705 ESpm11x 0.371 0.371 0.736 0.802 0.025 -1.083 0.473 -0.571 -0.227 0.020 -2.796 -1.083 0.169 1.233 0.860 0.699 ESpm12x 0.372 0.372 0.732 0.802 0.025 -1.078 0.474 -0.568 -0.227 0.021 -2.802 -1.078 0.171 1.232 0.859 0.696 ESpm13x 0.372 0.372 0.729 0.801 0.024 -1.075 0.474 -0.566 -0.227 0.022 -2.807 -1.075 0.172 1.231 0.859 0.693 ESpm14x 0.373 0.373 0.727 0.801 0.025 -1.072 0.474 -0.564 -0.227 0.023 -2.811 -1.072 0.172 1.229 0.858 0.691 ESpm15x 0.373 0.373 0.726 0.800 0.025 -1.070 0.474 -0.562 -0.227 0.024 -2.815 -1.070 0.173 1.228 0.858 0.690 ESpm01d 0.960 1.366 -0.079 0.800 -1.622 -0.594 1.288 0.035 -1.622 -0.594 -0.594 0.404 0.035 1.576 -0.594 -0.763 ESpm02d 0.310 0.688 0.831 0.498 0.242 -1.422 0.728 -0.688 -0.250 -0.451 -2.444 -0.817 -0.047 1.688 0.497 0.636 ESpm03d 0.789 1.059 0.146 0.962 -1.714 -0.762 0.922 -0.387 -1.714 -0.028 -1.389 -0.090 0.120 1.587 0.606 -0.105 88 ESpm04d 0.679 1.027 0.328 0.913 -0.237 -1.369 0.814 -0.783 -0.606 -0.325 -2.305 -0.507 -0.098 1.710 0.576 0.182 ESpm05d 0.774 1.035 0.244 0.951 -1.679 -0.820 0.855 -0.410 -1.679 -0.009 -1.528 -0.127 0.132 1.530 0.665 0.066 ESpm06d 0.736 1.070 0.215 0.963 -0.363 -1.325 0.832 -0.764 -0.717 -0.263 -2.267 -0.426 -0.070 1.701 0.610 0.068 ESpm07d 0.768 1.026 0.278 0.944 -1.663 -0.839 0.839 -0.409 -1.663 -0.008 -1.580 -0.132 0.135 1.512 0.667 0.125 ESpm08d 0.749 1.078 0.186 0.972 -0.415 -1.301 0.838 -0.743 -0.767 -0.237 -2.262 -0.396 -0.052 1.694 0.622 0.033 ESpm09d 0.765 1.022 0.291 0.940 -1.656 -0.843 0.834 -0.407 -1.656 -0.008 -1.607 -0.132 0.136 1.505 0.665 0.149 ESpm10d 0.754 1.079 0.178 0.975 -0.445 -1.285 0.841 -0.729 -0.796 -0.225 -2.262 -0.382 -0.042 1.690 0.627 0.022 ESpm11d 0.764 1.020 0.297 0.938 -1.651 -0.844 0.832 -0.406 -1.651 -0.007 -1.621 -0.131 0.136 1.501 0.664 0.160 ESpm12d 0.756 1.080 0.177 0.976 -0.465 -1.275 0.842 -0.720 -0.815 -0.218 -2.262 -0.374 -0.036 1.687 0.630 0.017 ESpm13d 0.763 1.019 0.299 0.937 -1.649 -0.844 0.831 -0.405 -1.649 -0.007 -1.629 -0.131 0.136 1.499 0.664 0.165 ESpm14d 0.757 1.080 0.177 0.976 -0.480 -1.267 0.843 -0.714 -0.829 -0.213 -2.261 -0.369 -0.032 1.685 0.632 0.016 ESpm15d 0.763 1.019 0.300 0.936 -1.647 -0.843 0.831 -0.404 -1.647 -0.007 -1.635 -0.130 0.136 1.498 0.663 0.168 ESpm01r -0.043 -0.771 1.412 -0.985 0.961 -0.840 0.592 -0.346 0.659 -0.240 -1.709 -1.709 0.136 1.070 0.592 1.223 ESpm02r -0.004 -0.967 1.159 0.328 0.739 -1.012 0.504 -0.453 0.360 -0.104 -2.305 -1.428 0.197 1.175 0.781 1.030 ESpm03r 0.229 -0.878 0.945 0.475 0.546 -0.934 0.482 -0.446 0.231 0.117 -2.380 -1.639 0.294 1.126 0.968 0.865 ESpm04r 0.297 -0.862 0.851 0.565 0.460 -0.922 0.476 -0.420 0.169 0.181 -2.444 -1.612 0.337 1.115 1.027 0.784 ESpm05r 0.338 -0.836 0.792 0.605 0.398 -0.897 0.471 -0.392 0.135 0.221 -2.462 -1.643 0.369 1.108 1.061 0.731 ESpm06r 0.357 -0.823 0.762 0.624 0.368 -0.884 0.471 -0.376 0.115 0.240 -2.480 -1.643 0.385 1.106 1.077 0.702 ESpm07r 0.367 -0.812 0.744 0.634 0.349 -0.874 0.470 -0.365 0.104 0.251 -2.489 -1.649 0.395 1.105 1.085 0.684 ESpm08r 0.373 -0.805 0.733 0.639 0.338 -0.867 0.471 -0.358 0.096 0.257 -2.497 -1.649 0.401 1.105 1.089 0.674 ESpm09r 0.377 -0.800 0.727 0.642 0.331 -0.862 0.471 -0.353 0.091 0.261 -2.502 -1.650 0.404 1.105 1.091 0.668 ESpm10r 0.379 -0.797 0.723 0.644 0.328 -0.859 0.472 -0.351 0.087 0.263 -2.505 -1.650 0.406 1.104 1.092 0.664 ESpm11r 0.380 -0.795 0.720 0.645 0.326 -0.857 0.472 -0.348 0.085 0.264 -2.508 -1.649 0.407 1.104 1.093 0.661 ESpm12r 0.381 -0.793 0.718 0.645 0.324 -0.855 0.472 -0.347 0.084 0.266 -2.510 -1.649 0.408 1.104 1.093 0.660 ESpm13r 0.381 -0.792 0.717 0.646 0.324 -0.854 0.472 -0.346 0.083 0.266 -2.512 -1.649 0.408 1.104 1.093 0.659 ESpm14r 0.381 -0.791 0.717 0.646 0.323 -0.853 0.472 -0.346 0.082 0.266 -2.513 -1.648 0.409 1.104 1.093 0.658 ESpm15r 0.382 -0.791 0.716 0.646 0.323 -0.853 0.472 -0.345 0.082 0.267 -2.514 -1.648 0.409 1.104 1.093 0.658 BEHm1 -0.436 -0.753 0.550 1.548 0.072 -0.924 -0.292 -0.698 -0.251 -0.522 -1.556 0.652 -0.387 2.550 -0.078 0.524 BEHm2 -0.610 -1.193 0.873 0.760 0.310 -1.051 0.802 -0.444 0.979 -0.610 -2.521 0.760 -0.055 1.036 0.374 0.588 BEHm3 -0.024 -2.111 0.805 1.189 0.955 -0.776 0.262 -0.473 0.674 0.152 -2.111 -0.661 0.216 0.621 0.476 0.805 BEHm4 -1.248 -1.248 1.133 1.799 0.209 -1.248 0.633 -0.158 0.697 -0.331 -1.248 -1.248 0.190 0.366 0.839 0.865 BEHm5 -0.861 -0.988 1.101 -0.988 1.225 -0.861 0.508 -0.861 1.190 -0.861 -0.988 -0.988 0.363 1.037 1.530 0.442 BEHm6 -0.625 -0.803 1.214 -0.803 1.461 -0.625 -0.625 -0.625 1.654 -0.625 -0.803 -0.803 -0.625 1.746 0.070 0.821 BEHm7 -0.131 -0.575 3.424 -0.575 -0.131 -0.575 -0.131 -0.575 -0.131 -0.131 -0.575 -0.575 -0.131 -0.131 -0.131 1.074 BEHm8 -0.791 -0.791 3.148 -0.791 0.299 -0.791 0.299 -0.791 0.299 0.299 -0.791 -0.791 0.299 0.299 0.299 0.299 BELm1 0.167 -0.169 0.819 -3.223 0.848 0.089 0.143 -0.013 0.649 0.314 -0.410 -1.136 0.254 0.262 0.567 0.838 BELm2 0.387 -0.997 0.589 -1.948 0.617 -0.247 0.864 -0.286 1.129 0.387 -1.161 -1.948 0.312 0.806 0.806 0.690 BELm3 -0.936 -1.252 0.901 -1.252 1.478 -0.476 0.491 -0.434 1.331 0.131 -1.252 -1.252 0.133 0.256 1.233 0.901 89 BELm4 -1.075 -1.075 1.220 -1.075 0.780 -1.075 0.056 -0.047 1.543 -0.248 -1.075 -1.075 0.045 0.585 1.279 1.240 BEHv1 -0.101 -0.552 1.465 -1.000 0.917 -0.643 -0.204 -0.549 0.504 -0.073 -1.749 -1.448 -0.023 1.447 0.551 1.459 BEHv2 0.148 -0.444 0.897 -1.861 0.692 -0.448 0.929 -0.312 1.121 0.148 -1.525 -1.861 0.235 0.705 0.726 0.851 BEHv3 -0.122 -1.651 1.080 -0.836 1.255 -0.658 0.379 -0.562 0.992 0.067 -1.651 -1.231 0.119 0.928 0.810 1.080 BEHv4 -1.226 -1.226 1.441 -0.251 0.543 -1.226 0.783 -0.130 1.153 -0.272 -1.226 -1.226 0.041 0.502 0.987 1.336 BEHv5 -0.619 -1.101 0.849 -1.101 1.439 -0.619 -0.358 -0.619 1.412 -0.619 -1.101 -1.101 0.436 1.043 1.427 0.629 BEHv6 -0.363 -1.051 1.417 -1.051 1.729 -0.363 -0.363 -0.363 1.920 -0.363 -1.051 -1.051 -0.363 0.128 0.163 1.029 BEHv7 0.269 -0.984 2.767 -0.984 0.269 -0.984 0.269 -0.984 0.269 0.269 -0.984 -0.984 0.269 0.269 0.269 0.980 BEHv8 -1.067 -1.067 1.694 -1.067 0.722 -1.067 0.722 -1.067 0.722 0.722 -1.067 -1.067 0.722 0.722 0.722 0.722 BELv1 0.042 -0.541 1.070 -2.535 0.965 -0.151 0.158 -0.058 0.610 0.216 -0.820 -1.693 0.270 0.780 0.629 1.058 BELv2 0.190 -0.764 0.686 -2.007 0.495 -0.259 0.839 -0.083 1.046 0.190 -1.132 -2.007 0.310 1.229 0.692 0.577 BELv3 -0.442 -1.368 0.832 -1.368 1.441 -0.604 0.485 -0.418 1.236 0.342 -1.368 -1.368 0.412 0.255 1.099 0.832 BELv4 -1.064 -1.064 1.311 -1.064 0.465 -1.064 0.639 -0.185 1.362 -0.585 -1.064 -1.064 0.154 0.773 1.320 1.127 BEHe1 -0.089 -0.670 1.298 -1.381 0.852 -0.557 0.055 -0.345 0.414 -0.043 -1.434 -1.675 0.081 1.687 0.535 1.272 BEHe2 0.137 -0.441 0.833 -2.123 0.575 -0.242 0.884 0.010 1.041 0.137 -1.113 -2.123 0.337 0.717 0.679 0.690 BEHe3 -0.008 -1.649 0.898 -1.198 1.189 -0.498 0.459 -0.309 0.959 0.311 -1.649 -1.425 0.378 0.845 0.801 0.898 BEHe4 -1.216 -1.216 1.382 -0.676 0.482 -1.216 0.825 0.020 1.102 -0.219 -1.216 -1.216 0.318 0.533 1.137 1.176 BEHe5 -0.235 -1.500 0.950 -1.500 1.133 -0.235 0.148 -0.235 1.133 -0.235 -1.197 -1.500 0.525 0.885 1.327 0.536 BEHe6 0.082 -1.488 1.039 -1.488 1.323 0.082 0.082 0.082 1.430 0.082 -1.488 -1.488 0.082 0.724 0.117 0.829 BEHe7 0.615 -1.410 1.573 -1.410 0.615 -0.795 0.615 -0.702 0.615 0.615 -1.410 -1.410 0.615 0.615 0.615 0.647 BEHe8 -0.965 -1.195 0.874 -1.195 0.844 -1.195 0.844 -0.684 0.844 0.844 -1.195 -1.195 0.844 0.844 0.844 0.844 BELe1 0.005 -0.493 1.411 -1.912 1.066 -0.357 -0.077 -0.346 0.677 0.124 -1.319 -1.626 0.128 0.650 0.660 1.411 BELe2 0.073 -0.938 0.757 -1.491 0.603 -0.696 0.926 -0.609 1.208 0.073 -1.491 -1.491 0.110 1.493 0.733 0.741 BELe3 -0.867 -0.867 1.224 -0.867 1.799 -0.867 0.138 -0.867 1.394 -0.398 -0.867 -0.867 -0.336 -0.103 1.124 1.224 BEHp1 -0.177 -0.648 1.358 -0.707 0.841 -0.685 -0.294 -0.614 0.441 -0.134 -1.768 -1.242 -0.097 1.903 0.472 1.352 BEHp2 0.128 -0.553 0.894 -1.715 0.692 -0.523 0.941 -0.402 1.151 0.128 -1.675 -1.715 0.201 0.859 0.734 0.856 BEHp3 -0.185 -1.684 1.083 -0.643 1.273 -0.703 0.357 -0.617 1.009 0.028 -1.684 -1.154 0.076 0.937 0.824 1.083 BEHp4 -1.241 -1.241 1.419 -0.006 0.537 -1.241 0.752 -0.161 1.157 -0.300 -1.241 -1.241 -0.007 0.506 0.977 1.331 BEHp5 -0.540 -1.148 0.825 -1.148 1.433 -0.540 -0.451 -0.540 1.413 -0.540 -1.148 -1.148 0.455 1.056 1.376 0.644 BEHp6 -0.269 -1.121 1.393 -1.121 1.710 -0.269 -0.269 -0.269 1.891 -0.269 -1.121 -1.121 -0.269 -0.103 0.178 1.028 BEHp7 0.367 -1.062 2.511 -1.062 0.367 -1.062 0.367 -1.062 0.367 0.367 -1.062 -1.062 0.367 0.367 0.367 0.924 BEHp8 -1.082 -1.082 1.455 -1.082 0.765 -1.082 0.765 -1.082 0.765 0.765 -1.082 -1.082 0.765 0.765 0.765 0.765 BELp1 0.112 -0.418 1.090 -2.655 0.949 -0.097 0.249 0.025 0.606 0.256 -0.718 -1.706 0.328 0.253 0.653 1.072 BELp2 0.149 -0.724 0.720 -1.990 0.486 -0.278 0.850 -0.059 1.051 0.149 -1.181 -1.990 0.311 1.249 0.682 0.576 BELp3 -0.393 -1.349 0.850 -1.349 1.465 -0.684 0.488 -0.461 1.235 0.361 -1.349 -1.349 0.441 0.163 1.083 0.850 BELp4 -1.023 -1.023 1.381 -1.023 0.382 -1.023 0.683 -0.293 1.352 -0.731 -1.023 -1.023 0.113 0.718 1.384 1.147 LP1 -0.136 -1.080 1.064 0.660 0.660 -1.080 0.369 -0.474 0.072 -0.136 -2.310 -1.080 0.209 1.421 0.880 0.960 Eig1Z -0.587 -1.112 1.698 -0.813 0.819 -0.933 0.731 -0.385 1.655 -0.437 -1.364 -1.194 0.208 0.052 0.898 0.764 90 Eig1m -0.584 -1.109 1.697 -0.831 0.819 -0.929 0.732 -0.383 1.654 -0.434 -1.360 -1.197 0.210 0.053 0.898 0.764 Eig1v -0.802 -1.358 1.529 -0.297 0.241 -0.958 1.847 -0.073 0.941 -0.466 -1.438 -1.169 0.527 0.549 0.732 0.196 Eig1e -0.704 -1.241 1.634 -0.254 0.733 -1.057 0.649 -0.496 1.589 -0.550 -1.499 -1.103 0.111 0.693 0.816 0.677 Eig1p -0.746 -1.290 1.535 -0.527 0.212 -0.866 2.051 0.050 0.871 -0.388 -1.339 -1.195 0.636 0.068 0.760 0.169 SEigZ -0.390 -0.551 -0.390 3.126 -0.766 -0.390 -0.014 -0.014 -0.766 -0.390 -0.390 1.180 -0.014 0.926 -0.390 -0.766 SEigm -0.390 -0.546 -0.390 3.138 -0.757 -0.390 -0.022 -0.022 -0.757 -0.390 -0.390 1.190 -0.022 0.898 -0.390 -0.757 SEigv -0.131 0.567 -0.131 1.162 1.162 -0.131 -1.425 -1.425 1.162 -0.131 -0.131 1.162 -1.425 -1.315 -0.131 1.162 SEige -0.263 -0.735 -0.263 2.303 -1.333 -0.263 0.803 0.803 -1.333 -0.263 -0.263 0.487 0.803 1.111 -0.263 -1.333 SEigp -0.207 0.386 -0.207 1.742 0.979 -0.207 -1.393 -1.393 0.979 -0.207 -0.207 1.360 -1.393 -1.005 -0.207 0.979 AEigZ -0.542 -1.035 1.662 -1.063 0.846 -0.875 0.697 -0.380 1.653 -0.397 -1.292 -1.263 0.193 0.112 0.891 0.794 AEigm -0.538 -1.030 1.660 -1.086 0.846 -0.870 0.698 -0.376 1.651 -0.394 -1.286 -1.268 0.195 0.114 0.890 0.793 AEigv -0.769 -1.366 1.497 -0.383 0.141 -0.921 1.910 0.044 0.821 -0.443 -1.387 -1.229 0.627 0.640 0.722 0.096 AEige -0.691 -1.210 1.632 -0.325 0.770 -1.042 0.619 -0.518 1.620 -0.538 -1.481 -1.111 0.086 0.654 0.819 0.715 AEigp -0.693 -1.275 1.491 -0.684 0.103 -0.809 2.108 0.190 0.734 -0.350 -1.261 -1.284 0.752 0.169 0.749 0.061 VEA1 -0.490 -1.092 1.615 0.016 0.894 -1.092 0.530 -0.452 0.608 -0.490 -1.877 -1.092 0.043 1.141 0.455 1.283 VEA2 0.232 1.086 -1.189 -0.354 -0.513 1.086 -0.742 0.271 -0.692 0.232 2.456 1.086 -0.335 -0.960 -0.781 -0.881 VRA1 -0.587 -1.051 1.863 -0.117 0.876 -1.051 0.391 -0.575 0.401 -0.587 -1.504 -1.051 -0.101 1.351 0.374 1.366 VRA2 -0.404 -0.973 1.403 -0.047 1.313 -0.973 0.270 -0.371 0.289 -0.404 -2.042 -0.973 -0.009 1.332 0.231 1.358 VED1 -0.482 -1.126 1.595 0.102 0.719 -1.126 0.625 -0.477 0.625 -0.482 -1.887 -1.126 0.097 1.149 0.623 1.172 VED2 0.252 1.067 -1.224 -0.285 -0.626 1.067 -0.687 0.262 -0.687 0.252 2.460 1.067 -0.295 -0.976 -0.687 -0.956 VRD1 -0.568 -1.063 1.847 -0.069 0.835 -1.063 0.401 -0.582 0.392 -0.568 -1.536 -1.063 -0.085 1.361 0.415 1.345 VRD2 -0.331 -0.989 1.357 0.073 1.220 -0.989 0.275 -0.376 0.256 -0.331 -2.155 -0.989 0.028 1.337 0.308 1.305 VEZ1 -0.467 -1.102 1.585 0.060 0.738 -1.107 0.628 -0.455 0.645 -0.460 -1.863 -1.245 0.120 1.125 0.640 1.158 VEZ2 0.270 1.096 -1.236 -0.316 -0.609 1.086 -0.682 0.280 -0.671 0.280 2.498 0.898 -0.274 -0.985 -0.671 -0.964 VRZ1 -0.572 -1.069 1.853 -0.051 0.835 -1.067 0.400 -0.586 0.390 -0.571 -1.543 -1.041 -0.088 1.343 0.414 1.353 VRZ2 -0.346 -1.005 1.367 0.115 1.222 -0.999 0.273 -0.393 0.247 -0.346 -2.185 -0.887 0.016 1.301 0.299 1.321 VEm1 -0.466 -1.101 1.585 0.059 0.738 -1.106 0.628 -0.454 0.646 -0.459 -1.860 -1.253 0.121 1.123 0.641 1.158 VEm2 0.271 1.098 -1.236 -0.325 -0.608 1.088 -0.681 0.282 -0.670 0.282 2.501 0.889 -0.273 -0.984 -0.670 -0.963 VRm1 -0.573 -1.069 1.853 -0.050 0.835 -1.067 0.400 -0.586 0.390 -0.572 -1.543 -1.040 -0.088 1.343 0.414 1.353 VRm2 -0.347 -1.007 1.367 0.121 1.222 -1.000 0.272 -0.393 0.246 -0.347 -2.186 -0.881 0.015 1.301 0.299 1.320 VEv1 -0.477 -1.134 1.526 0.106 0.727 -1.114 0.661 -0.465 0.634 -0.470 -1.923 -1.131 0.098 1.179 0.628 1.153 VEv2 0.265 1.072 -1.267 -0.284 -0.626 1.093 -0.667 0.275 -0.688 0.265 2.418 1.072 -0.294 -0.967 -0.688 -0.978 VRv1 -0.570 -1.064 1.855 -0.067 0.834 -1.065 0.392 -0.582 0.393 -0.570 -1.530 -1.060 -0.084 1.352 0.416 1.350 VRv2 -0.339 -1.001 1.376 0.077 1.220 -1.001 0.259 -0.384 0.259 -0.339 -2.131 -0.982 0.032 1.324 0.311 1.318 VEe1 -0.486 -1.124 1.578 0.100 0.726 -1.129 0.616 -0.473 0.633 -0.478 -1.886 -1.137 0.105 1.179 0.628 1.149 VEe2 0.252 1.072 -1.243 -0.288 -0.620 1.061 -0.693 0.262 -0.682 0.262 2.462 1.051 -0.288 -0.952 -0.682 -0.973 VRe1 -0.569 -1.064 1.852 -0.066 0.835 -1.062 0.401 -0.583 0.392 -0.568 -1.537 -1.060 -0.085 1.348 0.415 1.352 VRe2 -0.335 -0.989 1.365 0.077 1.222 -0.983 0.280 -0.381 0.253 -0.335 -2.160 -0.976 0.025 1.313 0.306 1.320 91 VEp1 -0.472 -1.131 1.521 0.101 0.730 -1.109 0.666 -0.462 0.636 -0.465 -1.925 -1.147 0.101 1.173 0.628 1.155 VEp2 0.267 1.075 -1.277 -0.293 -0.624 1.106 -0.666 0.277 -0.687 0.277 2.412 1.054 -0.293 -0.966 -0.687 -0.977 VRp1 -0.571 -1.065 1.857 -0.064 0.835 -1.066 0.391 -0.583 0.393 -0.571 -1.530 -1.056 -0.085 1.347 0.416 1.351 VRp2 -0.346 -1.004 1.380 0.084 1.224 -1.010 0.260 -0.385 0.260 -0.346 -2.130 -0.965 0.032 1.315 0.312 1.321 92 Table B.2: Eigenvector matrix from 2D descriptors Descriptor F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 ZM1 0.067 0.009 -0.022 0.050 -0.004 -0.022 0.013 0.001 -0.008 0.012 -0.027 0.022 0.027 -0.023 0.021 ZM1V 0.042 0.023 0.122 0.081 0.015 -0.011 -0.003 0.116 0.009 -0.039 0.034 -0.107 0.015 -0.074 -0.025 ZM2 0.066 0.000 -0.018 0.071 -0.007 -0.016 0.027 0.001 -0.008 0.016 -0.035 0.027 0.027 -0.020 0.020 ZM2V 0.050 -0.009 0.074 0.053 0.009 0.039 -0.174 0.047 0.001 0.042 0.165 -0.050 0.054 -0.185 0.009 Qindex 0.054 0.035 -0.040 0.102 -0.048 -0.067 0.052 -0.015 -0.015 0.042 -0.064 0.048 0.053 -0.040 0.058 SNar 0.064 -0.034 -0.022 0.064 -0.022 0.032 0.016 0.032 -0.024 -0.011 0.006 0.002 0.025 0.006 -0.001 HNar 0.055 -0.057 -0.039 0.051 -0.066 0.058 0.041 0.076 -0.041 -0.018 0.065 0.022 0.049 0.010 -0.007 GNar 0.061 -0.039 -0.032 0.041 -0.059 0.051 0.044 0.068 -0.042 -0.006 0.042 0.000 0.038 0.011 -0.002 Xt -0.025 0.067 0.022 -0.104 -0.067 0.174 0.018 -0.061 -0.125 0.026 0.004 0.078 0.078 -0.035 -0.009 Dz 0.068 0.009 0.009 0.022 0.040 -0.013 0.011 0.048 -0.026 -0.013 0.001 -0.004 0.005 -0.002 -0.017 Ram 0.039 0.099 -0.001 0.013 0.017 -0.129 -0.022 -0.071 0.005 0.072 -0.046 -0.017 0.001 -0.050 0.026 Pol 0.062 -0.043 0.016 0.073 0.037 0.023 -0.017 -0.003 0.003 0.039 -0.004 0.066 -0.001 -0.039 -0.019 LPRS 0.068 -0.014 -0.006 0.025 0.033 0.012 -0.017 0.003 0.002 -0.017 -0.010 0.007 0.010 -0.010 -0.003 VDA 0.067 -0.022 -0.001 0.019 0.052 0.022 -0.030 -0.005 0.010 -0.036 -0.013 0.001 0.003 -0.002 -0.009 MSD -0.065 -0.037 0.029 0.006 0.050 0.028 -0.003 -0.014 0.054 -0.028 -0.043 0.001 -0.003 0.013 0.009 SMTI 0.064 -0.026 -0.007 0.077 0.026 0.025 -0.029 -0.017 -0.018 -0.033 -0.028 0.028 0.001 -0.006 0.001 SMTIV 0.058 -0.008 0.058 0.100 0.040 0.024 -0.030 0.030 -0.039 -0.039 0.007 -0.008 -0.022 -0.019 -0.024 GMTI 0.060 -0.036 -0.010 0.098 0.012 0.037 -0.032 -0.013 -0.033 -0.036 -0.026 0.030 -0.004 0.001 -0.003 GMTIV 0.053 -0.012 0.051 0.120 0.047 0.041 -0.042 0.032 -0.085 -0.042 0.000 0.031 -0.074 0.026 -0.051 Xu 0.069 -0.010 -0.006 -0.002 0.028 0.018 -0.011 0.018 -0.003 -0.014 0.001 -0.005 0.004 -0.004 -0.007 SPI 0.045 0.063 0.051 -0.080 0.114 -0.009 -0.039 -0.047 0.037 0.047 -0.115 -0.060 -0.040 -0.034 0.024 W 0.065 -0.025 0.000 0.056 0.050 0.025 -0.039 -0.025 -0.003 -0.042 -0.027 0.026 -0.001 -0.010 -0.003 WA 0.063 -0.026 0.011 -0.050 0.072 0.051 -0.028 0.024 0.018 -0.049 -0.007 -0.036 -0.020 0.019 -0.020 Har 0.068 -0.007 -0.016 0.050 0.003 -0.001 0.003 0.006 -0.010 0.002 -0.014 0.016 0.021 -0.016 0.009 Har2 0.068 -0.008 -0.013 0.053 0.012 -0.001 -0.003 -0.001 -0.007 -0.001 -0.018 0.022 0.019 -0.019 0.009 QW 0.064 -0.017 0.012 0.016 0.088 0.016 -0.041 -0.037 0.028 -0.059 -0.029 0.020 -0.001 -0.011 -0.005 TI1 -0.003 -0.021 -0.036 0.195 -0.135 0.024 0.042 0.015 -0.116 0.033 -0.040 0.042 -0.010 0.021 0.022 TI2 0.008 -0.041 0.076 -0.110 0.171 0.110 0.000 0.012 0.085 -0.069 -0.076 -0.020 -0.028 0.025 -0.015 HyDp 0.062 -0.034 0.005 0.054 0.067 0.039 -0.056 -0.035 -0.002 -0.069 -0.032 0.024 -0.016 0.003 -0.012 RHyDp 0.068 -0.007 -0.015 0.051 0.005 -0.002 0.002 0.004 -0.009 0.002 -0.016 0.018 0.021 -0.017 0.009 w 0.059 -0.033 -0.016 0.110 -0.010 0.034 -0.028 -0.005 -0.046 -0.014 -0.025 0.031 -0.001 -0.005 0.001 ww 0.054 -0.038 -0.017 0.125 -0.016 0.045 -0.045 -0.007 -0.066 -0.018 -0.024 0.041 -0.019 0.000 -0.006 Rww 0.036 0.047 0.036 -0.133 0.135 -0.044 -0.039 -0.039 0.074 -0.025 -0.013 0.011 -0.025 -0.030 -0.012 D/D 0.063 0.006 0.016 -0.039 0.101 -0.015 -0.040 -0.032 0.050 -0.032 -0.007 0.026 0.008 -0.035 -0.004 Wap 0.056 -0.033 -0.017 0.126 -0.021 0.037 -0.025 -0.004 -0.061 -0.012 -0.032 0.035 -0.011 0.001 0.001 WhetZ 0.062 -0.055 0.006 0.005 0.045 0.018 -0.053 -0.044 0.010 -0.041 0.029 0.076 -0.033 -0.019 -0.026 93 Whetm 0.062 -0.055 0.006 0.005 0.044 0.018 -0.052 -0.044 0.010 -0.041 0.029 0.076 -0.033 -0.019 -0.026 Whetv 0.063 -0.018 0.029 0.014 0.090 -0.001 -0.010 0.019 0.006 -0.076 0.091 0.018 0.071 -0.016 0.019 Whete 0.065 -0.036 -0.009 0.020 0.051 0.010 -0.013 -0.038 0.014 -0.059 -0.018 0.064 -0.016 0.013 -0.019 Whetp 0.060 -0.025 0.042 0.002 0.095 0.002 -0.039 0.027 0.000 -0.068 0.140 0.020 0.077 -0.040 0.018 J 0.045 0.076 -0.005 -0.119 0.032 -0.072 -0.038 -0.012 0.003 0.033 0.042 0.032 -0.017 -0.047 -0.015 JhetZ 0.013 0.116 -0.099 0.004 0.071 0.012 -0.010 0.028 -0.004 0.008 -0.026 -0.011 0.043 0.015 -0.020 Jhetm 0.011 0.115 -0.102 0.003 0.072 0.013 -0.012 0.028 -0.007 0.008 -0.023 -0.008 0.042 0.015 -0.021 Jhetv 0.051 0.058 -0.055 -0.018 -0.053 0.045 -0.090 -0.077 0.102 0.062 -0.109 0.053 -0.108 -0.010 -0.073 Jhete 0.043 0.101 0.005 -0.039 0.024 -0.007 -0.112 0.029 0.069 0.027 0.034 -0.045 0.039 -0.075 -0.018 Jhetp 0.045 0.076 -0.085 0.004 -0.028 0.041 -0.028 -0.065 0.084 0.045 -0.141 0.035 -0.082 0.021 -0.056 MAXDN 0.009 0.118 0.026 0.040 0.074 -0.087 0.115 0.049 -0.084 -0.047 -0.094 -0.032 -0.003 0.034 -0.010 MAXDP 0.018 0.067 0.137 0.031 0.028 -0.078 0.080 -0.066 -0.051 -0.123 0.040 0.029 -0.083 -0.128 0.020 DELS 0.021 0.090 0.078 0.035 0.059 -0.070 0.173 0.085 -0.042 -0.051 -0.077 -0.042 -0.070 0.086 -0.079 TIE 0.058 0.025 0.029 -0.002 0.037 -0.087 0.143 -0.049 0.047 0.008 -0.072 0.093 -0.050 0.021 -0.026 S0K 0.053 0.011 0.089 0.043 0.090 0.014 -0.056 -0.086 0.016 0.057 -0.034 -0.029 0.034 -0.016 -0.065 S1K 0.054 0.037 -0.031 -0.089 0.112 -0.027 -0.013 -0.004 -0.003 -0.019 0.002 0.028 -0.006 0.007 -0.021 S2K 0.019 -0.062 -0.020 -0.114 0.144 0.135 0.047 0.058 0.013 -0.090 -0.077 -0.013 -0.043 0.077 -0.014 S3K 0.011 -0.040 0.072 -0.123 0.144 0.017 0.073 0.059 0.148 0.106 0.037 0.198 0.067 -0.128 -0.005 PHI 0.003 -0.036 -0.031 -0.141 0.172 0.121 0.058 0.032 0.007 -0.058 -0.078 -0.028 -0.070 0.058 -0.013 BLI -0.016 0.024 -0.108 -0.027 0.073 0.048 0.241 -0.169 -0.053 0.018 0.002 -0.037 -0.004 -0.037 0.073 PW2 0.056 0.064 -0.011 -0.076 -0.046 0.011 0.002 0.015 -0.083 0.032 0.024 -0.008 -0.030 -0.010 -0.008 PW3 0.061 -0.048 0.018 0.025 0.029 -0.003 0.072 0.103 0.028 0.079 0.004 0.063 -0.028 0.001 -0.042 PJI2 -0.012 0.089 0.036 -0.070 0.025 0.152 -0.073 -0.132 -0.086 -0.145 0.001 -0.131 0.128 -0.028 0.083 CSI 0.063 -0.047 -0.008 0.045 0.030 0.042 -0.033 0.008 -0.003 -0.031 0.002 0.016 0.006 -0.003 -0.014 ECC 0.064 -0.042 0.001 0.014 0.055 0.042 -0.038 0.005 0.013 -0.037 -0.002 0.010 -0.001 -0.003 -0.017 AECC 0.059 -0.053 0.009 -0.039 0.064 0.070 -0.022 0.046 0.023 -0.031 0.008 0.000 -0.009 0.004 -0.022 DECC 0.030 0.029 0.089 -0.056 0.119 0.160 -0.015 -0.045 -0.027 -0.023 -0.154 -0.060 -0.028 -0.003 0.035 MDDD 0.051 0.012 0.057 -0.050 0.145 0.034 -0.026 -0.036 0.046 -0.045 -0.129 -0.069 -0.072 0.035 -0.007 UNIP 0.065 -0.044 -0.015 0.025 0.026 0.032 -0.020 0.030 0.012 -0.037 0.027 -0.012 0.036 -0.008 0.002 CENT 0.058 0.023 0.034 0.042 0.105 -0.005 -0.046 -0.096 -0.004 -0.041 -0.116 0.069 -0.077 -0.001 -0.016 VAR 0.056 0.005 0.047 0.017 0.120 0.025 -0.088 -0.079 -0.001 -0.016 -0.082 0.039 -0.075 -0.019 -0.026 BAC 0.007 0.073 0.023 -0.154 0.110 -0.102 -0.068 -0.043 0.070 0.030 0.011 0.060 -0.044 -0.060 -0.020 Lop 0.025 0.026 0.077 -0.095 0.125 0.154 -0.014 -0.052 0.015 -0.039 -0.154 -0.115 0.008 0.001 0.049 ICR 0.032 0.020 0.078 -0.054 0.128 0.158 -0.042 -0.065 -0.025 -0.070 -0.150 -0.064 -0.039 0.016 0.025 MWC01 0.068 -0.012 -0.017 0.045 -0.002 0.007 0.005 0.015 -0.014 -0.003 -0.005 0.006 0.020 -0.007 0.004 MWC02 0.068 0.021 -0.019 -0.020 -0.016 -0.001 0.004 0.024 -0.034 0.008 0.015 -0.008 0.000 -0.007 -0.001 MWC03 0.068 0.017 -0.016 -0.017 -0.017 0.003 0.011 0.032 -0.034 0.013 0.016 -0.004 -0.002 -0.005 -0.005 MWC04 0.067 0.027 -0.019 -0.024 -0.021 -0.003 0.006 0.024 -0.040 0.010 0.016 -0.008 -0.004 -0.008 0.000 94 MWC05 0.068 0.023 -0.017 -0.023 -0.021 0.000 0.011 0.030 -0.039 0.015 0.017 -0.004 -0.006 -0.006 -0.004 MWC06 0.067 0.030 -0.018 -0.028 -0.024 -0.003 0.007 0.025 -0.043 0.012 0.017 -0.007 -0.007 -0.008 -0.001 MWC07 0.067 0.026 -0.017 -0.026 -0.023 0.000 0.011 0.029 -0.042 0.015 0.017 -0.004 -0.008 -0.006 -0.003 MWC08 0.067 0.031 -0.018 -0.030 -0.025 -0.002 0.008 0.025 -0.045 0.013 0.018 -0.006 -0.008 -0.008 -0.001 MWC09 0.067 0.029 -0.017 -0.028 -0.025 0.000 0.012 0.028 -0.044 0.016 0.017 -0.004 -0.009 -0.007 -0.003 MWC10 0.066 0.032 -0.018 -0.031 -0.026 -0.002 0.008 0.025 -0.047 0.014 0.018 -0.005 -0.009 -0.008 -0.001 TWC 0.068 0.020 -0.017 -0.014 -0.017 0.000 0.007 0.024 -0.036 0.010 0.013 -0.003 -0.002 -0.008 -0.001 SRW01 0.069 -0.006 -0.010 0.019 0.020 0.004 -0.009 0.009 -0.004 -0.006 -0.004 0.006 0.011 -0.012 -0.002 SRW02 0.068 -0.012 -0.017 0.045 -0.002 0.007 0.005 0.015 -0.014 -0.003 -0.005 0.006 0.020 -0.007 0.004 SRW04 0.067 0.014 -0.023 0.052 -0.005 -0.029 0.015 -0.003 -0.007 0.015 -0.032 0.026 0.029 -0.027 0.025 SRW06 0.064 0.022 -0.024 0.068 -0.006 -0.046 0.027 -0.013 0.000 0.032 -0.053 0.058 0.036 -0.045 0.040 SRW08 0.061 0.023 -0.024 0.085 -0.007 -0.049 0.039 -0.022 0.004 0.040 -0.075 0.079 0.041 -0.057 0.056 SRW10 0.059 0.024 -0.021 0.098 -0.008 -0.047 0.053 -0.030 0.005 0.042 -0.099 0.088 0.041 -0.063 0.071 MPC01 0.068 -0.012 -0.017 0.045 -0.002 0.007 0.005 0.015 -0.014 -0.003 -0.005 0.006 0.020 -0.007 0.004 MPC02 0.066 0.023 -0.024 0.053 -0.006 -0.041 0.018 -0.009 -0.005 0.021 -0.041 0.032 0.031 -0.033 0.032 piPC01 0.066 0.007 0.004 0.037 -0.040 0.056 -0.026 0.025 0.055 -0.017 -0.009 -0.014 0.013 0.015 -0.011 piPC02 0.066 0.032 -0.001 0.035 -0.044 0.023 -0.032 0.007 0.033 -0.002 -0.019 -0.038 -0.003 -0.020 -0.010 piPC03 0.061 -0.041 0.007 0.082 0.008 0.019 0.015 0.038 0.014 0.083 -0.019 0.002 0.040 -0.045 0.015 TPC 0.066 -0.025 -0.016 0.065 -0.004 0.022 -0.004 0.014 -0.022 -0.007 -0.010 0.005 0.014 -0.003 0.001 piID 0.061 -0.025 -0.007 0.095 -0.007 0.045 -0.056 0.002 -0.031 0.020 -0.038 -0.006 -0.003 -0.021 0.008 PCR 0.031 -0.005 0.033 0.138 -0.060 0.129 -0.171 -0.016 0.082 0.052 -0.087 -0.018 -0.009 0.003 0.016 CID 0.068 -0.014 -0.014 0.034 0.008 0.011 -0.004 0.015 -0.010 -0.007 -0.002 0.006 0.015 -0.007 -0.001 BID 0.069 0.000 -0.013 0.024 0.012 0.000 -0.008 0.002 -0.011 -0.005 -0.006 0.008 0.012 -0.012 0.000 X0 0.068 0.014 -0.002 -0.010 0.044 -0.017 -0.023 -0.006 0.008 0.000 -0.011 0.007 0.002 -0.023 -0.001 X1 0.067 -0.027 -0.009 0.032 0.017 0.024 -0.004 0.021 -0.006 -0.014 0.002 0.005 0.010 -0.001 -0.009 X2 0.063 0.052 -0.031 0.006 -0.006 -0.053 -0.011 -0.019 -0.014 0.009 -0.021 0.005 0.030 -0.037 0.036 X3 0.062 -0.040 0.004 0.080 0.002 0.006 0.049 0.030 0.002 0.055 -0.022 0.067 -0.006 -0.008 -0.025 X0A -0.061 0.040 0.025 -0.025 0.056 -0.072 -0.051 -0.074 0.045 0.006 -0.038 -0.001 -0.037 -0.010 0.000 X1A -0.061 -0.044 0.020 0.055 0.054 -0.031 -0.009 -0.023 0.080 -0.009 -0.030 0.016 0.005 0.007 -0.003 X2A -0.017 0.061 0.021 -0.108 -0.071 0.194 0.020 -0.067 -0.123 0.007 0.017 0.047 0.104 -0.034 -0.001 X0v 0.061 0.025 -0.074 -0.032 0.049 -0.014 0.015 -0.055 0.027 -0.018 -0.012 0.002 0.050 -0.031 0.029 X1v 0.058 0.005 -0.063 0.013 0.011 -0.006 0.166 -0.039 0.018 -0.054 -0.071 -0.015 0.039 0.055 0.042 X2v 0.044 0.065 -0.094 0.012 0.007 -0.066 0.105 -0.016 0.012 -0.050 -0.130 0.033 0.045 0.092 0.026 X0Av -0.021 0.058 -0.132 -0.060 0.076 0.002 0.042 -0.173 0.001 0.014 0.060 -0.028 0.021 -0.083 0.045 X1Av -0.016 0.024 -0.108 -0.027 0.073 0.048 0.241 -0.170 -0.053 0.018 0.002 -0.036 -0.003 -0.036 0.073 X2Av -0.005 0.061 -0.113 -0.057 0.041 0.117 0.137 -0.123 -0.110 0.066 0.073 -0.006 -0.113 -0.025 0.022 X0sol 0.058 0.052 -0.059 -0.007 0.075 -0.005 -0.033 0.007 -0.014 0.002 -0.014 0.021 0.018 -0.005 -0.018 X1sol 0.064 0.012 -0.049 0.051 0.041 0.030 0.053 0.018 -0.014 -0.014 -0.032 -0.007 0.013 0.024 0.000 95 X2sol 0.039 0.092 -0.091 0.014 0.033 -0.046 -0.005 0.024 -0.022 -0.030 -0.087 0.013 0.082 0.046 0.000 XMOD 0.056 0.038 -0.062 0.060 0.063 0.026 0.086 0.033 -0.026 -0.016 -0.042 -0.018 0.013 0.036 0.001 RDCHI 0.065 -0.042 -0.016 0.037 -0.002 0.042 0.013 0.042 -0.008 -0.023 0.013 -0.011 0.024 0.012 -0.007 RDSQ 0.066 -0.008 -0.015 0.070 0.003 -0.003 0.005 -0.002 -0.011 0.002 -0.027 0.025 0.022 -0.018 0.015 ISIZ 0.056 -0.066 0.015 -0.067 -0.005 -0.042 0.056 -0.071 0.035 -0.020 0.008 0.037 -0.012 -0.014 0.039 IAC 0.055 -0.028 0.084 -0.005 0.004 -0.040 0.142 -0.043 0.001 -0.025 0.026 -0.006 -0.052 -0.018 0.046 AAC -0.014 0.030 0.122 0.097 0.010 0.075 0.198 -0.011 0.012 0.029 0.097 -0.031 -0.187 -0.026 0.040 IDE 0.060 -0.026 0.024 -0.066 0.046 0.095 0.002 0.052 -0.011 -0.009 0.008 -0.007 -0.012 -0.002 -0.015 IDM 0.069 0.009 -0.011 -0.024 0.002 0.011 -0.002 0.028 -0.024 0.002 0.015 -0.009 -0.003 -0.004 -0.008 IDDE 0.050 0.020 0.078 0.008 0.083 0.069 -0.018 -0.045 0.022 0.209 -0.129 -0.073 0.038 0.006 -0.102 IDDM 0.069 0.004 -0.012 -0.016 0.006 0.009 -0.002 0.029 -0.016 0.000 0.014 -0.009 0.002 -0.005 -0.007 IDET 0.063 -0.034 0.006 0.054 0.056 0.037 -0.044 -0.021 -0.002 -0.041 -0.021 0.023 -0.006 -0.010 -0.007 IDMT 0.063 -0.026 0.002 0.078 0.048 0.027 -0.044 -0.036 -0.014 -0.045 -0.035 0.043 -0.008 -0.014 0.001 IVDE 0.033 0.063 0.103 -0.037 0.074 0.089 0.039 -0.014 -0.002 0.191 -0.101 -0.081 0.070 -0.031 -0.032 IVDM 0.069 -0.015 -0.010 -0.003 0.006 0.026 0.007 0.040 -0.012 -0.005 0.015 -0.009 0.009 0.001 -0.009 HVcpx 0.056 -0.062 0.017 -0.041 0.049 0.092 0.004 0.074 0.012 -0.014 0.020 0.000 -0.009 0.010 -0.028 HDcpx 0.065 0.025 -0.007 -0.061 -0.019 0.042 0.005 0.034 -0.057 0.006 0.024 -0.010 -0.013 -0.003 -0.009 Uindex 0.066 0.005 0.006 -0.056 0.058 0.025 -0.018 -0.007 0.023 -0.010 0.019 0.022 0.046 -0.040 -0.004 Vindex -0.028 0.079 0.010 -0.085 -0.073 0.146 0.011 -0.107 -0.111 0.030 0.012 0.097 0.125 -0.042 -0.024 Xindex -0.011 0.082 0.019 -0.133 -0.056 0.146 0.017 -0.055 -0.111 0.043 0.025 0.098 0.087 -0.052 -0.017 Yindex 0.043 -0.013 0.060 -0.055 0.051 -0.081 0.110 0.110 0.059 0.218 0.034 0.224 -0.098 -0.100 -0.045 IC0 -0.014 0.030 0.122 0.097 0.010 0.075 0.198 -0.011 0.012 0.029 0.097 -0.031 -0.187 -0.026 0.040 TIC0 0.055 -0.028 0.084 -0.005 0.004 -0.040 0.142 -0.043 0.001 -0.025 0.026 -0.006 -0.052 -0.018 0.046 SIC0 -0.047 0.044 0.034 0.096 0.028 0.111 0.106 -0.020 0.048 0.056 0.103 0.030 -0.160 -0.001 0.027 CIC0 0.054 -0.065 0.002 -0.086 -0.027 -0.070 -0.018 -0.031 -0.016 -0.021 -0.030 -0.015 0.060 -0.021 0.017 BIC0 -0.014 0.077 0.014 -0.022 -0.006 0.213 0.120 -0.016 -0.104 0.091 0.128 0.047 -0.110 -0.069 0.015 IC1 0.011 -0.012 0.158 0.097 0.035 0.024 -0.032 -0.005 -0.077 0.176 -0.008 -0.032 -0.015 0.017 0.012 TIC1 0.054 -0.032 0.099 0.018 0.032 -0.026 -0.012 -0.050 -0.050 0.104 -0.008 0.006 -0.011 -0.020 0.033 SIC1 -0.033 0.010 0.111 0.128 0.043 0.071 -0.030 -0.004 -0.016 0.169 0.019 0.015 -0.037 0.046 0.012 CIC1 0.045 -0.050 -0.056 -0.118 -0.047 -0.063 0.069 -0.033 0.035 -0.123 0.006 -0.006 0.008 -0.041 0.023 BIC1 0.002 0.060 0.079 -0.020 -0.008 0.206 0.029 0.000 -0.176 0.176 0.065 0.044 -0.003 -0.030 -0.003 IC2 0.024 -0.036 0.160 0.017 0.044 -0.010 0.017 -0.046 -0.017 0.115 -0.043 -0.092 0.161 0.107 -0.074 TIC2 0.051 -0.038 0.098 -0.026 0.063 -0.028 0.016 -0.079 0.007 0.080 -0.044 -0.042 0.076 0.059 -0.027 SIC2 -0.004 -0.035 0.169 0.051 0.034 0.015 0.020 -0.043 0.011 0.122 -0.033 -0.072 0.172 0.128 -0.079 CIC2 0.026 -0.019 -0.139 -0.079 -0.077 -0.038 0.031 0.015 0.006 -0.145 0.051 0.079 -0.186 -0.154 0.118 BIC2 0.018 0.005 0.147 -0.047 -0.007 0.123 0.057 -0.035 -0.111 0.136 0.006 -0.037 0.170 0.063 -0.080 ATS1m 0.053 0.059 -0.072 0.032 0.050 0.024 0.094 0.035 -0.045 0.014 -0.005 -0.032 -0.001 0.007 0.012 ATS2m 0.042 0.085 -0.092 -0.010 0.049 0.016 0.011 0.021 -0.073 0.039 0.051 -0.002 -0.003 -0.027 -0.007 96 ATS3m 0.062 -0.041 0.031 0.028 0.047 -0.004 0.055 0.081 0.028 0.088 0.006 0.069 -0.026 -0.004 -0.058 ATS1v 0.067 0.001 -0.044 -0.002 -0.019 0.032 0.005 -0.014 -0.015 0.012 0.001 0.014 -0.008 0.008 -0.009 ATS2v 0.066 0.018 -0.056 0.004 -0.014 -0.015 -0.019 -0.019 -0.010 0.021 -0.003 0.022 -0.022 0.009 -0.014 ATS3v 0.062 -0.057 -0.001 0.029 0.031 0.015 -0.014 -0.011 0.063 0.047 0.048 0.049 0.082 -0.105 0.032 ATS1e 0.068 0.012 -0.017 0.005 0.014 0.004 0.019 0.057 -0.036 0.001 0.025 -0.021 0.015 -0.012 -0.003 ATS2e 0.064 0.049 -0.025 -0.019 -0.002 -0.020 0.006 0.031 -0.050 0.011 0.027 -0.020 0.015 -0.023 0.004 ATS3e 0.062 -0.041 0.031 0.028 0.047 -0.004 0.055 0.081 0.028 0.087 0.006 0.069 -0.025 -0.004 -0.057 ATS1p 0.065 0.013 -0.059 0.007 -0.011 0.036 0.031 -0.022 -0.018 0.013 -0.014 0.008 -0.012 0.014 -0.004 ATS2p 0.062 0.030 -0.077 0.009 -0.001 -0.010 -0.013 -0.021 -0.015 0.024 -0.007 0.024 -0.022 0.013 -0.017 ATS3p 0.061 -0.058 -0.003 0.028 0.030 0.017 -0.022 -0.017 0.065 0.043 0.052 0.048 0.089 -0.116 0.042 MATS1m 0.041 -0.085 -0.014 -0.071 -0.116 0.038 -0.016 -0.016 0.053 0.035 -0.081 -0.068 0.042 0.044 -0.041 MATS2m 0.028 -0.056 -0.138 0.012 0.043 0.020 0.001 -0.060 0.099 0.075 0.115 -0.169 0.000 -0.049 0.008 MATS3m 0.014 -0.086 -0.059 -0.075 -0.052 0.050 0.031 0.243 0.037 0.107 -0.143 -0.132 -0.114 0.024 0.096 MATS1v 0.024 -0.005 -0.150 -0.064 -0.017 0.118 -0.004 -0.040 -0.019 0.120 0.021 -0.042 -0.019 -0.008 -0.056 MATS2v 0.023 -0.024 -0.160 0.018 0.062 0.027 0.015 -0.052 0.075 0.072 0.098 -0.149 0.002 -0.035 0.004 MATS3v 0.003 -0.091 -0.055 -0.101 -0.038 0.059 -0.088 0.216 0.011 0.121 -0.030 -0.072 -0.108 -0.028 0.048 MATS1e 0.045 -0.077 -0.012 -0.055 -0.118 0.032 0.031 -0.019 0.062 0.023 -0.119 -0.085 0.046 0.064 -0.027 MATS2e 0.029 -0.054 -0.138 0.017 0.043 0.017 0.016 -0.061 0.102 0.071 0.103 -0.175 0.001 -0.042 0.012 MATS3e 0.000 -0.090 -0.053 -0.105 -0.033 0.059 -0.115 0.203 0.004 0.121 -0.001 -0.055 -0.104 -0.041 0.035 MATS1p 0.038 -0.089 -0.015 -0.081 -0.112 0.043 -0.053 -0.014 0.046 0.043 -0.050 -0.053 0.039 0.027 -0.051 MATS2p 0.031 -0.051 -0.137 0.022 0.041 0.015 0.034 -0.062 0.105 0.066 0.088 -0.181 0.003 -0.035 0.018 MATS3p 0.011 -0.089 -0.059 -0.084 -0.049 0.053 -0.003 0.239 0.030 0.112 -0.113 -0.117 -0.114 0.009 0.084 GATS1m -0.017 0.116 -0.040 0.049 0.103 -0.031 0.079 0.084 -0.035 -0.059 -0.031 -0.002 0.061 0.048 0.039 GATS2m -0.010 0.046 0.156 -0.071 -0.078 -0.004 0.018 0.037 -0.069 -0.039 -0.039 0.101 0.043 -0.041 0.061 GATS1v -0.018 0.016 0.161 0.034 0.026 -0.094 0.037 0.112 0.016 -0.105 0.019 -0.014 0.060 -0.004 0.085 GATS2v -0.010 0.045 0.156 -0.072 -0.078 -0.004 0.015 0.037 -0.070 -0.038 -0.037 0.102 0.043 -0.042 0.060 GATS1e -0.030 0.099 -0.048 0.003 0.119 -0.011 -0.069 0.097 -0.066 -0.023 0.092 0.054 0.051 -0.015 -0.004 GATS2e -0.009 0.048 0.156 -0.067 -0.079 -0.006 0.029 0.036 -0.067 -0.042 -0.049 0.096 0.044 -0.035 0.064 GATS1p -0.019 0.116 -0.042 0.045 0.106 -0.029 0.064 0.086 -0.039 -0.055 -0.018 0.004 0.061 0.041 0.035 GATS2p -0.012 0.043 0.155 -0.077 -0.077 -0.001 -0.003 0.038 -0.073 -0.033 -0.023 0.108 0.041 -0.049 0.055 EPS0 0.066 -0.024 -0.003 0.008 -0.002 0.082 0.013 0.028 -0.038 -0.009 0.005 0.031 0.024 -0.003 -0.013 EPS1 0.068 -0.006 -0.020 0.033 -0.015 0.027 0.008 0.014 -0.029 -0.006 0.001 0.000 0.035 -0.009 0.007 EEig01x 0.056 0.078 0.008 0.003 -0.055 0.004 -0.016 -0.011 0.082 -0.008 -0.029 0.043 0.012 0.046 0.019 EEig02x 0.062 -0.044 0.022 0.046 0.008 0.037 0.023 0.063 0.088 0.024 0.000 -0.029 0.007 -0.054 -0.051 EEig01d 0.031 0.112 0.023 0.029 -0.046 0.027 0.062 -0.006 0.167 -0.086 0.010 -0.004 0.025 0.023 0.010 EEig02d 0.049 -0.007 0.045 0.073 0.014 0.067 0.098 0.129 0.199 -0.024 0.052 -0.071 -0.105 0.037 -0.116 EEig03d 0.038 -0.025 -0.045 0.144 -0.020 0.053 0.081 -0.025 0.098 -0.081 0.045 -0.059 0.360 -0.127 -0.160 EEig04d -0.008 0.028 0.028 0.163 0.011 0.094 -0.107 -0.018 -0.140 -0.198 -0.108 -0.078 -0.074 -0.211 0.432 97 EEig01r 0.062 0.051 0.008 -0.038 -0.038 -0.061 -0.004 -0.011 -0.015 0.008 -0.023 0.008 -0.014 -0.041 -0.004 EEig02r 0.061 -0.057 0.029 0.029 0.012 -0.006 0.053 0.052 0.073 0.011 -0.005 -0.030 0.042 -0.027 0.004 EEig03r 0.049 -0.068 -0.033 0.097 -0.055 0.048 0.041 -0.044 0.026 -0.031 0.023 -0.057 0.100 0.000 -0.040 ESpm02u 0.067 0.032 -0.019 -0.026 -0.022 -0.009 0.000 0.017 -0.041 0.011 0.018 -0.011 -0.006 -0.009 -0.001 ESpm03u 0.041 0.096 0.009 0.015 0.013 -0.124 -0.045 -0.071 -0.010 0.077 -0.014 -0.069 -0.015 -0.032 0.000 ESpm04u 0.064 0.045 -0.023 -0.026 -0.009 -0.060 -0.008 0.015 -0.024 0.015 0.005 -0.005 -0.035 -0.013 0.006 ESpm05u 0.044 0.090 0.015 0.025 0.016 -0.114 -0.057 -0.068 -0.017 0.087 -0.005 -0.080 -0.014 -0.023 -0.021 ESpm06u 0.062 0.049 -0.024 -0.026 -0.005 -0.082 -0.011 0.011 -0.017 0.014 0.004 -0.011 -0.048 -0.012 0.008 ESpm07u 0.045 0.088 0.017 0.029 0.017 -0.110 -0.062 -0.066 -0.019 0.091 -0.002 -0.084 -0.011 -0.023 -0.024 ESpm08u 0.061 0.051 -0.023 -0.025 -0.004 -0.092 -0.012 0.008 -0.014 0.013 0.005 -0.019 -0.056 -0.009 0.008 ESpm09u 0.045 0.087 0.017 0.030 0.017 -0.108 -0.064 -0.065 -0.020 0.093 -0.001 -0.086 -0.010 -0.024 -0.025 ESpm10u 0.060 0.051 -0.022 -0.024 -0.003 -0.097 -0.014 0.007 -0.013 0.012 0.006 -0.024 -0.061 -0.007 0.007 ESpm11u 0.045 0.086 0.017 0.031 0.018 -0.108 -0.065 -0.065 -0.020 0.094 0.000 -0.087 -0.009 -0.024 -0.025 ESpm12u 0.060 0.052 -0.022 -0.023 -0.003 -0.101 -0.015 0.007 -0.012 0.012 0.007 -0.028 -0.064 -0.006 0.006 ESpm13u 0.046 0.086 0.017 0.031 0.018 -0.107 -0.066 -0.065 -0.020 0.095 0.000 -0.087 -0.008 -0.025 -0.025 ESpm14u 0.060 0.052 -0.021 -0.022 -0.002 -0.103 -0.016 0.006 -0.012 0.012 0.008 -0.030 -0.066 -0.005 0.005 ESpm15u 0.046 0.086 0.017 0.031 0.018 -0.107 -0.066 -0.064 -0.020 0.095 0.000 -0.087 -0.008 -0.025 -0.025 ESpm01x 0.066 0.007 0.004 0.037 -0.040 0.056 -0.026 0.025 0.055 -0.017 -0.009 -0.014 0.013 0.015 -0.011 ESpm02x 0.065 0.034 0.000 0.002 -0.054 0.043 -0.027 0.019 0.053 -0.017 0.006 0.021 0.011 0.056 -0.002 ESpm03x 0.061 0.054 0.003 -0.011 -0.062 0.040 -0.037 0.006 0.054 -0.012 0.004 0.028 0.006 0.063 0.004 ESpm04x 0.059 0.062 0.006 -0.019 -0.065 0.038 -0.034 0.003 0.054 -0.012 0.004 0.032 0.002 0.067 0.005 ESpm05x 0.058 0.067 0.008 -0.024 -0.066 0.038 -0.032 0.001 0.053 -0.011 0.003 0.034 0.000 0.068 0.005 ESpm06x 0.057 0.070 0.009 -0.027 -0.067 0.038 -0.030 0.001 0.051 -0.011 0.003 0.035 -0.001 0.069 0.006 ESpm07x 0.057 0.071 0.010 -0.029 -0.067 0.039 -0.028 0.001 0.050 -0.010 0.003 0.036 -0.002 0.069 0.006 ESpm08x 0.056 0.072 0.010 -0.030 -0.067 0.039 -0.027 0.001 0.049 -0.010 0.003 0.037 -0.002 0.068 0.006 ESpm09x 0.056 0.072 0.011 -0.031 -0.067 0.040 -0.026 0.001 0.048 -0.010 0.003 0.037 -0.001 0.068 0.006 ESpm10x 0.056 0.073 0.011 -0.032 -0.067 0.040 -0.025 0.001 0.047 -0.010 0.003 0.037 -0.001 0.068 0.006 ESpm11x 0.056 0.073 0.011 -0.032 -0.067 0.041 -0.025 0.001 0.047 -0.010 0.004 0.037 -0.001 0.068 0.006 ESpm12x 0.056 0.073 0.011 -0.032 -0.067 0.041 -0.025 0.001 0.046 -0.010 0.004 0.037 -0.001 0.067 0.006 ESpm13x 0.056 0.073 0.011 -0.033 -0.067 0.041 -0.024 0.001 0.046 -0.010 0.004 0.038 -0.001 0.067 0.006 ESpm14x 0.056 0.073 0.011 -0.033 -0.068 0.042 -0.024 0.001 0.045 -0.010 0.004 0.038 -0.001 0.067 0.006 ESpm15x 0.056 0.073 0.011 -0.033 -0.068 0.042 -0.024 0.002 0.045 -0.009 0.004 0.038 0.000 0.067 0.006 ESpm01d -0.004 0.116 0.058 0.073 0.015 0.064 0.049 0.077 0.116 -0.106 0.059 -0.100 -0.021 -0.051 -0.088 ESpm02d 0.055 0.066 0.009 0.010 -0.060 0.063 0.022 0.018 0.122 -0.053 0.053 0.022 -0.014 0.064 -0.009 ESpm03d 0.016 0.129 0.059 0.045 -0.015 0.027 0.000 -0.008 0.085 0.017 0.023 -0.023 -0.015 -0.049 0.050 ESpm04d 0.042 0.098 0.012 0.004 -0.058 0.056 0.026 -0.002 0.147 -0.067 0.068 0.007 -0.020 0.035 -0.010 ESpm05d 0.020 0.127 0.057 0.043 -0.021 0.031 -0.012 -0.011 0.081 0.034 0.019 -0.018 -0.026 -0.048 0.053 ESpm06d 0.039 0.105 0.015 0.002 -0.056 0.053 0.028 -0.006 0.145 -0.063 0.074 0.006 -0.023 0.030 -0.005 98 ESpm07d 0.021 0.126 0.056 0.042 -0.022 0.034 -0.015 -0.011 0.079 0.039 0.018 -0.019 -0.030 -0.047 0.054 ESpm08d 0.038 0.107 0.017 0.001 -0.055 0.053 0.029 -0.006 0.141 -0.060 0.075 0.006 -0.024 0.027 -0.002 ESpm09d 0.021 0.126 0.056 0.042 -0.023 0.036 -0.016 -0.011 0.078 0.041 0.018 -0.019 -0.031 -0.047 0.055 ESpm10d 0.037 0.108 0.018 0.001 -0.055 0.053 0.029 -0.006 0.139 -0.058 0.075 0.007 -0.024 0.025 0.000 ESpm11d 0.022 0.126 0.056 0.041 -0.023 0.037 -0.017 -0.011 0.077 0.042 0.018 -0.019 -0.031 -0.047 0.055 ESpm12d 0.037 0.109 0.019 0.001 -0.055 0.053 0.029 -0.006 0.137 -0.056 0.075 0.007 -0.024 0.024 0.001 ESpm13d 0.022 0.126 0.055 0.041 -0.023 0.037 -0.017 -0.011 0.077 0.043 0.017 -0.019 -0.031 -0.048 0.055 ESpm14d 0.037 0.109 0.020 0.001 -0.054 0.053 0.028 -0.006 0.136 -0.055 0.075 0.007 -0.024 0.023 0.001 ESpm15d 0.022 0.126 0.055 0.041 -0.024 0.038 -0.017 -0.011 0.077 0.043 0.017 -0.019 -0.031 -0.048 0.055 ESpm01r 0.066 -0.031 0.031 -0.004 -0.048 -0.005 0.005 0.013 0.015 -0.030 -0.003 -0.033 0.005 -0.027 -0.014 ESpm02r 0.068 0.019 -0.006 -0.027 -0.035 -0.012 -0.008 0.021 -0.023 -0.014 0.000 -0.017 0.010 -0.007 -0.015 ESpm03r 0.065 0.032 0.004 -0.044 -0.047 -0.030 -0.020 0.014 -0.025 -0.020 -0.014 -0.017 0.008 -0.002 -0.020 ESpm04r 0.064 0.039 0.006 -0.051 -0.048 -0.033 -0.020 0.012 -0.029 -0.017 -0.014 -0.016 0.000 -0.003 -0.020 ESpm05r 0.063 0.042 0.008 -0.055 -0.049 -0.036 -0.022 0.013 -0.030 -0.017 -0.018 -0.015 -0.002 -0.001 -0.021 ESpm06r 0.062 0.044 0.009 -0.057 -0.049 -0.037 -0.021 0.013 -0.032 -0.016 -0.020 -0.014 -0.003 0.000 -0.021 ESpm07r 0.062 0.046 0.010 -0.058 -0.050 -0.037 -0.021 0.014 -0.032 -0.016 -0.021 -0.013 -0.003 0.000 -0.021 ESpm08r 0.062 0.046 0.011 -0.059 -0.050 -0.037 -0.021 0.014 -0.033 -0.016 -0.021 -0.012 -0.003 0.000 -0.021 ESpm09r 0.062 0.047 0.011 -0.059 -0.050 -0.037 -0.021 0.014 -0.033 -0.016 -0.022 -0.012 -0.002 0.001 -0.021 ESpm10r 0.062 0.047 0.011 -0.059 -0.050 -0.037 -0.021 0.014 -0.033 -0.016 -0.022 -0.012 -0.002 0.001 -0.021 ESpm11r 0.062 0.047 0.012 -0.059 -0.050 -0.037 -0.021 0.015 -0.034 -0.016 -0.022 -0.012 -0.002 0.001 -0.021 ESpm12r 0.062 0.047 0.012 -0.060 -0.050 -0.037 -0.021 0.015 -0.034 -0.016 -0.022 -0.012 -0.002 0.001 -0.021 ESpm13r 0.061 0.047 0.012 -0.060 -0.051 -0.037 -0.021 0.015 -0.034 -0.016 -0.022 -0.012 -0.002 0.001 -0.021 ESpm14r 0.061 0.047 0.012 -0.060 -0.051 -0.037 -0.021 0.015 -0.034 -0.016 -0.022 -0.011 -0.002 0.001 -0.021 ESpm15r 0.061 0.047 0.012 -0.060 -0.051 -0.037 -0.021 0.015 -0.034 -0.016 -0.022 -0.011 -0.002 0.001 -0.021 BEHm1 0.041 0.074 -0.080 0.064 0.018 0.033 0.138 -0.015 -0.024 0.017 -0.089 -0.039 -0.016 0.039 0.028 BEHm2 0.055 0.030 -0.055 -0.039 0.076 0.097 0.074 -0.016 -0.034 0.037 0.077 -0.041 -0.024 -0.076 0.028 BEHm3 0.058 0.018 -0.067 -0.060 0.011 -0.033 -0.009 0.011 -0.138 -0.002 0.030 -0.101 -0.009 -0.057 -0.066 BEHm4 0.055 0.022 -0.055 -0.028 0.086 -0.051 -0.094 0.088 -0.004 0.027 -0.047 0.077 0.089 0.138 0.129 BEHm5 0.059 -0.046 0.005 -0.018 0.028 -0.042 0.084 -0.066 0.083 0.018 0.071 0.157 0.057 -0.025 -0.113 BEHm6 0.053 -0.060 -0.044 0.031 -0.025 0.012 0.120 -0.060 0.043 -0.101 -0.158 0.028 -0.038 0.068 -0.003 BEHm7 0.044 -0.031 0.023 0.109 0.022 0.054 -0.164 -0.075 -0.133 -0.046 0.016 0.073 -0.174 0.018 -0.086 BEHm8 0.052 -0.035 0.033 0.067 0.048 0.013 -0.080 -0.097 -0.123 -0.083 0.106 0.120 -0.105 0.245 0.006 BELm1 0.029 -0.094 0.099 -0.003 -0.085 0.003 0.052 -0.076 0.009 0.036 0.031 -0.027 -0.069 -0.062 0.044 BELm2 0.051 -0.053 0.078 -0.056 -0.047 -0.052 0.040 -0.050 0.017 -0.054 -0.019 -0.145 0.025 -0.071 0.016 BELm3 0.056 -0.070 0.008 -0.059 -0.009 -0.036 0.029 -0.048 -0.012 0.039 0.066 0.063 0.100 0.009 0.153 BELm4 0.059 -0.062 0.003 -0.029 0.024 -0.016 0.018 -0.031 0.040 0.092 -0.066 0.091 -0.099 0.027 0.197 BEHv1 0.064 -0.026 0.017 0.022 -0.082 0.010 0.028 -0.040 -0.009 0.012 -0.089 0.000 -0.022 0.032 0.005 BEHv2 0.055 -0.050 0.075 -0.047 -0.052 -0.003 0.018 -0.033 0.051 -0.053 0.003 -0.078 0.041 -0.013 0.033 99 BEHv3 0.064 -0.040 -0.002 -0.034 -0.028 -0.034 0.040 -0.043 -0.053 0.001 0.026 -0.071 0.001 -0.082 0.002 BEHv4 0.064 -0.036 -0.005 -0.017 0.057 -0.017 -0.037 0.026 0.026 0.065 -0.009 0.034 0.020 0.042 0.231 BEHv5 0.056 -0.057 -0.007 -0.036 -0.017 -0.048 0.099 -0.072 0.037 0.044 -0.038 0.150 -0.032 0.006 -0.161 BEHv6 0.051 -0.090 -0.033 -0.028 -0.030 0.025 0.003 -0.041 -0.023 -0.082 -0.063 0.027 -0.042 -0.015 -0.061 BEHv7 0.056 -0.032 0.035 0.057 -0.002 0.005 -0.106 -0.111 -0.088 -0.048 0.094 -0.045 -0.192 0.054 -0.123 BEHv8 0.059 -0.039 0.030 0.004 0.023 -0.036 0.001 -0.095 -0.040 0.035 0.146 -0.017 0.024 0.342 0.120 BELv1 0.046 -0.076 0.081 0.001 -0.084 -0.024 0.045 -0.036 -0.006 0.013 -0.025 -0.028 -0.030 -0.028 0.034 BELv2 0.051 -0.049 0.087 -0.036 -0.042 -0.043 0.071 -0.023 0.033 -0.069 -0.063 -0.111 0.027 -0.027 0.031 BELv3 0.056 -0.065 0.019 -0.068 -0.026 -0.046 0.026 -0.046 -0.030 0.027 0.089 -0.006 0.015 0.020 0.062 BELv4 0.061 -0.050 0.021 -0.017 0.053 -0.013 0.024 -0.022 0.068 0.081 -0.029 0.073 -0.026 -0.034 0.132 BEHe1 0.062 -0.033 0.041 0.027 -0.076 -0.017 0.060 -0.018 -0.003 -0.003 -0.089 -0.033 -0.006 0.016 0.026 BEHe2 0.049 -0.059 0.094 -0.041 -0.050 -0.026 0.025 -0.006 0.045 -0.064 -0.038 -0.069 0.038 -0.012 0.028 BEHe3 0.061 -0.045 0.023 -0.052 -0.037 -0.045 0.051 -0.029 -0.072 0.001 0.030 -0.091 -0.014 -0.054 0.001 BEHe4 0.063 -0.041 0.017 -0.023 0.057 -0.023 -0.014 0.025 0.022 0.087 0.012 0.048 -0.019 0.055 0.211 BEHe5 0.056 -0.058 0.035 -0.048 -0.018 -0.065 0.079 -0.047 -0.037 0.011 -0.027 0.059 -0.011 -0.099 -0.118 BEHe6 0.053 -0.071 0.021 -0.047 -0.048 -0.006 0.058 -0.003 -0.101 -0.081 -0.087 -0.097 -0.006 -0.072 -0.064 BEHe7 0.058 -0.034 0.055 -0.021 -0.032 -0.042 -0.013 -0.088 -0.080 -0.049 0.102 -0.141 -0.150 0.004 -0.113 BEHe8 0.058 -0.039 0.033 -0.033 0.006 -0.056 0.041 -0.050 -0.010 0.080 0.136 -0.083 0.009 0.310 0.193 BELe1 0.055 -0.064 0.050 0.001 -0.091 0.002 -0.001 -0.050 -0.013 0.024 -0.036 0.004 -0.032 -0.008 0.007 BELe2 0.060 -0.033 0.050 -0.032 -0.023 -0.022 0.084 -0.063 0.055 -0.057 -0.014 -0.136 0.024 -0.025 0.045 BELe3 0.054 -0.077 -0.036 -0.024 -0.013 -0.009 -0.022 -0.071 0.058 0.030 0.095 0.122 0.050 -0.043 0.085 BEHp1 0.064 -0.012 0.003 0.033 -0.073 0.008 0.065 -0.043 -0.011 0.007 -0.116 -0.011 -0.019 0.045 0.012 BEHp2 0.058 -0.043 0.066 -0.048 -0.046 0.001 0.031 -0.041 0.047 -0.050 0.007 -0.089 0.038 -0.018 0.035 BEHp3 0.065 -0.036 -0.012 -0.034 -0.022 -0.033 0.039 -0.044 -0.052 0.002 0.027 -0.063 0.007 -0.079 0.006 BEHp4 0.064 -0.031 -0.014 -0.018 0.060 -0.020 -0.044 0.030 0.027 0.058 -0.020 0.038 0.032 0.046 0.228 BEHp5 0.056 -0.058 -0.005 -0.039 -0.024 -0.049 0.101 -0.070 0.021 0.043 -0.057 0.138 -0.045 0.004 -0.165 BEHp6 0.050 -0.092 -0.025 -0.040 -0.032 0.023 -0.013 -0.034 -0.042 -0.078 -0.047 0.015 -0.038 -0.035 -0.071 BEHp7 0.058 -0.032 0.037 0.041 -0.009 -0.008 -0.087 -0.117 -0.074 -0.047 0.112 -0.074 -0.190 0.062 -0.130 BEHp8 0.059 -0.039 0.029 -0.005 0.019 -0.042 0.012 -0.093 -0.027 0.050 0.149 -0.036 0.042 0.348 0.133 BELp1 0.041 -0.084 0.088 -0.009 -0.084 -0.019 0.007 -0.031 -0.006 0.021 0.009 -0.017 -0.036 -0.046 0.026 BELp2 0.052 -0.048 0.087 -0.036 -0.040 -0.037 0.072 -0.018 0.034 -0.070 -0.065 -0.103 0.027 -0.021 0.032 BELp3 0.056 -0.065 0.017 -0.069 -0.027 -0.046 0.016 -0.048 -0.026 0.028 0.105 -0.011 -0.002 0.028 0.050 BELp4 0.061 -0.049 0.020 -0.012 0.059 -0.009 0.014 -0.033 0.079 0.082 -0.029 0.084 -0.010 -0.072 0.097 LP1 0.066 0.035 -0.020 -0.016 -0.025 -0.017 0.016 0.019 -0.038 0.021 0.007 0.005 -0.002 -0.013 0.004 Eig1Z 0.061 -0.056 0.011 -0.044 0.045 0.007 -0.038 -0.030 0.017 -0.036 0.040 0.053 -0.040 -0.005 -0.033 Eig1m 0.061 -0.057 0.011 -0.044 0.045 0.007 -0.038 -0.030 0.017 -0.036 0.040 0.052 -0.041 -0.005 -0.032 Eig1v 0.060 -0.015 0.037 -0.035 0.104 -0.013 -0.005 0.055 -0.007 -0.064 0.120 -0.007 0.076 -0.004 0.022 Eig1e 0.065 -0.034 -0.010 -0.031 0.057 0.001 -0.001 -0.023 0.019 -0.053 -0.005 0.043 -0.023 0.027 -0.028 100 Eig1p 0.056 -0.023 0.052 -0.046 0.106 -0.010 -0.035 0.064 -0.014 -0.054 0.168 -0.005 0.081 -0.028 0.023 SEigZ -0.006 0.110 -0.095 0.019 0.099 -0.006 0.039 0.077 -0.066 -0.014 -0.032 0.006 0.032 0.065 -0.044 SEigm -0.006 0.109 -0.097 0.019 0.099 -0.005 0.037 0.076 -0.065 -0.014 -0.031 0.007 0.032 0.065 -0.044 SEigv -0.005 -0.027 -0.149 -0.016 -0.054 0.092 -0.112 -0.163 0.077 0.038 -0.018 0.066 0.045 -0.024 0.049 SEige -0.006 0.112 -0.007 0.016 0.123 -0.050 0.061 0.164 -0.107 -0.027 0.009 -0.013 0.003 0.065 -0.078 SEigp -0.005 -0.004 -0.164 -0.010 -0.032 0.086 -0.093 -0.144 0.060 0.034 -0.029 0.060 0.050 -0.014 0.040 AEigZ 0.060 -0.062 0.019 -0.041 0.035 0.006 -0.029 -0.036 0.024 -0.036 0.033 0.046 -0.041 -0.006 -0.025 AEigm 0.060 -0.063 0.020 -0.041 0.034 0.006 -0.029 -0.036 0.024 -0.036 0.033 0.046 -0.041 -0.006 -0.024 AEigv 0.059 -0.012 0.048 -0.033 0.105 -0.020 0.005 0.067 -0.013 -0.065 0.118 -0.012 0.070 -0.002 0.017 AEige 0.065 -0.037 -0.010 -0.031 0.053 0.002 -0.003 -0.028 0.022 -0.052 -0.006 0.044 -0.023 0.025 -0.026 AEigp 0.054 -0.022 0.067 -0.043 0.105 -0.019 -0.024 0.076 -0.020 -0.055 0.163 -0.011 0.073 -0.025 0.018 VEA1 0.069 -0.009 -0.017 0.010 -0.001 0.014 -0.001 0.029 -0.015 -0.002 0.009 -0.009 0.013 -0.003 -0.004 VEA2 -0.068 -0.017 0.005 0.046 -0.008 -0.013 0.004 -0.029 0.027 -0.001 -0.017 0.014 0.015 0.002 0.012 VRA1 0.068 -0.014 -0.017 0.047 -0.001 0.008 0.004 0.015 -0.013 -0.004 -0.006 0.006 0.020 -0.007 0.004 VRA2 0.067 -0.008 -0.027 0.017 -0.043 0.023 0.032 0.049 -0.039 0.005 0.024 -0.011 0.022 0.004 0.002 VED1 0.069 -0.003 -0.013 0.004 0.011 0.005 -0.004 0.021 -0.009 -0.002 0.007 -0.001 0.011 -0.009 -0.004 VED2 -0.068 -0.013 0.008 0.041 -0.001 -0.019 0.000 -0.033 0.030 -0.001 -0.017 0.016 0.013 -0.001 0.011 VRD1 0.068 -0.011 -0.017 0.044 0.000 0.006 0.003 0.013 -0.013 -0.003 -0.007 0.007 0.018 -0.008 0.003 VRD2 0.067 0.000 -0.026 0.010 -0.043 0.020 0.028 0.045 -0.044 0.006 0.022 -0.013 0.014 0.004 0.002 VEZ1 0.070 -0.005 -0.009 0.001 0.008 0.000 -0.007 0.024 -0.006 -0.008 0.002 0.002 0.015 -0.001 -0.006 VEZ2 -0.068 -0.015 0.012 0.039 -0.006 -0.027 -0.004 -0.029 0.035 -0.010 -0.026 0.017 0.021 0.011 0.009 VRZ1 0.068 -0.011 -0.018 0.044 0.001 0.007 0.002 0.013 -0.014 -0.001 -0.005 0.007 0.018 -0.009 0.003 VRZ2 0.067 0.001 -0.029 0.010 -0.041 0.025 0.027 0.043 -0.048 0.012 0.030 -0.012 0.012 -0.003 0.002 VEm1 0.070 -0.005 -0.009 0.001 0.008 0.000 -0.008 0.024 -0.006 -0.008 0.002 0.002 0.016 -0.001 -0.006 VEm2 -0.068 -0.015 0.013 0.039 -0.006 -0.027 -0.004 -0.029 0.035 -0.010 -0.026 0.017 0.021 0.011 0.010 VRm1 0.068 -0.011 -0.018 0.044 0.001 0.007 0.002 0.013 -0.014 -0.001 -0.005 0.007 0.018 -0.009 0.003 VRm2 0.067 0.001 -0.030 0.010 -0.040 0.025 0.027 0.043 -0.048 0.013 0.030 -0.012 0.012 -0.003 0.002 VEv1 0.069 -0.002 -0.012 0.000 0.011 0.005 0.001 0.023 -0.007 -0.003 0.008 -0.007 0.019 -0.011 0.002 VEv2 -0.068 -0.012 0.009 0.038 -0.002 -0.018 0.004 -0.033 0.029 -0.002 -0.016 0.013 0.018 -0.004 0.013 VRv1 0.068 -0.011 -0.017 0.045 0.000 0.006 0.002 0.013 -0.013 -0.002 -0.007 0.008 0.017 -0.008 0.002 VRv2 0.067 -0.001 -0.027 0.011 -0.042 0.019 0.026 0.044 -0.044 0.008 0.022 -0.011 0.010 0.004 -0.001 VEe1 0.069 -0.003 -0.013 0.003 0.011 0.004 -0.001 0.021 -0.008 -0.004 0.005 0.000 0.010 -0.004 -0.004 VEe2 -0.068 -0.013 0.008 0.041 -0.002 -0.021 0.003 -0.033 0.032 -0.003 -0.018 0.016 0.012 0.005 0.012 VRe1 0.068 -0.011 -0.017 0.044 0.000 0.006 0.002 0.013 -0.014 -0.002 -0.006 0.007 0.019 -0.009 0.003 VRe2 0.067 0.000 -0.026 0.010 -0.043 0.021 0.026 0.044 -0.045 0.008 0.024 -0.012 0.016 0.001 0.001 VEp1 0.069 -0.002 -0.012 -0.001 0.010 0.004 0.001 0.023 -0.007 -0.003 0.008 -0.008 0.020 -0.011 0.002 VEp2 -0.068 -0.012 0.009 0.037 -0.003 -0.019 0.004 -0.032 0.029 -0.003 -0.018 0.012 0.021 -0.002 0.015 VRp1 0.068 -0.011 -0.017 0.045 0.001 0.006 0.002 0.013 -0.014 -0.002 -0.007 0.008 0.016 -0.008 0.002 101 VRp2 0.067 -0.001 -0.027 0.011 -0.041 0.020 0.026 0.044 -0.044 0.010 0.024 -0.010 0.009 0.004 -0.001 102 Table B.3: 2D PCA factor matrix Solvent F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 Acetone -5.354 5.241 2.383 -1.530 -5.179 -0.663 -1.398 -1.083 0.365 -1.767 0.135 -1.801 -1.449 -1.370 -1.019 Aceto-nitrile -14.975 4.300 4.375 1.291 -4.810 5.441 -1.343 0.084 4.023 -0.564 -0.118 1.642 0.297 0.985 0.140 Benzyl Alcohol 22.921 -3.603 2.826 7.480 2.902 2.516 -3.438 -1.138 -2.602 -1.730 0.210 1.468 -0.694 0.152 -0.209 Carbon Tetra- chloride -1.022 15.459 -12.358 -1.313 2.572 -1.857 -3.055 1.635 -0.365 -0.784 -0.721 0.369 0.519 0.291 -0.146 Cyclo-hexane 9.083 -10.587 -8.743 -1.040 -6.235 -1.391 2.206 1.703 -0.372 -1.151 2.343 1.012 0.421 0.040 -0.141 Ethanol -16.434 -2.713 3.268 -1.235 -1.287 1.218 0.725 -1.154 -3.389 -0.349 -1.684 0.484 2.310 -0.678 -0.475 Ethyl Acetate 8.869 2.445 6.579 -1.443 5.117 0.694 -0.464 1.689 1.638 -1.145 2.537 -1.634 1.450 -0.548 0.447 Ethylene Glycol -8.069 -2.052 3.874 -2.484 0.818 1.006 0.898 5.698 -1.570 0.154 -1.264 0.734 -1.307 -0.538 0.796 Hexane 8.723 -11.232 -3.772 -7.768 4.433 2.341 0.560 -1.775 2.051 -1.413 -1.906 -0.707 -0.387 0.492 -0.158 Iso-propanol -5.287 1.845 2.259 -2.928 -3.191 -2.414 -1.156 -2.463 -2.803 -0.199 0.542 -1.402 -0.354 1.331 1.447 Methanol -28.703 -9.061 -0.939 7.175 3.268 -6.569 -0.820 -0.147 1.894 -0.214 -0.168 -0.275 -0.143 0.055 0.012 Methylene Dichloride -18.282 2.691 -6.093 1.613 3.698 6.106 3.071 -2.132 -1.061 1.853 1.705 -0.114 -0.783 -0.464 0.105 Propylene Glycol 2.151 1.102 5.408 -2.650 1.458 -1.907 0.584 1.351 -0.954 2.405 0.813 -0.100 -0.263 1.454 -1.641 Sulfolane 18.294 7.869 1.121 5.876 -0.835 -1.598 6.776 -0.256 0.781 -0.672 -1.564 -0.661 0.083 0.286 0.149 t-Amyl Alcohol 11.388 3.182 2.942 -4.438 0.637 -4.712 -0.045 -2.711 1.411 1.861 0.227 2.734 -0.185 -1.053 0.421 Toluene 16.696 -4.885 -3.130 3.395 -3.366 1.789 -3.101 0.701 0.952 3.714 -1.086 -1.749 0.485 -0.436 0.270 103 Table B.4 below shows the QSAR models developed using 2D descriptors and containing all 14 factors. Because of the way PCA factors are calculated, F1 contains the most information from the descriptor matrix and F14 contains the least. In the Q2 analysis where fewer factors are included, F14 is removed first and then F13 and then so on. Table B.4: Equations developed with PCA method regression Descriptor Class Equation to calculate normalized AR 2D ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 3D GAFF ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2D & 3D GAFF ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 3D Ghemical ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2D & 3D Ghemical ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 3D MMFF94s ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2D & 3D MMFF94s ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) This procedure is repeated for each of the other sets of descriptors. The size of the matrices changes as more descriptors are added but the general procedure remains the same. Also, when the expansion solvents are added to the training set, the same procedure is repeated. Again, the additional solvents increase the size of the matrices, but the process remains the same. Table B.5 shows the 35 expansion solvents and their two-dimensional structures. As in Table 4.4, the order within the table is the order in which they were added to the training set. 104 Table B.5: Expansion solvents and their two-dimensional structures CH3 O CH3 butanone CH3 OH O CH3 CH3 diacetone alcohol OH phenethyl alcohol OHCH3 2-phenylpropanol cycloheptane cyclopentane O O C H 3CH 3 propyl acetate O OCH CH3 methyl acetate OH CH3 OH 1,3-butanediol OH OH CH3 CH3 2,3-butanediol CH3 CH3 CH3 3-methylpentane CH3 CH3 CH3 2-methylpentane CH3 OH propanol OH CH3 CH3 CH3 t-butanol H H OH OH methanediol 105 Cl Cl 1,2-Dichloroethane Cl Cl Cl CH3 1,1,1-trichloroethane OH OH OH glycerol OH OH 1,3-propanediol OH CH3 1-pentanol OH CH3 CH3 2-pentanol CH3 ethylbenzene CH3CH3 cumene CH3 CH3 CH3 OH 2,2-dimethyl-1-butanol O HCH 3 1-hexanol CH 3 O HCH 3 2-methyl-1-pentanol CH 3 C H 3 n-heptane C H 3 CH 3 n-octane C H 3 CH 3 n-nonane C H 3 CH 3 n-decane 106 CH3 OO CH3 OCH3 1-ethoxyethyl acetate O OCH 3 O C H 3 2-Methoxyethyl acetate O HCH 3 1-heptanol CH 3 O H 1-octanol CH 3 O H 1-nonanol