Statistical Human Body Form Classification: Methodology Development and Application by Frederick Steven Cottle A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Auburn, Alabama May 6, 2012 Keywords: body form, cluster analysis, somatotypes, statistical learning Copyright 2012 by Frederick Steven Cottle Approved by Lenda Jo Connell, Co-chair, Professor of Consumer Affairs Pamela V. Ulrich, Co-chair, Professor of Consumer Affairs Karla P. Simmons, Assistant Professor of Consumer Affairs ii Abstract The focus of this exploratory study was statistical human body form classification. Prior studies have explored human body size and shape but few have explored human body form. The actual human body is a three-dimensional (3D) object and form is the construct that best represents the body. Body scanning can portray the 3D human form as a digital point cloud containing in excess of one million data points. This study intended to develop a statistical human body form classification methodology and apply that methodology to a sample of 117 male subject?s body scan data. Four (4) research questions guided the study. They were (1) Will body form categories emerge from an unsupervised hierarchical clustering of 3D male body scan data?, (2) What are the statistical characteristics of each cluster?, (3) What are the visual characteristics of each cluster?, and (4) Do experts in the field of somatology recognize the various clusters from the statistical and visual characteristics generated? The study structure consisted of a pretest (to test the statistical methodology), a clustering of male body form exercise (to answer research questions one and two), and an expert recognition of clusters (to answer research questions three and four). To answer research questions one, two and three, the methodology established in the pretest was applied to a sample of 117 male subject?s 3D body scan data. An unsupervised hierarchical classification was performed revealing seven defined clusters and answering research question one. Statistical characteristics like the number of subjects included, average age, average height, average weight, and average BMI were iii reported for each cluster answering research question two. Front and side view images generated by 3D body scanning were obtained for the two most extreme subject members and the median subject member in each identified cluster. These 21 images were used by a panel of experts to generate written visual characteristics for each cluster thus answering research question three. The panel of experts used answers to research questions one, two, and three to aid in their task of answering research question four. The panel did recognize the clusters generated with two exceptions concerning clusters with fewer than 5 members that could possibly be merged into adjoining clusters. The overall result of this exploratory study was the methodology was successful at generating meaningful body form clusters utilizing 3D body scan data. This study is most significant because it provides a foundational work to reduce processing time of body form classification studies using large amounts of data. Other significant contributions include the quantitative generation of meaningful body form categories from the 3D body scan data of specific samples, the statistical data reduction technique application to raw 3D body scan data, and the opportunity to collaborate with fields like kinesiology, psychology, nutrition, and statistics. Future study includes expanding the methodology to different data sets and strengthening the current analysis methodology. iv Acknowledgments I am eternally grateful to God for opening the doors for this journey. I am most grateful and indebted to my wife for her support in this effort. She is my sole mate, best friend, and life long partner. Without her encouragement, help, and ideas none of this would have been possible. My two children and son-in-law played a vital support role in the completion of this process as well. My mother and step-father often gave supporting gestures. Without these people I would not be here. Special thanks go to my committee members at Auburn University. Value cannot be placed on their leadership, guidance, and expertise. Deep appreciation also goes to the leadership, faculty, staff, graduate students, and undergraduate students in the Department of Consumer Affairs at Auburn University. Thanks also goes to the leadership, faculty, staff, and students of the Consumer Apparel and Retail Studies department at the University of North Carolina at Greensboro for allowing time to complete this dissertation. Special thanks to the support received from the Statistics Lab at UNCG. v Table of Contents Abstract.........................................................................................................................................ii Acknowledgements......................................................................................................................iv List of Tables ...............................................................................................................................ix List of Figures............................................................................................................................... x Introduction .................................................................................................................................. 1 Background ................................................................................................................................. 2 Somatological Constructs ........................................................................................................... 2 Size ?..?..................................................................................................................................... 3 Build ??..................................................................................................................................... 3 Shape ??.................................................................................................................................... 4 Form ??..................................................................................................................................... 5 Problem and Purpose Statements ??......................................................................................... 5 Research Questions ..................................................................................................................... 7 Literature Review.......................................................................................................................... 8 Biological Shape Classification Methodology ............................................................................ 8 Physical Anthropology ................................................................................................................ 9 Human Body Measurement Methods ......................................................................................... 9 Analytical Methods in Somatology .......................................................................................... 12 Early Figure Typing ??........................................................................................................... 12 Cartesian Grid Analysis ??.. ................................................................................................... 12 vi Anthropometric Database Studies ??...................................................................................... 13 Male Body Shape Analysis ??................................................................................................ 14 Body Build and Posture Analysis ??.. .................................................................................... 15 Dressmaking Focused Size and Shape Analysis ??................................................................ 17 Recurrent Body Shape Analyses ??........................................................................................ 17 Historical Somatology Summary ??....................................................................................... 20 Body Form Analysis ??.. ........................................................................................................ 21 Summary ................................................................................................................................... 24 Methodology .............................................................................................................................. 25 Methodological Framework ...................................................................................................... 25 Stage One ? Form Preprocessing ??....................................................................................... 26 Stage Two ? Form Transformation ??.................................................................................... 28 Stage Three ? Form Classification ??..................................................................................... 29 Pretest ........................................................................................................................................ 30 Purpose ??............................................................................................................................... 30 Sample Description ??............................................................................................................ 31 Procedure ??.. ......................................................................................................................... 31 Results ??................................................................................................................................ 34 Clustering of Male Body Form ................................................................................................. 35 Purpose ??............................................................................................................................... 35 Sample description ??............................................................................................................. 35 Procedure ??.. ......................................................................................................................... 36 Unsupervised Classification ??............................................................................................... 36 vii Expert Recognition of Clusters ................................................................................................. 38 Purpose ??............................................................................................................................... 38 Procedure ??.. ......................................................................................................................... 38 Background ??........................................................................................................................ 40 Expected Results ??................................................................................................................ 40 Data Analysis and Findings??.. .............................................................................................. 41 Pretest Analysis and Findings ??............................................................................................ 41 Pretest Sample ??.................................................................................................................... 41 Pretest Methodology ??.. ........................................................................................................ 44 Data reduction ??.................................................................................................................... 45 Cluster analysis ??.. ................................................................................................................ 46 Clustering of Male Body Form Analysis and Findings ??.. ................................................... 52 Sample Description ??............................................................................................................ 52 Findings ??.............................................................................................................................. 56 Research question one ??........................................................................................................ 57 Research question two ??.. ..................................................................................................... 58 Expert Recognition of Clusters Analysis and Findings ??.. ................................................... 60 Purpose ??............................................................................................................................... 60 Method ??.. ............................................................................................................................. 61 Findings ??.............................................................................................................................. 63 Research question three ??.. ................................................................................................... 63 Research question four ??....................................................................................................... 71 Conclusions, Discussion, & Recommendations ??.. .............................................................. 73 viii Summary ??............................................................................................................................ 73 Research Question One ??...................................................................................................... 73 Research Question Two ??.. ................................................................................................... 74 Research Question Three ??.. ................................................................................................. 74 Research Question Four ??..................................................................................................... 75 Discussion ??.......................................................................................................................... 76 Historical Comparison ??.. ..................................................................................................... 76 Significance ??........................................................................................................................ 77 Recommendations ??.............................................................................................................. 81 Limitations ??.. ....................................................................................................................... 81 Sample demographics ??.. ...................................................................................................... 81 Degree of subjectivity ??........................................................................................................ 82 Software programming time ??.. ............................................................................................ 82 Future Study ??....................................................................................................................... 83 Improve current methodology ??............................................................................................ 83 Expand current analysis ??.. ................................................................................................... 84 Apply methodology ??.. ......................................................................................................... 84 Collaborate ??......................................................................................................................... 85 References ................................................................................................................................. 87 Appendix A ............................................................................................................................... 92 Appendix B ??.. ...................................................................................................................... 94 Appendix C ??.. .................................................................................................................... 101 ix List of Tables Table 1 - Methods to Measure the Human Body........................................................................ 11 Table 2 - Historical Somatology Studies and Measurement Constructs..................................... 21 Table 3 - Data Map of Spreadsheet File Containing Normalized Data ...................................... 33 Table 4 - Anticipated Demographic Data of Sample.................................................................. 35 Table 5 - Pretest Demographic Data........................................................................................... 42 Table 6 - Demographic Data of Sample ..................................................................................... 53 Table 7 - Average Value of Variables by Cluster....................................................................... 59 Table 8 - Body Measurement Difference by Cluster in Inches .................................................. 70 Table 9 - Somatological Constructs Dimensional Description................................................... 78 x List of Figures Figure 1 - Thompson?s Cartesian Grid Transformation (1917).................................................. 13 Figure 2 - Sheldon?s Somatotypes (1940) .................................................................................. 14 Figure 3 - Douty?s Body Build and Posture Scales (1968)......................................................... 16 Figure 4 - Female Figure Identification Technique (FFIT?) shape categories (2004) ............... 18 Figure 5 - Body Shape Assessment Scale (BSAS?) (2006) ....................................................... 19 Figure 6 - Body scan images normalized for height................................................................... 23 Figure 7 - Computational shape analysis framework ................................................................. 26 Figure 8 - Point cloud visual representation ............................................................................... 27 Figure 9 - X, Y, Z digital data point array .................................................................................. 28 Figure 10 - Dendrogram of hierarchical cluster analysis............................................................ 37 Figure 11 - Statistical and visual representation of individual clusters ...................................... 39 Figure 12 - Frequencies of pretest males? and females? weights................................................ 42 Figure 13 - Frequencies of pretest males? and females? heights................................................. 43 Figure 14 - Frequencies of pretest males? and females? BMIs ................................................... 43 Figure 15 - Graphical representation of elliptical slices of the human form .............................. 45 Figure 16 - Dendrogram depicting pretest cluster analysis (height and weight) ........................ 48 Figure 17 - Body scan images of male subjects classified as female ......................................... 49 Figure 18 - Dendrogram depicting pretest cluster analysis (form data) ..................................... 50 Figure 19 - Dendrogram depicting pretest cluster analysis (height, weight, and form data)...... 51 Figure 20 - Frequencies of subjects age...................................................................................... 54 xi Figure 21 - Frequencies of subjects weight ................................................................................ 54 Figure 22 - Frequencies of subjects height ................................................................................. 55 Figure 23 - Frequencies of subjects BMI.................................................................................... 56 Figure 24 - Dendrogram depicting cluster analysis results......................................................... 57 Figure 25 - Typical body scan image.......................................................................................... 62 Figure 26 - Cluster one body scan images.................................................................................. 63 Figure 27 - Cluster two body scan images.................................................................................. 64 Figure 28 - Cluster three body scan images................................................................................ 65 Figure 29 - Cluster four body scan images ................................................................................. 66 Figure 30 - Cluster five body scan images.................................................................................. 67 Figure 31 - Cluster six body scan images ................................................................................... 68 Figure 32 - Cluster seven body scan images............................................................................... 69 Figure 33 ? Expert visual analysis summary .............................................................................. 71 1 INTRODUCTION To classify is human (Costa & Cesar, 2001). We identify and classify almost every object we encounter. We are experts at doing it. For example, when we visually observe an automobile, we tend to place it in the general category of automobile. Once the general classification is complete, the mind continues to process the object into more specific classifications of automobiles based on various constructs (e.g., car, truck, or sport utility vehicle). This mental processing takes place virtually instantaneously for every object we encounter. However, it can be difficult for us to classify multiple objects in a timely manner. The human body is an object we classify by observation. Some of the constructs our minds use in classification include size, build, posture, shape, and form. While these constructs may be related to one another and some are used interchangeably, they are each distinct. It is important for us to classify these body constructs in order to better understand them because they have been linked to issues like human health and the development of products that interact with the body (Bye, LaBat, & DeLong, 2006; Flegal & Graubard, 2009). The human mind focuses best on classifying one body at a time. To classify larger numbers of human bodies, emerging technology can be used to analyze sufficient numbers of subjects to arrive at meaningful classifications. Academic studies have used data collected by 3D body scanning technology to analyze appropriate numbers of subjects to classify size, build, shape, and posture (Connell, Ulrich, Brannon, Alexander, & Presley, 2006; Simmons, Istook, & Devarajan, 2004a). However, there is an absence of analysis and classification of the construct termed body form. Therefore, this study 2 expanded the body of knowledge about body form by exploring a methodology to classify the human form utilizing digital data from 3D whole body scans. Background Somatology is a branch of anthropology that uses measurement and observation to study the variation and classification of the human body (Brannon, 1971; Webster?s, 1993). A search of the literature related to somatology reveals that differences among the meanings of constructs like size, build, shape, and form are sometimes confused or not revealed when reviewed across studies. A brief discussion of these constructs is necessary to give consistency to the proposed study. The discussion includes how these constructs are defined, how they are measured, and how they have been used in associated studies. Somatological Constructs Three categories of body measurement techniques and tools for the constructs of size, build, shape, and form have been identified. Tape measures, calipers, and anthropometers yield linear measurements. The multiple probe method (phototography and somatography), and the body form method (draping and 3D body scanning) provide more complex views of the body (Bye et al., 2006). Researchers have used these measurement methods to investigate a single body construct or a combination of constructs in order to perform body classification. Different body measurement methods may be used for different body constructs. Since differences can be nuanced, research into a body construct must clarify the literature and carefully specify operational definitions. 3 Size The size of an object is defined as its physical magnitude, extent, or bulk; relative or proportionate dimensions of an object (Webster?s, 1993). Examples of size are length, height, weight, or girth. Body size is operationalized in this study as the 1D measurement of component body parts. This construct has been used in research to build databases used in the design of products that interact with the human body. Studies include research related to the fit of apparel. Examples are the O?Brien and Shelton study (1941) of women?s measurements, the Commercial Standard: Voluntary Product Standard CS 215-58 (U.S. Department of Commerce, 1958), and the U.S. Air Force anthropometry survey (Churchill & McConville, 1976). Size tends to consist of 1D data and can be measured by the linear, multiple probe, and body form methods of data collection (Bye et al., 2006). Build Build is defined as the mode of structure: the bodily conformation of a person (Webster?s, 1993). Examples of describing body build with linear data include waist-to- hip ratio, thigh-to-calf ratio, and shoulder, waist and hip proportions. Body build is operationalized in this study as the relationship between the linear size measurements of component body parts expressed as ratios or proportions. Graphic somatometry is a multiple-probe technique using photography to project shadows of the human body on a 2D grid (Douty, 1968). Douty (1968) and Brannon (1971) used face forward and side silhouette photographs to study female and male body build and posture. Their studies advanced somatographic measurement methods and focused on the constructs of body build and posture by visually classifying body types. 4 Shape Shape is defined as the visible makeup characteristic of a particular item (Webster?s, 1993). The operational definition of shape in this study is the external surface or outline that an object is perceived as occupying. Examples of shape include triangle, rectangle, pear, apple, and hourglass. Within the creative sciences (art and design), shape is considered to be a 2D construct (Fiore, 2010). Shape can be studied using the multiple probe method or the body form method. It has been the focus of much of recent somatological study (Connell et al., 2006; Simmons et al., 2004a). Using photography as the tool, Sheldon (1940) developed three human male shape categories: endomorph (soft and round), mesomorph (hard and muscular), and ectomorph (linear and skinny). Although this study is considered groundbreaking in the field, these categories blurred body constructs, particularly shape and form. The Female Figure Identification Technique? (FFIT?) was one of the first studies to utilize 3D body point cloud data to classify human female body shapes in relation to the fit of apparel (Simmons et al., 2004b). The development of the Body Shape Assessment Scale? (BSAS?) was based on quantitative landmark coordinate data derived from body scanning. A panel of experts utilized 2D images to qualitatively categorize female bodies into nine classifications of whole and component body shapes and builds (Connell et al., 2006). 5 Form Although size, build, and shape are vital constructs in the fit of apparel, body form could be considered the quintessential construct to measure, as it best represents all attributes of the actual human body. Form is defined as the structure of something: a human body as distinguished by external appearance (Webster?s, 1993). Body form is operationally defined in this study as the volume that an object occupies in 3D space. Form is considered to be a 3D construct in the creative sciences (art and design) (Fiore, 2010). It is the general construct that the human mind utilizes in classification exercises. One approach to describing body form is suggested by the 3D point cloud data generated in body scans. Analyzing body form can be considered one of the highest order applications for body scanning as the digital data is maintained in 3D format. Recent studies have performed quantitative analysis of body form but have stopped short of classifying the sample data (Azouz, Rioux, Shu, & Lepage, 2006). Because accurate data has been generated from the available technology, the next logical step in the progression of the field of somatology is the development of a methodology to classify body form. Problem and Purpose Statements Limited research has focused on human body form, with the primary studies focusing on shape analysis for the female body. No studies have searched for a way to perform human body form classification using 3D body scan point cloud data. Therefore, the purpose of this study was to explore a methodology for male body form classification using 3D data, statistical algorithms, and analysis by experts in the field of somatology. This research is important because it expanded current knowledge in the field of somatology. Exploring body form classification methodology is important because form 6 represents most accurately the actual space the human body occupies. This study was a step aiming to progress toward identifying human body form somatotypes that can be used in future research studies. Successful development of the methodology has the potential to generate body form somatotypes that can serve as a foundation for improving the fit of apparel targeting specific sample demographics. In product development, the design and fit of apparel items is accomplished on a human model or a dress form. A better understanding of the human form by the apparel industry could lead to improved sizing strategies, reduced inventory costs, and a decrease in unsold items (Azouz, Rioux, & Lepage, 2002). It could also allow businesses to longitudinally monitor target demographics for body form variability. This study contributes to academic scholarship by providing a means for quantitatively generating scales of body forms for specific samples. This creates the potential for relating a body form model to data analysis of other constructs such as body image and purchase intention. The methodology will generate research techniques concerning overall body form and component body form analysis and classification. As the worldwide database of body scans grows, a new methodology provides the possibility of progressing toward universal body form categories or somatotypes for males and females. In addition to a focus on the body in relation to apparel, this study has the potential to provide researchers with a tool to investigate possible links of human lifestyle to physical health by exploring human physical activity intervention effects on body form variability. Thus, in several potential ways, the findings of this research support 7 interdisciplinary collaboration across academic fields such as kinesiology, medicine, business, and psychology. Research Questions This study addressed the problem and satisfied the purpose by exploring the following research questions: 1. Will body form categories emerge from an unsupervised hierarchical clustering of 3D male body scan data? 2. What are the statistical characteristics of each cluster? 3. What are the visual characteristics of each cluster? 4. Do experts in the field of somatology recognize the various clusters from the statistical and visual characteristics generated? 8 LITERATURE REVIEW The purpose of this study was to a explore methodology to analyze and classify human body form. A through understanding of scholarly research, including a review of scholarly works in the field of somatology, was necessary to develop and provide direction for this study. Considering that terminology for this discipline is not well defined, the review of literature also focused on bringing consistency and clarity to terms currently used in somatology to establish a link between what has been done previously and the proposed body form classification. Biological Shape Classification Methodology Humans identify biological objects by visual analysis and categorization. The work of Costa and Cesar (2001) was done to develop a methodology to analyze and categorize biological objects. The classification approach they outlined focused on 2D shapes but can also be applied to 3D form. Costa and Cesar (2001) identify supervised and unsupervised classification as two methods of creating categories of similar biological objects. Supervised classification is the method of placing observed biological objects into categories where the categories are known a priori (Costa & Cesar, 2001). Unsupervised classification or cluster analysis is the method of discovering categories in the data based on various observation techniques of a particular sample (Lele & Richtsmeier, 2001). A category identified by cluster analysis is a collection of objects that are similar to each other and dissimilar to objects identified as belonging to other categories (Godil, 2009). Cluster analysis has been used in studies related to facial recognition techniques involving principal component analysis (PCA) and k-means algorithms for clustering the data (Godil, 2009). Clustering can be 9 considered more of an art than a science, and the results can vary depending on the statistical tools chosen for the analysis (Lele & Richtsmeier, 2001). This study used a panel of experts to visually recognize the variability of the results of the cluster analysis procedure, relating the variations to the fit on garments to the human body form. Physical Anthropology The appearance of ready-made apparel in the 19th and early 20th centuries precipitated study of the human body in the context of apparel fit (Brannon, 1971). Physical anthropology is the study of human classification and variation through measurement (Physical Anthropology, n.d.). Anthropometry and somatometry are both considered branches of physical anthropology. Anthropometry is the study of human body measurements on a comparative basis (Webster?s, 1993). This area of study includes the collection of sizing data used to compare human bodies in relation to the fit of apparel. Sizing data is represented by linear measurements between specific points on the human body (landmarks) using gradated measuring devices like tape measures (Kidwell, 1979). Somatology is a broader term derived from the roots, soma (or body), and logia, (the study of). It is defined as the comparative study of human variation and classification based on measurement and observation (Webster?s, 1993). The term somatology, as it describes the broad field of evaluating and classifying dimensions of the human body, has the greatest application for this research study. Human Body Measurement Methods Methods to measure the human body can be categorized as a) linear, b) multiple probe, and c) body form (Bye et al., 2006). The linear method of body measurement uses the distance between two points on the body or landmarks to quantify the size of body 10 components like hips, thighs, waist, or neck. The identification of landmarks presents the potential for human error because of the need for agreement on the locations of the landmarks on the body (Lele & Richtsmeier, 2001). The multiple probe method of body measurement combines linear measurements with tools that evaluate the relationship between and description of body contours (Bye et al., 2006). By evaluating and describing multiple dimensions (linear measurement and contour) of the human body, researchers are able to study parameters more related to the fit of garments. The body form method of describing the human body relies on surface and volume evaluation rather than numerical descriptions (Bye, et al., 2006). Body form methods include draping and 3D body scanning. A 3D body scanner is a device used to create a dimensionally accurate digital representation of the human form (Bye et al., 2006). Table 1 is a list of measurement methods, the tools used for each method, and the resulting data formats used to describe the human body. 11 Table 1 Methods to Measure the Human Body ________________________________________________________________________ Method Point Length Surface Shape Volume ________________________________________________________________________ Linear Tape Measure - X - - - Proportional Measures - X - - - Anthropometer - X - - - Calipers - X - - - Multiple Probe Complex Anthropometer X X - X - Photography X X - X - Somatography X X - X - Minnot Method - X - X - Body Form Draping - - X X - Body Scanning X X X X X ________________________________________________________________________ Note. Methods and tools to measure the human body and the resulting data format. Adapted from ?Analysis of Body Measurement Systems for Apparel? by Bye et al., 2006, Clothing and Textile Research Journal, 24(2), pp. 66-79. The data collected by the various methods are represented in 1D point (identification of landmarks), 1D length, 2D surface, 2D shape, and 3D volume formats (Bye et al., 2006). Table 1 shows that body scanning is a valuable tool in somatological and anthropometric research because it provides data related to each dimension of 12 measurement. The unique multi-dimensional data that 3D body scanning provides are the 3D digital representation of the human form. Therefore, body scanning is applicable to the analysis of body form for the purpose of classification. Analytical Methods in Somatology Early Figure Typing Using visual observation, Hippocrates recorded two distinct human body shapes in the 3rd century B.C. as thin/tall and short/thick (Croney, 1971). From this apparent beginning, the methods of describing and classifying the various attributes of the human body expanded. Cartesian Grid Analysis D?Arcy Thompson?s work at the end of the 19th and beginning of the 20th century was considered groundbreaking in the biological shape classification field. His analysis technique was based on 2D shapes of biological objects drawn on grid patterns. Grids were deformed to morph one biological object into a related object and show how the objects were related (Thompson, 1917). Figure 1 is an example of Thompson?s transformation grid technique showing relationships between biological objects. Pioneers in the field of shape classification also include Medwar and Sneath, who contributed to shape analysis by using mathematics and rescaling techniques to compare similar biological figures (Medwar, 1944; Sneath, 1967). 13 Figure 1. Thompson?s Cartesian Grid Transformation (1917). Adapted from On Growth and Form by D. W. Thompson, 1917, UK, Cambridge University Press. Anthropometric Database Studies Early anthropometric studies that focused on developing or improving apparel sizing systems tended to consist of length measures collected using linear methodology. The O?Brien and Shelton (1941) study of women?s measurements was one of the first to systematically collect linear body measurement data to be used for sizing apparel. Their sizing system continues to be the foundation for women?s apparel sizing today. This data was updated and converted into the Commercial Standard: Voluntary Product Standard CS 215-58 (U.S. Department of Commerce, 1958). What followed were more efforts to maintain updated information on both males and females, as evidenced by the 1976 U.S. Air Force anthropometry survey (Churchill & McConville, 1976). Recent anthropometric studies (e.g. Size USA, Size Europe, Size Mexico, CAESAR, and Size Asia) continue to build body measurement data to describe the human body based on ethnicity as well as gender. These studies are currently limited to apparel sizing information. With the limited information provided by length measures, it is not possible 14 to develop the deep understanding of 3D body form that is necessary to address fit of apparel (Bye et al., 2006). Male Body Shape Analysis In 1940, a psychologist at Harvard University published research defining three different human body shapes based on visual analysis and classification of photographs of human male subjects (Sheldon, 1940). Sheldon developed the three somatotypes; endomorph (short/fat), mesomorph (lean/muscular), and ectomorph (tall/thin), linking them to various psychological disorders. Examples of these somatotypes are shown in Figure 2. Figure 2. Sheldon?s somatotypes (1940). Sheldon?s work is an example of early visual body type classification combining photography with Cartesian grid structure. Adapted from The Varieties of Human Physique by W. H. Sheldon, 1940, NY: Harper and Brothers. 15 Sheldon?s technique included analyzing photographic images of human male subjects taken from the front, side, and back view and visually synthesizing the 2D images of shape into 3D mental representations of body form. A limitation of this technique is that gaps in information can occur in attempting to describe the human body by qualitatively combining two or more 1D size or 2D shape dimensions (Bye et al., 2006). Therefore, the need exists to further investigate more accurate ways to represent the human body. As measurement technology developed, other studies in somatology followed Sheldon?s, utilizing different measurements and methodologies. Body Build and Posture Analysis Graphic somatometry is an example of a technology based technique using photography combined with a Cartesian grid structure to project shadows of the human body (Douty, 1968). Douty (1968) used this technology to study female body build and posture. Brannon (1971) expanded this research to the study of male body build and posture. While these studies advanced the technology available for shape analysis, they focused on the body build and posture constructs, and they utilized human experts in the field to visually perform the classification tasks. Douty, Moore, and Hartford (1974) provided a different perspective on the analysis of the human body for fit of apparel. They studied the human body by investigating components such as body build, bust size, body tension, lower back curve, pelvic tilt, knee tension, upper back curve, head position, shoulder slope, global posture quality, and figure impression in female subjects (Douty et al., 1974). A limitation of this work is similar to the limitations of Sheldon?s (1940) work in that gaps in information 16 occur when attempting to qualitatively synthesize two or more 2D shapes to represent 3D form. Douty?s female build and posture scales are shown in Figure 3. Figure 3. Douty?s Body Build and Posture Scales (1968). Douty?s body build and posture scales were derived through visual analysis of photographs projected onto a Cartesian grid structure for reference. Adapted from ?Silhouette Photography for the Study of Visual Somatometry and Body Image,? by H. I. Douty, 1968, paper resented at The National Textiles and Clothing Meeting, Minneapolis, Minnesota. 17 Dressmaking Focused Size and Shape Analysis Minott categorized female human body shapes in components to aid in patternmaking for apparel (Minott, 1972, 1978). In the development of the Minott Method of fitting apparel patterns, she observed shoulder and hip size considering the relationship with other body parts. Posture was also taken into account in order to adjust measurement data for more accurate patterns. August (1981) assessed female body shape in relation to dressmaking. She developed four categories of body type designated as A, X, V, and H. These were observed from a front view of the subject. Side views were qualitatively evaluated, as well, and utilized lower case designations like b, d, i, and r to indicate categories (August, 1981). The August method of categorizing body shapes was based on landmark identification and recognition by component (RBC). In her book Patternmaking for Fashion Designers, Armstrong (1987) described four female body shapes based on the shoulder/hip relationship. The categories used by Armstrong included hourglass, straight line, wide shoulders, and narrow shoulders. While these categories could be advantageous to patternmaking, they are limited to that application. Recurrent Body Shape Analyses Minott?s (1972, 1978), August?s (1981), and Armstrong?s (1981) investigations were all based on their accumulative experience with female body shape and apparel. The Female Figure Identification Technique? (FFIT?) was published in 2004 and was the result of one of the first studies to utilize 3D body scanning data to classify human female body shapes in relation to fit of apparel (Simmons et al., 2004a). This study, focused on 18 the female body shape, was an exercise in software development that was verified by experts in the field of somatology. Researchers studied 222 female body scans and performed supervised classification of the body shapes. This means that the categories of shape were known a priori, and the subjects were evaluated and placed into those categories. The predefined body shapes were triangle, inverted triangle, rectangle, hourglass, and oval. After the initial evaluation, these predefined shapes appeared to be insufficient to encompass all the subject shapes in the study, and researchers added shape categories termed spoon, diamond, bottom hourglass and top hourglass. Examples of the triangle, rectangle, and hourglass FFIT? shape categories are shown in Figure 4. The ultimate goal of FFIT? for apparel was to define every shape using the fewest number of categories (Simmons et al., 2004). Triangle Rectangle Hourglass Figure 4. Female Figure Identification Technique (FFIT?) shape categories (2004a). These images are examples of three of the six shape categories identified by the FFIT study in 2004. Adapted from ?Female Figure Identification Technique? (FFIT?) for Apparel. Part I: Describing Female Shapes,? by K. Simmons, C. Istook, & P. Devarajan, 2004a, Journal of Textile and Apparel, Technology and Management, 4(1), pp. 1-16. 19 Additional work in human body shape classification related to apparel included the development of the Body Shape Assessment Scale? (BSAS?). This scale development study was based on quantitative landmark coordinate data derived from 3D body scanning combined with expert knowledge to qualitatively classify female body shape based on nine classifications of whole body and component body shapes (Connell et al., 2006). The nine classifications established were body build, body shape, hip shape, shoulder slope, front torso shape, bust prominence, buttocks shape, back curvature, and posture. Figure 5 is a visual representation of a portion of the BSAS?. Figure 5. Body Shape Assessment Scale BSAS? (2006). These images are shape categories of shoulder, waist, and hip relationships generated from 3D body scan data and visual expert evaluation. See Appendix I for examples of other component and whole body measures developed in the BSAS? study. Adapted from ?Body Shape Assessment Scale: Instrument Development for Analyzing Female Figures? by L. J. Connell, P. V. Ulrich, E. L. Brannon, M. Alexander, A. B. Presley, 2006, Clothing and Textile Research Journal, 24(2), pp. 80-95. 20 Connell et al. (2006) analyzed and synthesized previous body shape classification studies to develop the BSAS? shape categories. Two-dimensional images of whole and component bodies were printed from 3D body scan data files and visually placed in the synthesized categories by experts in the field. Later, algorithms were developed in a software program to classify female bodies based on shape categories in the BSAS?. Alexander (2003) contributed to the study of body shape by utilizing an early version of the BSAS? to investigate the relationship of whole and component body shapes with other variables like body mass index (BMI), age, ethnicity, body build, and body posture. The FFIT? (Simmons et al., 2004b), BSAS? (Connell et al., 2006), and Alexander (2003) studies represent the most significant uses of the point cloud data produced by 3D body scanning systems to analyze human body shape related to the fit of apparel. However, 2D body shape analysis was the foundation used to place subjects into established shape categories. Historical Somatology Summary Table 2 is a historical representation of the human body construct classification systems and the categories of evaluation for each system. The list is not exhaustive, but it represents some seminal works in somatology. Each of the size, build, and shape related studies evaluated constructs other than form and are limited by the origin of the data. The encompassing nature of body form is the most comprehensive descriptor of the human body. Body scanning technology has advanced and now allows researchers to examine body form in a more comprehensive manner. 21 Table 2 Historical Somatology Studies and Measurement Constructs ________________________________________________________________________ 3rd Century 1940 1972 1974 1981 1987 2004a 2006 Hioopcrates Sheldon Minott Douty August Armstrong Simmons Connell et al. et al. _____________________________________________________________________________________ thin/tall ectomorph shoulder size body build Front hourglass triangle body build short/thick mesomorph hip size bust size A straight line inverted body shape triangle endomorph posture body X wide rectangle hip shape tension shoulders lower back V narrow hourglass shoulder curve shoulders slope pelvic tilt H oval front torso shape knee Side spoon Bust tension prom. upper back b diamond buttocks curve shape head d bottom back position hourglass curve shoulder i top Posture slope hourglass global r posture quality figure impression ______________________________________________________________________________________ Body Form Analysis The FFIT? (Simmons et al., 2004b) and BSAS? (Connell et al., 2006) studies used algorithms to analyze 3D body scanning data to classify 2D body shape. Though initially these studies visually reviewed 2D computer generated images, ultimately they used algorithms based on 3D form to identify shapes. Each of these studies used 22 supervised classification techniques. Studies have emerged recently that are intended to analyze the 3D data generated by body scanning. Azouz et al., (2006) developed a methodology to identify contributors to the variation of body form by applying unsupervised classification or clustering techniques to a sample of 300 males. The results of this study revealed five modes of variation, including height and weight (33.9% of variation); posture (15.1% of variation); mass distribution and muscularity (8.9% of variation); space between arms and torso (4% of variation); and head position (3.6% of variation). The study also found that height was not related to form. Normalizing the data for height, weight was found to be the largest contributor to form (Azouz et al., 2006). Normalization is the process that insures that each subject?s scan data contains the same number of data points and has a consistent point of origin in 3D space (Costa & Cesar, 2001). Figure 6 shows the results achieved by Azouz et al. (2006) where weight becomes the major contributor to body form variation after normalizing the subject data for height. 23 Figure 6. Body scan images normalized for height. Each subject appears to be the same height while variations in weight become more prevalent. Top images are morphed representations of actual images on bottom row. Adapted from ?Characterizing Human Shape Variations using 3D Anthropometric Data,? by Z. B. Azouz, M. Rioux, C. Shu, R. Lepage, 2006, Visual Computing, 22, pp. 302-314. The researchers stated that this study was one step leading toward faster and more reliable characterization of the whole human body, but that results had not been verified by field experts (Azouz et al., 2006). The lack of expert verification is significant; my study used experts in the field of somatology to verify the clusters of body form generated by statistical analysis methodology in relation to the fit of apparel. Studies have analyzed 3D body scan data in the search for major form contributors and variations, but they have stopped short of establishing form clusters for the sample data. Unsupervised data clustering has been suggested as showing potential for determining the ?true? landscape of human shape variations (Allen, Curless, & Popovic, 2003; Azouz et al., 2006). My study is a step in the investigation of this human 24 landscape in that it establishes unsupervised clustering methodology that could lead to the development and description of universal somatotypes. Summary Much of the analytical methodology in somatology utilizes the linear or multiple probe methods for data generation. Some studies have utilized the body form method (namely 3D body scanning) in research but tend to convert the data into one of the other formats by extracting measurement based on landmark coordinate data, or by visually analyzing the 2D images generated. Gaps in information can occur when describing the body by attempting to combine two or more size or shape dimensions (Bye et al., 2006). Body scanning provides the data that best describes the human body form, and my study used that data to develop a methodology to advance the field of somatology by applying statistical cluster analysis techniques to classify human body form and used field experts to verify the form categories. 25 METHODOLOGY The purpose of this study was to develop a methodology to explore body form analysis and recognition using 3D digital data generated through body scanning. Concepts in shape analysis and classification adapted from the work of Costa and Cesar (2001) will frame the study. This framework closely follows techniques involved in the emerging field of statistical learning. Statistical learning involves the use of computer based algorithms to discover patterns in data sets (McCue, 2007). It is applied in this study by allowing body scan data to reveal categories of human body form. The exploratory research design consisted of a pretest; unsupervised body form classification, or clustering, of male body scan data; and expert recognition of the categories of male body form resulting from the unsupervised clustering. Methodological Framework The methodological framework is adapted Costa and Cesar?s (2001) framework for computational shape analysis. In their book, Shape Analysis and Classification: Theory and Practice, they provided a schematic illustrating shape analysis using three classes of typical shape analysis tasks used in classification. Costa and Cesar (2001) use the term shape in their work, but the same framework can be applied to the analysis of form, as operationalized in the current study. Costa and Cesar (2001) examined the three main stages in shape analysis as (1) form preprocessing, (2) form transformations, and (3) form classification. A pictorial outline of the framework is shown in Figure 7. 26 Figure 7 ? Computational shape analysis framework. Adapted from ?Shape Analysis and Classification: Theory and Practice,? by L. F. Costa, & R. M. Cesar, 2001, New York: CRC Press. Stage One ? Form Preprocessing Acquisition, detection, noise filtering, and operations make up Stage 1 or form preprocessing. Acquisition of accurate data is critical in this stage of processing and 3D body scanning has been shown to be reliable in accomplishing this task (Simmons & Istook, 2003). The Textile Clothing Technology Corporation ([TC]2 ) NX-16 body scanner uses white light technology to obtain the raw point cloud data that represent the human subject?s form ([TC]2, 2011). There can be up to 1M data points contained in each subject?s raw body scan data (see Figures 8 & 9). 27 Figure 8. Point cloud visual representation. This figure is a visual 2D representation of the 3D point cloud data generated by body scanning technology ([TC]2, 2011). While 3D digital data will be used to statistically investigate body form, 3D computer generated images will be used in this study to aid experts to visually recognize the form categories generated from the cluster analysis. 28 Figure 9. X, Y, Z digital data point array. This figure is a numerical representation of the point cloud data generated by 3D body scanning technology in X, Y, Z data point format. The data array is further processed by the body scan software to filter unrelated data points (or noise) resulting in a data file for each subject consisting of an array of approximately 144,000 digital X, Y, Z data points suitable for processing in the next stage of the framework. Stage Two ? Form Transformation Form transformation involves converting the preprocessed point cloud data into a normalized format and reducing the number of data points to a level manageable for processing on common university computing systems. The process of normalization results in each subject?s data file having a common X,Y,Z spatial point of origin and containing an identical number of data points (Costa & Cesar, 2001). This normalization 29 is accomplished by 3D body scanner software via conversion of each subject?s data file to an avatar mesh. The main result of an avatar mesh conversion is to provide a life like visual representation of the subject. A secondary result is normalization of the data as described above. The avatar mesh selected from the [TC]2 software options was the optitexadam mesh. An additional function of form transformation is to reduce the data volume for each subject to a manageable level. The optitexadam avatar mesh contains in excess of 32,000 data points. While this is a significant reduction from the approximately 144,000 points presented from the form preprocessing stage, the volume remains too large for efficient processing on common computing systems. During further analysis it was found that the data points in the optitexadam mesh are distributed evenly in layers from bottom to top throughout the array thus forming ellipses at each layer ([TC]2, 2011). Techniques of principle component analysis (PCA) can be applied to reduce the number of data points while maintaining the descriptive integrity of the subject?s body form. This PCA application results in a data file for each subject containing approximately 3,000 points. The result of form transformation is a file for each subject that has a common spatial point of origin, an equal number of data points, and a volume of data points that is manageable in further processing. Stage Three ? Form Classification Classification involves choosing from several measurement criteria with the understanding that each choice could lead to different classifications (Costa & Cesar, 2001). There are no specified rules indicating how to make the best choices. Some 30 techniques used to classify form are similarity, matching, unsupervised classification (clustering), and supervised classification. Similarity and matching can be achieved by placing forms into classes through visual expert analysis. Supervised classification is done when the researcher has an established concept of the desired classes of form and forces the data into those desired classes. Unsupervised classification, or clustering, is a technique that considers the overall data and allows some type of mathematical algorithm to reveal patterns or clusters of similar form representations within the data. The subjects placed within each shape cluster are more similar to those within that cluster and more different than those placed outside that cluster (Lele & Richtsmeier, 2001). Unsupervised classification or clustering is the method utilized in this study to develop a hierarchical clustering of subjects along the male body form continuum. To proceed, a pretest of the method was completed to explore the potential for success and check for procedural issues. Pretest Purpose The purpose of the pretest was to determine if the body scan data can be acquired, transformed, and classified. The pretest measured the cluster analysis algorithm?s ability to classify the data set into male and female body form categories. Male and female form can be considered two basic categories into which the human body form can be classified. This procedure was an attempt to establish the algorithm?s ability to perform this basic classification task. 31 Sample Description A sample of 10 male and 10 female body scans were selected based on every third qualifying data file contained in the Spring 2010 scan data collection of the Freshman 15 study at Auburn University (Connell, Ulrich, Simmons, & Gropper, 2010). A qualified data file in this case was defined as a data file (x.rbd) that was saved by the [TC]2 NX-16 body scanner software and was loadable into spreadsheet format. This sample was appropriate for the pre-test because the objective was to normalize the data and analyze it based on a priori categories. Procedure The following steps were necessary to convert the raw data file (.rbd) provided by the initial body scan of each subject into an avatar mesh format. 1. Load the file (x.rbd) from the existing scans into [TC]2 NX-16 body scanner software. 2. Select ?create avatar? from the file options. 3. Select the appropriate mesh (optitexadam) 4. Save the file in spreadsheet format. The conversion to the avatar mesh is the step that normalizes the unique X, Y, Z data point cloud contained in each file. Normalization insures that each subject?s body scan file has the same point of spatial origin and contains the same number of data points. Without this normalization step, it would be statistically impossible to compare one file to another. The avatar mesh selected for this purpose is the optitexadam mesh which is based on a male format and includes whole body scan data. Both male and female body scans were normalized into the same mesh, because the main purpose of conversion was 32 to enable the normalization process and not to view the avatar. The files were selected and converted via the batch process function in the [TC]2 software in order to process the data as efficiently as possible. The resulting spreadsheet file contains approximately 138,941 lines of data representing the whole body. The data map for each file is shown in Table 3. Based on a conversation with Dr. David Bruner, VP of Technology Development of [TC] 2 (personal communication, October 12, 2010), the X, Y, Z data points represented by lines 9 ? 21,471 of the avatar mesh represent the body form of each subject. Other lines of data represent parts of the subject?s body like the head and facial features. Since this study is concerned with human body form, the pretest focused on the segment of the data representing the subject?s body and not on facial features and texture data. The relevant data represent a digital point cloud of 21,463 X, Y, Z data points for each of the subjects that were used for analysis in the pretest. 33 Table 3 Data Map of Spreadsheet File Containing Normalized Data ________________________________________________________________________ Spreadsheet Line Description Prefix # of Data Points ________________________________________________________________________ 9 ? 21,471 gadambody vertices V 3 21,474 ? 44,489 texture vertices Vt 2 44,492 ? 67,939 vertex normals Vn 3 67,946 ? 89,381 Faces F 4 89,386 ? 95,212 Vertices V 3 95,215 ? 100,693 texture vertices Vt 2 100,696 ? 107,123 vertex normals Vn 3 107,130 ? 112,875 Faces F 4 112,880 ? 113,050 Vertices V 3 113,053 ? 113,223 texture vertices Vt 2 113,226 ? 113,412 vertex normals Vn 3 113,419 ? 113,572 Faces F 4 113,577 ? 113,747 Vertices V 3 113,750 ? 113,920 texture vertices Vt 2 113,923 ? 114,108 vertex normals Vn 3 114,115 ? 114,268 Faces F 4 114,273 ? 119,322 Vertices V 3 119,325 ? 124,375 texture vertices Vt 2 124,378 ? 134,051 vertex normals Vn 3 134,058 ? 138,941 Faces F 4 ________________________________________________________________________ 34 The 21,463 points for each subject were batched together, imported into a statistical software package, and analyzed by coding developed for this cluster analysis exercise. The purpose of the trial was to establish grounds for making the assumptions necessary to proceed with the cluster analysis of the full male only data set. The assumptions satisfied were: 1. Raw 3D body scan data from the NX-16 3D body scanner can be normalized for statistical human body form analysis. This was accomplished by the conversion of the raw data files (.rbd) to the avatar mesh files (.obj). 2. It is possible to identify usable data points within the normalized data in order to perform statistical human body form analysis. This was accomplished by working with [TC]2 to determine which lines of data are applicable to the process. 3. The normalized data points identified for analysis can be easily imported and processed in typical statistical software processing tools. Testing this assumption was accomplished by batching the data into a statistical software package. Results The ability of statistical algorithms to distinguish between the male and female form purely from numerical point cloud data was the measure evaluated. Two distinct form categories emerged from the application of the unsupervised clustering methodology, but the question remained as to whether the actual male and female data were separated into the two clusters. A manual comparison of the content of the two clusters based on the master code number for each subject revealed that the clusters were in fact separated along gender lines. 35 Clustering of Male Body Form Purpose The purpose of this procedure was to apply the unsupervised hierarchical clustering methodology developed in the pretest to a database of male body scan forms. The procedure followed the stages of form classification defined in the methodological framework including all the steps established and tested in the pretest. Sample Description The sample used in this stage of the research was taken from previously acquired 3D body scan data generated from the NX-16 body scanner at Auburn University. Male body scan data was used from the Men?s Mentoring Study (Simmons, Chattaraman, & Ulrich, 2009). This study focused exclusively on male subjects aged 18 or older and was determined to be demographically diverse enough to provide a good range of body form variability. Approximately 157 subject data sets were available from this database. General combined demographic data is shown in Table 4. Table 4 Anticipated Demographic Data of Sample (N = 157) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 18 66 21.22 6.44 Weight (pounds) 120 332 171.57 32.39 Height (inches) 61 78 70.28 2.70 BMI 18 52 24.45 4.28 ________________________________________________________________________ 36 Procedure The procedure for the unsupervised clustering of male body form followed the guidelines established in the pretest with the exception of the sample data set. This part of the study focused on answering the following research questions. 1. Will body form categories emerge from an unsupervised hierarchical clustering of 3D male body scan data? 2. What are the statistical characteristics of each cluster? Clustering methodology does not necessarily produce an absolute cluster structure. In fact, cluster analysis can produce clusters at different levels of complexity. The results can range from the all subjects in the data set being placed into one large cluster; to individual subjects data being placed into clusters of their own. Unsupervised Classification The result of the unsupervised hierarchical clustering performed in this research study was the identification of several unique categories of male body form. These categories were allowed to emerge based on unique statistical characteristics associated with the data array. Figure 10 is a dendogram that visually represents the conceptual results of the hierarchical cluster analysis executed and used to answer Research Question one. 37 Figure 10. Dendogram of hierarchical cluster analysis. This figure represents the results of the application of the cluster analysis algorithm to the 3D body scan data. Various numbers of clusters are revealed at different levels of clustering complexity (Lele & Richtsmeier, 2001). The dendogram illustrated how clusters begin with each subject?s data (represented by the dark dots) being placed in its own individual cluster. As the cluster analysis proceeds, individual subjects data are then grouped based on similarities in 3D body form digital data arrays. The highest level of complexity results in all the dots being linked together into one cluster representing the overall male body form. Representative statistics for each identified cluster were evaluated to answer Research Question two. Once statistical analysis was completed, the study proceeded to visual analysis of each cluster by a panel of experts. Single Cluster of Human Male Body Forms Individual Subject 3-D Scan Data Arrays L e v e l s o f H i e r a r c h i c a l C l u s t e r i n g Human Male Body Form Clusters 38 Expert Recognition of Clusters Purpose The purpose of this step in the research is to answer the following research questions: 3. What are the visual characteristics of each cluster? 4. Do experts in the field of somatology recognize the various clusters from the statistical and visual characteristics generated? The overall purpose of this study was to develop methodology for use with 3D body scan data to reveal male body form categories which can be applied to the assessment of bodies for the improvement of the sizing and fit of apparel. Therefore, the clusters generated by the cluster analysis were validated by experts in the field of somatology. Procedure Visual identification of clusters addressed Research Question three. Three subjects were selected within each cluster that represent the mean body form and two standard deviations above and below the mean form. The three subjects identified represented 95.44% of the subjects within each cluster, therefore are appropriate for the visual analysis procedure. Images representative of the three identified subjects within each cluster were made available to the panel of experts. These three images were evaluated via scanning software that generated 3D images taken from the original scan data file of each subject. Results of the visual data for each cluster that were presented to the panel of experts for analysis will resemble the representation shown in Figure 11. 39 Figure 11. Statistical and visual representation of individual clusters. This figure represents the statistical and visual results of the application of the unsupervised clustering algorithm. This is the format that was presented to the panel of experts for recognition. The panel of experts had access to the 3D Body Scan Laboratory at Auburn University to view the 3D computer generated images of each subject in the sample. These images were rotated and turned to perform a 360? visual analysis of each subject?s body. The panelist viewed three images within each cluster to visually verify that subjects placed within the cluster have similar body forms to each other. Additional visual evaluation of subjects in different clusters was completed to evaluate body form differences between Individual Cluster Distribution Cluster Mean Image -2 +2 40 clusters. Experts evoked their explicit and tacit knowledge of apparel and pattern fit to visually examine the images both in gestalt and in component parts (hip, buttock, shoulders, stomach, and waist prominence). Visual comparisons within and between clusters served to answer Research Question four. Background Published studies of human shape classification have utilized expert knowledge to produce and verify results (Connell et al., 2006; Simmons et al., 2004a). The FFIT? for Apparel study utilized expert knowledge of apparel and pattern fit to determine the fewest and most elemental landmarks on the body on which to base the shape categories (Simmons et al., 2004a). The knowledge of experts was also used in the development of the BSAS?. In the case of the BSAS?, a panel of experts evaluated and synthesized existing female body shape scales, and developed revised and new whole and component female shape categories by visually evaluating 2D images of body scans (Connell et al., 2006). The panel of experts used in this study consisted of three of the researchers involved in the FFIT? and BSAS? studies. Expected Results It was anticipated that the panel of experts would visually recognize the clusters of male body form generated by the unsupervised cluster analysis exercise. 41 DATA ANALYSIS AND FINDINGS The primary focus of this study was to develop a methodology to explore body form analysis using 3D digital data generated through body scanning. The exploratory research design consisted of a pretest, unsupervised clustering of male body scan data, and expert recognition of the clusters of male body form. Because the three phases of the study have distinct purposes that are designed to build on each other, the data analysis and findings will be presented separately for each phase. The findings of each phase are viewed holistically to draw final conclusions and are reported in the Conclusions, Discussion, and Recommendations chapter. Pretest Analysis and Findings Pretest Sample A sample of 10 male and 10 female body scans were selected based on every third qualifying data file contained in the Spring 2010 scan data collection of the Freshman 15 study at Auburn University (Connell et al., 2010). Qualified data files in the pretest were defined as files with rbd file extensions, because they were convertible into an avatar mesh and loadable into a spreadsheet format. This sample was appropriate for the pretest because the objective was to normalize the data, reduce the quantity of 3D body scan data points to a manageable level, and segregate the data based on a priori classifications (male and female). Table 5 shows descriptive statistics associated with the pretest subjects. Age is not shown because it was concentrated at the 20-21 year old category; these subjects were all entering college freshmen in 2007 or 2008. It is significant to note that the ranges in the weight, height, and BMI categories were narrower than anticipated for the primary 42 study demographics as the main study includes males in a broader age range. The pretest data included both male and female subjects because the initial step in the methodology was to determine if the 3D body scan data could be manipulated and evaluated to distinguish between two distinct body forms, male and female. Table 5 Pretest Demographic Data (N =20) ________________________________________________________________________ Range ____________________ Minimum Maximum Mean SD ________________________________________________________________________ Weight (pounds) 92 193 147.45 28.89 Height (inches) 62 73 67.20 3.90 BMI 16 28 23.00 2.94 ________________________________________________________________________ The frequency distributions of pretest subjects? weights, heights, and BMIs were plotted in histograms (see Figures 12, 13, and 14 respectively) to describe the sample and see if separation along gender lines was observable. Figure 12. Frequencies of pretest males? and females? weights. 43 Figure 13. Frequencies of pretest males? and females? heights. Figure 14. Frequencies of pretest males? and females? BMIs. 44 For heights in Figure 13, frequency spikes at the 64 and 73 inch levels could indicate one data cluster each of females and males (respectively). The same is true when evaluating weight shown in Figure 12. It is possible that the lower weight subjects were female and the higher weight subjects were males but this is not always the case. An analysis of BMI would be more difficult to segregate along gender lines than both height and weight as the frequency distribution follows a more normal curve. The BMI variable is shown here as a descriptor of the sample set used in the pretest. It was excluded from the cluster analysis because by definition it is a combination of the height and weight variables and would be considered redundant in the analysis. A combination of the demographics would indicate that it may be possible to distinguish between the male and female body form from height and weight alone for certain distinctive populations. However, other dimensional data like 3D body scan data would be necessary for cluster analysis of similar body forms. This was the purpose of the pretest. Pretest Methodology To perform cluster analysis on the 3D body scan data, it was necessary to reduce the number of X,Y,Z data points further than originally anticipated. The raw data resulting from a single 3D body scan can contain in excess of one million X,Y,Z points. The initial data reduction was accomplished utilizing the 3D body scanner processing software capability via an avatar mesh conversion. Conversion to the avatar mesh normalized the form representative data and reduced the number of data points to approximately 64,000. Even with that reduction, the data volume was still too large for processing on conventional university computing systems. Therefore, principle component analysis (PCA) was used to further reduce the data volume for each subject to 45 1,034 X,Y,Z data points each (3,102 total points) while maintaining a meaningful 3D representation of the subjects body form. The data were then analyzed to test the cluster methodology?s ability to distinguish male and female body form via Statistical Package for the Social Sciences (SPSS Statistics 18 v. 18.0.0) software. Data reduction. During personal communication with [TC]2, it was discovered that the data points in the avatar mesh were distributed from bottom to top in slices (see depiction in Figure 15). Figure 15. Graphical representation of elliptical slices of the human form generated by 3D body scanning technology. (image from http://www.bodyscan.human.cornell.edu/scene8d5e.html) The vertical range consisted of 215 ellipses, each being a horizontal slice of the form that was uniform in frequency from bottom to top representing the circumference of a section of the subject?s body. By summarizing the width and depth values using the range, the long and short axes of each ellipse could be approximated. Developing an understanding 46 of how the raw data points were arranged in the avatar mesh allowed the data reduction to be executed without losing significant information that represented overall body form. The data transformation was performed by principle component analysis using three linear transformations. Three principle components, accounting for 38%, 33%, and 29% of the variance respectfully, were given as a weighted average of the three components or an overall size component, a strong positive weight on the width of the subject, and a contrast between depth and height. The final body scan data used in the cluster analysis procedure was representative of the three principle components weighted for each component?s contribution to the overall variability of the sample descriptive data. This enabled the subjects? data to be further reduced from approximately 64,000 X, Y, Z points contained in the avatar mesh to 1,034 X, Y, Z points each (3,102 total data points). This reduction exercise provided a more manageable data load for use in the cluster analysis phase of the exploratory pretest study. Details of the SAS coding used in the data reduction exercise are shown in Appendix B. Cluster analysis. Once the data were reduced to a manageable level, it was found that the cluster analysis capabilities of SPSS were able to segregate the data files into two classifications using height, weight, and the 3D data generated by body scanning. The classification option was selected from the cluster analysis tools available in SPSS, which are based on data type and outcome desired. The unsupervised hierarchical clustering option, between-groups distance, and squared Euclidean distance measure was chosen to perform the analysis. The unsupervised clustering technique insured that the numbers of clusters were allowed to emerge from the data and not established a priori. The hierarchical clustering technique insured that once a subject was placed within a 47 cluster, he or she remained within that cluster throughout the analysis. Euclidean distance is the linear distance between two values (Lele & Richtsmeier, 2001). The squared Euclidean distance value allows greater weight to be placed on values that are further apart (Lele & Richtsmeier, 2001). In other words, this clustering technique searches for the difference between the corresponding data values of each subject. The lower the distance between the value of a particular variable of one subject and the identical variable for another subject means that the two subjects are related based on that single variable. The statistical analysis attempted to evaluate 3,104 variables for each subject in order to cluster similar subjects and separate dissimilar subjects. The linking of the individual subjects into clusters is shown on a dendrogram (refer to Figure 16) by vertical and horizontal lines. Each subject initially is its own cluster, indicated by short horizontal lines extending from the master code (four digits) and subject (one-two digits) numbers. However, as the cluster analysis iterations progress, subjects become linked to other subjects based on the squared Euclidean distance between the 3,104 variable values for each subject. Those subjects whose distance between variable values are closer are placed in clusters, and those subjects with data values are further apart (or the squared Euclidean distance value is greater) are placed in different clusters. The linkages between subjects are graphically shown by the vertical lines on the Figure 16 dendrogram. Long vertical lines connecting several subjects at early cluster iterations are indicative of well-formed or well-defined clusters. This can be visualized in Figure 16 by the linkage between female subjects 1054-13, 1138-17, 1085-15, and 1161-20. In contrast, subject 1149-19 does not appear to be linked to any particular cluster other than the overall female cluster. Cluster analysis 48 contains a degree of subjectivity (Lele and Richtsmeier, 2001). The clusters identified on the dendrogram were the result of the researcher performing a subjective review of the cluster structure as described above. Three different clustering exercises were performed on the transformed data using this methodology; the intent was to determine the contribution presented by three different data sets. Height and weight values for each subject were clustered alone, 3D body scan generated variables (3,102) for each subject were clustered alone, and height, weight, and 3D body scan variables (3,104) combined for each subject were clustered. Figure 16. Dendrogram depicting pretest cluster analysis (height and weight). The variable dimensions of height and weight alone were used first in the clustering application. It was anticipated that by themselves, these one dimensional 49 measurements would allow for some level of separation between male and female body form. This assumption appeared correct evidenced by the separation of males and females within the dendrogram shown in Figure 16. This was true with the exception of two male subjects, indicated in Figure 16 by the highlighted code numbers 1065-5 and 1028-1, which were classified as female. A review of the demographic data of the two male subjects that were erroneously classified as females by the clustering technique revealed that they were 68 inches and 69 inches tall and weighed 138 and 151 pounds (1065-5 and 1028-1 respectively). The images in Figure 17 show the frontal body scan image of each of these subjects. Subject 1065-5 Subject 1028-1 Figure 17. Body scan images of male subjects classified as female. Visual analysis of the two images shows subjects with rounded shoulders, high defined waist, and an overall hourglass appearance (characteristic of some human 50 females) thus providing evidence of how these two subjects could have been misclassified. The ability to visually analyze subjects is a vital part of the methodology that allows experts in the field to visually verify the results of the data analysis. The second application of the SPSS cluster analysis technique focused only on the 3,102 body scan variables identified in the data reduction exercise. This application is shown graphically in Figure 18. The data were separated along the Y axis (master code / subject number) by females toward the top of and males toward the bottom with the exception of subject 1065-5. However, most notable is that no distinct clusters seemed to emerge within the male and female body forms. Figure 18. Dendrogram depicting pretest cluster analysis (form data). 51 The third and final application of the SPSS clustering technique combined height, weight, and transformed body form data (3,104 total variables) and revealed two distinct form categories (see Figure 19). The resulting key question was whether or not the actual male and female data were separated cleanly into the two clusters. A manual comparison of the content of the two clusters shown in Figure 19 showed that the clusters were separated along gender lines with the exception of one subject. This subject (1065-5) was one of the two subjects misclassified in the first clustering application and was also misclassified in the second application. However, the other subject who was originally misclassified (1028-1) was now classified correctly. This suggested that considering all three dimensions provided a more accurate analysis. Figure 19. Dendrogram depicting pretest cluster analysis using combined height, weight, and form data). 52 The level of separation between males and females that was found with the combined height, weight, and body scan data cluster analysis was considered an acceptable level of separation that justified progression to the main study. It provided more defined separation into clusters than did either height and weight or body scan data alone. For example, Figure 19 shows that five clusters immerge at iteration four for females and five clusters immerge at iteration nine for males. The findings of the pretest were that the body scan raw data could be collected, condensed, combined with other variables like height and weight, and analyzed to allow male and female body form clusters to emerge from the data. What remained was to apply the methodology established in the pretest to a sample of 117 males to see if clusters would emerge from the data within the human male body form continuum. Clustering of Male Body Form Analysis and Findings Sample Description The sample used in the main study consisted of male body scan data from the Auburn University Men?s Mentoring Study (Simmons, et al., 2009). All examples were generated from 3D body scans using the same NX-16 body scanner. The Men?s Mentoring study focused exclusively on male subjects aged 18 or older and displayed adequate age diversity for the analysis. There were 117 body scan files out of 157 total files collected in the study that were convertible to the avatar mesh. The sample?s demographic data are shown in Table 6. 53 Table 6 Demographic Data of Sample (N = 117) ________________________________________________________________________ Range ____________________ Minimum Maximum Mean SD ________________________________________________________________________ Age (years) 18 59 24.92 7.89 Weight (pounds) 125 325 183.85 37.77 Height (inches) 61 78 70.65 2.73 BMI 19 47 25.87 4.96 ________________________________________________________________________ The sample of 117 male subjects was anticipated to contain sufficient diversity to provide enough variation in body forms to support the clustering methodology developed in the pretest. Figures 20, 21, 22 and 23 represent histograms that depict the age, weight, height, and BMI frequency distributions respectively of the sample. The frequency distribution of the age variable does not follow a normal distribution. Figure 20 shows that age was highly concentrated between 20 and 24 years old. Although the mean age for the sample was young, the range extended to age 59 So the chosen sample provided greater age variability than the pretest sample and was determined sufficient to test the cluster analysis methodology. 54 Figure 20. Frequencies of subject age. Figure 21 shows that the weight variable follows a more normal distribution than the age sample; it is centered on a mean weight of 183.85 pounds. The existence of values at the upper end of the scale suggested that the potential existed for varying body form clusters. Weight has been shown to account for about 33% of human body form variability when subject 3D scan data is normalized for height (Azouz et al., 2006). Figure 21. Frequencies of subject weight. 55 A previous study showed that height and weight combined account for approximately 35% of body form variation, with height being the dominant contributor (Azouz et al., 2006). Figure 22 provides a graphical representation of the frequency distribution of height for the main study sample. The relatively normal distribution curve suggested the possibility of sufficient levels of body form variability. Figure 22. Frequencies of subject height. The frequency distribution of BMI for the main study sample is shown in Figure 23. Because of the relatively normal distribution curves for both weight and height, the same was expected for BMI. The presence of some outliers in the distribution indicated a potential for human body form variability. The BMI variable is included here to describe the sample but as in the pretest was excluded from the clustering technique because of redundancy. 56 Figure 23. Frequencies of subject BMI. Findings The cluster analysis methodology established in the pretest of this study was applied to the main study sample data set with the purpose of answering the following research questions. 1. Will body form categories emerge from an unsupervised hierarchical clustering of 3D male body scan data? 2. What are the statistical characteristics of each cluster? The SAS data reduction technique used in the pretest was applied to the 117 subjects? 3D body scan avatar data and revealed 1,034 X, Y, Z (total 3,102 points or variables) data points for each subject. The 3,102 body form variables were combined with the height and weight variables for each subject (3,104 total variables) to perform the unsupervised cluster analysis using the SPSS software program described for the pretest. 57 Research question one. Will body form categories emerge from an unsupervised hierarchical clustering of 3D male body scan data? Applying the SPSS clustering technique established in the pretest to the 3,104 variables (height, weight, and 3D body scan data), seven distinct body form clusters emerged. The dendrogram shown in Figure 24 depicts the results of the cluster analysis. Figure 24. Dendrogram depicting cluster analysis results. SPSS cluster analysis using combined height, weight, and 3D body scan variables from the main study subject?s data. 58 The seven distinct clusters are shown in Figure 24 by the arched lines labeled by cluster number. The arched lines are the result of the researcher?s subjective visual analysis of the cluster structure guidelines established in the pretest. Clusters one, three, and five contain the largest number of members and appear to be well defined by the long vertical lines connecting the subjects. Clusters two, four, six, and seven have fewer members, but clusters six and seven seem to have vertical linkages similar to those in clusters one, three, and five. In contrast, the linkages between the members of clusters two and four appear less well defined, as indicated with long horizontal lines. Long horizontal lines on a clustering dendrogram mean that the subjects were not placed into a cluster until later iterations of the clustering application. This could be indicative of a less cohesive cluster. Evaluation of the cluster dendrogram visually provided valuable insight into the cluster structures and was an aid in further analysis of the sample. Using the methodology previously established, visual analysis of the dendrogram (Figure 24) revealed that distinct data clusters of subjects did emerge. Thus, the answer to research question one was yes, unique and distinct clusters emerged from the data as a result of the methodology application. The next step in the methodology was to review and analyze the statistical description of the resulting clusters in order to answer research question two. Research question two. What are the statistical characteristics of each cluster? At this point in the process, only digital data representing each subject were analyzed. This represents statistical learning where patterns in the data were allowed to emerge rather than placing the representative data into categories known a priori. Once the clusters were formed by the unsupervised hierarchical clustering technique, the analysis 59 focused on the cluster verification process. Answering research question two was the next step to determine if the revealed data clusters related to actual human male body forms. The statistical description of the variable data was represented as the average (mean) values of age, weight, height, and BMI. Arrangement of these values by cluster shown in Table 7 allowed for a deeper understanding of the individual clusters. Table 7 Average value of variables by cluster (N = 117) ________________________________________________________________________ Age Weight Height Cluster n (years) (lbs) (in) BMI ________________________________________________________________________ 1 38 23.55 152.39 69.42 22.34 2 3 25.67 159.33 72.00 21.67 3 45 24.38 180.49 70.71 25.36 4 5 24.80 177.80 70.00 25.60 5 15 27.00 213.07 71.93 29.20 6 5 27.80 248.80 73.80 32.20 7 6 29.33 298.50 72.00 40.83 ________________________________________________________________________ The purpose of answering research question two was to further verify that the clusters identified in the data analysis represented body form variation within the sample and that the identified clusters were in fact unique and distinct. A review of the contents of Table 7 was sufficient to make that determination because trends in the variables between the clusters can be seen. For example, mean weight differs between clusters; it is higher in 60 each successive cluster with only one exception. After height (which does not appear to change drastically between clusters), weight is the largest contributor to body form variation (Azouz et al., 2006). Therefore weight variation within clusters could be correlated with body form variation among the same clusters. With weight changes shown between the identified clusters and relative height consistency between those clusters, it could be expected that BMI values would also change between clusters. This was verified in the representative data. An interesting trend that is revealed in the statistical data is that the age of the subject tended to change in conjunction with changes in weight which is the largest contributor to body form. This information could aid in the future investigation into the relationship between variations in body form, weight, age, and BMI. Table 7 shows the number of subjects in each cluster (n) along with the average values of the age, weight, height, and BMI variables by cluster. In Appendix C, range values and standard deviation for each of the variables are arranged by cluster. Appendix C also contains tables showing the variable values for each of the three specific subjects identified for use in the expert verification phase of the methodology. It was shown that the body form clusters revealed in the digital data analysis do each have distinct statistical characteristics, thus answering research question two. The final methodological step was for a panel of experts to visually analyze images representative of clusters. Expert Recognition of Clusters Analysis and Findings Purpose Expert visual review of the clusters revealed by the cluster analysis technique was used to answer the following research questions: 61 3. What are the visual characteristics of each cluster? 4. Do experts in the field of somatology recognize the various clusters from the statistical and visual characteristics generated? The original intent of this study was to develop a new methodology for use with 3D body scan data to reveal male body form categories that could be applied to the assessment of human bodies for the improvement of the sizing and fit of men?s apparel. The clusters generated by the cluster analysis were evaluated by experts in somatology to accomplish this end. Method The panel of experts was presented with the dendrogram shown in Figure 24 along with the detailed statistical descriptions of each cluster shown in Appendix C. Additionally, the panel was given body scan images of three subjects from each cluster. The three images were of the first, middle, and last subject in each cluster; they represented the high (H), median (M), and low (L) members of the cluster. The appearance of the images viewed is illustrated by the body scan printout shown in Figure 25. 62 Figure 25. Typical body scan image. Generated from subject scan data collected via [TC]2 NX-16 Body Scanner at Auburn University. The images representing the subject members of each cluster were arranged side by side in cluster order and evaluated visually by the panel of experts. The panel used their explicit and tacit knowledge of apparel fit as well as techniques used in the development of the BSAS? to answer research questions three and four. The assessment of the clusters was heavily based on recognition by component (RBC) theory (Biderman, 63 1985). The panel used this visual analysis along with the statistical data contained in Appendix C to develop findings presented in the next section. Findings Research question three. What are the visual characteristics of each cluster? The panel of experts arrived at the following descriptions of each cluster. Cluster one contained 38 subjects who are visually represented by the three images contained in Figure 26. The experts described this cluster as visually having a higher and more defined waist, sloped shoulders, and flat buttocks in relation to the other identified clusters. The side view shows a flat chest and stomach area. The frontal view reveals hips and shoulders that are balanced in width. The body measurements of these subjects seemed small relative to subsequent clusters. Figure 26. Cluster one body scan images. H, M, and L members of cluster one (front & side view). 64 Cluster two contained three subjects (see Figure 27). This cluster appeared on the dendrogram to be the least cohesive of the identified clusters because the long horizontal lined extending from the three subjects master code/subject number?s indicated that the cluster algorithm did not connect these subjects together until later cluster iterations. This observation based on information presented in the dendrogram can be verified by a visual evaluation of the images of the three subjects shown in Figure 27. These three subjects do not appear visually closely related in form but do appear to be related to subjects placed in other clusters. Subjects 2M and 2L seem to relate to cluster three because their shoulders are squared, upper chest is broad, and buttocks are more prominent than cluster one. Subject 2H has more rounded shoulders, a more prominent stomach area, and flatter buttocks than subjects 2M and 2L. However, his upper chest prominence is similar to 2M and 2L. Figure 27. Cluster two body scan images. H, M, and L members of cluster two (front & side view). 65 Cluster three contained 45 subjects represented by the three images shown in Figure 28. This cluster appears to be characterized by broad and relatively square shoulders in comparison to hip width. The subjects seem more muscular and to have more prominent chests and buttocks than those in other clusters. They present a high and defined waist. However, subject 3H could be considered visually closer to 4L and 4H due to his less defined waist. The relatively large number of subjects placed in this cluster and the well-defined cluster structure shown in the dendrogram in Figure 24 tends to indicate the cohesiveness of this cluster. Figure 28. Cluster three body scan images. H, M, and L members of cluster three (front & side view). Cluster four consisted of five subjects. Like cluster two, it does not appear as cohesive in body form characteristics as clusters one and three; it was more similar in structure to cluster two. The three subjects representing cluster four show square shoulders like cluster three, but they do not appear as broad relative to hip width. Subject 66 4M seems more like subjects 3L and 3M in his more defined waist and more muscular appearing shoulder, chest, and thighs. Subject 4H is visually more like those subjects represented in cluster five. This cluster appears similar in structure to cluster two on the dendrogram indicating low cohesion. Also, the panel of experts identified characteristics of the representative subjects in this cluster as more similar to those of other clusters. This could indicate the need to eliminate this cluster and merge the subjects into more appropriate clusters. Figure 29. Cluster four body scan images. H, M, and L members of cluster four (front & side view). Cluster five consisted of 15 subjects. The subjects in this cluster appear to have less square shoulders, with shoulder slopes similar to cluster one. From a frontal view, these subjects are relatively proportional and rectangular in width from shoulders-to- waist-to-hips. Their waists appear thicker, and the upper chest does not appear more 67 prominent than the stomach area. This group seems less muscular than cluster three and thicker than cluster one. Overall, this appears to be a cohesive cluster. Figure 30. Cluster five body scan images. H, M, and L members of cluster five (front & side view). Although cluster six contains only five subjects, its definition in the Figure 24 dendrogram appears more defined than clusters two and four but not as well defined as clusters one and three. This could indicate that the number of subjects with this particular body form was lower in the total sample count than subjects with the body forms belonging to clusters one and three. Cluster six subjects appear rectangular in shape with a larger midriff (than previous clusters) that contributes to a fuller figure appearance. On a side view evaluation, subjects 6M and 6L have central torso (between upper chest and high hip) protrusions; subject 6H appears more generally large but does not have the degree of protrusion that the others do. The appearance of the shoulders were more square than cluster five but not broad or muscular. 68 Figure 31. Cluster six body scan images. H, M, and L members of cluster six (front & side view). Cluster seven was created with six subjects. It is well defined in the Figure 24 dendogram. Like cluster six, this could be indicative of fewer comparable subjects in the sample. Expert visual analysis identified subject 7H as the most obvious single outlier in the study. A review of the demographic data in Appendix C shows that subject 7H was at the minimum cluster values for weight (272) and BMI (37). Cluster seven?s mean values for weight (298.5) and BMI (40.83) include the values for subject 7H. Therefore, subject 7H?s values for these variables were influencing the mean values of cluster seven toward the low side. Without subject 7H?s variable influence, the overall cluster means for weight and BMI would be even higher. Evaluation of both the visual image and the variables suggests the need to reclassify subject 7H into a more appropriate cluster. Subjects 7M and 7L can be described as appearing larger in circumference through the middle torso and having more clearly defined upper chest fullness than those in cluster 69 six. The shoulder structure varied between the two remaining representative subjects in this cluster. Subject 7L has square and 7M has sloped shoulders. Figure 32. Cluster seven body scan images. H, M, and L members of Cluster 7 (front & side view). During the visual analysis of the clusters, it was discovered that adding body measurement data to the content of the evaluation had the potential to add clarity to the classifications. Experts? observations led to the recommendation to calculate the difference measurements among the three major circumferences of chest, waist, and hips Thus, indicators of relative differences could be compared across clusters by considering the chest-to-waist, chest-to-hips, and hips-to-waist variables. Table 8 shows a comparison of these measurement differences arranged by cluster subject representative. The greater the value, the greater is the difference in relative size to be observed, implying possible variations in body form among clusters. The trend shown in the table indicates that the larger the subject, the smaller the relative differences between 70 landmarks, leading to body forms that are more evenly cylindrical or full from chest to hips. This finding supports those from the visual analysis of frontal and side views. Table 8 Body Measurement Difference by cluster in inches (N = 117) ________________________________________________________________________ Cluster Member n Chest-Waist Chest-Hip Hip-Waist ________________________________________________________________________ 1 High (H) 38 7.05 1.98 5.07 Median (M) 4.22 0.12 4.10 Low (L) 9.17 0.29 8.88 2 H 3 4.11 - 0.78 4.89 M 8.86 2.24 6.62 L 13.12 10.04 3.08 3 H 45 6.28 1.45 4.83 M 10.89 4.83 6.06 L 14.29 5.65 8.64 4 H 5 4.41 0.88 3.53 M 11.95 4.59 7.36 L 10.04 3.83 6.21 5 H 15 4.60 - 0.39 4.99 M 6.21 - 0.33 6.54 L 8.41 3.59 4.82 6 H 5 7.73 4.55 3.18 M 3.53 1.98 1.55 L 2.15 1.17 0.98 7 H 6 3.24 - 1.15 4.39 M 2.57 6.19 -3.62 L 5.54 3.31 2.23 ________________________________________________________________________ A tabular summary of the information accumulated during the expert visual analysis phase of the study was most helpful in answering research question three (What are the visual characteristics of each cluster?). This summary also provided the panel of 71 experts with the basic information used to fully answer research question four (Do experts in the field of somatology recognize the various clusters from the statistical and visual characteristics generated?). Figure 33 contains the summary of the visual, written, and numerical descriptive data of the clusters identified in answering research questions one and two. Figure 33. Expert visual analysis summary. Research question four. Do experts in the field of somatology recognize the various clusters from the statistical and visual characteristics generated? Using research question three as the basis, the panel of experts found that clusters one and three (which contain the largest number of subjects) appear most visually cohesive in form. These two clusters have the most visually similar and identifiable characteristics. The same is true 72 to a lesser degree for cluster five. Clusters two and four seem to be the least cohesive. The subjects within these clusters could easily be reclassified into adjacent clusters with which they share more common definable traits. Clusters six and seven are less well defined than clusters one, three, and five. This could be the result of the lower number of subjects in the sample that fit these categories. However, these clusters remain somewhat cohesive with the one exception of subject 7H. Subject 7H seems to fit best the descriptors of cluster five. Therefore, most of the clusters have some degree of cohesion but not complete cohesion. It appears from the analysis that the sample could be clustered into five well defined clusters. 73 CONCLUSIONS, DISCUSSION, & RECOMMENDATIONS To classify is human (Costa & Cesar, 2001). This study explored the development of a methodology to apply unsupervised statistical clustering to classify the male body form using data generated through 3D body scanning. Though the female body form has been studied and classified using body proportions and expert recognition, little information exists on classification of the male body form. Digital technology presents a new possibility for data analysis. The purpose of this study was to find a viable statistical path to make male body form identification and classification achievable. One obstacle discovered during the exploration is that individual digital files that identify body form contain massive amounts of data. Much of the development of the methodology was spent transforming the data to a volume that could be easily processed on university computing systems. Steps involved in the methodology were (1) data collection via 3D body scanning technology, (2) normalization via conversion to an avatar mesh, (3) data reduction via PCA, (4) unsupervised hierarchical cluster analysis via SPSS, and 5) confirmation of clusters via expert analysis. Summary The methodology was deemed successful and may be considered a step toward a statically based methodology for classification of the human body form. Four research questions guided this exploratory study. Research Question One Will body form categories emerge from an unsupervised hierarchical clustering of 3D male body scan data? Seven clusters emerged when the variables of subject height, 74 subject weight, and 3D body scan measurement data were used in the unsupervised hierarchical clustering process. Some clusters (one, three, and five) appeared to be visually well defined (using guidelines established in the pretest), but others showed evidence of loose cohesion, with each member of the cluster seemingly having limited to no linkage to the others (e.g., cluster two). Cluster analysis is an inherently subjective process (Lele & Richtsmeier, 2001) and must be defined by the characteristics associated within and between clusters. Therefore, further analysis was necessary to verify that the clusters revealed made sense in terms of measurable and visible characteristics. Research Question Two What are the statistical characteristics of each cluster? Height and weight are known to be the greatest contributor to body form variation (Azouz, et al., 2006). Thus, the statistics considered were the basic demographic and body measurement data along with age, weight, height, and BMI. These are summarized in Table 6 and detailed in Appendix C; they were used to more thoroughly define the clusters shown in Figure 24. Analysis showed that there were clear differences in subject height and weight among clusters, especially between cluster one and cluster seven. However, use of height and weight constructs alone was insufficient to provide uniquely identifiable and describable clusters. Research Question Three What are the visual characteristics of each cluster? Two experts in the field of somatology analyzed the visual characteristics of each cluster using images generated by 3D body scanning. Body scan images of three male forms representing the range and average body form of each cluster were visually analyzed to distinguish common visual 75 body form characteristics. This step was intended to determine whether or not the clusters? representative body forms appeared similar within the clusters and distinct and unique between the clusters. Physical characteristics such as prominent chest, high defined waist, narrow hips, square shoulders, prominent buttocks, and flat tummy (cluster 3) were identified and compared. For example, descriptors of cluster seven were full chest, full waist, balanced hips, sloped shoulders, flat buttocks, and prominent tummy; these differed from cluster three. Distinctions revealed during analysis of research questions two and three provided evidenced that the clusters were visually as well as statistically distinct and unique. Research Question Four Do experts in the field of somatology recognize the various clusters from the statistical and visual characteristics generated? The expert evaluation showed that emergent clusters varied in degree of cohesiveness; those with the smallest number of sample subjects tended to be the least cohesive. Subjects in two clusters in particular (two and four) could be assigned both visually and statistically to other clusters. Thus, the expert review suggested that the number of clusters be reduced from seven to five by re-classifying the subjects in cluster two and cluster four into adjacent clusters. The answers to the four research questions showed that the methodology was successful in accurately reducing a large volume of statistical data to identifiable clusters that represent distinctive male body form characteristics. This could be considered a valuable tool in the classification of the male body form in relation to the fit of apparel. 76 Discussion Historical Comparison Beginning in prerecorded history, humans exercised their innate need to classify objects. This need stems from identifying security threats and recognizing edible foods for survival. As time progressed, humans visually processed and recorded large numbers of objects. Entire sciences were developed around the classification of objects that humans encounter. Constructs used by humans to identify and classify objects can include size, build, shape, and form. Over time, methodology was developed to measure these constructs and store the data collected in order to build on the capability of the human brain to process the information. Linear measures (tape measures) were developed to evaluate size, multiple probe methods (photography and somatography) were developed to combine with the linear method to evaluate build and shape, and body form methods (draping and body scanning) were developed to evaluate body form (Bye, et al., 2006). Form is the best construct for use in the fit of apparel to the human body as the actual human body is a 3D form and not a 2D shape nor a 1D size measurement. As body form classification evolved, researchers utilized the most current technology to add to the body of knowledge in the field of somatology. In 1940, Sheldon developed the somatotype categories of endomorph, mesomorph, and ectomorph by using front and side view photography to derive 3D form from 2D shape images (Sheldon, 1940). Later studies by Douty and Brannon combined photography, Cartesian grid structure, and RBC theory to derive body form in order to evaluate body build and posture in relation to the fit of apparel (Douty, 1968; Brannon, 1971). Contemporary studies related to human body form include the development of the BSAS? and FFIT? 77 (Connell, et al., 2006; Simmons, et al., 2004). These two systems were developed utilizing 3D body scan data and RBC theory to derive body form information by establishing shape categories for the whole body or for component parts. The current study builds on the knowledge of somatology developed thus far by extending statistical learning techniques in the quest to reveal clusters of human body form using 3D body scan data. Compared to past research, this study provides a new step toward quantitatively developing body form clusters for specific samples that may be generalizable to the associated population. Some studies have used subjective measurement techniques like visual analysis of photographs and visual analysis of body scan images to categorize body shapes (Brannon, 1971; Connell, et al., 2006; Douty, 1968). Another study used 3D body scan data to define the characteristics of body shape categories known a priori in the development software to perform supervised clustering of subsequent body scan data (Simmons, et al., 2004). The current study reduces reliance on subjective measurement techniques by using an unsupervised clustering technique to statistically generate preliminary male human body form clusters from raw 3D body scan data. Identification of preliminary clusters revealed by the digital data can drastically reduce visual analysis time. Thus, as study sample sizes increase initial statistical clustering can contribute to the quicker, and perhaps more precise, distinguishing of realistic clusters. Significance In developing this research study, the researcher found that the somatological literature was often blurry in its definitions and applications of body measurement constructs. Terms with different meanings in other fields (like shape and form in the arts) 78 were often used interchangeably by researchers across somatological studies. This study should be helpful to future researchers by bringing clarity and consistency to the terminology used in the field of somatology. The most commonly used constructs are size, build, shape, and form. It is easiest to differentiate these constructs using dimensional descriptions (see Table 9). Table 9 Somatological Constructs Dimensional Descriptions ________________________________________________________________________ Construct Dimensional Description ________________________________________________________________________ Size 1D (inches and pounds) Build No Dimension (ratios like hip/waist) Shape 2D (triangle and rectangle) Form 3D (volume) ________________________________________________________________________ Note: (Fiore, 2010; Webster?s, 1993). The human body occupies space in three dimensions. Humans are 3D objects and are best described by the form construct. When a garment is fit to the human body, three dimensional body form is a vital consideration in achieving a good fit. Though size, build, and/or shape are facets of achieving good fit, the final test is whether the garment moulds comfortably to each individual body form. This study utilized raw data that the NX 16 software brings together to show an individual?s body form. Without converting it to a lower dimensional construct (e.g., 1D or 2D), subjects? scans were defined as elliptical layers and clustered into body form categories. One advantage of this method is that time and error can be reduced by utilizing raw 3D body scan data to statistically 79 reveal the clusters of form as opposed to an estimation of derivative form from an analysis of lower dimensional constructs. Clustering subjects within a sample based on elliptical data can help researchers develop a deeper understanding of varying body forms in the population. The current study provides a methodology to further develop that understanding. This exploratory study is also significant because it provides the foundational work for a methodology that reduces processing time of body form studies involving large amounts of individual 3D body scan data and can be used to generate clusters with similar body form characteristics. This process is completed more quickly than if researchers had to individually and manually view subjects within a sample and place them in clusters. Prior studies have developed software with the capability of evaluating subject 3D body scan data and placing forms into a priori categories (Connell, et al., 2006; Simmons, et al., 2004). Statistically, these systems are based on using body measurements and ratios to perform supervised clustering. Supervised clustering is limited by the appropriateness and accuracy of the a priori categories being applied. This study evaluated a sample of male subjects and allowed the 3D body scan data to reveal male body form clusters in an unsupervised approach. Unsupervised clustering is significant because it allows updated body form clusters to emerge as more sample data is analyzed. This could provide the field of somatology with more appropriate and accurate human body form clusters. The time saved in analysis could prove to be the most important contribution of this study. The vast amounts of existing data generated by 3D body scanning technology can now be quickly analyzed to provide a deeper understanding of the human body form. 80 Body form occurs along a continuum. This means that there are as many nuanced or clearly different human body forms as there are individuals. Application of the methodology developed in this study provides cohesive clusters representing body form effectively converting a continuous construct to a categorical construct. This outcome is important and relates directly to the study of consumer behavior, human health risks, and physical activity intervention. For example, much of the consumer behavior research today is conducted utilizing categorical variables. Presently variables like body image, body satisfaction, or purchase intention are measured with questionnaires that utilize some kind of incremental measurement scale. If a researcher intends to relate any number of categorical constructs to body form, the only option is to visually place the subjects into universal form categories developed a priori. Application of unsupervised clustering methodology developed in this study will allow quantitatively derived body form categories to emerge from a study?s specific data which can be compared statistically to a study?s other constructs. The researcher found during this study that 3D body scan data files can contain in excess of one million data points that represent the human form. It was planned initially to convert the raw data to an avatar mesh with the purpose of normalizing and reducing the data volume. However, it became evident that it was necessary to reduce the data even further to provide a methodology that could be easily performed on smaller university computing systems. This further data reduction exercise turned out to be a significant outcome of the study. When data reduction exercises are necessary, it is important to maintain the descriptive integrity of the subject data. To do this, an in depth understanding of the original data format is mandatory. Data reduction was 81 accomplished using techniques of principal component analysis (PCA) and provided the most accurate data for the cluster analysis technique. Recommendations Limitations This study?s contributions include construct terminology clarification and quantitatively derived body form scales, but it is not without limitations. Prior to considering recommendations for future study, researchers must consider steps to reduce or eliminate the acknowledged limitations in the current study. This serves as a road map to strengthen body form research and a point of origin for future studies. The acknowledged limitations of this study include sample demographics, degree of subjectivity, and software programming time. Sample demographics. Data were limited to 117 male subjects from a small southeastern US location. The sample did not present a high degree of variability in age and body form, but it was considered to contain sufficient variability to test the methodology?s ability to used data reduction techniques and unsupervised clustering to cluster male body form. The limitations of sample size and sample demographic could be easily remedied by increasing the number of subjects and utilizing subjects who offer more potential variability in body form. This can be accomplished by utilizing larger and more diverse existing data files collected in studies like Size USA, Size UK, and Size Mexico or collecting data for specific populations. The methodology developed in this study can be applied to larger data sets with more diverse 3D populations. Building sample size and potential variability will allow progress toward defining specific somatotypes among and between varying populations. 82 Degree of subjectivity. This study did not eliminate all subjectivity in finalizing form categories in that experts evaluated the statistically generated clusters. This step was necessary to verify cluster cohesiveness by looking for common characteristics related to human body form. The methodology does, however, reduce subjectivity by first allowing clusters to emerge from the raw 3D body scan data rather than visually analyzing actual human bodies, photographs, or body scan generated images. This one contribution will allow researchers to process large amounts of 3D body scan data in a relatively short period of time. Human body form clustering for the fit of apparel is a subjective process and removing all subjectivity would not be possible nor advised. Software programming time. The SAS coding in this study requires the manual input of a line of code to read each avatar file associated with study subjects. This was not considered a problem in processing the 117 files of this study but other studies involving greater sample sizes could significantly increase the time to code the program. Manual coding input also increases the possibility of human error. This limitation can be addressed in future studies by employing SAS macros to automatically read in the data for each subject. The conversion of raw 3D body scan files into the avatar mesh presented a time limitation as well. One hundred seventeen files had to be converted to the avatar mesh. There was not a problem in processing 117 files in this study but it would be increasingly time consuming to process 500, 1000, or more files in this manner. This limitation can be addressed within the macro programming capability of the [TC]2 software. The two software and data processing limitations of the study are easily addressed by the refinement of coding involved in the process. This is a step that 83 must be considered as a type of continuous improvement each time the methodology is applied to different sample sets. Future Study Identifying and addressing limitations of a study is the first step to improving subsequent research but there may be opportunities for expansion and improvement beyond alleviating limitations. This study is the seminal basis for several possible areas of future research. Opportunities can be identified as studies to improve the current methodology (academic application), expand analysis of each cluster (academic application), apply the methodology to other samples (industry application), and collaborate with other fields of study (academic and practice application). Improve current methodology. The methodology developed in this study can be improved by exploring new data reduction and clustering techniques. For example, there are prior studies that use PCA data reduction techniques differently compared to the current study. These studies were looking to identify the major contributors to human body form variability but stop short of clustering the forms (Allen, Curless, & Popovic, 2003; Azouz, et al., 2006). An exploration into contemporary data reduction techniques would require a degree of collaboration with experts in the field of statistical methods. Future research should continue to monitor varied methods being tried in somatology studies. One area to be considered is how the image of the average subject assigned to each cluster is visually represented. This study used the actual body scan image generated by the [TC]2 NX16 whole body scanner for the median member subject to represent the entire cluster. Some researchers have developed methodology with the ability to morph (or combine) several 3D body scan images into a single image 84 representing the entire group (Allen et al., 2003). Applying this technique could prove to be a valuable tool in the visual analysis, cluster verification. It could also provide a more accurate single representative body form for use by researchers or manufacturers. Expand current analysis. This study used simple demographic data of height, weight, age, and selected circumference measures to describe clusters. Future studies could utilize the measurement extraction profile (MEP) capability of the 3D body scan software to isolate certain component part measures that might provide a more detailed description of the individual clusters. This could allow for the comparison of constructs like size and build (ratios) within the individual clusters as well as more detailed descriptions to be used for comparisons between clusters. Apply methodology. The methodology used in this research has potential value to the apparel industry in the area of product development. Many apparel companies spend large amounts of money researching their target customer. They develop a deep understanding of their customer?s demographics, purchase patterns and preferences, and anthropometric measurements, and they apply the information to marketing and product development efforts. A primary concern for any apparel company is whether or not their garments will actually fit their identified target customers. The fit of garments to the human body is best evaluated on an actual body form (either a surrogate or live fit model). The statistical human body form clustering methodology developed in this study has the potential to provide apparel companies with body form categories for use in fit analysis exercises based on the demographics of their specific target customers. Cost savings might be generated by reducing the proportion of unsold inventory and product returns. 85 Collaborate. Collaboration across fields is an important trend in academia. Kinesiology is one field that could have applications for the methodology developed in this study. Human physical activity is often related to various anthropometric measures (Jackson, Howton, Grable, & Collins, 2006; Kang, Marshall, Barreira, & Lee, 2009). The methodology developed in this study makes it possible to cluster whole and component human body form and relate the identified clusters to certain physical activity measurements within a sample. This methodology also lends itself well to the longitudinal study of variations in body form clusters caused by variations in physical activity intervention. Healthcare is another field where human body form cluster methodology could be applied. It is tacit knowledge that certain health risk factors are related to human body form types. This study provides a methodology to pair with research into human health risks in fields like cardiology, endocrinology, oncology, and psychology. Examples of how the cluster methodology applies to health risk investigations include but are not limited to the relation of body form or component body form to heart disease, the occurrence of diabetes, and certain cancers. Also, prior studies have attempted to relate psychological disorders to human body type (Sheldon, 1940). The cluster methodology developed in this study could provide psychologists with quantitatively derived body form categories to revisit prior theories. Exploratory studies have a propensity to generate more questions than answers. The researcher must remain focused on answering the research questions specific to the study at hand in order to set a stable foundation to branch out into future investigation opportunities. This exploration contributed a foundational methodology that can be 86 expanded. The conclusions drawn from this investigation have far reaching applicability across many fields of study including but not limited to somatology, fit of apparel, human health risk analysis, and kinesiology. 87 References Alexander, M. (2003). Applying three-dimensional body scanning technologies to body shape analysis (Unpublished doctoral dissertation) Auburn University, Alabama. Allen, B., Curless, B., Popovic, Z. (2003). The space of human body shapes: Reconstruction and parameterization from range scans. ACM Transactions on Graphics (TOG) ? Proceedings of ACM SIGGRAPH 2003, 22(3), 587-594. Armstrong, H. J. (1987). Patternmaking for fashion designers. (5th edn.) pp. 35-43. New York: Harper Collins. August, B. (1981). Complete Bonnie August dress thin system. New York: Rawson, Wade Publishers, Inc. Azouz, Z. B., Rioux, M., & Lepage, R. (2002). 3D description of the human body shape using Karhunen-Loeve expansion. International Journal of Information Technology, 8(2), 26-35. Azouz, Z. B., Rioux, M., Shu, C., Lepage, R. (2006). Characterizing human shape variations using 3D anthropometric data. Visual Computing, 22, 302-314. Biederman, I. (1985). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115-147. Brannon, E.L. (1971). Graphic somatometry in the development and application of posture and body build scales for men (Unpublished master?s thesis) Auburn Univeristy, Alabama. Bye, E., LaBat, K. L., & DeLong, M. R. (2006). Analysis of body measurement systems for apparel. Clothing & Textiles Research Journal, 24(2), 66-79. 88 Churchill, E., & McConville, J. T. (1976). Sampling and data gathering strategies for future USAF anthropometry (AMRL-TR-74102). WrightPatterson Air Force Base, OH:Aerospace Medical research Labortory, Aerospace Medical Division, (AFSC). Connell, L. J., Ulrich, P. V., Brannon, E. L., Alexander, M., & Presley, A. B. (2006). Body shape assessment scale: Instrument development for analyzing female figures. Clothing and Textiles Research Journal, 24(2), 80-95. Connell, L. J., Ulrich, P. V., Simmons, K. P., & Gropper, S. S., (2010). Freshman 15 Longitudinal Study [Department of Consumer Affairs 3D Body Scanning Database]. Auburn University: Alabama. Costa, L. F., & Cesar, R. M. (2001). Shape analysis and classification: Theory and practice. New York: CRC Press. Croney, J. (1971). Anthropometrics for designers. Van Nostrand Reinhold Company: New York. Douty, H. I. (1968). Silhouette photography for the study of visual somatometry and body image. Paper resented at the National Textiles and Clothing Meeting, Minneapolis, Minnesota. Douty, H. I., Moore, J. B., & Hartford, D. (1974). Body characteristics in relation to life adjustment, body-image and attitudes of college females. Perceptual and Motor Skills, 39, 499-521. Fiore, A. M., (2010). Understanding aesthetics for the merchandising and design professional. New York: Fairchild Books. 89 Flegal, K. M., & Graubard, B. I. (2009). Estimates of excess deaths associated with body mass index and other anthropometric variables. American Journal of Clinical Nutrition, 89, 1213-1219. Godil, A. (2009). Facial shape analysis and sizing system. In Duffy, V. G. (Ed.), Digital Human Modeling (pp.29-35). Berlin, Germany: Springer-Verlag. Jackson, E. M., Howton, A., Grable, S., & Collins, M. A. (2006). Increasing walking in college students using pedometers: Differences according to body mass index [Abstract]. Medicine & Science in Sports & Exercise, 38, S121. Kang, M., Marshall, S., Barreira, T., Lee, J. (2009). Effect of pedometer based physical activity interventions: A meta-analysis. Research Quarterly for Exercise and Sport, 80(3), 648?655. Kidwell, C. (1979). Cutting a fashionable fit: Dressmakers? drafting systems in the United States. Washington, DC: Smithsonian Institute Press. Lele, S. R., & Richtsmeier, J. T. (2001). An invariant approach to statistical analysis of shapes. New York: Chapman & Hall/CRC. McCue, C. (2007). Data mining and predictive analysis. Intelligence gathering and crime analysis. Amsterdam: Elsevier. Medwar, P. B. (1944). The behavior and fate of skin autografts and skin homografts in rabbits. Journal of Anatomy, 78, 176-199. Minott, J. (1972). Pants and skirts. (2nd edn.). pp. 10-12. Minnesota: Burgess Publishing Company. Minott, J. (1978). Fitting commercial patterns: The Minott method. Minnesota: Burgess Publishing Company. 90 O?Brien, R., & Shelton, W. (1941). Women?s measurements for garment and pattern construction. Miscellaneous Publication No. 454. Washington, D.C.: U.S. Government Printing Office. Physical Anthropology, (n.d.). In Medilexicon online medical dictionary. Retrieved from http://www.medilexicon.com/medicaldictionary.php Sheldon, W. H. (1940). The varieties of human physique. New York: Harper and Brothers. Simmons, K. P., Chattaraman, V., & Ulrich, P.V. (2009). Analysis of body shape and apparel fit preferences of male consumers [Department of Consumer Affairs 3D Body Scanning Database]. Auburn University: Alabama. Simmons, K. P., & Istook, C. L. (2003). Body measurement techniques: Comparing 3D body-scanning and anthropometric methods for apparel applications. Journal of Fashion Marketing and Management, 7(3), 306-332. Simmons, K., Istook, C., & Devarajan, P. (2004a). Female figure identification technique (FFIT) for apparel. Part I: Describing female shapes. Journal of Textile and Apparel, Technology and Management, 4(1), 1-16. Simmons, K., Istook, C., & Devarajan, P. (2004b). Female figure identification technique (FFIT) for apparel. Part II: Development of shape sorting software. Journal of Textile and Apparel, Technology and Management, 4(1), 1-16. Sneath, P. H. A. (1967). Trend surface analysis of transformation grids. Journal of Zoology, Proceedings of the Zoological Society of London, 151, 65-122. Textile Technology Corporation [TC]2, 2011. Body scanner specifications: NX-16. Retrieved from http://www.tc2.com/products/body_scanner.html 91 Thompson, D. W. (1917). On growth and form. Cambridge, UK: Cambridge University Press. U.S. Department of Commerce. (1958). Body measurements for the sizing of women?s patterns and apparel (Commercial standard CS215-58). Washington, CD:U.S. Government Printing Office. Webster?s. (1993). Webster?s third new international dictionary, unabridged: The great library of the English language. Springfield, MA: Merriam. 92 Appendix A Body Shape Assessment Scale BSAS? (2006) Images 93 94 Appendix B FSCottle Dissertation Main Study SAS Coding %LET DIR = E:\Ricks data\Men Ment Normal obj\; LIBNAME pretest "&DIR"; %MACROreadin(ID, file); DATA WORK.data1 ; infile"&DIR.\&file"lrecl=32767firstobs=29267obs=51321 ; informat ID 8.pixbest12.VAR1 $2.xbest16.ybest16.zbest16.; format ID 8.pixbest12.VAR1 $2.xbest16.ybest16.zbest16.; input VAR1 x y z; if var1='v'; DROP var1; ID="&ID"; pix=_n_; RUN; PROC SORT data=data1; by z x y; RUN; data data1; set data1; srt1=_n_; RUN; PROC SORT data=data1; By x; RUN; DATA data1; set data1; srt2=_n_; RUN; PROC SORT data=data1; By y; RUN; DATA data1; set data1; srt3=_n_; RUN; DATA data1; set data1; if 5000< srt1 le 21400; /*if 500< srt2 le 30000; */ IF 3000< srt3 le 29678; RUN; *IF srt le 21400 and srt> 1000; PROC SORT data=data1; by z; run; DATA data1; set data1; newindex=_n_; if mod(newindex,10)=0; RUN; PROC APPEND data=data1 base=all1 force; 95 RUN; %MENDreadin; %Readin(03100901, 03100901mm_OptiTexAdam_Texture_standard.obj); %Readin(03100902, 03100902mm_OptiTexAdam_Texture_standard.obj); %Readin(03100903, 03100903mm1_OptiTexAdam_Texture_standard.obj); %Readin(03100904, 03100904mm_OptiTexAdam_Texture_standard.obj); %Readin(03100905, 03100905mm_OptiTexAdam_Texture_standard.obj); %Readin(03100906, 03100906mm_OptiTexAdam_Texture_standard.obj); %Readin(03100907, 03100907mm_OptiTexAdam_Texture_standard.obj); %Readin(03100908, 03100908mm_OptiTexAdam_Texture_standard.obj); %Readin(03100909, 03100909mm_OptiTexAdam_Texture_standard.obj); %Readin(03100910, 03100910mm_OptiTexAdam_Texture_standard.obj); %Readin(03100911, 03100911mm_OptiTexAdam_Texture_standard.obj); %Readin(03100912, 03100912mm_OptiTexAdam_Texture_standard.obj); %Readin(03100913, 03100913mm_OptiTexAdam_Texture_standard.obj); %Readin(03100915, 03100915mm_OptiTexAdam_Texture_standard.obj); %Readin(03100916, 03100916mm_OptiTexAdam_Texture_standard.obj); %Readin(03100917, 03100917mm_OptiTexAdam_Texture_standard.obj); %Readin(03110901, 03110901mm_OptiTexAdam_Texture_standard.obj); %Readin(03110902, 03110902mm_OptiTexAdam_Texture_standard.obj); %Readin(03110903, 03110903mm_OptiTexAdam_Texture_standard.obj); %Readin(03110904, 03110904mm_OptiTexAdam_Texture_standard.obj); %Readin(03110906, 03110906mm_OptiTexAdam_Texture_standard.obj); %Readin(03110907, 03110907mm_OptiTexAdam_Texture_standard.obj); %Readin(03110908, 03110908mm_OptiTexAdam_Texture_standard.obj); %Readin(03110909, 03110909mm_OptiTexAdam_Texture_standard.obj); %Readin(03110910, 03110910mm_OptiTexAdam_Texture_standard.obj); %Readin(03120901, 03120901mm_OptiTexAdam_Texture_standard.obj); %Readin(03120902, 03120902mm_OptiTexAdam_Texture_standard.obj); %Readin(03120903, 03120903mm_OptiTexAdam_Texture_standard.obj); %Readin(03120904, 03120904mm_OptiTexAdam_Texture_standard.obj); %Readin(03120905, 03120905mm_OptiTexAdam_Texture_standard.obj); %Readin(03120906, 03120906mm_OptiTexAdam_Texture_standard.obj); %Readin(03120907, 03120907mm_OptiTexAdam_Texture_standard.obj); %Readin(03120908, 03120908mm_OptiTexAdam_Texture_standard.obj); %Readin(03120909, 03120909mm_OptiTexAdam_Texture_standard.obj); %Readin(03120910, 03120910mm_OptiTexAdam_Texture_standard.obj); %Readin(03120912, 03120912mm_OptiTexAdam_Texture_standard.obj); %Readin(03120914, 03120914mm_OptiTexAdam_Texture_standard.obj); %Readin(03120915, 03120915mm_OptiTexAdam_Texture_standard.obj); %Readin(03120916, 03120916mm_OptiTexAdam_Texture_standard.obj); %Readin(03120917, 03120917mm_OptiTexAdam_Texture_standard.obj); %Readin(03120918, 03120918mm_OptiTexAdam_Texture_standard.obj); %Readin(03120919, 03120919mm_OptiTexAdam_Texture_standard.obj); %Readin(03120920, 03120920mm_OptiTexAdam_Texture_standard.obj); %Readin(03120921, 03120921mm_OptiTexAdam_Texture_standard.obj); %Readin(03120922, 03120922mm_OptiTexAdam_Texture_standard.obj); %Readin(03120923, 03120923mm_OptiTexAdam_Texture_standard.obj); %Readin(03120924, 03120924mm_OptiTexAdam_Texture_standard.obj); %Readin(03120925, 03120925mm_OptiTexAdam_Texture_standard.obj); %Readin(03120926, 03120926mm_OptiTexAdam_Texture_standard.obj); %Readin(03120927, 03120927mm_OptiTexAdam_Texture_standard.obj); %Readin(03230901, 03230901mm_OptiTexAdam_Texture_standard.obj); %Readin(03230903, 03230903mm_OptiTexAdam_Texture_standard.obj); 96 %Readin(03230904, 03230904mm_OptiTexAdam_Texture_standard.obj); %Readin(03230905, 03230905mm_OptiTexAdam_Texture_standard.obj); %Readin(03230906, 03230906mm_OptiTexAdam_Texture_standard.obj); %Readin(03240901, 03240901mm_OptiTexAdam_Texture_standard.obj); %Readin(03240902, 03240902mm_OptiTexAdam_Texture_standard.obj); %Readin(03240903, 03240903mm_OptiTexAdam_Texture_standard.obj); %Readin(03240904, 03240904mm_OptiTexAdam_Texture_standard.obj); %Readin(03240905, 03240905mm_OptiTexAdam_Texture_standard.obj); %Readin(03240906, 03240906mm_OptiTexAdam_Texture_standard.obj); %Readin(03240907, 03240907mm_OptiTexAdam_Texture_standard.obj); %Readin(03240908, 03240908mm_OptiTexAdam_Texture_standard.obj); %Readin(03240909, 03240909mm_OptiTexAdam_Texture_standard.obj); %Readin(03240910, 03240910mm_OptiTexAdam_Texture_standard.obj); %Readin(03240911, 03240911amm_OptiTexAdam_Texture_standard.obj); %Readin(03240912, 03240912mm_OptiTexAdam_Texture_standard.obj); %Readin(03240913, 03240913amm_OptiTexAdam_Texture_standard.obj); %Readin(03240914, 03240914mm_OptiTexAdam_Texture_standard.obj); %Readin(03240915, 03240915mm_OptiTexAdam_Texture_standard.obj); %Readin(03240916, 03240916mm_OptiTexAdam_Texture_standard.obj); %Readin(03240917, 03240917mm_OptiTexAdam_Texture_standard.obj); %Readin(03240918, 03240918mm_OptiTexAdam_Texture_standard.obj); %Readin(03250901, 03250901mm_OptiTexAdam_Texture_standard.obj); %Readin(03250902, 03250902mm_OptiTexAdam_Texture_standard.obj); %Readin(03250903, 03250903mm_OptiTexAdam_Texture_standard.obj); %Readin(03250904, 03250904mm_OptiTexAdam_Texture_standard.obj); %Readin(03250905, 03250905mm_OptiTexAdam_Texture_standard.obj); %Readin(03250906, 03250906mm_OptiTexAdam_Texture_standard.obj); %Readin(03250907, 03250907mm_OptiTexAdam_Texture_standard.obj); %Readin(03250908, 03250908mm_OptiTexAdam_Texture_standard.obj); %Readin(03250909, 03250909mm_OptiTexAdam_Texture_standard.obj); %Readin(03250910, 03250910mm_OptiTexAdam_Texture_standard.obj); %Readin(03250911, 03250911mm_a_OptiTexAdam_Texture_standard.obj); %Readin(03250912, 03250912mm_OptiTexAdam_Texture_standard.obj); %Readin(03250913, 03250913mm_OptiTexAdam_Texture_standard.obj); %Readin(03250914, 03250914mm_OptiTexAdam_Texture_standard.obj); %Readin(03250915, 03250915mm_OptiTexAdam_Texture_standard.obj); %Readin(03250916, 03250916mm_OptiTexAdam_Texture_standard.obj); %Readin(03250917, 03250917mm_OptiTexAdam_Texture_standard.obj); %Readin(03250918, 03250918mm_OptiTexAdam_Texture_standard.obj); %Readin(03250919, 03250919mm_OptiTexAdam_Texture_standard.obj); %Readin(03250920, 03250920mm_OptiTexAdam_Texture_standard.obj); %Readin(03250921, 03250921mm_OptiTexAdam_Texture_standard.obj); %Readin(03250922, 03250922mm_OptiTexAdam_Texture_standard.obj); %Readin(03250923, 03250923mm_OptiTexAdam_Texture_standard.obj); %Readin(03250924, 03250924mm_OptiTexAdam_Texture_standard.obj); %Readin(03250925, 03250925mm_OptiTexAdam_Texture_standard.obj); %Readin(03250926, 03250926mm_OptiTexAdam_Texture_standard.obj); %Readin(03250927, 03250927mm_OptiTexAdam_Texture_standard.obj); %Readin(03250928, 03250928mm_OptiTexAdam_Texture_standard.obj); %Readin(03250929, 03250929mm_OptiTexAdam_Texture_standard.obj); %Readin(03250930, 03250930mm_OptiTexAdam_Texture_standard.obj); %Readin(03250931, 03250931mm_OptiTexAdam_Texture_standard.obj); %Readin(03260901, 03260901mm_OptiTexAdam_Texture_standard.obj); %Readin(03260902, 03260902mm_OptiTexAdam_Texture_standard.obj); %Readin(03260903, 03260903mm_OptiTexAdam_Texture_standard.obj); %Readin(03260904, 03260904mm_OptiTexAdam_Texture_standard.obj); %Readin(03260905, 03260905mm_OptiTexAdam_Texture_standard.obj); 97 %Readin(03260906, 03260906mm_OptiTexAdam_Texture_standard.obj); %Readin(03260907, 03260907mm_OptiTexAdam_Texture_standard.obj); %Readin(03260908, 03260908mm_OptiTexAdam_Texture_standard.obj); %Readin(03260909, 03260909mm_OptiTexAdam_Texture_standard.obj); %Readin(03260910, 03260910mm_OptiTexAdam_Texture_standard.obj); %Readin(03260911, 03260911mm_OptiTexAdam_Texture_standard.obj); %Readin(03260912, 03260912mm_OptiTexAdam_Texture_standard.obj); %Readin(03260913, 03260913mm_OptiTexAdam_Texture_standard.obj); %Readin(03260914, 03260914mm_OptiTexAdam_Texture_standard.obj); %Readin(03260915, 03260915mm_OptiTexAdam_Texture_standard.obj); %Readin(03260917, 03260917mm_a_OptiTexAdam_Texture_standard.obj); %Readin(03260918, 03260918mm_OptiTexAdam_Texture_standard.obj); %Readin(03260919, 03260919mm_OptiTexAdam_Texture_standard.obj); %Readin(03260920, 03260920mm_OptiTexAdam_Texture_standard.obj); %Readin(03260921, 03260921mm_OptiTexAdam_Texture_standard.obj); %Readin(03300901, 03300901mm_OptiTexAdam_Texture_standard.obj); %Readin(03300902, 03300902mm_OptiTexAdam_Texture_standard.obj); %Readin(03300903, 03300903mm_OptiTexAdam_Texture_standard.obj); %Readin(03300905, 03300905mm_OptiTexAdam_Texture_standard.obj); %Readin(03300906, 03300906mm_OptiTexAdam_Texture_standard.obj); %Readin(03300907, 03300907mm_OptiTexAdam_Texture_standard.obj); %Readin(03300908, 03300908mm_OptiTexAdam_Texture_standard.obj); %Readin(03300909, 03300909mm_OptiTexAdam_Texture_standard.obj); %Readin(03300910, 03300910mm_OptiTexAdam_Texture_standard.obj); %Readin(03300911, 03300911mm_OptiTexAdam_Texture_standard.obj); /* proc means data=data1; var srt1 srt2 srt3; run; goptions device=win gunit=in vsize=7 in hsize=3 in noborder noprompt display; SYMBOL1 v=dot c=black h=.01; run; procgplot data=all; plot z*y; where id=1005; run; goptions device=win gunit=in vsize=7 in hsize=3 in noborder noprompt display; SYMBOL1 v=dot c=black h=.01; procgplot data=all; plot z*x; where id=1005; run; proc means data=all; class id; var x y z; run; */ DATA out3; set ALL1; *gender=(id in (1005,1045,1054,1079,1085, 1114,1138, 1144,1149,1161)); *format gender gen.; RUN; /* PROC LOGISTIC DATA=OUT3; model gender=x y z/selection=stepwise ctable; run; Prin1 Prin2 Prin3 98 x 0.690220 -.179663 0.700941 y 0.188150 0.979927 0.065900 z 0.698710 -.086396 -.710169 PROC PRINCOMP data=out3 out=prin ; var x y z; RUN; procgplot data=prin; plot prin1*prin2=gender prin1*prin3=gender prin2*prin3=gender; RUN; procgplot data=prin; plot prin1*prin2; where id=1005; run; PROC PRINCOMP data=out3 out=prin ; var x y z; RUN; PROC FASTCLUS data=prin out=clusmaxc=2; var prin1-prin3; RUN; PROC FREQ data=clus; TABLE gender * cluster; RUn; proc logistic data=prin; model gender(event='Male')=prin1 prin2 prin3/ctable; run; PROC PRINCOMP data=out3 out=princov ; var x y z; RUN; /* procgplot data=prin; plot prin1*prin2=gender prin1*prin3=gender prin2*prin3=gender; run; procgplot data=prin; plot prin1*prin2; where id=1239; run; proc logistic data=prin; model gender(event='Male')=prin1 prin2 prin3/ctable; run; */ PROCPRINCOMPdata=out3 out=prin ;/*PRIN dataset is output of PROC PRINCOMP on all obs*/ var x y z; RUN; PROCSORTdata=prin; BY/*gender*/ id newindex; RUN; PROCTRANSPOSEdata=prinout=x prefix=x; var prin1; by/*gender*/ ID; RUN; PROCTRANSPOSEdata=prinout=y prefix=y; var prin2; by/*gender*/ ID; RUN; PROCTRANSPOSEdata=prinout=z prefix=z; var prin3; by/*gender*/ ID; RUN; data analysis; merge x y z; RUN; 99 PROCPRINCOMPdata=analysis out=prin2; var x1 x3 x5 x7 x9 y1 y3 y5 y7 y9 z1 z3 z5 z7 z9 ; RUN; /* PROC LOGISTIC data=analysis; model gender = x1-x10 y1-y10 z1- z10/selection=stepwise ctable; RUN; PROC LOGISTIC data=prin2; model gender = prin1- prin15/selection=stepwise ctable; RUN; PROC DISCRIM data=prin2; class gender; var prin1-prin10;run; PROC stepdisc data=prin2; class gender; var prin1-prin10;run; PROC DISCRIM data=prin2; class gender; var prin1 prin8 prin5 prin9 prin3;run; DATA demo; INPUT ID gender height weight BMI; format gender gen.; gender=(id in (1005,1045,1054,1079,1085, 1114,1138, 1144,1149,1161)); DATALINES; 1028 0 69 151 22 1049 0 73 167 22 1053 0 73 184 24 1059 0 72 193 26 1065 0 68 138 21 1102 0 73 168 23 1162 0 68 184 28 1174 0 67 173 27 1220 0 73 175 24 1239 0 69 181 27 1005 1 62 113 21 1045 1 63 136 24 1054 1 64 128 22 1079 1 64 143 25 1085 1 63 123 22 1114 1 67 115 18 1138 1 64 129 23 1144 1 64 136 24 1149 1 64 92 16 1161 1 64 120 21 ; procprincomp data=demo out=prind; VAR height weight; RUN; PROC GPLOT data=prind; plot prin1*prin2=gender; PROC LOGISTIC data=prind; MODEL gender=prin1 prin2/ctable;run; procdiscrim data=prind; class gender ; var prin1-prin2; run; PROC SORT data=out3; by ID; PROC SORT data=demo; by ID; 100 DATA both; merge demo out3; by ID; PROC LOGISTIC data=both; model gender= weight height x y z/selection=stepwise ctable;RUN; procprincomp data=both out=pd ; var weight height x y z; RUN; PROC GPLOT data=pd; plot prin1*prin2=gender prin3*prin2=gender prin3*prin4=gender prin5*prin4=gender; where id =1005;run; PROC logistic data=pd; model gender=prin1-prin5/selection=stepwise ctable; run; procglm data=both ; model height = gender weight x y z gender*x gender*y gender*z; run; procgplot data=both; plot height*z=gender; run; procdiscrim data=pd; class gender ; var prin1-prin5; run; */ 101 Appendix C Main Study Demographic Data (N =117) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 18 59 24.92 7.89 Weight (pounds) 125 325 183.85 37.77 Height (inches) 61 78 70.65 2.73 BMI 19 47 25.87 4.96 ________________________________________________________________________ 102 Cluster #1 Demographic Data (N =38) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 19 59 23.55 6.80 Weight (pounds) 125 170 152.39 11.08 Height (inches) 61 76 69.42 2.90 BMI 19 30 22.34 2.26 ________________________________________________________________________ Cluster #1 Subjects (N=38) ________________________________________________________________________ Code Age Weight Height BMI ________________________________________________________________________ High 03120922 20 129 62 24 Mean 03240917 21 160 68 24 Low 03240908 22 135 69 20 ________________________________________________________________________ Cluster #2 Demographic Data (N =3) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 19 38 25.67 10.69 Weight (pounds) 145 168 159.33 12.50 Height (inches) 69 74 72.00 2.65 BMI 19 24 21.67 2.52 ________________________________________________________________________ 103 Cluster #2 Subjects (N=4) ________________________________________________________________________ Code Age Weight Height BMI ________________________________________________________________________ High 03300908 38 145 74 19 Mean 03250901 20 165 69 24 Low 03260921 19 168 73 22 ________________________________________________________________________ Cluster #3 Demographic Data (N =45) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 18 57 24.38 8.00 Weight (pounds) 165 195 180.49 7.35 Height (inches) 65 75 70.71 2.34 BMI 20 32 25.36 2.17 ________________________________________________________________________ Cluster #3 Subjects (N=45) ________________________________________________________________________ Code Age Weight Height BMI ________________________________________________________________________ High 03120921 22 195 71 27 Mean 03260906 21 191 73 25 Low 03120919 19 190 67 30 ________________________________________________________________________ 104 Cluster #4 Demographic Data (N =5) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 19 36 24.80 6.61 Weight (pounds) 137 195 177.80 24.08 Height (inches) 67 73 70.00 2.55 BMI 21 30 25.60 3.65 ________________________________________________________________________ Cluster #4 Subjects (N=5) ________________________________________________________________________ Code Age Weight Height BMI ________________________________________________________________________ High 03240911 36 137 67 21 Mean 03250915 24 196 68 30 Low 03100907 24 192 70 28 ________________________________________________________________________ Cluster #5 Demographic Data (N =15) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 20 46 27.00 8.38 Weight (pounds) 200 235 213.07 9.18 Height (inches) 69 78 71.93 2.60 BMI 24 32 29.20 2.18 ________________________________________________________________________ 105 Cluster #5 Subjects (N=15) ________________________________________________________________________ Code Age Weight Height BMI ________________________________________________________________________ High 03250908 37 218 71 30 Mean 03260914 22 205 71 29 Low 03100913 33 225 76 27 ________________________________________________________________________ Cluster #6 Demographic Data (N =5) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 19 50 27.80 12.72 Weight (pounds) 240 256 248.80 5.76 Height (inches) 72 75 73.80 1.10 BMI 31 35 32.20 1.64 ________________________________________________________________________ Cluster #6 Subjects (N=5) ________________________________________________________________________ Code Age Weight Height BMI ________________________________________________________________________ High 03100905 20 240 74 31 Mean 03260911 25 250 74 32 Low 03230904 50 250 74 32 ________________________________________________________________________ 106 Cluster #7 Demographic Data (N =6) ________________________________________________________________________ Range ____________________ Minimum Maximum Average SD ________________________________________________________________________ Age (years) 22 45 29.33 8.34 Weight (pounds) 272 325 298.5 18.58 Height (inches) 70 76 72.00 2.10 BMI 37 47 40.83 3.55 ________________________________________________________________________ Cluster #7 Subjects (N=6) ________________________________________________________________________ Code Age Weight Height BMI ________________________________________________________________________ High 03260912 45 272 72 37 Mean 03240907 26 310 76 38 Low 03100910 22 301 72 41 ________________________________________________________________________