A PROPERTY BASED APPROACH TO INTEGRATED PROCESS AND MOLECULAR DESIGN Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisory committee. This thesis does not include proprietary or classified information. ___________________________________________ Fadwa Tahra Eljack Certificate of Approval: ________________________ ________________________ W. Robert Ashurst Mario R. Eden, Chair Assistant Professor Assistant Professor Chemical Engineering Chemical Engineering ________________________ ________________________ Ram B. Gupta Christopher B. Roberts Professor Professor and Chair Chemical Engineering Chemical Engineering ________________________ ________________________ Mahmoud M. El-Halwagi George T. Flowers Professor Interim Dean Chemical Engineering Graduate School Texas A&M University College Station, TX A PROPERTY BASED APPROACH TO INTEGRATED PROCESS AND MOLECULAR DESIGN Fadwa Tahra Eljack A Dissertation Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Auburn, Alabama May 10, 2007 iii A PROPERTY BASED APPROACH TO INTEGRATED PROCESS AND MOLECULAR DESIGN Fadwa Tahra Eljack Permission is granted to Auburn University to make copies of this dissertation at its discretion, upon request of individuals or institutions and at their expense. The author reserves all publication rights. ________________________ Signature of Author ________________________ Date of Graduation iv VITA Fadwa Tahra Eljack, daughter of Rashida Eljack and Abdalla Eljack, wife of Nazar Suliman, and mother of Nour Suliman, was born on June 6, 1977, in Saskatoon, Canada. She graduated from Auburn High School as National Merit Finalist. She pursued a Bachelor?s degree in Chemical Engineering at Auburn University and graduated in June 1999. She later entered the graduate doctoral program at Auburn University in 2003. Fadwa is currently a new faculty member at Qatar University, Qatar. v DISSERTATION ABSTRACT A PROPERTY BASED APPROACH TO INTEGRATED PROCESS AND MOLECULAR DESIGN Fadwa Tahra Eljack Doctor of Philosophy, May 10, 2007 (B.Sc., Auburn University, 1999) 191 Typed Pages Directed by Mario Richard Eden In this work, a new simple yet effective, systematic method to synthesize and design molecules is presented. Visualization of the problem is achieved by employing an annex to the recently developed property clustering techniques, which allows a high- dimensional problem to be visualized in two or three dimensions by employing the concepts of reverse problem formulation. Group contribution methods are used to predict the properties of the formulated molecule. For the molecular design problem the target properties as well as the molecular groups that make up the formulations are identified on a ternary diagram. The target properties are represented as individual points if given as discrete values or as a region if given as intervals. The formulation of the desired vi molecule is achieved via linear ?mixing? of molecular groups in order to match the desired performance. A significant advantage of the developed methodology is that for problems that can be satisfactorily described by just three properties, the process and molecular design problems can be simultaneously solved visually on ternary diagrams, irrespective of how many molecular fragments are included in the search space. The process design problem is solved for the desired target properties using property clusters. This is the solution of a reverse simulation problem, where the process design problem is solved in terms of constitutive variables and without having to commit to any component a priori. The target properties as well as a selection of molecular building blocks (groups) are used as input into the molecular design algorithm. The problem is now visualized on a molecular ternary cluster diagram. The structure and identity of candidate components is then identified by combining or ?mixing? molecular fragments until the resulting properties match the targets. The designed candidate formulations are screened using the developed necessary and sufficient conditions for the synthesis of molecules. Finally, the feasible molecular formulations are mapped back to the process domain for verification. Although, the molecular property clustering framework provides a property interface for the simultaneous consideration of process and molecular design problems, it should be emphasized that the developed tools can also be used to solve just molecular synthesis problems (e.g. solvent design). As a CAMD tool, this algorithm has the added feature of visual synthesis for those problems that can be described using three clusters or properties; and for those requiring more than three, an algebraic approach for the formulation and solution of molecular design problems is outlined. vii ACKNOWLEDGEMENT The author would like to dedicate this work to her daughter, Nour. Special recognition is given to Dr. Mario R. Eden for his guidance and direction. He has been a great source of information and inspiration. Thanks to Dr. Mahmoud M. El-Halwagi at Texas A&M University, for the encouragement and motivation throughout my undergraduate and graduate careers. Special gratitude and appreciation is given to my parents, Rashida and Abdalla Eljack, my husband, Nazar Suliman, and my brother, Amin Eljack for all their love, encouragement and faith. Without their support, the work presented here would not have been possible. Thanks are also due to my friends and co- workers, Kristin McGlocklin, Norman Sammons, Charles Solvason, Jeff Seay, Wei Yuan, Nishanth Chemmangattuvalappil, and Jennifer Wilder at Auburn University. To Dr. Vasiliki Kazantzi and Dr. Xiaoyun Qin, thank you both for the fruitful collaborative work and for your friendship. To my friend and colleague Dr. Nimir Elbashir, thank you for your friendship, guidance and support, you are a true inspiration. Finally I would like to recognize and thank the faculty and staff of the Chemical Engineering department at Auburn University for making my graduate research experience at Auburn a memorable and rewarding one. viii Style manual or journal style: Computers and Chemical Engineering Journal Computer software used: Microsoft Word, Excel, and Visio ix TABLE OF CONTENTS ` LIST OF FIGURES .......................................................................................................... xii LIST OF TABLES........................................................................................................... xiv 1. INTRODUCTION .......................................................................................................... 1 2. THEORETICAL BACKGROUND................................................................................ 9 2.1. Process and Product Design..................................................................................... 9 2.2. Scope of the Problem............................................................................................. 11 2.3. General Problem Definition................................................................................... 14 2.4. Process Synthesis and Design Approaches............................................................ 15 2.5. Process Integration................................................................................................. 19 2.5.1. Heat Integration .............................................................................................. 21 2.5.2. Mass Integration.............................................................................................. 27 2.6. Computer Aided Molecular Design Methods (CAMD) ........................................ 33 2.7. Roles of Property Models & Reverse Problem Formulation................................. 41 2.8. Property Integration ? Motivation, Need & Limitations ....................................... 46 2.9. Property Prediction and Group Contribution......................................................... 51 2.10. Design of Experiments......................................................................................... 55 2.11. Ternary Diagrams for Visualization .................................................................... 58 2.12. Summary.............................................................................................................. 60 x 3. UNIFIED PROPERTY INTEGRATION FRAMEWORK.......................................... 62 3.1. Property Clustering Fundamentals......................................................................... 64 3.1.1. Property Operator Description........................................................................ 64 3.1.2. Cluster Formulation ........................................................................................ 66 3.1.3. Lever Arm Analysis........................................................................................ 69 3.1.4. Ternary Diagram and Cartesian Coordinate Conversion................................ 72 3.1.5. Feasibility Region Boundaries........................................................................ 73 3.2. Molecular Property Clusters .................................................................................. 77 3.2.1. Group Contribution......................................................................................... 77 3.2.2. Bridging the Gap between Process and Molecular Design............................. 79 3.2.3. Molecular Property Operators......................................................................... 80 3.2.4. Conservation Rules for Molecular Clusters.................................................... 82 3.3. Visual Molecular Design using Property Clusters................................................. 84 3.4. Algebraic Property Clustering Technique for Molecular Design.......................... 92 3.4.1. Proof of Concept Example.............................................................................. 98 4. MOLECULAR SYNTHESIS APPLICATION EXAMPLES.................................... 103 4.1. Example 1 ? Aniline Extraction Solvent Design................................................. 103 4.1.1. Problem statement......................................................................................... 103 4.1.2. Molecular Synthesis...................................................................................... 104 4.2. Example 2 - Blanket Wash Solvent Design......................................................... 110 4.2.1. Problem Statement........................................................................................ 111 4.2.2. Property Prediction (GCM)........................................................................... 111 4.2.3. Molecular Property Operators....................................................................... 112 4.2.4. Molecular Synthesis...................................................................................... 113 4.3. Summary.............................................................................................................. 118 5. SIMULTANEOUS PROCESS AND MOLECULAR DESIGN................................ 119 5.1. Application Example 1 - Metal Degreasing Process ........................................... 119 xi 5.1.1. Process Design.............................................................................................. 121 5.1.2. Molecular Design: Fresh Solvent Synthesis ................................................. 124 5.1.3. Summary....................................................................................................... 129 5.2. Application Example 2 ? Gas Purification .......................................................... 130 5.2.1. Problem Statement........................................................................................ 130 5.2.2. Process Design.............................................................................................. 131 5.2.3. Molecular Design.......................................................................................... 139 5.2.4. Summary....................................................................................................... 142 6. CONCLUSIONS AND FUTURE WORK ................................................................. 143 6.1. Achievements....................................................................................................... 143 6.2. Future Directions ................................................................................................. 147 6.2.1. Property Model Development....................................................................... 147 6.2.2. Defining the Search Space............................................................................ 148 6.2.3. Expanding the Application Range ................................................................ 148 REFERENCES ............................................................................................................... 150 APPENDICES ................................................................................................................ 164 Appendix A: Group Contribution ........................................................................... 165 Appendix B: Solubility Estimation Method ........................................................... 174 xii LIST OF FIGURES Figure 2.1: Product design approach according to Cussler and Moggridge................. 10 Figure 2.2: Traditional approach to molecular and process design .............................. 13 Figure 2.3: Integrated approach to process and molecular design................................ 13 Figure 2.4: Mass-energy matrix of a process (Garrison et al., 1996) .......................... 20 Figure 2.5: Representation of hot composite stream .................................................... 24 Figure 2.6A: Composite heat diagram with partial integration .................................... 26 Figure 2.6B: Thermal pinch diagram ? maximum heat integration ............................. 26 Figure 2.7: Mass pinch diagram. .................................................................................. 31 Figure 2.8: Process source-sink mapping diagram ....................................................... 32 Figure 2.9: Flow diagram of the multi-level CAMD framework (Harper, 2000) ........ 38 Figure 2.10: Formulation and solution of a CAMD problem....................................... 41 Figure 2.11: Property models presented in various roles (Eden, 2003)........................ 43 Figure 2.12: Property model in service, advice and solve role..................................... 44 Figure 2.13: Conventional approach for process and molecular design problems....... 47 Figure 2.14: New approach to process and molecular design problems ...................... 51 Figure 2.15: Response surface plot............................................................................... 57 Figure 2.16: Generic ternary diagram.......................................................................... 59 Figure 3.1: Reverse problem formulation methodology............................................... 64 Figure 3.2: Intra-stream conservation of clusters ........................................................ 67 Figure 3.3: Inter-stream conservation of clusters ......................................................... 70 Figure 3.4: Converting ternary to Cartesian coordinates............................................. 72 Figure 3.5: Overestimation of feasibility region........................................................... 76 Figure 3.6: True feasibility region of a sink. ................................................................ 76 Figure 3.7: Property driven approach to integrated process and molecular design...... 80 Figure 3.8: Group addition on ternary cluster diagram. .............................................. 85 xiii Figure 3.9A: Group addition path A for formulation of Butyl methyl ether................ 88 Figure 3.9B: Group addition path B for formulation of Butyl methyl ether. ............... 88 Figure 3.10: Molecular property cluster framework.................................................... 90 Figure 4.1: Feasibility region for aniline extraction solvent....................................... 106 Figure 4.2: Aniline extraction solvent synthesis problem .......................................... 107 Figure 4.3: Candidates formulated for aniline extraction solvent .............................. 107 Figure 4.4: Feasibility region for blanket wash solvent problem. .............................. 114 Figure 4.5: Blanket wash solvent synthesis problem.................................................. 114 Figure 4.6: Candidate formulations for blanket wash solvent .................................... 117 Figure 4.7: Valid formulations for blanket wash solvents.......................................... 117 Figure 5.1: Original metal degreasing process. .......................................................... 120 Figure 5.2: Metal degreasing process after property integration................................ 121 Figure 5.3: Metal degreasing problem in process design ........................................... 123 Figure 5.4: Property targets of solvent for maximum condensate recycle. ................ 123 Figure 5.5: Metal degreasing solvent problem. .......................................................... 127 Figure 5.6: Candidate metal degreasing solvents. ..................................................... 128 Figure 5.7: Selection of metal degreasing solvent...................................................... 129 Figure 5.8: Gas purification process ? feasibility regions and streams ...................... 132 Figure 5.9: New feasibility region ? reflects mixture/blend design constraints ......... 133 Figure 5.10: Identification of mixture (new) feasibility region.................................. 134 Figure 5.11: New feasibility region ? Gas Purification Example............................... 134 Figure 5.12: Molecular synthesis of gas purification solvent..................................... 140 Figure 5.13: Candidate molecules for gas purification solvent ................................. 140 Figure 5.14: Verification of candidate molecules in process domain......................... 141 xiv LIST OF TABLES Table 3.1: Calculation of cluster values from physical property data .......................... 71 Table 3.2: Property functions for Group Contribution Methods .................................. 78 Table 3.3: Listed values of GCM universal constants................................................. 79 Table 3.4: Calculation of cluster M values from GCM predicted property data ............ 84 Table 3.5: Outline of algebraic molecular cluster approach......................................... 97 Table 3.6: Property data for each molecular group. ..................................................... 98 Table 3.7: Calculated ? for the given property constraints......................................... 99 Table 3.8. Result of solving to the molecular synthesis problem............................... 101 Table 4.1: Property data and molecular groups for aniline design problem............... 104 Table 4.2: Candidate solvents for aniline extraction .................................................. 108 Table 4.3: Accuracy of predicted properties values ................................................... 109 Table 4.4: Property constraints for blanket wash solvent........................................... 111 Table 4.5: Property operators for blanket wash solvent problem. ............................. 113 Table 4.6: Candidate blanket wash solvents............................................................... 116 Table 5.1: Degreaser feed constraints......................................................................... 121 Table 5.2: Property constraints obtained from process design problem..................... 125 Table 5.3: Revised property constraints for fresh solvent synthesis........................... 125 Table 5.4: Property operators needed for molecular synthesis................................... 126 Table 5.5: Candidate molecules for metal degreasing problem.................................. 128 Table 5.6: Property data for gas purification example................................................ 130 Table 5.7: Mixture property data of lumped source (S L ) ............................................ 132 Table 5.8: Calculation data for new feasiblity region................................................. 137 Table 5.9: New feasibility region data........................................................................ 138 Table 5.10: Determined property constraints for molecular design algorithm........... 138 Table 5.11: Property operators for gas purification molecular synthesis .................. 139 xv Table 5.12: Candidate property data for gas purification solvent............................... 141 Table A.1: Listed values of GCM universal constants ............................................... 166 Table A.2: Property functions for Group Contribution Methods ............................... 167 Table A.3: 1st order groups and their contributions (Marrero and Gani, 2001)......... 168 Table A.4: 1 st order groups and their V m contributions (Constantinou et al., 1995)... 172 Table B.1: Parameters for estimation of Hansen solubility (van Krevelen, 1990)..... 175 Table B.2: Solubility calculations for candidate solvent ............................................ 176 1 1. Introduction The terms chemical (product) synthesis and design designate problems involving identification and selection of formulations (compounds) or mixtures that are capable of performing certain tasks or possess certain qualities (properties). Since the properties of the compound or mixture dictate whether or not the design is useful, the basis for solution approaches in this area should be based on the properties themselves. In fact, for molecular design techniques e.g. Computer Aided Molecular Design (CAMD), the desired target properties are required inputs to the algorithm. The performance requirements for the formulations are usually determined by process needs. Thus, the identification of the desired formulation properties should be driven by the desired process performance. Although this integrated relationship between product and process design problems is recognized; traditionally they have been treated as two separate problems, with little or no feedback between the two. Generally the objective in the design or optimization of processes is to find a balance between satisfying process unit requirements and the use of appropriate raw materials in order to maximize profit and minimize cost. The raw materials used, are selected from a list of pre-defined candidate components, therefore limiting performance to the listed components. The problem here is that these decisions are made ahead of design and are usually based on qualitative process knowledge and/or experience and thus possibly yield a sub-optimal design, 2 re-emphasizing that the main setback in finding an optimal solution is that process and molecular design problems have been decoupled. Each problem has been conveniently isolated. Why does the simultaneous consideration of process and molecular design problems present such a challenge? When considering interfacing product and process design using conventional methods such as mathematical programming, most algorithms face a bottleneck when it comes to using property models; suitable models for product design may not be available for process design and vice versa (Gani, 2001). In addition, once a property model is selected for inclusion into the process model, its application range is restricted by the availability of model parameters for those molecules. Mathematical programming techniques suffer from discontinuities in the solution trajectory in response to changes in the model equations. Inclusion of multiple models for the same property into the algorithm may make it more difficult to achieve convergence. Hence, understanding how property models fit into design is key in resolving some of these issues. Recent contributions to understanding the roles of property models in the solution of Computer Aided Process Engineering (CAPE) problems brought about the development of the reverse problem formulation (RPF) framework (Eden, 2003). In principle, process models are made up of balance equations, constraint equations, and constitutive equations. The constitutive equations are used to represent property models 3 in terms of intensive variables (e.g. temperature, pressure, and composition). Often, the complexity of property models is indicative of non-linear behavior of process model equations; leading to intense computations and consequently difficulty in reaching convergence. The RPF methodology decouples the constitutive equations from the balance and constraint equations, so the traditional forward problem can be solved as two reverse problems. It is a technique analogous to how a molecular design problem is formulated as a reverse property prediction problem. Here, the first problem (reverse- simulation) solves the balance and constraint equations in terms of the constitutive variables (properties), providing the design targets. The second reverse problem (reverse property prediction) solves the constitutive equations to identify unit operations, operating conditions and/or products that possess the targeted property values, set forth by the first reverse problem. The key advantage to this targeting approach is the exclusion of the constitutive equations, which allows for easy solution of the balance and constraint equations, which are generally linear. Now, the algorithm gains the freedom to use different property models at any point during the solution step as long as the property targets are matched; making for a robust design algorithm. Once the design targets are identified, only then are the constitutive equations solved to identify the intensive variables. Thus RPF lowers the complexity of the design problem without sacrificing accuracy of the design. The use of RPF in design is the first step towards developing of a simultaneous approach for solving process and molecular design problems. However, there is the still the question of how to link the two problems. There is a need for a method to facilitate the flow of information from the process level to the molecular level, and vice versa. 4 Recall, that process unit performance is gauged by properties such as boiling temperature, heat of vaporization etc. Furthermore, properties are required inputs for solution algorithm in molecular design. Therefore it would make sense to use a property based platform as the link between process and molecular design approaches. Introduction of the property clustering framework allow for representation of process streams and units from a properties perspective (Shelley and El-Halwagi, 2000). Recognizing that properties are not conserved and thus can not be tracked, the property clustering concept fashioned the idea of property operators which are functions of physical properties. The property clusters are conserved and posses the unique feature of linear mixing rules; even though the operators themselves might not be linear (e.g. the inverse of the mixture density of two components is the result of the sum of the inverse density of the individual components). The property integration framework is based on reverse problem formulations and utilizes property clusters to provide a representation of the constitutive variables of a system. Within this framework, only the process design algorithm has been developed; where process needs are targeted ahead of design and used as input. For cases where the system can be described by just three properties, the process design problem can be visualized on a ternary cluster diagram. Discrete points are used to represent property values while feasibility regions are used for a range of accepted property values. Visualization of the problem allows for easy identification of optimum recycle strategies, while the unique feature of linear mixing rules allows for the use of simple lever arm analysis to solve the problem. Thus, this framework allows for the representation and solution of process design problems that are driven by properties. 5 In chemical process design, there is a general need for reliable and accurate property estimation methods. It is critical to the solution of most simulation problems, where convergence is often dependent on the reliability of predicted physical and thermodynamic properties. More so for molecular design algorithms, where predictive property models are at the heart of all solution strategies. Almost all property estimation methods used in CAMD techniques are based on Group Contribution Methods (GCM), where the properties of a compound are expressed in terms of a function of the number of occurrences of predefined groups in the molecule. The novel techniques developed in this work, merge the concepts of group contribution methods with those of the property integration framework, to provide a property based platform capable of simultaneous handling of process and molecular synthesis/design needs. In chemical processes the utilization of such an approach enables identification of the desired system properties by targeting the optimum process performance without committing to any components during the solution step. The identified property targets can then be used as inputs for solving the molecular design problem, which returns the corresponding components. The purpose of the work presented here is to develop a property based molecular design algorithm within the general property clustering framework. The mixing rules will invariably be functionally different for molecular groups and process streams; however since they represent the same property, they can still be visualized on the same diagram. Once visualized it is possible to solve the process design problem by identifying the system/product properties corresponding to the desired process performance. On the ternary diagram the target product properties will be represented as either a single point or a region depending on whether the target properties are discrete or given as intervals. 6 The structure and identity of candidate molecules are then identified by combining or ?mixing? molecular fragments until the resulting properties match the targets. A key advantage of the developed technique is that for problems, which can be satisfactorily described by just three properties, the process and molecular design problems are solved visually and simultaneously on ternary diagrams; irrespective of how many molecularly fragments are included in the search space. Furthermore, the molecular property cluster framework can be used as a visual CAMD tool for solvent design. By visualizing the property constraints, as a feasibility region on the ternary cluster diagram along with a wide range of molecular groups, the search space is minimized by excluding those fragments that do not aid in reaching the property targets. In addition, an algebraic approach has been developed in recognition of the fact that not all design problems can be described in terms of just three properties. To take advantage of the developed molecular property operators in lowering the dimensionality of the design problem, this technique provides a simple method for formulating the molecular design problem as a set of linear algebraic equality and inequality equations. The benefits gained through utilization of this technique, is that the molecular design problem traditionally formulated as mixed integer non-linear program (MINLP) can now be presented as a simple linear program (LP) The conventional process and molecular design methods, tools and techniques upon which the molecular property technique was developed, are discussed in Chapter 2. The property clustering technique that provided the building blocks for a property driven approach to solving process design is presented in Chapter 3. Section 3.2 introduces the foundations of the molecular property clustering methodology developed in this work, 7 including the incorporation of group contribution methods and detailed development of the molecular property operators. In Sections 3.3 and 3.4 the methodology behind the visual synthesis of molecular formulations and the algebraic approach are outlined. Chapters 4 and 5 provide application examples of the developed methodology. In chapter 4, aniline extraction and blanket wash solvent design problems illustrate the advantages of using this new methodology for molecular design problems. It highlights how the algorithm handles cases where group contribution property models do not exist, demonstrating the algorithm?s ability to handle multiple models. The blanket wash solvent design problem has been previously solved as mixed-integer non-linear problem (MINLP), which allowed for comparing the formulation, designed using the molecular cluster technique versus other established methods. In chapter 5, the simultaneous consideration of process and molecular design using the clustering framework, is presented via two application examples. The metal degreasing example supplied property targets as well as the constraints placed on the process design problem. The problem is solved as a maximization problem of the available in house resource. Using lever-arm analysis the process design problem was solved for the cluster values that optimized the process needs. Those cluster values are then converted property targets used as input into the molecular framework, where a set of candidate formulations are generated and screened using a set of conditions established as part of the framework. The resulting three valid formulations were mapped back to the process design level, where the validity of the solution was established. The gas purification example aims at identifying a solvent that will replace methyl diethanol amine, one of three streams currently fed to the gas treatment process unit (sink). The process design objectives and 8 requirements dictate that the mixed stream properties must match that of the sink. Lever arm analysis is used to determine the mixture property requirements; and the calculations steps are highlighted in Section 5.2.2. Finally chapter 6 summarizes the conclusions of the thesis and provides future directions for further development of the framework. 9 2. Theoretical Background 2.1. Process and Product Design In the chemical processing industry product design arises from a ?need? and involves finding a product that exhibits certain desirable behavior or involves finding an additive (chemical) that when added to another chemical or product enhances its desirable functional properties (Achenie et al., 2003). In product design, the identity of the final product is unknown, however the general behavior or characteristics of the product (goal) is known. The objective is to find the most appropriate chemical or a mixture of chemicals that will satisfy this goal. Once possible solutions to the problem are generated, the next step is processing of the raw materials. Cussler and Moggridge (2001) suggested the following steps for product design: 1. Define needs (formulate the problem) 2. Generate ideas to meet the needs (generate molecular or flowsheet structures matching the problem) 3. Select among ideas (rank the generated alternatives to get the best alternative) 4. Manufacture product 10 The first step is to understand the design objectives. Regardless, if it is a molecular design problem or a mixture/blend design problem the solution strategy is the same (see Figure 2.1). Product designers/chemists depend on their understanding and knowledge of the matter and suggest a list of raw materials they believe will lead to the desired product(s); in other words they generate possible solutions and select among those alternatives. Iterative ApproachIterative Approach Figure 2.1: Product design approach according to Cussler and Moggridge Next product designers transfer this information to process designers in order to manufacture the final desired product, i.e. satisfy initial consumer ?needs?. This final step is labeled process design; where process designers obtain the list of suggested alternatives provided by the product designers and they look at performing feasibility and profitability analysis. Engineers here gain a detailed understanding of how the process flowsheet will function; and a common objective is to determine feasible and preferably 11 optimal configurations in terms of selecting equipment and conditions of operation for the parts of the process being considered (Hostrup, 2002). Process designers also consider environmental impact as well as health and safety issues. After addressing all these tasks, they determine whether the generated alternatives (step 2 and 3) are practical. If they conclude that the designs are infeasible, then all of their findings are passed back to the product designers to make further alterations to their design. Such changes may include altering the chemistry make up or the starting raw materials, etc. Once new alternative designs are generated then they are again passed to process engineers for further study, thus making the approach iterative (see Figure 2.1). Iteration can lead to inefficiency and is a result of an apparent gap that exists between products/molecular and process design approaches. The work presented in this thesis aims at bridging the gap between process and molecular design approaches, through the development of design tools that address both design objectives. First the scope of the problem will be clearly identified and the overall objectives will be stated. The following sections intend to provide an overview of some of the current approaches and techniques used in process (Section 2.4) and molecular synthesis/design (Section 2.6). Furthermore, this chapter discusses the methods and tools that provided the foundation for the design techniques developed in this thesis. 2.2. Scope of the Problem Traditionally, process and molecular design problems have been treated as two separate entities. In molecular design, the general approach is often based on trial and error experimentation. Although not efficient, this method is the only available option in 12 cases where property models are not available to predict the properties of the desired components. In cases where models do exist, the algorithm (see Figure 2.2) is given the overall objectives and a set of molecular building blocks (e.g. ?CH 3 , -OH, etc.). and the goal is to identify a set of candidate components that meet a given set of criteria (e.g. physical or chemical properties). The importance of property models to design will be further discussed in Section 2.7. In process design, generally the objective is to find a balance between satisfying process unit requirements/constraints and the use of appropriate raw materials and processing chemicals e.g. solvents, in order to maximize profit and minimize cost (see Figure 2.2). The chemicals used as input into process algorithms, are selected from a list of pre-defined candidate components, therefore limiting performance to the listed components. The problem here is that (1) molecular/product designers make these decisions ahead of design (process) and (2) often these decisions are based on qualitative process knowledge and/or experience and thus possibly yield a sub-optimal design. Hence a major obstacle in finding an optimal design is that the process and molecular design problems have been decoupled from each other. Each problem has been conveniently isolated. 13 Figure 2.2: Traditional approach to molecular and process design Figure 2.3: Integrated approach to process and molecular design 14 The work of researchers like Cussler and Moggride (2001) highlighted at the beginning of this chapter recognize the potential benefits to be gained, by allowing the flow/exchange of information between process and molecular design algorithms. It is the objective of this thesis to introduce a unified design approach to overcome the limitations of decoupling the two problems. Figure 2.3 outlines the proposed approach for handling design problems. Where the process design algorithm takes in only the desired process performance requirements and solves the design problem for optimal process operating conditions and desired functionalities. These requirements along with a set of molecular building blocks are now the input to the molecular design algorithm; where the objective is to formulate candidate components that posses these properties. The generated list of candidate feasible components are guaranteed to posses all of the required criteria, and other screening criteria (e.g. environmental impact, cost etc.) can be used to rank the candidate components. 2.3. General Problem Definition The general formulation of process/product synthesis/design problems can be described by the following set of equations with x and y as the optimization real and integer variables, respectively: F obj = min {A T y + f(x)} Objective function (2.1) s.t. 0)yx, z x (h 1 = ? ? Process/product model (2.2) 15 h 2 (x,y) = 0 Equality constraints (2.3) g 1 (x) > 0 Process inequality constraints (2.4) g 2 (x) > 0 Product inequality constraints (2.5) B . y + C . x > d Structural constraints (2.6) Many variations of the above mathematical formulation can be derived to represent different synthesis and/or design problems. Some of the equations or terms may be excluded, depending on the type of problem solved. If the objective is to simply generate a feasible solution to the process/molecular design problem, then only the equality, inequality and structural constraints are considered. However, some approaches utilize mathematical optimization tools that aim at identifying optimal solutions, which require solving equations 2.1-2.6. The various approaches for process synthesis/design are reviewed in the following section. 2.4. Process Synthesis and Design Approaches Process synthesis deals with the activities where the various process elements are integrated and the flowsheet of the system is generated to meet certain objectives. To gain a detailed understanding of how the process behaves and whether the process objectives are met, process analysis tools such as ASPEN Plus, PRO/II, and HYSYS are often utilized (El-Halwagi, 2006). The common objective is to determine feasible and preferably optimal configurations in terms of the selection of equipment and conditions of operation for each part of the process (Hostrup, 2002). Once a feasible flowsheet has been identified, it is analyzed/tested to make sure the process objectives are met. 16 Iterations between process synthesis and analysis are continued until the desired goals are met (El-Halwagi, 1997). Biegler et al. (1997) lists the basic steps in flowsheet/process synthesis as: (1) gathering information, (2) representation of alternatives, (3) assessment of preliminary design, and (4) the generation and search among alternatives. Several approaches exist that aim at developing and improving ideas for the design of process including: ? Brainstorming different scenarios by a select group of experts in the scientific and engineering fields dealing with the specific process. The ?optimal? generated design using this approach is determined by the ability to generate alternatives and absence of bias towards a specific solution. The problem here is that these decisions are made ahead of design which might lead to a sub-optimal solution. ? Another method to solving a process design problem is adapting an old solution to a similar design challenge and then improving on it (El-Halwagi, 2006). The limitation here is that this solution can not guarantee optimality. ? Heuristic based approaches: Process engineers have classified most processes into groups or categories, and each is assigned a group of possible solutions. This approach uses rules to analyze the problem and to fix many of the discrete variables a priori to reduce the size of the search space. The rules come about as direct observation of recurring behavior in a given type of problem. Heuristics are used as a tool to aid in choosing how decisions should be made and which decisions we should make. Without heuristics, design problems are often too difficult to converge and/or too large to search, however here again the optimality 17 of the generated solution is not guaranteed (Westerberg, 2004). The approach is useful only in cases where the problem at hand is closely related to the class of problems for which the solution has been derived (El-Halwagi, 1997). ? Mathematical optimization approaches for process design require (Westerberg, 2004): a problem formulation that can express goals and describe them as an actionable task; the ability to enumerate all alternatives; and the capability of narrowing down the search space by eliminating alternatives. Mathematical optimization approaches are excellent because they can guarantee optimality solutions; however, they can not always guarantee convergence, i.e. you can not depend on the approach to always generate a solution. Usually, representation of such large optimization problems is in the form of Mixed Integer Non-Linear Programs (MINLPs). The algorithm identifies the integer variables (e.g. determine the existence or absence of a certain piece of equipment) and continuous variables, that determine design and operating parameters such as temperature, pressure, flowrate and the size of the equipment. Process synthesis reviews are readily available in the open literature. The various approaches are organized into two categories, those that are structure-independent (also known as targeting approaches), and those that are structure based. All the approaches follow the basic steps of process synthesis and design summarized by Biegler and co- workers (1997). The approaches vary in the generation of alternatives and the manner in which optimal solutions are identified from amongst all the alternatives. The first category looks at solving the synthesis problem by breaking it down into multiple stages 18 to reduce the dimensionality of the problem. Within each stage, the design targets are identified and used in the following stage. The structure-dependent approaches, like superstructures, include the structure of the process design (i.e., equipment identity and connectivity) as well as all the design and operating parameters for each piece of equipment as part of its formulation, therefore the superstructure encompasses many redundant paths and equipment alternatives for achieving the design objectives. Superstructure optimization is the process of stripping away the unessential pathways and equipment alternatives to find the ?best? solution. Two separate and distinct problems still limit the use of superstructure optimization techniques: (1) how to generate the initial superstructure to guarantee it contains the ?best? solution (2) how to solve the large optimization problems inherent in practical synthesis problems (Barnicki and Siirola 2004). Recognizing, that structure-dependent approaches are generally more robust than their independent counter part, El-Halwagi (1997) and Westerberg (2004) identified several issues that process design algorithms should address: first, the methodology has to be able to enumerate all alternatives and represent them in a common space. Failure to include some possible configuration can lead to sub-optimal solutions. This is related to the ability to systematically narrow down the search space. The second issue is that mathematical optimization problems of such magnitude often fail to converge due to the complexity of the non-linear properties included in the formulation. Finally, to avoid obtaining a biased configuration, due to the influence of personal or engineering evaluation, all insights should already be part of the problem formulation. 19 The novel concept of property clusters (Shelly and El-Halwagi, 2000; Eden, 2003) provides design tools that can address the issues raised by El-Halwagi (1997) and Westerberg (2004). Property clustering methods lower the complexity of the design problem, by mapping properties to a lower domain and by perceptively using given information as guidelines for placing bounds on the search space. This methodology provides the platform on which the tools presented by thesis are built. Recognizing that clustering methods have only been recently developed, a detailed review of the concept will be covered in Section 3.1. 2.5. Process Integration First attempts at optimization of processes came in the form of process integration. Process integration is defined as ?a holistic approach to process design and optimization, which encompasses design, retrofitting and operations of the process? (El- Halwagi, 1997). The aim is to allow us to see ?the big picture first, and the details later?. Integration requires the ability to state the objective in ?actionable tasks? or in terms of quantified engineering parameters (e.g. maximizing profit can be translated into minimizing raw material usage or waste material generation etc.) The designer needs to identify global performance targets ahead of any development activity and identify the optimal strategy to reach it (Sirinivas, 1997). It is important to find and evaluate the maximum performance benchmarks ahead of synthesizing the design to obtain insights about potential opportunities. In that sense, an efficient methodology must include the ability to identify the search space, generate solutions knowledgably; and finally the capability to select amongst the alternatives. 20 Processes are generally characterized by the flow of materials/mass and energy. Mass flow includes flow of raw materials including solvents, feed material etc. utilized within the process to make the products. Energy flow in the form of water, heating and cooling power, coal or gases etc. is needed to process the mass flow to desired products (see Figure 2.4), (Srinivas, 1993). Energy and mass integration are systematic methods for identifying energy and mass performance targets respectively. Energy integration aims at heat recovery within a process. It can also identify the optimal system configuration for the minimal energy consumption. Mass integration techniques/tools provide means of identifying optimum performance targets by generating and selecting among alternatives for allocating the flow of material (species) in the process. Figure 2.4: Mass-energy matrix of a process (Garrison et al., 1996) 21 Numerous successful applications worldwide in a range of industries testify to the value of process integration technology in reducing energy costs and increasing capacity through debottlenecking (Gundersen and Ness, 1988). Some of the principal tools used by the two integration fields are highlighted below. 2.5.1. Heat Integration Process integration efforts began in the late 1970s; and were initially rooted in energy conservation and, in particular, the design of heat exchanger networks (HEN) (Linnhoff and Hindmarsh, 1983; Papoulias and Grossmann, 1983; Gundersen and Ness, 1988; and Cerda et al., 1983). Tools were developed to find ways to increase energy conservation/utilization, in response to the rise in energy cost. Efforts led to the development of a variety of tools, the best-known of which are composite curves and pinch design methods or analysis used to identify minimum utility targets ahead of designing of the HEN. In a chemical process there are generally several streams that require heating or cooling before they satisfy the process unit requirements. The use of external utilities; (e.g. steam and cooling water) for each stream requiring cooling or heating is not economically efficient, consequently it is desirable to lower the use of external utilities by maximizing the transfer of available internal energy from hot to cold streams prior to implementation. The synthesis of HENs involves optimum allocation of energy within a process via maximizing the exchange of internal energy between process hot streams and cold 22 streams; which are process streams that require cooling and heating, respectively. In synthesizing such networks the following process information is given: ? Number of hot and cold streams ? Heat capacity flowrate of hot (HH) and cold (HC) streams = flowrate (F) x specific heat (C p ) ? Hot streams supply (inlet) temperature (T s ) and target (outlet) temperature (T t ) Cold streams inlet or supply temperature (t s ) and outlet or target temperature (t t ) ? T s and T t of available external heating and cooling utilities The design tasks are as follows (El-Halwagi, 2006): ? Which heat and/or cooling utilities should be employed? ? What is the optimal heat load to be removed/added by each utility? ? How should the hot and cold streams be matched, i.e. stream pairings? ? What is the optimal system configuration, e.g. how should the heat exchangers be arranged? Should any streams be mixed or split? Hohmann (1971) introduced the ?thermal pinch diagram?, the first graphical approach aimed at identifying the minimum utility requirements. Linnhoff and collaborators led the efforts to advance the development of this technique (Linnhoff et al., 1982; Linnhoff and Hindmarsh, 1983). The method is based on the ability to thermodynamically transfer heat from any hot streams with temperature T to any cold stream with temperature t, with a minimal driving force of ?T min. . The minimum hot stream temperature where heat transfer is feasible is given by equation 2.7: 23 min TtT ?+= (2.7) First, a hot composite stream representing all hot process streams must be constructed. The diagram represents the amount of enthalpy exchanged by each hot stream vs. temperature, assuming ideal thermodynamics and constant heat capacities. The composite stream is a global representation of all the hot process streams as a function of the heat they exchanged vs. temperature. An example of a hot composite stream for two hot streams is shown on Figure 2.5, where the tail and head of each arrow represents the supply (T s ) and target (T t ) temperatures, respectively. The amount of energy or heat lost by a hot stream (HH) and analogously gained by a cold stream (HC) is calculated according to equations 2.8 ? 2.9. The hot composite stream is created by superposition, see Figure 2.5. In a similar manner a cold composite stream can be constructed. Next both streams are plotted on the same diagram; this is possible by having two temperature scales, where the cool composite temperature scale shifts by ?T min . The position of the hot composite stream is going to always be on the right of the cold composite stream because the temperature of the hot stream is always higher than or equal to the temperature of the cold with a minimum temperature gradient of ?T min , as seen in equation 2.7. 24 Figure 2.5: Representation of hot composite stream )( ts p TTFCHH ?= (2.8) )( ts p ttFCHC ?= (2.9) The overlap between the composite curves provides a target for the heat recovery opportunities, labeled integrated heat exchange, see Figure 2.6A. Hence, the overlap in enthalpy that occurs between the two curves on this diagram guarantees the ability to exchange heat from hot to cold streams without the use of external utilities. Those duties that cannot be satisfied by internal energy recovery must be serviced by external heating and cooling utilities. The cold composite stream can be moved up/down on the diagram, where final location of the stream determines the amount of heat being exchanged, see Figure 2.6A. 25 Here, the construction of the diagram provides a tool for the determination of maximum heat recovery targets. The minimum external utility requirements can be identified by sliding the cooling curve all the way down until the two curves touch, Figure 2.6B. This point is named ?the thermal pinch point?. If the cooling curve is moved up and away from the pinch this signifies a penalty in terms of the amount of energy being exchanged between the streams, and thus additional external utility are required. If the cooling curve is moved down passed the pinch point, then heat integration potential is lost; yet again resulting in a need for additional external utilities. To avoid loss of potential integration, Linnhoff et al. (1982) developed rules to identify minimum external heating and cooling utilities once the pinch point has been identified: ? No heat should be passed though the pinch ? No external cooling utilities used above the pinch ? No external heating utilities used below the pinch 26 Heat Exchanged Load of External Cooling Utility T Integrated Heat Exchange Load of External Heating Utilities Cold Composite Stream Hot Composite Stream t = T- T min Figure 2.6A: Composite heat diagram with partial integration Minimum Cooling Utility T Maximum Integrated Heat Exchange Minimum Heating Utility Cold Composite Stream Pinch Point Hot Composite Stream t = T- T min Heat Exchanged Figure 2.6B: Thermal pinch diagram ? maximum heat integration 27 It should be recognized that the identification of minimum external utilities does not necessarily translate to minimum total cost. The required amount of external utilities can be lowered by decreasing ?T min ; however, the decrease in driving force translates to larger heat exchanger area and in turn higher exchanger unit cost. Thus there is a trade- off between minimum utility requirements and the number/size of the heat exchangers that will need to be implemented. The optimal solution to heat integration problems has also been successfully identified using mathematical optimization methods (Floudas, 1995). Non-Linear Program (NLP) and Linear Program (LP) transshipment models representing superstructures for each possible heat exchanger network were solved to give the optimal number of heat exchangers and minimal external duties (Papoulias and Grossman, 1983); however, these problems suffer from nonconvexities that can result in suboptimal solutions. Reformulation/Linearization techniques (RLTs) have also been used to solve such nonconvex problems (Sherali and Adams, 1999); however, such methods are harder to implement than the pinch analysis methods. Pinch analysis is a powerful tool, which illustrates the cumulative cooling and heating requirements of the process in a single diagram. It is however based on an assumption of ideal thermodynamics and constant heat capacities. To address this limitation, simulated annealing methods have been employed to include detailed thermodynamics in addition to property correlations (Nielsen et al., 1996). 2.5.2. Mass Integration By the end 1980s, the development of process integration tools was extended beyond just heat integration. El-Halwagi and Manousiouthakis (1989) created new tools 28 for designing mass exchange networks using the same philosophy as utilized in the thermodynamic analysis of heat exchanger networks. Due to more stringent environmental regulations on the chemical industry, later work focused on the particular sub-problem of water networks (Takama et al., 1980; Wang et al., 1994; and Doyle et al., 1997). The design objective in water-reusing networks is to minimize water consumption by maximizing water reuse. This led to the development of new general design methodologies, specifically the formation of mass pinch analysis and source sink mapping diagrams. This new paradigm is collectively referred to as mass integration (El- Halwagi and Manousiouthakis, 1989; El-Halwagi and Spriggs, 1996; and El-Halwagi, 1997). Mass integration enables identification of the optimal path for the recovery and allocation of process species or resources by the use of systematic design and analysis tools (El-Halwagi, 2006). Mass integration aims at improving yield, debottlenecking the process, conserving energy and reducing waste in a cost effective manner. In other words it aims at determining achievable performance targets ahead of detailed design by the use of fundamentals such as thermodynamics, transport phenomena and mathematical optimization (Sirinivas, 1993). Mass integration uses mass-separating agents (MSAs) to remove undesirable materials from waste streams (rich streams). To understand exactly where these MSAs (lean streams) should be used and which streams need to be intercepted is the challenge in synthesizing mass exchange networks (MENs). Mass exchange networks aim at: ? Selecting the mass exchange operation needed ? Choosing the MSA 29 ? Matching MSAs with waste streams ? Deciding the arrangement of mass exchangers and where to split and mix streams The approach in solving such tasks needs to be very systematic, due to the combinatorial nature of each of the tasks. Several approaches have been used to solve such problems, e.g. enumeration techniques which proved to be complicated (discussed in Section 2.6) and it is not able to guarantee a feasible much less an optimal solution because of all the decisions that are involved (El-Halwagi, 1997). On the other hand, ?targeting approaches? simplify the design challenge by identifying performance targets, such as minimizing the cost of MSAs and the number of mass exchange units ahead of design without committing to a MEN configuration. Mass pinch analysis is a graphical approach for analyzing available process MSAs along with mass exchangers from a thermodynamic limitations point of view (El-Halwagi and Manousiouthakis, 1989). By understanding the available MSAs within a process and maximizing their use, the goal of minimizing the cost of external MSAs can be realized. Mass pinch analysis is very similar to the thermal pinch analysis in its approach. The construction of a pinch diagram is as follows: ? Each rich and lean stream is represented by an arrow on a mass exchanged vs. composition diagram. ? The slope of the line corresponds to the flowrate of the stream. The tail and the head of the arrow on the diagram represent the maximum (supply) and minimum (target) compositions of the rich streams, and vice versa for the lean streams, 30 where the minimum is labeled as supply composition, while the maximum is the target composition. ? Construct the composite rich and lean streams by using the ?diagonal rule? of superposition, also known as linear superposition to add up the mass exchanged in regions where overlapping occurs. The order in which rich streams are stacked is by placing the streams with the lowest target composition first. Lean streams are stacked with the MSA having the lowest supply composition first. On the diagram, the rich composite curve represents the cumulative mass of the pollutant lost by all rich streams. Similarly, the lean composite curve represents the mass of pollutant gained by all process MSAs. Next, both the rich and lean composite streams are placed on the same graph, and the lean stream is slid down vertically on the graph until the two curves meet. If the lean process stream is to the left of the rich that means that mass can be exchanged between the two streams (see Figure 2.7). 31 Figure 2.7: Mass pinch diagram. No external MSA?s should be used above the pinch. This point is emphasized because mass exchanged above pinch would require the use of added external MSAs, which translates to higher cost of external MSAs. However, the load below the lean composite stream must be removed using external MSAs; and the vertical overlap between the two composite curves represents the maximum amount of mass that can be exchanged internally, with the use of already available MSAs. This region is labeled integrated mass exchange on Figure 2.7. Anything above the integrated mass exchange is excess capacity of the internal process streams which can be eliminated by lowering the flowrates of those streams or lowering the outlet composition. 32 Figure 2.8: Process source-sink mapping diagram The source-sink mapping diagram is a visualization tool used to determine feasible recycle strategies within a process (El-Halwagi and Spriggs, 1996; El-Halwagi, 2006). The sources correspond to available process waste streams that are available for recycle and whose flowrate and targeted pollutant composition are known; while the sinks are process units that have certain constraints in terms of input flowrate and allowed maximum composition of pollutant species. The diagram is constructed as pollutant load/flowrate vs. composition; with sources and sinks being represented by shaded and hollow circles on the diagram, respectively (see Figure 2.8). The sink flowrate and composition constraints are represented by the horizontal and vertical bands, respectively; with the shared spaces representing area of acceptable loads and 33 compositions available for recycle. For instance, source A (Figure 2.8) can be directly recycled into sink S (El-Halwagi, 2006). The location of a mixture of sources B and C can be determined using lever arm analysis, and if the mixture is located within the band then it is a feasible recycle stream into sink S as well. Source D, located above the band can be rerouted to sink S by mixing it with a fresh source in order to lower the load of source D and minimize the use of fresh source, again following lever-arm rules and material balance equations. A general rule is sources with the shortest arm to the sink should be recycled first. There is a rich volume of information available in literature that covers the development and uses of energy and mass integration tools (Cerda et al., 1983; Linhoff and Hindmarsh, 1983; Gundersen and Ness, 1988; Douglas, 1988; Shenoy, 1995; El- Halwagi, 1997; Dunn and El-Halwagi, 2003; Dunn and El-Halwagi, 2003; and Rossiter, 2004). The previous sections gave an overview of currently utilized process synthesis and design methods. The next sections will concentrate on the methodology behind molecular design algorithms; the importance of property models to design, and how it challenges developed methodologies. 2.6. Computer Aided Molecular Design Methods (CAMD) By definition, a CAMD problem is (Brignole and Cismondi, 2003): Given a set of building blocks and a specified set of target properties, determine the molecule or molecular structure that matches these properties. 34 A class of CAMD software for chemical synthesis developed by Molecular Knowledge Systems Inc focuses on three major steps in the formulation of molecules; it illustrates the general methodology behind most CAMD methods (Joback, 2006): 1. Identifying target physical property constraints. Translate performance requirements in terms of constraints on properties: e.g. if a certain chemical must be liquid at certain conditions, it should be translated in terms of constraints on melting and boiling temperature. 2. Automatically generating molecular structures. CAMD software is used to generate molecular structures based on the groups of molecular building blocks given as the input. Types of molecules being generated can be controlled (e.g. alcohols could be removed by simply excluding ?OH group from the pool of building blocks, same can be for amines, amides, chlorines etc.) 3. Estimating physical properties. Using structural groups as building blocks enables the use of group contribution estimation techniques to predict the properties of all generated formulations. (More on group contribution in Section 3.2.1). In property prediction, the component?s structural information is used to predict its properties; therefore, the identity of the formulation is required as input to the algorithm. The solutions generated by design algorithms that employ these methods are limited to the list of ?pre-selected? components, which can lead to ?sub-optimal? designs. CAMD is able to avoid such tribulations by solving the reverse of the property prediction 35 problem. It uses available property models to formulate the design problem in terms of target values for the identifiable set of properties. The property constraints are used as input into its algorithm, then it determines candidates of molecules (or mixture of molecules) that match the specified property targets values without limiting the search space (Eden, 2003). Hence, with the problem well defined, in terms of properties, CAMD methods are able to design novel formulations that otherwise might not of have been part of the available database. A rich volume of investigative research regarding CAMD is available in literature and can be grouped into three main categories: mathematical programming, stochastic optimization, and enumeration techniques (Harper, 2000; Harper and Gani, 2000): Mathematical programming solves the CAMD problem as an optimization problem where the property constraints are used as mathematical bounds and the performance requirements are defined by an objective function. Solutions techniques to such optimization problems include Mixed Integer Non-linear Programming (MINLP) solution methods. Although widely used and proven to be effective, MINLP methods suffer from a large computational load and it lacks the guarantee of finding a globally optimal solution. (Odele and Macchietto, 1993; Vaidyanathan and El-Halwagi, 1994; Duvedi and Acheni, 1996; Pistikopoulos and Stefanis, 1998). Stochastic optimization, where the solution alternatives are based on the successive pseudo-random generation method. Like the previously mentioned approach, this method aims at finding the optimal value for the objective function, but the technique it uses varies. One important aspect is that stochastic 36 optimization methods do not require any gradient information, giving it the freedom to specify discontinuous properties as design targets. There are two forms of stochastic optimization: (1) uses the Simulated Annealing (SA) method and (2) uses Genetic Algorithm, which is based on Darwin?s evolutionary theory. The Simulated Annealing technique requires the formulation of the problem in form of states and moves. States refer to an instance of design parameters and possible parameter modifications are the moves. The algorithm runs as an iterative process where moves generate new states, according a set of perturbation probabilities (Marcoulaki and Kokossis, 1998). The generated parameters (states) are tested against previous to satisfy a probability criterion. The advantage of using SA is its ability to easily deal with highly non-linear models (e.g. predictive property models) and large numbers of decision variables (e.g. numerous alternative molecular structures). In the second approach, populations of potential solutions are obtained from the previous populations based on ?survival of the fittest?; it also takes into account how attributes are passed from ?parent? to ?offspring?, i.e. from one solution population to the next population. Because of the stochastic nature, both approaches are capable of handling non-linear models, although as the problem complexity increases, the genetic algorithm approach reports limitations in terms of computational time. (Holland, 1975; Venkatasubramanian et al., 1994; Marcoulaki and Kokossis, 1998; Ourique and Telles, 1998). Enumeration techniques aim at satisfying the feasibility and property constraints by first generating molecules using a combinatorial approach and then test against 37 the specifications, where molecules that fail to satisfy the constraints are eliminated. Thus, the generation and screening of molecules are performed separately. As with the stochastic methods, no gradient information is needed; however, a disadvantage of this approach is that solving a simple enumeration of a CAMD problem can lead to combinatorial explosion. Meaning that even with today?s fast computers, excessive computational time is needed (Gani et al., 1991; Pretel et al., 1994; Joback and Stephanopolous, 1995; Constantinou et al., 1996; Friedler et al., 1998). Another method labeled as ?generate and test approach? was introduced by Harper (2000). It is an approach where only feasible formulations are generated from molecular building blocks using a rule based combinatorial approach. The difference between this and the enumeration techniques mentioned previously, is that this method uses a multi-level CAMD approach that controls the generation and testing of molecules. Harper (2000) proved that a solution algorithm of a ?generate and test? type can be successful without suffering from ?combinatorial explosion?, even when considering detailed molecular models. The employed method consists of four levels (see Figure 2.9). Each level has a generation and a screening step. In the generation step the molecular structures are created while the properties of the generated compound are predicted and compared against the design specifications in the screening step. The first two levels operate on molecular descriptions based on groups while the latter two rely on atomic representations. The outline for the individual levels has the following characteristics (Harper, 2000): 38 Figure 2.9: Flow diagram of the multi-level CAMD framework (Harper, 2000) Level 1 In the first level, a group contribution approach (generation of group vectors) is used with group contribution property prediction methods. Group vectors are generated from the set of building blocks identified in the pre- design step. The generation step does not suffer from the so-called "combinatorial explosion" as it is controlled by rules regarding the feasibility of a compound consisting of a given set of groups (Gani et al., 1991). Only the candidate molecules fulfilling all the requirements are allowed to proceed onto the next level. 39 Level 2 At the second level, corrective terms to the property predictions are introduced. These terms are based on identifying substructures in molecules. At this level molecular structures are generated using the output from the first level as a starting point and the second order groups are identified using a pattern matching algorithm. The generation step at this level is a tree building process where all the possible legal combinations of the groups in each group vector are generated. Level 3 In the third level the molecular structures are converted into an atomic representation by expanding the group representations. The conversion into an atomic representation enables the use of molecular encoding techniques (Harper & Gani, 1999). The use of molecular encoding techniques makes it possible to re-describe the candidate compounds using other group contribution schemes thereby further broadening the range of properties that can be estimated as well as giving the opportunity to estimate the same properties using different methods for comparison. Level 4 In the fourth level the atomic representations from level three are further refined to 3-dimensional representations. This conversion can create further isomer variations and enables the use of molecular modeling techniques as well as creating molecules ready for structural database searches in the post- design step. 40 This methodology has been implemented by the Computer Aided Process Engineering Center (CAPEC) in their ProCAMD software as part of the Integrated Computer Aided System (ICAS) (CAPEC, 2006). Regardless of the method of choice, stating the objective (Pre-Design Phase) of the problem is a prerequisite for solving any CAMD problem. Here the goals/targets of the design (numerical property constraints) and a selection of molecular building blocks are used as input into the CAMD algorithm (see Figure 2.10). The design phase includes the generation of molecular formulations and testing their ability to satisfy the property constraints placed on the problem. Next, the post-design phase involves using other prediction methods, database sources, engineering insight, and if possible, simulation in order to screen and rank the designed compound(s) based on suitability and capabilities (e.g. environmental impact, health and safety aspects, production cost or availability). 41 Figure 2.10: Formulation and solution of a CAMD problem 2.7. Roles of Property Models & Reverse Problem Formulation In design, process or molecular, the formulation of the problem will always include a property model (see equations 2.1-2.6). Property models play an important role in design and it is the non-linear nature of these property models that often lead to complications within design calculations. The following section takes a closer look at property models. The widespread availability of powerful computers and user-friendly software has made process modeling an integral part of chemical engineering practice. An essential requirement of successful process modeling is the availability of thermo-physical 42 property models that are accurate, reliable, and computationally efficient over a very large range of temperatures, pressures, and compositions. However, property models can be more than just generators of property values. They can provide insight and guidance to the efficient solution of process engineering problems. Gani and O?Connell (2001) describe property models as having three distinctive roles in computer-aided product and process engineering (CAPE). The first is a service role where the property models are used to provide the needed property values when prompted by the process model. The second is a service plus advice role. In addition to providing property information, the models advise on feasibility. The third role, considered the most comprehensive in CAPE problems is the service/advice/solve role, where in addition to the previous roles the property models take part in the solution of design problems. In the advice role, the choice of property model will dictate the resulting properties, and usually the property model complexity can lead to observed non-linear behavior of the process model equations, causing difficulties in achieving convergence. The use of an inappropriate property model and/or property parameters can lead to erroneous numerical results that cause further complications by causing bottlenecking, over-sizing and even resulting in wrong process configurations (Whiting and Xin, 1999). In CAMD, the property model plays first an advice role in the formulation of the design problem, by advising on which properties to target. Once the CAMD algorithm has generated the candidate formulations, it prompts the property model in the service role to verify that the properties of the designed molecules satisfy the targets. Notice that without the service role, the search space for candidate molecules would be too great to 43 handle, regardless of the particular method of solution for the CAMD problem. Thus, the advice role of property models helps to narrow down the search space for solutions to the design problem. Eden (2003) described how the various roles of property (constitutive) models fit into the overall solution of the design problem. In Figure 2.11, the process model provides intensive variables such as temperature, pressure and composition to the property model; and requests property values during the solution step. Here the property models are in the service role. The process model can act as the basis for a process simulator, where the effects of changing various parameters can be analyzed. Furthermore, the simulator can be connected to a process synthesis/design algorithm in order to update the process parameters based on the results from each simulation. Thus, the operating conditions that yield the desired process performance are identified. Figure 2.11: Property models presented in various roles (Eden, 2003) 44 Property models now advise the synthesis/design algorithm on feasible and optimal operation/process conditions, thereby narrowing down the search space for the process design problem. The ability to provide design targets and feasibility constraints, enables the property model to be included as part of the solution routine. In the service/advice/solve role, the property models are decoupled from the process model and solved separately. Figure 2.12 shows how the information to and from the process model is reversed, i.e. the process model is solved in terms of constitutive variables and the property model is called upon to determine the corresponding intensive properties. Figure 2.12: Property model in service, advice and solve role. Decoupling of the property/constitutive model from the process model is key to lowering the dimensionality of design problems because as previously mentioned the 45 nonlinearity of the process model is generally attributed to the complexity of the constitutive model. Now different constitutive models can be used at different stages of solving the design problem, the selection of the model depending on the data given by the process model. To fully take advantage of property models as a powerful tool in the solve role, requires a methodology capable of handling multiple property models. Recently, Gani and Pisitikopoulos (2002) and Eden et al. (2002) proposed the solution of process as well as product design problems as a series of reverse problems (as the approach is part of the technique developed in this thesis, more details on this methodology in Section 3.1). Based on understanding the roles of property models discussed above, Gani and Pistikopoulos (2000) and Eden et al. (2002) have shown that applying the reverse problem formulation to process or product design does not require the use of property models in the process model equations since the unknown design targets are functions of the target properties. This means that the target properties can be determined from the solution of the reverse simulation problem by solving a set of linear equations (in most cases) and from these, the design targets are calculated. As long as these targets are matched, any number of property models may be used at the various stages of the solution step. The advantage of using reverse problem formulation is that the problem complexity is significantly reduced (by decoupling the constitutive property models from the design step) without sacrificing solution accuracy. The design tools developed in this thesis have recognized the importance of reverse problem formulations in design, and to property integration. 46 2.8. Property Integration ? Motivation, Need & Limitations While integrating process and product design problems presents a logical approach to optimality, there is still the problem of a common platform that allows for such incorporation. As Figure 2.13 shows, current process design algorithms take in overall objectives along with available data (e.g. pre-selected list of possible components or mixture data) and through the various methods discussed in Section 2.4, generate a solution(s) to meet the performance requirements. In molecular design, advances in the area of property prediction, specially using GCM, proved valuable to the success of CAMD tools such as the multi-level generate and test approach (Harper, 2000). In product/process development, the pre-design needs are described in terms of the products? performance or properties. Process unit performance is measured as a function of properties or ?functionalities? (e.g. condensers dependence on vapor pressure, reactors on heat capacities etc.). Moreover, in molecular design, the designed formulations? properties are later checked to make sure that they satisfy these needs. Therefore, it makes sense to use a property-based platform to link the two previously decoupled design problems. The need for systematic methodologies based on properties was recognized by El- Halwagi and collaborators. They introduced the concept of Property integration; and it is defined as ?a functionality-based holistic approach for the allocation and manipulation of streams and processing units, which is based on functionality tracking, adjustment, and matching of functionalities throughout the process? (Shelley and El-Halwagi, 2000; El- Halwagi et al., 2004 and Eden, 2003). Introduction of the property integration framework enabled for representation of process and products from a properties perspective. 47 The need for solving process design problems in terms of properties, involves addressing the challenge that while chemical components are conserved, properties are not conserved entities. Another difference between component-based ?chemo-centric? approaches and property-based approaches is that the mixing of components is linear, while the mixing of properties is not necessarily linear. Figure 2.13: Conventional approach for process and molecular design problems. To overcome this limitation, Shelley and El-Halwagi (2000) introduced conserved quantities called clusters. The clustering concept utilizes property operators, defined as functionalities of the non-conserved raw physical properties. The operators are formulated to posses simple linear mixing rules, even though the properties themselves might be nonlinear (e.g. the reciprocal value of the density of a mixture of two streams is the summation of the reciprocal densities for each stream). Property clusters are the basis for the developed property integration framework that allows for representation of 48 process and products from a properties perspective. The technique constitutes the basis of the methods developed in this research; hence, more in depth information regarding the property integration framework will be covered in chapter 3. Utilizing the property clustering concept as the basis, Eden (2003) introduced a systematic property integration framework for the formulation and solution of property- driven process design problems. Although analyzed in chapter 3 brief aspects of this technique are highlighted here: ? Using a reformulation strategy based on decoupling the constitutive equations from the balance and constraint equations, the traditional forward process design problem is converted into two reverse problems. The first solves the balance and constraint equations in terms of the constitutive (property) variables to identify the design targets. The second problem solves the constitutive equations to reveal the unit operating conditions and/or components that match the design targets set by the first reverse problem. ? By solving the constitutive equations separate from the balance and constraint equations, the method provides an important feature - a framework capable of handling multiple property models as needed to describe entire processes. ? For problems, that can be described using three properties, the problem and solution are visualized on a ternary diagram. Thus, the ability to use visualization insight to identify the desired properties needed to satisfy optimum process performance targets without committing to any component during the solution step (Eden, 2003). 49 To overcome the limitation of using only three properties and to increase the application range, Qin et al. (2004) developed an algebraic approach for property integration. Process constraints and stream characterization are described using bounds on intensive properties and flows. The specific mathematical structure of the set of operator constraints is used to develop a constraint-reduction algorithm, which provides rigorous bounds on the feasibility region. This algebraic approach allows for the expansion of the design problem to include any number of properties, and formulates the problem as a set of equality and inequality equations. By lowering the general dimensionality of the problem, the algebraic approach provides an easier solution to the design problem. Kazantzi et al. (2005) further extended the application of the clustering concept into a targeting procedure for material reuse in property based applications. They developed new property-based pinch analysis and visualization techniques. For systems characterized by one key property the developed technique determines rigorous targets for minimum fresh usage, maximum recycle, and minimum waste discharge (Kazantzi et al., 2004 a,b). This is a generalization of the conventional material reuse/recycle pinch diagram (Section 2.8) which can be modified to include property operators to track the properties as streams are segregated, mixed, and recycled. The graphical technique provides visualization insights for targeting and network synthesis (Foo et al., 2006). For solvent design Hostrup et al. (1990) and Linke and Kokossis (2002) have recently reported problems with simultaneous solution approaches for process-product design. Mathematical programming approaches incorporating product and process design, while attractive, needed first to overcome a problem in handling property models. 50 As pointed out by Gani (2001), the property models for product design may not be suitable for process design and vice versa. In addition, once a property model is selected for inclusion into the process model, the application range in terms of additional new mixtures (generated by the product design steps) is restricted, since either the model parameters may be unavailable or the property model may not be suitable for the generated molecules. In view of the fact that in mathematical programming techniques the changing of model equations (included as equality constraints) will cause discontinuities in the solution trajectory, it may become extremely difficult to achieve convergence if multiple versions of models for the same properties were to be used (Gani et al., 2003). Eden (2003), proposed the simultaneous consideration of both process and molecular design problems as illustrated in Figure 2.14. The molecular building blocks are used as input into the molecular design algorithm; while the process algorithm takes in the desired performance goals. The result of the simultaneous consideration is the identification of design variables needed to facilitate the process performance targets and the generation of molecular formulations that aim at satisfying the property constraints determined from solution of the process design algorithm. Eden (2003) succeeded in developing a systematic method for the solution of process design problems, where the objectives are driven by properties. The methodology is also capable of identifying property values that correspond to optimum process performance without committing to any components by ahead of design. Thus, Eden (2003) developed the process end of this integrated approach. The efforts of the presented thesis explores the problem of identifying a property based molecular design methodology capable of incorporating the 51 clustering techniques developed by Eden (2003) in order to bridge the gap between the two design problems. Figure 2.14: New approach to process and molecular design problems Detailed coverage regarding property clusters and the property integration framework is presented in Section 3.1. These techniques constitute the foundation for the generated molecular design approach presented in Section 3.2. 2.9. Property Prediction and Group Contribution Almost all CAMD algorithms rely on the ability to predict pure component and mixture properties for the analysis and design of formulations. In addition, the need for reliable and accurate property estimation methods is critical to the solution of various 52 simulation problems where convergence is often related to failures in the reliability of predicted physical and thermodynamic properties (Constantinou and Gani, 1994). Most property estimation methods used in CAMD techniques are based on Group Contribution Methods (GCM), where the properties of a compound are expressed in terms of a function of the number of occurrences of predefined groups in the molecule (Harper, 2000). The group contribution method is totally predictive, meaning that, as long as the structure can be fully described with the groups, the properties of the structure are immediately available. The method can be used to synthesize new structures easily as the evaluation of the properties of a structure is straightforward, given the models and the group contributions (d?Anterroches and Gani, 2005). Group contribution based design methodologies are built on the following general premise: the structure is composed of groups and the targets are properties. The formulation of a group contribution based problem is defined as identifying structures that possess target properties (e.g. molecular weight, melting temperature etc.) while matching structural constraints (e.g. no cyclical groups, no alcohols, etc.). The goal is to generate the molecular structures that match the target properties within the structural constraints. Group Contribution Methods (GCM) allow for the prediction of pure component physical properties from structural information. Property prediction of pure compounds was initially estimated as a summation of the contributions of simple first-order groups that occur in the molecular structure (Lydersen, 1955; Joback and Reid, 1983; Reid et al., 1987; Lyman et al., 1990; and Hovarth, 1992). The advantage of using such methods is it provides quick estimates without requiring substantial computational resources; however, 53 the molecular structure is often oversimplified, making isomers indistinguishable. To overcome such issues, Jalowka and Daubert (1986) have employed another class of group contribution involving particulate grouping of atoms in the presence of other atoms. Despite the increase in accuracy, this method is complex and in order to produce the desired properties it requires the use of other determined properties (Reid et al., 1987). Other methods have proposed correlating critical properties and normal boiling point to the number of carbon atoms in the molecule of homologous series (e.g. alkanes and alkanols) (Kreglenski and Zwolinski, 1961; Tsonopoulos, 1987; and Teja et al., 1990). An increase of accuracy was observed, however the application range has been questioned (Tsonopoulos and Tan, 1993). Constantinou et al. (1993, 1995) proposed an additive method where the molecular structure of a compound is viewed as a hybrid of a number of alternative formal arrangements of valence electrons (conjugate forms), and the property of a compound is a linear combination of the contributions. The systematic inclusion of more molecular information allowed this method to substantially increase the accuracy of the predicted property and capture the difference among isomers; however, a shortcoming of this method is that it requires a symbolic computing environment for the generation and enumeration of conjugate forms. In 1990, Gani et al. reported that group contribution based computational tools need to accommodate two separate molecular structure descriptions: one for the prediction of properties for pure compounds and another for the mixture property estimation. To overcome this limitation, Constantinou and Gani (1994) proposed the use of first order group contribution; and defined them as the set of groups commonly used 54 for the estimation of mixture properties, where each group has a single contribution, independent of the type of compound involved (e.g. acyclic or cyclic). The shortcoming here is that this method cannot differentiate between isomers. In their work, Constantinou and Gani (1994) also included a two-level approach to property estimation. The basic level has contribution from first-order groups for mixture properties, and the next level has a set of second-order groups that have the first- order groups as building blocks. The role of the second order groups is to consider, to some extent, the proximity effects and to distinguish amongst isomers. Despite the advantages of second order GC methods, the application range is limited by the relatively simple compounds that make-up its small data bank (more details regarding available property models and groups for first-order estimation are included in Section 3.2.1) Marrero and Gani (2001) developed a GCM that performs estimation in three levels. The first level is made-up of simple groups that describe a wide range of organic compounds; however, it still cannot distinguish between isomers. The second level involves groups that permit a better description of proximity effects and differentiation among isomers. The third level has groups that provide more structural information about the molecular fragments of compounds whose first and second level description is not sufficient. This level allows for the estimation of complex heterocyclic and large (C ? 60) polyfunctional acyclic compounds. The following is the full GC property- estimation model that includes 1 st , 2 nd and 3 rd order groups: ? ?? ++= jk kkjj i ii EODMCNXf )( (2.10) 55 C i is the first-order group type i which occurs N i times, D j is the contribution of the second-order group type j which occurs M j times and E k is the third-order group type k which occurs O k times. Any applications of group contribution rely on availability of groups to describe the structure as well as tables giving the contributions of each group (Franklin, 1949). The group-contribution property data has been developed from regression using a large data bank of more than 2000 compounds collected at the Computer Aided Process and Engineering Center at Technical University of Denmark (CAPEC-DTU). Properties that are predicted using GCM (e.g. critical properties, boiling point temperature etc.) are referred to as primary properties, and all other properties (e.g. density, viscosity, vapor pressure, heat of vaporization etc.) are classified by Jaksland (1996) as secondary properties, usually predicted as functions of primary or critical properties (Marrero and Gani, 2001). 2.10. Design of Experiments Industrial experimenters that deal with formulation of mixtures are often forced to deal with a rising number of variations in data samples, usually arising from external influences (e.g. raw materials, environmental condition or human operating error). To dampen these effects, some try to run large-scale experiments; however, they are too expensive and time-consuming. Others hope to isolate the underlying cause of the variations by going back and forth changing and then re-testing one parameter at a time; however this one-factor at a time approach (OFAT) is not able to provide any insight on 56 the interaction of different factors (variables). In addition, OFAT is more expensive than large scale experiments. A two-leveled factorial design (TLFD) method was developed as a statistical strategy that simultaneously adjusts all factors at two levels, i.e. the low and high values for each factor. Using only two levels helps in limiting the number of needed experiments, however more levels can be added if requested, e.g. the mean (midpoint) value of the factor can be included to increase resolution. Box et al. (1978) and Cornell (1990) discussed the basics of TLFD, where as Anderson and Whitcomb (1996) extended the application to chemical engineering problems. The advancement in technology brought about the inclusion of TLFD strategies into advanced software tools, e.g. Design Expert (Stat-Ease, 1999). Whitcomb (1999) described the general methodology for identifying the optimal mixture Design of Experiments (DOE) as follows: 1. Specify the polynomial order, i.e. first, second, third or beyond, that is needed to model the response. 2. Generate a ?candidate set? with more than enough points to fit the specified model 3. Select the minimum number of points, from the candidate set, needed to fit the model. 57 Figure 2.15: Response surface plot The algorithm is fed the factor constraints along with the specified polynomial order. After statistical analysis, the algorithm yields a sub-set of experiments needed to provide maximum information using the minimum number of experiments. Once the experiments are carried out and the response values measured, a model is generated for each response. Often the result of each design is represented by ternary diagrams that are plotted according to the generated response models. An example of these contours plots are shown in Figure 2.15; and are often called Response Surface Maps. The diagrams are used to identify optimum levels for each factor. The best formulation may be determined without having to prepare or test it. Next the identified solutions should be verified by performing confirmation runs, since the solution was identified based on predictive models. 58 The advantages of Design of Experiments (DOE) is that factors and/or processes can be changed independently so that main effects can be determined with fewer runs, saving both time and money. 2.11. Ternary Diagrams for Visualization A ternary diagram uses an equilateral triangle to graphically depict the relationship among three data values which sum to a constant value. It graphically depicts the ratios of three proportions. Geologists use ternary diagrams for a variety of purposes: identification and classification of sedimentary, igneous, and metamorphic rocks. Ternary phase diagrams are used in chemistry to gain insight into the miscibility of a three component system (e.g. ethanol, water and phenol). In this representation the effects of properties like temperature and pressure on component miscibility at various compositions can be visualized. The ternary diagram is read counter clockwise. As an illustration, four different ternary mixtures are depicted in Figure 2.16. The composition for each of these points is shown below. 59 Figure 2.16: Generic ternary diagram. 1. 60% A | 20% B | 20% C = 100% 2. 25% A | 40% B | 35% C = 100% 3. 10% A | 70% B | 20% C = 100% 4. 0.0% A | 25% B | 75% C = 100% Constructing a ternary diagram can be a tedious and time-consuming process if a plot program e.g. Ternary Plot, is not available. However, a conversion methodology has been developed to plot diagrams using Cartesian coordinates. The methodology is discussed in detail in Section 3.1.4. 60 2.12. Summary From the review provided in this chapter it should be evident that process and product design problems are intertwined. Although solving product or molecular design problem separately has its direct benefits (e.g. the design problems are less complex), nevertheless the overall objective in process/product synthesis/design is not just to find any chemical or mixture that satisfies the described objectives. The goal is to achieve an optimal design which addresses cost of raw materials, operation, efficiency and environmental impact. The targeting approaches used in heat and mass integration have proven very effective. The key feature of the targeting approach is that the design targets are identified without committing to a specific solution. Various targeting tools such as the pinch analysis, the source-sink mapping diagrams, are widely used in industry and have proven very successful in maximizing profit margins while lowering processing cost (e.g. raw material consumption rate, utility cost, and waste generation). The recently developed property clustering framework is a very powerful targeting tool that provides a platform for solving process design and optimization problems. The novel technique has enabled systematic tracking of properties (in the form of functionalities) throughout a process. The clustering concept has been implemented in various property integration tools, e.g. property-based pinch analysis (Kazantzi et al., 2005), the property integration framework (Eden, 2003), and to further expand the application range of the property integration framework, Qin et al. (2004) developed an algebraic approach. The clustering concept spawned a new generation of design tools 61 that recognize the importance of property based design. The recognition came about as a direct result of the following observations: ? Many processes are driven by properties NOT components ? Performance objectives are often described by properties ? Often objectives can not be described by composition alone ? Product/molecular design is based on properties ? Insights are often hidden by not integrating properties directly Current CAMD methods have the ability to design formulations that target property needs; however the property targets are generally set forth by the performance needs of a single process unit (e.g. separator, distillation column etc.). Unlike mass and energy integration tools, CAMD method fall short in taking the requirements of the entire process as part of setting up the design problem. Consider the simple case of designing a solvent for a certain process unit. The impact of this individual solvent is not limited to that specific processing unit; in fact the impact is propagated throughout the entire process (e.g. the remaining streams and other processing units). Hence, a truly effective molecular design approach is one that can handle integrating the needs of the entire process into its molecular design scheme. A major contribution of the highlighted targeting tools is that they posses a visualization media to help in the formulation and the generation of solutions to the design/optimization problem. Being able to visualize an entire chemical process in terms 62 of its streams and units provides insights on direct recycle and interception opportunities that might otherwise be hidden. The discussion presented here provided the motivation that guided the research, and as a result the objective of this dissertation is to develop methods and tools that must accomplish the following: ? Integrate process and molecular/product design problems via a systematic methodology within the property integration paradigm. ? The approach needs to be capable of setting up the design performance requirements or ?targets? a priori, i.e. a targeting approach. ? Incorporate the concepts of reverse problem formulation and property clusters to aid in the decomposition of the design problem ? In addition, the technique should take advantage of the benefits of using visual tools in the formulation of the problem and as part of its solution algorithm 3. Unified Property Integration Framework As discussed in the previous chapter, the requirements dictate the need for a design approach that incorporates properties directly as part of the solution algorithm. The property integration framework (Shelley and El-Halwagi, 2000; Eden, 2003) was developed to address this need. It is a novel technique that has proven very useful in the optimization of various industrial processes as well as product blend problems including binary and ternary mixtures. (Shelley and El-Halwagi, 2000; El-Halwagi et al., 2003; 63 Eden, 2003; Eden et al., 2004; and Eljack et al., 2005). This property based platform was developed as a reverse problem formulation framework with the ability to systematically reformulate design problems and generate solutions visually on a ternary diagram. It differs from conventional techniques because it is non-iterative. By understanding the roles of process and property models on design and recognizing that the complexity of the design problem is a direct result of the constitutive equations, the framework reformulates the original design problems as two reverse problems and decouples the constitutive equations from the balance and constraint equations (see Figure 3.1). Now the balance and constraint equations are solved in terms of the constitutive variables and the design targets are obtained, this is the reverse of a simulation problem. Next, the second reverse problem solves the constitutive equations to identify the unit operations, operating conditions and components needed to satisfy the design constraints. The key here is that any model can be used to describe the constitutive variables as long as the design targets are matched. This means that more than one solution to the targets can be identified, hence all feasible solutions to the design problem can be determined, and finally the optimal design can be determined based on a performance index (Eden, 2003). 64 Figure 3.1: Reverse problem formulation methodology This methodology is a valuable and powerful tool because: ? It allows for the integration of the process and product design problems using properties as a common interface. ? Simplifies the design problem by decoupling the often complex constitutive equations from the process model. 3.1. Property Clustering Fundamentals 3.1.1. Property Operator Description Property clusters are conserved surrogate properties that are functions of non- conserved properties. The clusters are obtained by mapping property relationships into a low dimensional domain, thus allowing for visualization of the problem (Shelley and El- 65 Halwagi, 2000). The basis for the property clustering technique is the use of property operators. Although the operators themselves may be highly non-linear, they are tailored to possess linear mixing rules (Eden et al., 2004; El-Halwagi et al., 2004). The operator functions describe a class of properties that can be described by equation 3.1, in which the operator, ? j , of property j is determined for a mixture M. The mixture is made up of N s streams and can be described using j properties. The operator (? j ) is formulated as the summation of each stream flowrate fraction (x s ) multiplied by the contribution of property j for stream s (P js ) (Eden, 2003). )()()( 11 1 jsj N s sjsj N s N s s s jMj PxP F F P gg s ??? ?? ? == = ??=??= (3.1) The operators are always formulated in a manner so that the right hand side (RHS) of equation 3.1 exhibits a linear mixing rule. The operator can be directly defined as a function of the actual property P js (see equation 3.2), where the operator (? jM ) of the mixed stream M will be referred to as P jM and that of stream s as P js . The operator can also describe functional relationships as shown for density in equation 3.3. Thus, the property operators can be non-linear functionalities, but the mixing rules have to be linear. )( )( , j j 1 jsjsjMjMjs N s sjM PPPPPxP g ==?= ? = ?? (3.2) 1 )( 1 )( , 11 jj 1 s js M jM s N s s M PPx g ? ? ? ? ?? ==?= ? = (3.3) 66 In equation 3.4, the property operators are normalized to a dimensionless form by dividing by a reference value. This is a necessary step due to the fact that properties can possess various functional forms and units. The reference value for each operator is chosen, so that various properties used to describe the system are in the same order of magnitude. )( )( ref jj js js P P j ? ? =? (3.4) An Augmented Property index (AUP) for each stream s is defined as the summation of all the NP normalized property operators: ? = ?= NP j jss PAU 1 (3.5) The property cluster C js for property j of stream s is defined as: s js js PAU C ? = (3.6) 3.1.2. Cluster Formulation Property clusters are formulated to exhibit two fundamental rules: 1. Intra-stream conservation For each stream s, the summation of all NC clusters, which correspond to the NP property operators values add up to unity as shown in equation 3.7. For systems that can be described by only 67 three properties, a ternary diagram can be used for visualization as seen in Figure 3.2. 1 1 = ? = C N j js C (3.7) Figure 3.2: Intra-stream conservation of clusters 2. Inter-stream conservation requires that the mixing of two streams should be performed so that the resulting individual clusters are conserved, corresponding to consistent additive rules as seen in equation 3.8. ? = ?= C N j jssjM CC 1 ? (3.8) In order to validate the inter-stream conservation rule represented in the equation above, it is first noted that the original definition of a property cluster C js is valid for any 68 cluster. Meaning that it should also apply to the individual cluster values of a mixture, as shown in equation 3.9: M jM jM AUP C ? = (3.9) Next, the generalized mixing rule given in equation 3.1 is divided by a reference property value. Substituting in the definition of the normalized property operator (equation 3.4) results in equation 3.10. This can be rearranged to show the normalized property operator for a mixture as shown in equation 3.11. ?? == ??=?= ss N s jss N s ref jj jsj s ref jj jMj x P P x P P 11 )( )( )( )( ? ? ? ? (3.10) ? = ??=? s N s jssjM x 1 (3.1) Inserting the expression for the normalized operator of a mixture (equation 3.11) into equation 3.9, while rearranging the cluster definition in equation 3.6, yields equation 3.12, indicating inter-stream conservation of the clusters. Simplifying equation 3.12 yields equation 3.13, which defines the relative cluster arm for the individual stream (s). ? ?? = == ?= ?? = ?? = ? = s ss N s jss M N s jsss M N s jss M jM jM C AUP CAUPx AUP x AUP C 1 11 ? (3.12) M ss s AUP AUPx ? =? (3.13) 69 3.1.3. Lever Arm Analysis Inter-stream conservation, as given in equation 3.12, indicates that the mixture of two streams S1 and S2 on the ternary diagram can be represented by a straight line, as shown in Figure 3.3. This line corresponds to all possible mixture between the two streams, with the location of the mixture point, C jM , being directly related to the streams fractional flowrate contributions, x s . The location of the mixture point splits the line into two segments, each represented by ? 1 and ? 2 , corresponding to relative cluster arms for stream S1 and S2, respectively. The mixture cluster equations developed by Shelley and El-Halwagi (2000) are given in equations 3.14-3.15. 2211 ??+??=? xx M (3.14) ? = ?=?+?= s N s ssMM AUPxAUPAUPxAUPxAUP 1 2211 , (3.15) 70 Figure 3.3: Inter-stream conservation of clusters The relative cluster arm ? s , is a conserved entity and is defined as the AUP fractional contribution of each stream s to the mixture stream, (shown previously in equation 3.13). The merging of equations 3.9 and 3.12 generates equation 3.16; the subsequent substitution of equation 3.7 yields the conservation rule for the general relative cluster arm (equation 3.17). The expression for the Augmented Property index of a mixture, AUP M , as shown in equation 3.15, is a result of combining equation 3.17 with the definition of the cluster arm (equation 3.13). ???? ==== ?=? csCs N j js N s sjs N j N s s CC 1111 1 ?? (3.16) 1 1 = ? = s N s s ? (3.17) 71 These conservation rules are important features of the cluster formulation used in this methodology, as they allow for tracking clusters visually on a ternary diagram. As the conserved clusters are directly related to the raw properties, they enable tracking properties and this provides a unique way of representing processes/products from a properties perspective. The conversion of physical property data to cluster values is outlined in Table 3.1 (Eden, 2003). Step Description Equation 1 Calculate dimensionless stream property value 3.4 2 Calculate stream AUP indices 3.5 3 Calculate ternary cluster values for each stream 3.6 4 Plot the points on the ternary cluster diagram -- Table 3.1: Calculation of cluster values from physical property data In summary, clusters are obtained by mapping property relationships into a low dimensional domain, thus allowing for visualization of the problem (Shelley and El- Halwagi, 2000). Although the operators themselves may be highly non-linear, they are tailored to possess linear mixing rules, e.g. density does not exhibit a linear mixing rule, however the reciprocal value of density follows a linear mixing rule (Eden et al., 2004; El-Halwagi, Glasgow et al., 2004). The operator expressions will invariably be different for molecular fragments and process streams, however as they represent that same property, it is possible to visualize them in a similar fashion. This is part of the novel work that is presented in this thesis. 72 3.1.4. Ternary Diagram and Cartesian Coordinate Conversion The construction of the ternary diagrams in this work is accomplished by converting the cluster points from ternary to Cartesian coordinates. The conversion is used due to the absence of available software that supports ternary plot representations, as is the case here. By converting the ternary coordinates to Cartesian coordinates more common tools like Microsoft Excel can be used. Figure 3.4: Converting ternary to Cartesian coordinates Figure 3.4 is used to aid in describing the conversion methodology. Points on a ternary plot are represented in three dimensional coordinates (x, y, z) in this case (C 1s , C 2s , C 3s ). All axes on a triangular diagram have a length of 1. X cc , the x-value of the Cartesian coordinate set, is determined on the C 3 -C 1 axis on the ternary diagram. On this 73 axis C 1s and the value of (1-C 3s ) is known. X cc will always be the arithmetic mean of these two points since the triangular plot is equilateral, as shown in equation 3.18. ss sssss scc CC CCCCC X 21 21131 , 5.0 2 )( 2 )1( ?+= ++ = ?+ = (3.18) The y-value of the Cartesian coordinate, Y cc , is directly related to C 2s by some scaling factor. From Figure 3.4, it is obvious that the value has to be less than 1. Points C 3s , X cc and C 2s on the diagram along with the Pythagorean Theorem are used to determine this scaling factor from triangular to Cartesian coordinates. The length of C 3 - C 2 axis is 1 and according to equation 3.18 the length of X cc -C 3 is 0.5. Thus, Y cc is calculated to be 0.5 . ?3 using the Pythagorean Theorem (see equation 3.19); this value is constant due the equal length sides of the triangle. This scaling factor is used to convert triangular coordinates to Cartesian y-coordinates (Eden, 2003). 2 3 , 2 3 )1( 2 1 2, 22 2 =?=?=+ ? ? ? ? ? ? scalingssCCscaling YCYY (3.19) 3.1.5. Feasibility Region Boundaries Processes are made up of process units (sinks) and streams (sources). On the ternary diagram, the property values for streams (sources) are represented by discrete points while ranges of property values or property constraints (sinks) are denoted by a feasibility region. For visualization purposes only systems that can be described by three properties are used, with each property bound by a lower (P j min ) and an upper limit (P j max ), see equation 3.20. These values can also be described in terms of dimensionless property operators, as shown in equations 3.21 and 3.22. 74 max sink, min sink, max sink, min sink, , jjjjjj PPP ??????? (3.20) )( )( min sink,min sink, ref jj jj j P P ? ? =? (3.21) )( )( max sink,max sink, ref jj jj j P P ? ? =? (3.2) Using the definition of the augmented property index (AUP) as given in equation 3.5, the visualization of the sink region is achieved by translating the above dimensionless operators into the following cluster expressions: min sink,3 min sink,2 max sink,1 max sink,1max sink,1 max sink,3 max sink,2 min sink,1 min sink,1min sink,1 , ?+?+? ? = ?+?+? ? = CC (3.23) min sink,3 max sink,2 min sink,1 max sink,2max sink,2 max sink,3 min sink,2 max sink,1 min sink,2min sink,2 , ?+?+? ? = ?+?+? ? = CC (3.24) max sink,3 min sink,2 min sink,1 max sink,3max sink,3 min sink,3 max sink,2 max sink,1 min sink,3min sink,3 , ?+?+? ? = ?+?+? ? = CC (3.25) In 2003, El-Halwagi et al. addressed the task of mapping the feasibility region from the property domain to the cluster domain. Although the region represents an infinite number of feasible points, the developed technique required no enumeration. The feasibility region is first overestimated by simply using the minimum and maximum values of the clusters to place bounds on the region by the six line segments shown in Figure 3.5. Although the overestimated region does not define the true feasibility region, 75 it narrows down the search space and guarantees that no feasible point will exist outside it. Subsequently, the six cluster points defined by equations 3.23-3.25 are plotted as a point on each of the six line segments. Since the six points are part of the true feasibility region, any mixture of the points will also be part of the true region. The connection of the six points defined the underestimated region. According to the findings by El- Halwagi et. al (2003) and Eden (2003), the feasibility region is defined by six unique points, and their findings are summarized in Rule 1 below. Rule 1: Expressing property constraints as a Feasibility Region ? The boundary of the true feasibility region can be accurately represented by no more than six linear segments. ? When extended, the linear segments of the boundary of the true feasibility region constitute three convex hulls (cones) with their heads lying on the three vertices of the ternary cluster diagram. ? The six points defining the boundary of the true feasibility region are determined a priori and are characterized by the following values of dimensionless operators (see Figure 3.6) ( ) () min 3 max 2 max 1 max 3 min 2 min 1 ,, ,, ??? ??? ( ) () min 3 min 2 max 1 max 3 max 2 min 1 ,, ,, ??? ??? ( ) () max 3 min 2 max 1 min 3 max 2 min 1 ,, ,, ??? ??? The feasibility region boundary analysis described above provides the exact expression for the feasibility region a priori and without enumeration (El-Halwagi et al., 2003; Eden, 2003). 76 Figure 3.5: Overestimation of feasibility region Figure 3.6: True feasibility region of a sink. 77 3.2. Molecular Property Clusters 3.2.1. Group Contribution To provide a methodology for handling molecular design problems, the property integration framework is extended to include Group Contribution Methods (GCM), which allow for prediction of physical properties (e.g. boiling and melting temperature, enthalpy, and heat of vaporization), from structural information. As stated in Section 2.9, initially the Group Contribution Methods were based on the contributions of first order groups that make up the molecule (Joback and Reid, 1983), then to increase the accuracy of the predicted properties work by Constantinou and Gani (1994) and later by Marrero and Gani (2001), estimate properties utilizing first order, second order, and third order groups which use first order groups as building blocks. Understanding that the goals of this research is to develop the first implementation of a property based molecular algorithm that can handle systematic generation of formulations in response to property needs, the prediction accuracy of the first-order GCM is sufficient. Once the proposed framework is established, implementation of higher order GCM to enhance the accuracy of the selected property models will be explored. For now, the general group contribution model equation used to predict properties is: ? = i ii CNXf )( (3.26) C i is the contribution of the first-order group type i which occurs N i times, and f(X) is a function of property X. Table 3.2 presents ten main properties predicted using GCM (Constantinou and Gani, 1994; Constantinou et al., 1995; and Marrero and Gani, 78 2001). The left hand side (LHS) of equation 3.26 for each property X is shown in the table 3.2. The universal constants e.g. t mo , t bo etc. are part of the general model and their values for the various properties are listed in table 3.3. Only the first order group contribution terms are listed for the right hand side (RHS) of equation 3.26, there is data available for second and third order terms as mentioned previously (Constantinou and Gani, 1994; Marrero and Gani, 2001). The group contribution property data used by this method has been determined by regression using a large data bank of more than 2000 compounds collected at CAPEC-DTU, see Appendix A (Marrero and Gani, 2001). Property (X) LHS of Eq. 3.26 Function f(X) RHS of Eq. 3.26 1 st order GC term Normal melting point (T m ) ? ? ? ? ? ? ? ? mo m t T exp ? i mi i TN 1 Normal boiling point (T b ) ? ? ? ? ? ? ? ? bo b t T exp ? i bi i TN 1 Critical temperature (T c ) ? ? ? ? ? ? ? ? co c t T exp ? i ci i TN 1 Critical pressure (P c ) ( ) 5.0 1 ? ? cc PP ? i ci i PN 1 Critical volume (V c ) coc VV ? ? i ci i VN 1 Standard Gibbs energy 1 (G f ) fof GG ? ? i fi i GN 1 Standard enthalpy formation 1 (H f ) fof HH ? ? i fi i HN 1 Standard enthalpy vaporization 1 (H v ) vov HH ? ? i vi i HN 1 Standard enthalpy fusion (H fus ) fusofus HH ? ? i fusi i HN 1 Liquid molar volume 1 (V l ) dV l ? ? i i i vN 1 1 Properties predicted at 298K Table 3.2: Property functions for Group Contribution Methods 79 Universal Constants Value t mo 102.425 K t bo 204.359 K t co 181.128 K p c1 1.3705 bar v co 4.35 cm 3 /mol g fo -14.828 kJ/mol h fo 10.835 kJ/mol h vo 6.829 kJ/mol h fuso -2.806 kJ/mol d 0.01211 m 3 /kmol Table 3.3: Listed values of GCM universal constants 3.2.2. Bridging the Gap between Process and Molecular Design By combining property clustering techniques and first order group contribution methods (GCM), a systematic methodology is obtained that facilitates simultaneous consideration of process and molecular design. In the same manner that Eden (2003) reformulated the process design problem as two reverse problems, the process design will be solved in terms of property values with the design targets set as constraints. Figure 3.7 describes the general flow of information from process design to molecular design and back again; where the output of the process design algorithm will be a set of property values. These values are the property targets for the molecular design problem. The molecular design algorithm developed in this thesis and described in the following sections will systematically generate molecular formulations to satisfy the property targets/constraints identified by solving the process design problem. 80 Figure 3.7: Property driven approach to integrated process and molecular design 3.2.3. Molecular Property Operators Extending the property clustering technique to include GCM for molecular design, introduces the need for molecular property operators. Like the original operators, their formulation must be such that it still allows for simple linear additive rules for combining the groups, which can be described by the following: ? = ?= g N g jggj M j PnP 1 )(? (3.27) In equation 3.27, ? ? j (P j ) is the molecular property operator of the j th property. The molecular property operator describes the functional relationship of the group 81 contribution property equations in a manner so that the RHS of the equations is always in the form of a summation of the number of each group (n g ) multiplied by the contribution to property j from group g (P jg ). Some properties are not predicted directly from group contribution methods, but are estimated as functions of other properties that can be predicted using GCM, e.g. vapor pressure (VP) can not be estimated directly, however it can be estimated from the boiling point, which is a property described by GCM, as shown in equations 3.28 and 3.29 (Sinha and Achenie, 2001). 7.1 7.258.5log ? ? ? ? ? ? ? ? ?= T T VP bp (3.28) ? = ?= ? ? ? ? ? ? ? ? = g g N g bg bo bp M tn t T T 1 exp)(? (3.29) Where, T and t bo are the chosen condensing temperature and the group contribution boiling temperature constants, respectively. Notice that the property operator can be very complex, but molecular formulation on the ternary diagram is still simple because the property operators obey simple linear additive rules. Next, the molecular property operators can be converted to clusters according to the procedures developed for the original property clusters, see Section 3.1.2 (Shelley and El-Halwagi, 2000; Eden et al., 2004; El-Halwagi et al., 2004). Since properties can have various functional forms and units, the molecular property operators like process property operators are normalized into a dimensionless form by dividing by a reference operator. This reference is appropriately chosen such 82 that the resulting dimensionless properties are all of the same order of magnitude. The normalized property operator for group g is given as: )( )( j ref j jg M M jg P P jg ? ? =? (3.30) An Augmented Property index AUP M for each group g is defined as the summation of all the NP dimensionless property operators, ? jg M : ? = ?= NP j MM g jg PAU 1 (3.1) Molecular fragment g?s property cluster C jg M for property j is defined as the ratio of the normalized molecular property operator and the AUP g M : M g M M PAU C jg jg ? = (3.32) 3.2.4. Conservation Rules for Molecular Clusters Visualization of the molecular design problem is very valuable to this methodology. To ensure that the molecular clusters are conserved, they have to posses both intra- and inter-molecular conservation. Similar to the intra-stream conservation rule for processes, the intra-molecular conservation rule requires that the sum of individual clusters C jM for each molecular formulation M must sum to unity as shown in equation 3.33. This is proven by the summing of all cluster values for all j properties for molecule M in equation 3.32 and substituting the AUP definition (see equation 3.34). 83 1 1 = ? = C j N j M C (3.33) 1 1 1 == ? = ? ? = = M M M NP j M j NP j M AUP AUP AUP C j (3.4) The inter-molecular conservation rule for adding molecular groups or fragments on the ternary diagram is derived analogous to inter-stream conservation. The general additive rule for molecular operators (equation 3.27) is normalized by a reference value and the definition of the dimensionless molecular operator (equation 3.30) is substituted to yield the following mixing rule: ? = ??=? g N g jgg M mixj n 1 , (3.5) Inter-molecular conservation requires that the individual molecular cluster of mixing two groups C j M ,mix is conserved. For two groups each possessing their own individual cluster values, lever-arm rules like equation 3.36 are needed to allow for easy determination of the mixture cluster value for each property j. ? = ?= g N g jgg M mixj CC 1 , ? (3.6) The definition of molecular property cluster given in equation 3.32 applies to any cluster including molecular fragments, therefore the cluster of a mixture of two molecular groups or fragments is: M mix M M PAU C mixj mixj , , ? = (3.37) 84 It is crucial to validate the inter-molecular conservation rule. First the mixing rule for the dimensionless property operator is inserted into equation 3.37. Next, the definition of a molecular fragment cluster is rearranged and substituted. This proves the inter-molecular conservation rule according to equation 3.38- 3.40. M mix N g M g M jgg M mix N g M jgg M mixj AUP AUPCn AUP n C gg ?? == ?? = ?? = 11 , (3.38) M mix M gg g PAU AUPn ? =? (3.9) ? = ?= g N g M jgg M mixj CC 1 , ? (3.40) 3.3. Visual Molecular Design using Property Clusters The conversion of property data to cluster values outlined in Section 3.1.2 for process design was developed by Eden et al. (2004). The conversion of molecular property data to cluster values follows a similar procedure as given in Table 3.4. Step Description Equation 1 Calculate molecular property operators 3.27 2 Calculate dimensionless molecular property values 3.30 3 Calculate molecular AUP indices 3.31 4 Calculate molecular cluster values for each formulation 3.32 5 Plot the points on the ternary cluster diagram -- Table 3.4: Calculation of cluster M values from GCM predicted property data 85 The primary visualization tool from the mass integration framework, the source- sink mapping, which is discussed in Section 2.5.2 is utilized in the molecular synthesis framework. In the original cluster formulation for process design, mixing of two sources is a straight line, i.e. the mixing operation can be optimized using lever-arm analysis. Analogously, combining or ?mixing? two molecular fragments in the molecular cluster domain follows a straight line (an illustrative example is given in Figure 3.8 below). Design and optimization rules have been developed for property based process design problems (Eden et al., 2004; El-Halwagi et al., 2004), and in the following similar rules are presented for property based molecular design problems. Figure 3.8: Group addition on ternary cluster diagram. 86 Design & Synthesis Rules Rule 2: Two groups, G1 and G2, are added linearly on the ternary diagram, where the visualization arm ? 1 , describes the location of G1-G2 molecule. 11 1 112 2 ? = ?+? nAUP n AUP n AUP ? (3.41) Rule 3: More groups can be added as long as the Free Bond Number (FBN) is not zero. 11 12 == ???? =????? ???? ???? ?? gg NN g g g Rings FBN n FBN n NO (3.42) FBN is the free molecular bond number of the formulation, n g is the number of occurrences of group g, FBN g is the unique free bond number associated with group g, and NO Rings is the number of rings in the formulation. Rule 4: Location of the final formulation is independent of the order of group addition. The location of the formulation is unique, and is only based on the number of each group in the molecule. For example, consider Butyl methyl ether (C 5 H 12 O); it is made up of the following groups: CH 3 , CH 2 , and CH 3 O. Constructing this molecule on the ternary cluster 87 diagram, using three chosen properties, can be done in a variety of ways. However, regardless of the sequence in which the groups are combined, the resulting molecule (CH 3 O-CH 2 -CH 2 -CH 2 -CH 3 ) is located at the same unique point . To make sure Rule 2 is satisfied each molecular fragment?s Free Bond Number (FBN) is placed within brackets on the ternary diagram. As proof of concept, a random feasibility region (see Figure 3.9A and 3.9B) is represented by the dotted region; expressing the targeted approach of building molecular formulation to satisfy a set of given property constraints. Looking at Figure 3.9A, the starting point is CH 3 O then adding three CH 2 fragments then CH 3 . Figure 3.9B starts with CH 3 , then adds three CH 2 molecules, and finally CH 3 O. Both paths shown, as well as many others end up at the same point, hence the location of each molecular formulation is unique and independent of group addition path. Rule 5: For completeness, the final formulation must not have any free bonds, i.e. FBN has to be equal to zero. Given a completed molecular formulation, three conditions must be satisfied for the designed molecule to be a valid solution to the process and molecular design problem. Rules 5 and 6 are the necessary conditions, while rule 8 is the sufficient condition: Rule 6: The cluster value of the formulation must be contained within the feasibility region of the sink on the ternary molecular cluster diagram. 88 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.10.20.30.40.50.60.70.80.9 0.9 C 3 C 2 C 1 CH3[-1] CH2[-2] CH3O[-1] CH3O-CH2-CH2-CH2[-1] CH3O-CH2-CH2-CH2-CH3[0] CH3O-CH2[-1] CH3O-CH2-CH2[-1] Figure 3.9A: Group addition path A for formulation of Butyl methyl ether. 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C 3 C 2 C 1 CH3[-1] CH2[-2] CH3O[-1] CH3-CH2-CH2-CH2-CH3O[0] CH3-CH2-CH2-CH2[-1] CH3-CH2-CH2[-1] CH3-CH2[-1] Figure 3.9B: Group addition path B for formulation of Butyl methyl ether. 89 Rule 7: The AUP value of the designed molecule must be within the range of the target. If the AUP value falls outside the range of the sink, the designed molecule is not a feasible solution. Rule 8: For the designed molecule to match the target properties, the AUP value of the molecule has to match the AUP value of the sink at the same cluster location. And in the case where the design problem included Non-GC properties, those properties must be back calculated for the designed molecule using the appropriate corresponding GC property, and those values have to match the target Non-GC properties. Now that the process and molecular design problems are both described in terms of clusters, a unifying framework exists for simultaneous solution of property driven design problems. This is important because as mentioned earlier the clustering technique reduces the dimensionality of both problems, thus it is possible to visually identify the solutions, which is a significant advantage of this approach. Figure 3.10 highlights the flow of information in the molecular property cluster framework for product and process design problems. The framework requires property data as input to the algorithm, but whether the data is dictated by the process, in terms of performance requirements, or from product design requirements, does not affect the methodology behind this algorithm. Once the property targets are identified, they set the problem constraints and the selection of property operators to represent the target. The 90 availability of GC property models shape how the property is used in the methodology. If models are available, then (primary) property operators are formulated directly, otherwise empirical equations are used to correlate the secondary property to the primary. Having formulated the design problem, it is then mapped onto the molecular ternary cluster diagram. The property constraints are represented by a region, and the group building blocks as discrete points. Visual synthesis is performed by combining molecular fragments, followed by screening of the formulation using the developed necessary and sufficient conditions. The result are candidate molecules that posses the pre-determined property targets. Figure 3.10: Molecular property cluster framework. 91 The contributions of the developed algorithm in this work are two-fold. First, the developed molecular design methodology bridges the gap between process and molecular design via incorporation of property clusters into its CAMD strategy. Second, for systems that can be sufficiently described using three properties or functionalities, the molecular synthesis problem is solved visually by mixing molecular fragments on a ternary diagram using simple lever-arm rules. Application examples utilizing the developed techniques are presented in Chapters 4 and 5. 92 3.4. Algebraic Property Clustering Technique for Molecular Design As stated previously, the ability to synthesize molecules within the clustering domain is key to bridging the gap between process and molecular design, however utilizing the visualization approach limits the application range to cases that can be expressed using three properties. It is recognized that not all design problems can be described by just three properties. For property integration through componentless design of processes, Qin et al. (2004) introduced an algebraic approach to overcome this bottleneck, by taking advantage of the mathematical structure of the property clusters. Presented here is an analogous algebraic method that expands the application range of the molecular property clustering technique. Here we will further exploit the advantages of the linear additive rules of the molecular operators to setup the design problem as a set of linear algebraic equations. Problem Statement: Synthesize molecular formulations, given a set of molecular building blocks (first order groups from GCM) represented by n g and a set of property performance requirements/constraints that is described by: upper ijij lower ij PPP ?? (3.43) Where i, is the index for the molecular formulation, and j is the index of properties. The property constraints can be expressed in terms of the normalized property operators by combining the mixing rules for operators (equation 3.27) with the corresponding reference values. maxmin jj ij ????? (3.44) 93 Recall the generalized dimensionless additive rule for a given property j and n g molecular groups is written as: ? = ??=? g N g jggj n 1 (3.5) The substitution of equation 3.35 into the inequality expression given by equation 3.44 generates the following: max 1 min j N g jgg g j n ?????? ? = (3.45) Thus each property constraint can be expressed as a set of inequality expressions, which are the basis for the algebraic approach. These sets of equations will help place bounds on the feasibility region, referred to as the sink. Because each property can be expressed in terms of two inequalities, each property can be combined with another property in two ways. In the original visualization approach for the molecular design framework, the bounds on three properties can be represented by a set of six points (Eden et al, 2004; Qin et al., 2004). Similarly, for systems made up of four properties, ? 1 -? 4 , each with a lower and upper limit, the bounds on the feasibility region can be described by eight points. These points are determined by the following (Eljack et al., 2007a): 94 Rule 9: Each property constraint is translated into the inequality expression from equation 3.45, and then split into two equations, one for minimum (min) and one for maximum (max). ? = ???? g j N g jgg n 1 min max 1 j N g jgg g n ???? ? = (3.46) Hence there will be 2NP (number of properties) inequality equations that constitute the main set. The AUP values for these set of equations will be calculated in order to determine the AUP range of the sink. Rule 10: From the main set of equations, 2NP subsets will be generated. Each subset will contain an equation for each of the properties used to describe the system. For a four property system, there will be 8 inequality equations for the original set, from which eight subsets will be developed. Each subset will be made up of four equations and only one of the two inequalities used to describe each property will be used in each subset. For the normalized operators of the system (? 1 , ? 2 , ? 3 , ? 4 ) the following combinations from the original set should be used to generate the eight subsets of equations: ),,,( ),,,( ),,,( ),,,( max 4 min 3 min 2 min 1 min 4 max 3 min 2 min 1 min 4 min 3 max 2 min 1 min 4 min 3 min 2 max 1 ???? ???? ???? ???? , ),,,( ),,,( ),,,( ),,,( min 4 max 3 max 2 max 1 max 4 min 3 max 2 max 1 max 4 max 3 min 2 max 1 max 4 max 3 max 2 min 1 ???? ???? ???? ???? (3.47) 95 As stated earlier the subsets of equations are used to consider all possible ways the properties can be combined with each other to place bounds on the feasibility regions. Rule 11: The generated subsets of equations constitute the property constraints. In addition, structural constraints such as non-negativity constraints for the contribution of each group and a limit on the size of a molecular formulation need to be included (equation 3.48) and a possible limit on the length of a molecular formulation (equation 3.49): },,1{0 gg NgnK=? (3.48) NFn g N g g ? ? =1 (3.49) Rule 12: For this algorithm a limit on the number of first order group fragments (NF) will also need to be specified ahead of design. To ensure that all valences in a molecule are satisfied, the following equation is used to place another structural constraint on the design problem. ? ? ? ? ? ? ??? ? ? ? ? ? ? ?= ?? == gg N g gg N g g nFBNnFBN 11 12 (3.50) Each group g has a free bond number (FBN) associated with it (e.g. CH 3 has FBN = 1, CH 2 has FBN=2). It should be noted that equation (3.50) only takes non-cyclical 96 compounds into account, as does the algebraic approach. However, further studies are looking at how to include them within the framework. Now that the main concepts behind this methodology have been established, an outline of the algebraic technique is given by Table 3.5. The proposed technique lacks visualization aspects; however, it has provided important contributions: ? Lowers the complexity of the design problem by setting up the design problem as a set of linear algebraic equality and inequality equations. ? It expanded the application range of the recently introduced molecular clustering technique to enable handling of problems requiring more than three properties. The algebraic approach opens a new area of research that would concentrate on developing tools directed at incorporating this algebraic method with other mathematical design approaches, i.e. MILP or LP optimization methods. 97 Step Description Equation 1 Transform given property data into molecular property operator terms 3.27 2 Express property constraints as inequalities forming the main set of inequality equations 3.43 ? 3.44 3 Determine the AUP range of the sink 3.31 4 Develop the subsets of inequality equations following Rule 10 -- 5 Generate the structural constraints 3.48 ? 3.50 6 Find the solution to each subset of linear inequality equations along with the structural constraint equations in order to determine the min and max n g of each group g. This is done with the objective being: first minimize the AUP of each subset and then to maximize the AUP of each subset. This step can be solved using various programs: MATLAB, Visual C++, etc. For the examples shown in this chapter, Microsoft Excel was used. -- 7 If the AUP values of each subset do not fall within the AUP range of the sink, those solutions are excluded. Then the range of valid n g values should satisfy all remaining solutions. Thus if one solution gives g1 between 3 and 6 and another between 2 and 10 then the true range that will satisfy all constraints is 3-6. -- 8 Solutions for n g will not always be integer values, thus the solutions are rounded up for minimum values and rounded down for maximum values. This step can be bypassed by placing another constraint on the problem where n 1 , n 2 ? n g are defined as integer values. -- 9 Generate all the feasible formulations and perform the final checks that all property constraints are satisfied -- Table 3.5: Outline of algebraic molecular cluster approach. 98 3.4.1. Proof of Concept Example To highlight the different aspects of this new algebraic molecular clustering method, a simple design problem is presented. Problem statement: Given a system described by critical volume (V c ), heat of vaporization (H v ) and heat of fustion (H fus ) and the following molecular fragments as building blocks: CH 2 and OH, identify molecular formulations that will satisfy the following performance requirements: 310 ? V c (cm 3 /mol) ? 610 90 ? H v (kJ/mol) ? 120 20 ? H fus (kJ/mol) ? 64 450 ? T b (K) ? 560 (3.51) g Group FBN V c (cm 3 /mol) H v (kJ/mol) H fus (kJ/mol) T b (K) 1 CH 2 2 56.28 4.91 2.64 0.9225 2 OH 1 30.61 24.21 4.79 3.21 Table 3.6: Property data for each molecular group. The Group Contribution (GC) property data of the molecular groups is given in Table 3.6. In addition, the additive rules for the molecular operators of the targeted properties are represented by equation 3.52 (Constantinou and Gani, 1994; Marrero and Gani, 2001). The formulation of the operators from GC property models is outlined in the original molecular clustering framework (Section 3.2.3; Eljack et al., 2006). 99 1 1 0 c N g gcc vnvV g ?=? ? = 1 1 0 v N g gvv hnhH g ?=? ? = (3.52) 1 1 0 fus N g gfusfus hnhH g ?=? ? = 1 1 exp b N g g bo b tn t T g ?= ? ? ? ? ? ? ? ? ? = Other constraints are placed on the problem, i.e. the maximum length of the molecule can not exceed 15 fragments and no cyclical compounds should be formed. Given equations 3.51 and 3.52, and the information in Table 3.6, the data for the four properties: critical volume, heat of vaporization, heat of fusion and boiling temperature (1, 2, 3, 4) can be transformed to ? 1 , ? 2 , ? 3 , ? 4 using the normalized property operator definition (equation 3.30) along with the following reference values (20, 1.0, 0.5, 7.0), respectively. The same reference values are also used to convert the group data given in Table 3.6. These values were selected in order to keep the operators in the same order of magnitude. The resulting ? values for all four property constraints are shown in Table 3.7. The AUP range of the feasibility region (sink) was calculated to be 141.19 ? 273.27. ? Vc ? Hv ? Hfus ? Tb ? min 15.105 78.26 45.612 1.291 ? max 30.102 108.26 4.133 2.213 Table 3.7: Calculated ? for the given property constraints. 100 Next, the provided data along with equation 3.46 are used to generate the main set of linear inequality equations, from which eight subsets are generated. The equations involved in subset one according to equation 3.47 are provided below in equation 3.53. The remaining 7 subsets are generated in the same way. Finally the structural constraints are given in equation 3.54 and 3.55. 29.1459.0131.0 61.45571.928.5 26.7821.2491.4 10.3053.181.2 21 21 21 21 ??+? ??+? ??+? ??+? gg gg gg gg (3.53) 0 1 ?g , 0 2 ?g , 15 21 ?+ gg (3.54) [][ ] 012 212211 =?+???+? ggFBNgFBNg (3.5) The results from solving the subsets equations are summarized in Table 3.8. The solutions to the minimization problem of subsets 2, 5, 7 and 8 are excluded because their AUP values are outside the AUP range of the feasibility region. The results show that HO-(CH 2 ) 7 -OH, HO-(CH 2 ) 8 -OH, and HO-(CH 2 ) 9 -OH are the formulations that satisfy all of the property and structural constraints. The true physical properties for the three candidate molecules were back calculated from the operator values of the solution. 101 Subset g 1 +g 2 g 1 g 2 Objective FBN ? 1 ? 2 ? 3 ? 4 AUP 1 8.1 6.1 2 min 0 20.2 78.3 51.2 1.7 151.4 11.6 9.6 2 max 0 30.1 95.6 69.9 2.2 197.8 2 7 5 2 min 0 17.2 73.1 45.6 1.6 137.4 14.2 12.2 2 max 0 37.4 108.3 83.5 2.5 231.6 3 8.1 6.1 2 min 0 20.2 78.3 51.2 1.7 151.4 15 13 2 max 0 39.6 112.3 87.8 2.6 242.3 4 8.1 6.1 2 min 0 20.2 78.3 51.2 1.7 151.4 15 13 2 max 0 39.6 112.3 87.8 2.6 242.3 5 6.3 4.3 2 min 0 15.1 69.4 41.7 1.5 127.8 11.8 9.8 2 max 0 30.7 96.7 71.0 2.2 200.6 6 8.1 6.1 2 min 0 20.2 78.3 51.2 1.7 151.4 11.6 9.6 2 max 0 30.1 95.6 69.9 2.2 197.8 7 7 5 2 min 0 17.2 73.1 45.6 1.6 137.4 11.6 9.6 2 max 0 30.1 95.6 69.9 2.2 197.8 8 4.8 2.8 2 min 0 11.0 62.3 34.1 1.3 108.8 11.6 9.6 2 max 0 30.1 95.6 69.9 2.2 197.8 Table 3.8. Result of solving to the molecular synthesis problem In this section, an algebraic technique for molecular design based on molecular property clusters has been introduced. Using the developed concepts of molecular property operators (Section 3.2.3), this algebraic approach extends the application range of the original methodology to include more than three properties. The molecular design problem is solved to identify all possible formulations within the design space given a set of molecular building blocks as well as property and structural constraints. The linearity of the molecular operators plays an important role as it helps in lowering the complexity 102 of the design problem. The design problem is formulated as a set of linear algebraic equations. A simple example that had constraints in terms of four properties was solved successfully using this technique. The developed algebraic approach can be applied to problems that require both a higher number of properties as well as additional groups. The resulting optimization problems are simply larger, but would still consist of linear algebraic equations, thus lowering the complexity from a MINLP to a LP problem (Eljack et al., 2007a). 103 4. Molecular Synthesis Application Examples 4.1. Example 1 ? Aniline Extraction Solvent Design Liquid-liquid extraction involves the extraction of a substance from one liquid phase to another based on solution preferences. The success of the extraction is dependent on the immiscibility of the two liquid and the component?s affinity for one substance over the other. Often one of the liquid phases is an aqueous solution and the other is an organic solvent. The selectivity of a suitable solvent is essential, so that the solute in the bulk solution (aqueous phase) has more affinity towards the added solvent, allowing for mass transfer of the solute from the bulk solution to the solvent. In this molecular design application example, an aqueous solution containing Aniline is investigated. It is required to remove Aniline from solution in order to achieve product specifications (Eden et al., 2002) 4.1.1. Problem statement Identify a solvent that will extract aniline from an aqueous solution. The required solvent characteristics include: immiscible with water, its solubility in water must be below that of Aniline?s, and there needs to be a difference between the boiling points of the solvent and aniline to allow for the regeneration of the solvent after extraction. All the property data and molecular groups given as starting building blocks are summarized in Table 4.1. 104 Property Lower Bound Upper Bound Molecular Building Blocks T b (K) 350 431 CH 3 CH 2 H v (kJ/mol) 36.7 46.8 CH 2 CO CH 3 CO V m (cm 3 /mol) 115 180 CH 2 O CH 3 O R ij (MPa 1/2 ) - 24 Table 4.1: Property data and molecular groups for aniline design problem 4.1.2. Molecular Synthesis The system is described by boiling temperature (T b ), heat of vaporization (H v ), molar volume (V m ) and solubility parameter (R ij ). The visual tool only allows for three properties, thus the first three properties are used in the design of the solvent and the last is chosen as a screening criteria. Property operators and the corresponding reference values required to transform the design data into molecular property clusters are given by equations 4.1-4.3. 1 1 exp b N g g bo tn t T g ?= ? ? ? ? ? ? ? ? ? = T b, ref = 0.1 K (4.1) 1 1 v N g gvov hnhH g ?=?? ? = H v, ref = 0.5 kJ/mol (4.2) 1 1 vndV g N g gm ?=? ? = V m, ref = 2.0 cm 3 /mol (4.3) 105 Solubility is measured as a function of the interactions between two substances (i,j). According to Hansen (1967), in a three-dimensional space solubility of component i is a function of non-polar (? d i ), polar (? p i ) and hydrogen-bonding (? h i ) parameters. Solubility of component i in j is considered feasible if the radius of interaction (R ij ) of i is found within that of j, R ij ? R j . According to Barton (1985), the solubility parameter can be calculated according to the following (see Appendix B): ()( ) ( ) 222 4 j h i h j p i p j d i d ij R ?????? ?+?+?= (4.4) According to the Hoftyzer and Van Krevelen method, Hansen solubility parameters, based on group contributions, may be predicted using the following set of equations (Van Krevelen , 1976): m hi h m pi p m di d V E V F V F ? ? ? === ??? 2 (4.5) F di , F ji , E hi are contributions from group i for calculating dispersion, polar, and hydrogen component solubility parameters, respectively (see Appendix B). Note that the molar volume (V m ) of a molecule is estimated by group contribution methods (Constantinou et al., 1995). i i i vgdV 1 ?=? ? (4.6) 106 The bounded search space (sink) represented by the dotted line in Figure 4.1 is determined by six unique points, according to Rule 1. Figure 4.2 illustrates the molecular fragments included in the solvent synthesis. Eight different molecules are formulated (M1-M8) (see Figure 4.3). All of the candidates are structurally sound molecules whose locus lies within the sink. Next, the AUP values of the candidates are checked to see if they lay within the AUP range of the sink 182.4 - 215.9. Candidates M7 and M8 fail to satisfy this condition (see Table 4.2); the remaining molecules, M1-M6, satisfy all other criteria. 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 Feasibilty Region Figure 4.1: Feasibility region for aniline extraction solvent 107 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 G1 Feasibilty Region G6 G5 G2 G4 G3 Molecular Groups G1: CH 3 G2: CH 2 G3: CH 2 CO G4: CH 3 CO G5: CH 2 O G6: CH 3 O Figure 4.2: Aniline extraction solvent synthesis problem 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 M5 M1 M4 M7 M6 M3 M8 M2 Candidate Molecules M1 : CH 3 -(CH 2 ) 4 -CH 3 M2 : CH 3 -(CH 2 ) 5 -CH 3 M3 : CH 3 -(CH 2 ) 6 -CH 3 M4 : CH 3 -(CH 2 ) 7 -CH 3 M5 : CH 3 -(CH 2 ) 3 -CH 2 CO-CH 3 M6 : CH 3 -(CH 2 ) 4 -CH 3 CO M7 : CH 3 -(CH 2 ) 3 -CH 3 O M8 : CH 3 -(CH 2 ) 8 -CH 2 O-CH 3 Figure 4.3: Candidates formulated for aniline extraction solvent 108 T b H v V m R ij Formulations AUP (K) (kJ/mol) (cm 3 /mol) (MPa 1/2 ) M1 153.8 347.2 31.8 130.0 15.5 M2 181.0 379.1 36.7 146.4 15.3 M3 208.3 406.6 41.6 162.9 15.1 M4 235.5 430.9 46.5 179.3 15.0 M5 218.4 436.0 46.3 141.8 14.9 M6 215.7 428.7 46.8 140.4 9.6 M7 310.6 486.0 61.4 218.8 10.7 M8 154.6 363.1 32.5 120.2 11.6 Table 4.2: Candidate solvents for aniline extraction The final step is to determine the solubility parameter values for each of the designed molecules to address screening criterion constraints. The calculation of the solubility parameters and all required information is provided in Appendix B. Table 4.2 summarizes the four property values of the designed molecules. The solubility for all candidates is below that of aniline (R ij ? 24 MPa 1/2 ), thus the generated candidates satisfy the screening criteria. For verification, the property values predicted by the algorithm which are based on the contributions of the individual molecular fragments are checked against experimental values, see Table 4.3. The predicted values for the 1 st order group contribution properties boiling temperature (T b ) and molar volume (V m ) are within 98% of the experimental values, the heat of vaporization (H v ) fall within 85% range, and those for Hansen?s solubility parameter within 95 %. The precision of the property prediction method (1 st order GCM) is sufficient for the design tools developed here as the method is intended as a first-cut conceptual design approach to determining feasible candidates. 109 The precision of the predicted properties plays a role in subsequent steps of validation as additional analysis and screening is certainly needed to refine the list of candidates and to rule out impractical alternatives. In addition, the accuracy of the predicted properties is only dependent on the group contribution models and is not a reflection of the presented molecular clustering algorithm. Property 1 T b (K) 1 H v (kJ/mol) 1 V m (cm 3 /mol) 2 R ij (MPa 1/2 ) Solvents Predicted Values n-hexane 347.2 31.8 130.0 15.5 n-heptane 379.1 36.7 146.4 15.3 n-octane 406.6 41.6 162.9 15.1 n-nonane 430.9 46.5 179.3 15.0 Experimental Values n-hexane 342.2 37.6 130.5 14.9 n-heptane 371.6 42.6 147.4 15.3 n-octane 398.9 45.9 15.5 n-nonane 423.7 50.5 179.7 15.6 Predicted Value Percent Error ? compared to experimental n-hexane 1.5% 15.4% 0.4% 4.1% n-heptane 2.0% 13.8% 0.7% 0.1% n-octane 1.9% 9.3% - 2.5% n-nonane 1.7% 7.8% 0.2% 4.0% Experimental data obtained from the following references: 1 Perry's Chemical Engineering Handbook (Green, 1997) 2 Handbook of Solubility Parameters (Barton, 2000) Table 4.3: Accuracy of predicted properties values Formulations 2-heptanone (M6) and 4-heptanone (M5), from the provided list of candidates, posses the same number of atoms (C 7 H 14 O), although the molecules are synthesized from different first order fragments. The molecular makeup of 2-heptanone 110 involves CH 3 , CH 2 , and CH 3 CO; while 4-heptanone includes CH 3 , CH 2 and CH 2 CO. Hence, the molecular clustering technique is able to synthesize isomers; but in order to differentiate between the isomers in terms of predicted properties, the inclusion of 2 nd and 3 rd order groups is critical. Section 6.2 focuses on this issue as part of the future directions for advancing the molecular clustering methodology. 4.2. Example 2 - Blanket Wash Solvent Design Lithographic printers are used to print a variety of products including books and newspapers. Offset presses in industry transfer the printed image from a plate to a rubber or plastic blanket and then to the paper or other medium being used. The produced quality images are greatly dependent on the cleanliness of the blanket. Blanket washes, consisting of a variety of solvents, are used to remove ink, paper dust, and other debris from the blanket cylinders. They are generally petroleum-based solvents that consist of volatile organic compounds (VOCs), which are found in the printing ink as well. Reasonably, there is a lot of concern regarding the effects of such solvents on the environment as well as the direct effect on human health. Blanket solvents are designed to address specific needs: Minimal drying time, liquid at room temperature, low vapor pressure (VP), and to dissolve the ink, solubility (R ij ) of the solvent is an important factor. Such demands on product performance can be described using properties. The drying time is related to the heat of vaporization (H v ), and the state of the solvent at room temperature is directly related to melting (T m ) and boiling (T b ) temperatures. 111 4.2.1. Problem Statement The objective of this case study is to design a blanket wash solvent for a phenolic resin printing ink, specifically ?Super Bakacite ? 1001, Reichold?. Formulations are designed from a bank consisting of 7 possible groups, with a maximum formulation length of 7 groups. The design of the solvent involves the properties and constraints listed in Table 4.4. Sinha and Achenie (2001) solved this design problem as a mixed- integer non-linear programming problem (MINLP). In this paper, the problem is solved using the developed molecular property clusters (Eljack and Eden, 2007). Property Lower Bound Upper Bound Hv (kJ/mol) 20 60 Tb (K) 350 400 Tm (K) 150 250 VP (mmHg) 100 --- R ij 0 19.8 Table 4.4: Property constraints for blanket wash solvent 4.2.2. Property Prediction (GCM) Visualization of the design problem on the ternary diagram, dictates the use of only three properties. In this approach heat of vaporization, boiling and melting temperatures are used, with vapor pressure and solubility used as final screening properties. First order group contribution (GCM) equations are used to predict the first three properties (Constantinou and Gani, 1994): ? ?=?? i viivov hghH (4.7) 112 ? ? ? ? ? ? ??= ? i biibob tgtT exp (4.8) ? ? ? ? ? ? ??= ? i miimom tgtT exp (4.9) Group contribution (GCM) does not include vapor pressure in its bank of properties. Vapor pressure is predicted using the McGowon Hovarth Equation, as a function of boiling and operating temperatures (Sinha and Achenie, 2001): 7.1 7.258.5)(log ? ? ? ? ? ? ? ? ?= T T mmHgVP bp (4.10) The effectiveness of the designed solvent is greatly dependent on its power to dissolve the ink, i.e. it is dependent on the solubility power of the designed solvent. The interactions between phenolic resin molecules (solute) with the solvent molecules are very important in this design problem. Solubility is determined according to Hansen parameters (see equations 4.4 ? 4.6, Appendix B). 4.2.3. Molecular Property Operators Now the properties used to achieve the targets are heat of vaporization (H v ), boiling (T b ) and melting temperature (T m ). Notice that only these properties are used initially, again this is to be able to visualize the design problem on the ternary diagram, however an algebraic and optimization based approach to solve molecular design 113 problems with more than three properties has recently been introduced by Eljack and Eden (2007). The other properties, vapor pressure and solubility are non-group contribution properties, and thus will be used post molecular synthesis to screen the designed solvents. The property operators derived from equations 4.7-4.9 and their reference values are summarized in Table 4.5. Notice again that RHS of the equations allow/exhibit linear additive rules. Property LHS of equation M j ? RHS of equation 1 st order GC expression Reference values Standard Heat of Vaporization ?H v - h vo v N g g hn g ? ? =1 20 Normal Boiling Temperature ? ? ? ? ? ? ? ? bo t T exp b N g g tn g ? ? =1 7 Melting Temperature ? ? ? ? ? ? ? ? mo t T exp m N g g tn g ? ? =1 7 Table 4.5: Property operators for blanket wash solvent problem. 4.2.4. Molecular Synthesis The problem is visualized by converting the property targets to cluster values following the methodology described in Section 3.2. The property constraints are represented as a feasibility region, which is identified according to the feasibility rules highlighted in Section 3.1.5. The resulting ternary diagram is shown in Figure 4.4, where the dotted lines represent the feasibility region for the solvent design. 114 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C 3 C 2 C 1 Feasibility Region Figure 4.4: Feasibility region for blanket wash solvent problem. 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C 3 C 2 C 1 G1 G2 G3 G4 G5 G6 G7 Molecular Groups G1: CH 3 G2: CH 2 G3: CH 2 O G4: CH 3 O G5: CH 2 CO G6: CH 3 CO G7: COOH Figure 4.5: Blanket wash solvent synthesis problem 115 Notice that even though some of the property operators formulated earlier are very complex, molecular synthesis on the ternary diagram is still simple because these operators obey simple linear additive rules. It should be noted again, that the location of the formulated molecules is independent of group addition path. The molecules that will be designed in this domain can be made up of seven chemical groups. Carboxyl and methyl groups are amongst the selection. All the groups used in this synthesis problem are presented on Figure 4.5. It should be emphasized that these are the same fragments used by Sinha and Achenie (2001) to solve their MINLP problem. A number of candidate molecules, M1-M11, are formulated on the ternary diagram, see Figure 4.6, however to exhaust all possible combinations of molecular building blocks to provide a complete list of candidates requires the development of a software implementation. The validity of the designed formulations is satisfied only after satisfying the conditions summarized by Rules 4-8 in Section 3.2. The cluster values of the designed molecules are checked to make sure that they lie within the sink. The values of the augmented property index (AUP) of the designed molecule must lie within the AUP range of the sink; which in this example ranged from 2.28-5.09, see Table 4.6. It can be seen that molecules M9-M11 fail to satisfy this condition. The final necessary and sufficient conditions is that the property values of the remaining new formulations must lie within the upper and lower constraints placed on the molecular design problem, which includes the Non-GC properties. The property values for the new formulations are back calculated using the methodology outlined earlier in Section 3.2. The remaining formulated solvents satisfy the necessary and sufficient conditions. Consequently, M1-M8 are the final valid formulations shown in Figure 4.7. 116 The candidate molecules M1-M7 identified visually in this work corresponds to the solutions found by the MINLP approach used by Sinha and Achenie (2001). Formulation M1 is a cyclical molecule. Such molecules can be excluded ahead of design by simply placing another constraint on the problem. Molecules M2, M3 and M4 are ethers and M7 is known as methyl ethyl ketone (MEK), commonly found in printing inks (Sinha and Achenie, 2001). The key here is that blanket wash solvents are usually ketones or ethers, these aforementioned formulations are common components in commercial blanket wash solvents (United States Environmental Protection Agency (EPA), 1997). The final valid formulation, M8 is heptane and although the property values match the targets; it is not an ideal solvent for this case because it is highly flammable (Material Safety Data Sheet, 2006). Formulations AUP H v (kJ/mol) T b (K) T m (K) VP (mmHg) R ij (MPa 1/2 ) M1 3.20 33.91 359.14 201.24 1117.85 10.88 M2 3.08 33.99 355.34 189.86 1240.95 15.39 M3 3.10 34.67 364.49 183.38 963.57 12.83 M4 3.17 35.81 363.09 186.26 1001.85 18.09 M5 3.47 36.15 370.61 211.95 811.54 15.82 M6 3.61 36.74 382.51 216.49 578.05 11.31 M7 3.17 35.10 354.80 193.13 1259.28 16.84 M8 3.28 38.31 379.07 175.55 638.03 19.77 M9 6.79 68.87 457.92 286.76 56.69 13.83 M10 7.79 78.17 494.54 297.68 16.55 12.68 M11 7.83 74.75 535.55 292.08 3.86 11.33 Table 4.6: Candidate blanket wash solvents. 117 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C 3 C 2 C 1 M1 M8 M9 M10M11 M7 M3 M6 M4 M5 M2 Candidate Molecules M1 : CH 2 O-CH 2 -(CH 2 O) 2 M2 : CH 3 -CH 2 -CH 2 O-CH 3 O M3 : CH 3 -CH 2 -(CH 2 O) 2 -CH 3 M4 : CH 3 O-(CH 2 ) 3 -CH 3 M5 : CH 3 O-CH 2 O-CH 3 O M6 : CH 2 O-(CH 2 O) 2 -CH 2 O M7 : CH 3 -CH 2 CO-CH 3 M8 : CH 3 -(CH 2 ) 5 -CH 3 M9 : CH 3 CO-COOH M10: CH 3 CO-(CH 2 ) 2 -COOH M11: CH 3 -CH 2 -(CH 2 O) 2 Figure 4.6: Candidate formulations for blanket wash solvent . 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C 3 C 2 C 1 M1 M8 M7 M3 M6 M4 M5 M2 Valid Formulations M1 : CH 2 O-CH 2 -(CH 2 O)2 M2 : CH 3 -CH 2 -CH 2 O-CH 3 O M3 : CH 3 -CH 2 -(CH 2 O)2-CH 3 M4 : CH 3 O-(CH 2 )3-CH 3 M5 : CH 3 O-CH 2 O-CH 3 O M6 : CH 2 O-(CH 2 O)2-CH 2 O M7 : CH 3 -CH 2 CO-CH 3 M8 : CH 3 -(CH 2 )5-CH 3 Figure 4.7: Valid formulations for blanket wash solvents. 118 4.3. Summary A significant result of the developed methodology is that for problems that can be satisfactorily described by just three properties, the molecular design problem is solved visually on a ternary diagram, irrespective of how many molecularly fragments are included in the search space. This solvent design case study also showed how regardless of the group addition path chosen, the final location of a designed formulation on the molecular ternary diagram is unique. 119 5. Simultaneous Process and Molecular Design The molecular clustering technique was initially developed as a means of providing a bridge that will facilitate the flow of information between process and molecular design algorithms. In this chapter two application examples are solved used the developed simultaneous technique. 5.1. Application Example 1 - Metal Degreasing Process A case study is presented here to show the merits of using the simultaneous approach via GCM and property clusters. Figure 5.1 illustrates a metal degreasing facility that consists of an absorber and a degreaser. The fresh resources are in the form of two organic solvent streams (Shelley and El-Halwagi, 2000). The off-gas Volatile Organic Compounds (VOCs) are byproducts from the degreasing unit, and the current treatment of this stream is flaring. Such treatment methods lead to economic loss and environmental pollution (Eden, 2003). 120 Figure 5.1: Original metal degreasing process. In this case study, the objective is to explore the possibility of condensing the off gas VOCs, to (1) optimize the use of fresh solvent and (2) to simultaneously identify candidate solvents for the degreaser (see Figure 5.2). Three properties are examined to determine the suitability of a given organic process fluid for use in the degreaser: ? Sulfur content (S) - for corrosion considerations, expressed as weight percent. ? Molar Volume (V m ) - for hydrodynamic and pumping aspects. ? Vapor Pressure (VP) ? for volatility, makeup and regeneration. The synthesized solvents will be pure components; thus the sulfur content of these streams will be zero, as no sulfur containing groups will be included in the fragment search space. 121 Figure 5.2: Metal degreasing process after property integration 5.1.1. Process Design The constraints on the inlet streams to the degreaser are given in Table 5.1: Property Lower Bound Upper Bound S (%) 0.00 1.00 V m (cm 3 /mol) 90.09 487.80 VP (mmHg) 1596 3040 T b (K) 430.94 463.89 Flow rate (kg/min) 36.6 36.8 Table 5.1: Degreaser feed constraints 122 The process property operator mixing rules needed to describe the system are given by the following equations: ? = ?= Ns s ssM SxS 1 , S ref = 0.5 wt% (5.1) ? = ?= Ns s smsm VxV M 1 , V m ref = 80 cm 3 /mol (5.2) ? = ?= Ns s ssM VPxVP 1 44.144.1 , VP 1.44, ref = 760 mmHg (5.3) Samples of the off-gas were taken, and then condensed at various temperatures ranging from 400-550 K, providing measurements of the three properties as well as the flowrate of the condensate (Shelley and El-Halwagi, 2000). The data for the degreaser unit and for the VOC condensate are converted to cluster values according to cluster methodology developed by Eden et al. (2004) and discussed in Section 3.1.2 (see Figure 5.3). The degreaser property constraints are translated into a feasibility region according to the procedure highlighted in Section 3.1.5. Now that the problem has been mapped to the property domain and visualized on the ternary diagram, some additional constraints are placed on the process: The condensation temperature is set to 500K and the fresh synthesized solvents have no sulfur containing groups. By fixing the condensation temperature at 500K, the locus of possible solvents is bound by straight lines between the condensate and points A and B (see Figure 5.4). This adheres to the first constraint. Applying the second constraint on the process (no sulfur in fresh solvent), shows that the cluster solution to the degreaser 123 problem corresponds to all points between points A and B on the C 2 -C 3 axis (Eljack et al., 2007b). 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 505 K 500 K 495 K490 K 485 K480 K 510 K 515 K DEGREASER CONDENSATE Figure 5.3: Metal degreasing problem in process design 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C 3 C2 C 1 505 K 500 K 495 K 490 K 485 K480 K 510 K 515 K DEGREASER CONDENSATE POINT A POINT B Figure 5.4: Property targets of solvent for maximum condensate recycle. 124 5.1.2. Molecular Design: Fresh Solvent Synthesis Once all the constraints have been taken into account and the property targets for molecular formulations have been fixed by the process design problem the second phase of this case study begins. The cluster values associated with points A and B from the clustering diagram in Figure 5.4, are translated to physical property values using the methodology developed by Shelley and El-Halwagi (2000) and Eden (2003). These property targets obtained from solving the process design problem are now the upper and lower property constraints placed on the solvent/molecular design problem, see Table 5.2. The zero sulfur constraint placed on the problem provides an extra degree of freedom. So a heat of vaporization constraint is now placed on the fresh solvent problem. Now the properties used to describe the problem are heat of vaporization (H v ), boiling temperature (T b ) and molar volume (V m ). Notice that boiling temperature is used instead of vapor pressure since there is no direct group contribution method for predicting vapor pressure. However, according to equation 4.10, vapor pressure is a function of boiling temperature. Hence, the vapor pressure property constraints are converted to boiling temperature upper and lower limits. All of the property constraints on the molecular design problem are now shown in Table 5.3. 125 S (%) VP (mmHg) V m (cm 3 /mol) Point A 0.00 1825.4 720.8 Point B 0.00 3878.7 102.1 Table 5.2: Property constraints obtained from process design problem. Property Lower Bound Upper Bound H v (kJ/kg) 50 100 VP (mmHg) 1825.4 3878.7 V m (cm 3 /mol) 90.1 720.8 T b (K) 418.01 457.16 Table 5.3: Revised property constraints for fresh solvent synthesis. The physical properties are predicted using the following 1 st order group contribution equations: ? ?+=? i vivov i hnhH 1 (5.4) i vndV im 1 ?+= ? (5.5) ? ??= i bibobo tntT 1 ln (5.6) The property operators derived from the above equations and their reference values are summarized in Table 5.4. Notice again that RHS of the equations exhibit linear additive rules. 126 Property LHS of equation M j ? RHS of Equation 1 st order GC Expression Reference values Standard Heat of Vaporization ?H v - h vo 1 1 v N g g hn g ? ? = 20 Molar Volume V m - d 1 1 vn g N g g ? ? = 100 Normal Boiling Temperature ? ? ? ? ? ? ? ? bo t T exp 1 1 b N g g tn g ? ? = 7 Table 5.4: Property operators needed for molecular synthesis The problem is visualized by converting the property targets to cluster values following methodology described in Table 3.4. The property constraints are represented as a feasibility region, as outlined in Section 3.1.4. The resulting ternary diagram is shown in Figure 5.5, where the dotted lines represent the feasibility region in the molecular design domain. The molecules to be designed can be made up of eight chemical groups. Carboxyl, methyl, and amine groups are amongst the selection. All the groups used in the molecular synthesis problem are shown in Figure 5.5. Notice that even though some of the property operators formulated earlier are very complex, molecular synthesis on the ternary diagram is still simple because these operators obey simple linear additive rules. Seven candidates, M1-M7, are formulated for this solvent design problem (see Figure 5.6). However, the validity of the formulations is satisfied only after satisfying the conditions summarized by Rules 4-8 in Section 3.3. The cluster values of the designed molecules, M1-M7, are checked to make 127 sure that they lie within the sink. The values of the augmented property index of each designed molecule must lie within the AUP range of the sink; in the degreaser case study the AUP range of the sink is 4.22-12.65, see Table 5.5. It is seen that molecules M5 and M6 fail to satisfy this condition. 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 G1 G2 G3 G6 G5 G4 G7 Molecular Groups G1: CH 3 G2: CH 2 G3: CH 2 O G4: CH 2 N G5: CH 3 N G6: CH 3 CO G7: COOH G8: CCl Figure 5.5: Metal degreasing solvent problem. The final necessary and sufficient condition is that the property values of the new formulations must lie within the upper and lower constraints placed on the molecular design problem, which includes the Non-GC property constraints. The property values for the new formulations are back calculated using the methodology outlined in Section 3.3. Molecule M3 did not match the heat of vaporization property range of the sink; and although M7 satisfies the three GC properties, H v , V m and T b , it fails to satisfy the Non- GC property for vapor pressure. 128 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C 3 C 2 C 1 M1 M2 M3 M4 M5 M6 M7 Candidate Molecules M1 CH 3 -(CH 2 ) 5 -CH 3 CO M2 CH 3 CO-(CH 2 ) 2 -CH 3 CO M3 (CH 3 ) 3 -(CH 2 ) 5 -CH 2 N M4 CH 3 -(CH 2 ) 2 -COOH M5 (CH 3 ) 2 -CH 3 CO-CCL M6 -(CH2O) 5 -ring M7 CH 3 -(CH 2 ) 2 -CH 3 N-COOH Figure 5.6: Candidate metal degreasing solvents. Formulation AUP T b (K) H v (kJ/mol) V m (cm 3 /mol) VP (mmHg) M1 5.06 450.58 53.19 156.85 2078.98 M2 4.71 448.54 54.13 118.03 2163.90 M3 5.11 437.29 49.35 189.41 2692.07 M4 4.86 438.97 63.29 93.39 2606.12 M5 4.02 413.20 43.88 121.14 4241.48 M6 4.19 428.11 44.22 127.66 3208.12 M7 5.71 485.01 70.24 112.52 1037.99 Table 5.5: Candidate molecules for metal degreasing problem. Consequently, M1, M2, and M4 are the final valid formulations. After searching the ICAS database (CAPEC 2006), M1, M2 and M4 correspond to 2-octanone, 2,5- hexadione, and butanoic acid respectively. The valid molecular structures are shown in 129 Figure 5.7. The three candidates are mapped back to the process design framework to identify the formulation that will maximize recycle of condensate at 500K (Eljack et al., 2007b). Using lever arm analysis, 19.36 kg/min of fresh 2,5-hexadione will allow for a maximum condensate flow rate of 17.44 kg/min. 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C 3 C 2 C 1 500 K DEGREASER CONDENSATE M1 M2 M4 HO O O O O 2,5-hexadione (M2) 2-octanone (M4) butanoic acid (M1) Figure 5.7: Selection of metal degreasing solvent. 5.1.3. Summary This case study illustrates a systematic property based framework for simultaneous solution of process and molecular design problems. Using property clusters, the process design problem is solved to identify the property targets corresponding to desired process performance. The molecular design problem is solved to generate structures that match these targets. The approach provides a unifying framework that uses the physical properties to interface the process and molecular design problems. 130 5.2. Application Example 2 ? Gas Purification 5.2.1. Problem Statement A current gas treatment process uses fresh methyl diethanol amine, MDEA, (HO- (CH 2 ) 4 -CH 3 N-OH) and two other recycled process sources (S1, and S2) as a feed into the acid gas removal unit (AGRU). Another process stream, S3, currently a waste stream could be recycled as a feed if mixed with a fresh source to allow the mixed stream properties to match the sink (Kazantzi, 2006, Kazantzi et al., 2007). The property and flowrate data for all streams (S1, S2 and S3) and the sink are summarized in Table 5.6. Design objectives and requirements: identify a solvent that will replace MDEA as a fresh source and that will maximize the flowrate of all available sources (S1, S2 and S3). The solvent must then posses similar characteristics to that of MDEA and thus the molecular building blocks are limited to OH, CH 3 N and CH 2 . The designed solvent should be a diol in order to posses MDEA characteristics. The sink performance requirements are functions of critical volume (V c ), heat of vaporization (H v ) and heat of fusion (H fus ). Property Lower Bound Upper Bound S1 S2 S3 V c (cm 3 /mol) 530 610 754 730 790 H v (kJ/mol) 100 115 113 125 70 H fus (kJ/mol) 20 40 15 15 20 Flowrate (kmol/hr) 300 50 70 30 Table 5.6: Property data for gas purification example 131 5.2.2. Process Design The first step in implementing the simultaneous clustering approach requires the transformation of all process sources and sinks from the property domain to the cluster, using equations 3.2, 3.4-6. The process property operator mixing rules for the three properties critical volume, heat of vaporization and heat of fusion (? 1 , ? 2 , ? 3 ) are defined by the following equations: ? = ?= Ns s scsMc VxV 1 , V c, ref = 2.5 cm 3 /mol (5.7) ? = ?= Ns s svsMv HxH 1 , H v, ref = 0.35 kJ/mol (5.8) ? = ?= Ns s s fuss M fus HxH 1 , H fus, ref = 0.10 kJ/mol (5.9) Boundary constraints of the sink will be determined according to Rule 1 by six unique points seen as FP1-FP6 on Figure 5.8; while the sources are represented by discrete points. Notice the lumped source (S L ) point on the diagram; it represents the mixture property value of the three recycle streams (S1, S2, S3); the resulting data is shown in Table 5.7. 132 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 Original Process Feasibility Region S L S3 S2 S1 FP6 FP1 FP2 FP3 FP4 FP5 Figure 5.8: Gas purification process ? feasibility regions and streams Source V c cm 3 /mol H v kJ/mol H f kJ/mol Flowrate kmol/hr ? 1 ? 2 ? 3 AUP S1 754 113 15 50 301.6 322.9 150 774.5 S2 730 125 15 70 292 357.1 150 799.1 S3 790 70 20 30 316 200.0 200 716.0 Lumped source (S L ) 750 110 16 150 300 314.3 160 774.3 Table 5.7: Mixture property data of lumped source (S L ) 133 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 FP6 FP2 A Lumped Source B C D FP1 FP3 Feasibilty Region Considering S L Recycle Stream in the Feed Original Feasibility Region Considering Zero Flowrate of Recycle Streams Figure 5.9: New feasibility region ? reflects mixture/blend design constraints The synthesis of new molecules is dependent on the process constraints; and it will be designed as a blend/mixture formulation. Two streams will be recycled to the process sink, the lumped source (S L ) at 150 kmol/hr and the newly designed solvent at a rate of 150 kmol/hr in order to fulfill the 300 kmol/hr flowrate constraint of the sink. In Figures 5.8 and 5.9, there are two feasibility regions. The first reflect the sink?s original property demands as seen in Table 5.6, and the second is the newly defined search space that integrates the process requests to have the new designed molecule mix with the lumped source stream at a fractional flowrate contribution (x L ) of 0.5. 134 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 FP6 FP2 A FP3 Lumped Source B C D FP1 Feasibilty Region Considering S L Recycle Stream in the Feed Original Feasibility Region Considering Zero Flowrate of Recycle Streams Figure 5.10: Identification of mixture (new) feasibility region 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 Original Feasibility Region wihout Considering Recycle Streams Feasibilty Region Considering SL Recycle Stream in the Feed FP6 FP2 A FP3 Lumped Source (S L ) B C D FP1 ? SL FP4 FP5 ? 1 2 Mixture Point 1 M Figure 5.11: New feasibility region ? Gas Purification Example 135 The mixed feasibility region points (A, B, C and D) can be easily determined using lever arm analysis. Taking advantage of the visual aid, it is easily determined that the new region is bounded by points [FP4, FP3, A, C, D, B, FP6 and FP5]. Points A ? D are the only unknown points, the remaining are already established. The cluster values of points A ? D are calculated using the lever-arm rules (Section 3.1.2). An example step by step calculation is shown here for the case of determining point A. For generalization, the line segment connecting points S L and A in Figure 5.11 has been magnified, with points S L and A, shown as points 1 and 2 respectively in the magnification. FP3 on the line marks the location of the mixture point, now represented by M. The cluster values for points 1 and M are given in Table 5.8 The mixture point M also marks the location of the relative cluster arm ? 1 , in the magnification. Given that, x 1 , AUP 1 and AUP M are known; equation 3.13 is used to calculate the value of ? 1 . M ss s AUP AUPx ? =? (3.13) Next, the cluster values (C 12 , C 22 and C 32 ) for point 2 on Figure 5.11 are calculated according to the cluster conservation rule. Expanding equation 3.12, results in the following: ? = ?= s N s jssjM CC 1 ? (3.12) 136 3213113 2212112 1211111 )1( )1( )1( CCC CCC CCC M M M ??+?= ??+?= ??+?= ?? ?? ?? (5.10) The steps outlined above are used to determine the remaining points B ? D (see Table 5.8). The six cluster points and their respective property values that bound the new feasibility region are summarized in Table 5.9. The property values are back calculated from the property operator expressions and reference values (equations 5.7 ? 5.9) given in Section 5.2.2. Hence, the new property requirements specified by the process needs are back calculated from the determined cluster values and are now identified as the upper and lower bounds on the three properties (see Table 5.10); and used as input to the molecular design algorithm. Points V c H v H fus ? 1 ? 2 ? 3 AUPs C 1 C 2 C 3 ?C js Xcc Ycc Lumped Source (1) 750 110 16 300.0 314.3 160 774.3 0.3875 0.4059 0.2066 1.0 0.590 0.406 PT 3 on Feasibility (M) 530 115 20 212 328.6 200 740.6 0.2863 0.4437 0.2701 1.0 0.508 0.444 Point A (2) 310 120 24 124 342.9 240 706.9 0.1754 0.4850 0.3395 1.0 0.418 0.485 x ? 13 7 Table 5.8: Calculation data for new feasiblity region 0.5 0.522 PT 6 on Feasibility(M) 610 100 40 244 285.7 400 929.7 0.2624 0.3073 0.4302 1.0 0.416 0.307 Point B (2) 470 90 64 188 257.1 640 1085.1 0.1732 0.2370 0.5898 1.0 0.292 0.237 x ? 0.5 0.4164 PT 2 on Feasibility (M) 530 115 40 212 328.6 400 940.6 0.2254 0.3493 0.4253 1.0 0.400 0.349 Point C (2) 310 120 64 124 342.9 640 1106.9 0.1120 0.3098 0.5782 1.0 0.267 0.310 x ? 0.5 0.411 PT 1 on Feasibility (M) 530 100 40 212 285.7 400 897.7 0.2362 0.3183 0.4456 1.0 0.395 0.318 Point D (2) 310 90 64 124 257.1 640 1021.1 0.1214 0.2518 0.6267 1.0 0.247 0.252 x ? 0.5 0.431 H fus ? 1 ? 2 ? 3 AUPs C1 C2 C3 ?C js Xcc Ycc 40 244 285.7 400 929.7 0.262 0.307 0.43 1 0.416 0.307 20 244 285.7 200 729.7 0.334 0.392 0.274 1 0.530 0.392 20 244 328.6 200 772.6 0.316 0.425 0.259 1 0.528 0.425 20 212 328.6 200 740.6 0.286 0.444 0.27 1 0.508 0.444 24 124 342.9 240 706.9 0.175 0.485 0.34 1 0.418 0.485 64 124 342.9 640 1106.9 0.112 0.310 0.578 1 0.267 0.310 64 124 257.1 640 1021.1 0.121 0.252 0.627 1 0.247 0.252 64 188 257.1 640 1085.1 0.173 0.237 0.59 1 0.292 0.237 13 8 Table 5.9: New Feasibility Region Data Property LL UL V c 310 610 H v 90 120 H fus 20 64 Table 5.10: Determined property constraints for molecular design algorithm 139 5.2.3. Molecular Design Property models for the three functionalities (V c , H v , and H fus ) are available in the bank of group contribution models and have been used in the formulation of the corresponding molecular property operators (? ? 1 , ? M 2 , ? M 3 ), see table 5.11. The molecular feasibility region for the design problem has been plotted on Figure 5.12. The molecular building blocks given as input into the algorithm are represented by the discrete points on the same plot. Having the molecular synthesis problem represented visually, all that remains is to proceed with molecular addition of groups until molecular candidates are generated (M1-M6), whose locus falls within the sink, this satisfies the first feasibility condition (Rules 5-6) (Figure 5.13). For complete validation of the designed formulations all remaining conditions must be satisfied; the AUP of the formulations all fall within the AUP range of the sink, determined to be 172-306. The candidate formulations M1 and M2 failed to satisfy AUP constraint (see table 5.12). Hence, M3-M6 are the only molecules that satisfy all the necessary and sufficient conditions. As a final check the designed formulations are mapped back to the process design level and as seen on Figure 5.14, all the formulations fall within designated design space. j Property (X) GC Property Model Property Operator ? ref 1 V c ? ?=? i cgcoc i VnVV 1 ? ? i cg i Vn 1 20 2 H v ? ?=? i vgvov i HnHH 1 ? ? i vg i Hn 1 1 3 H fus ? ?=? i fusgfusofus i HnHH 1 ? ? i fusg i Hn 1 0.5 Table 5.11: Property operators for gas purification molecular synthesis 140 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 Molecular Groups G1: OH G2: CH 3 N G3: CH 2 G1 G3 G2 Figure 5.12: Molecular synthesis of gas purification solvent 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 Candidate Molecules M1 : OH-(CH2)4-CH3N-OH M2 : OH-(CH2)5-CH3N-OH M3 : OH-(CH2)6-CH3N-OH M4 : OH-(CH2)4-(CH3N)2-OH M5 : OH-(CH2)5-(CH3N)2-OH M6 : OH-(CH2)7-CH3N-OH M1 M2 M3 M4 M5 M Figure 5.13: Candidate molecules for gas purification solvent 141 Candidates AUP Vc (cm 3 /mol) Hv (kJ/mol) Hf (kJ/mol) M1 148.9 389.23 89.294 23.33 M2 161.9 445.51 94.204 25.969 M3 174.9 501.79 99.114 28.608 M4 175.2 484.17 98.787 29.338 M5 188.2 540.45 103.697 31.977 M6 187.9 558.07 104.024 31.247 Table 5.12: Candidate property data for gas purification solvent 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9 C3 C2 C1 M3 M4 M5 M6 Mixed Feed Feasibilty Region Lumped Source Figure 5.14: Verification of candidate molecules in process domain 142 5.2.4. Summary The solution of the gas purification study has illustrated the simultaneous approach and its ability to transfer information from the process level to the molecular level and back again. Formulation of the process design problem on the ternary diagram enabled the decomposition of the problem by identifying the optimal feasibility region through the simple use of lever arm analysis. The framework allows for the complete integration of process sources and sinks for identification of property targets. The methodology facilitates the flow of information from the process domain to the molecular domain and back without the need for extensive calculation. 143 6. Conclusions and Future Work 6.1. Achievements The main achievement of this work is the development of the Molecular Property Cluster algorithm ? a property based framework that allows for the systematic synthesis of molecular formulations based on molecular fragments. The method is capable of simultaneously considering both process and molecular design needs. In that sense it is a truly integrated approach. Developed within the property clustering platform, it has established a systematic means of formulating molecular property operators, which helps in lowering the complexity of the design problem. In this work the property clustering technique has been combined with first-order Group Contribution Methods (GCM) to produce a systematic methodology capable of handling property design targets and synthesizing molecular options to satisfy them. Current integrated solution strategies like mathematical optimization struggle with limitations on flexibility of the property models. The problem becomes too complex, which makes it difficult to achieve convergence. In this approach the complex nature of the property models is hidden within the formulation of the molecular operators. The concept of linearizing non-linear functionalities aided in handling the convergence limitations due to complex property models. The development of the molecular operators was key in bridging the gap between the previously decoupled design problems. 144 It is a targeting approach that sets up the design problem as a reverse problem formulation, where the property performance requirements in this approach are obtained directly from the process clustering algorithm, as established by Eden (2003). The process algorithm was developed within the clustering platform; the process design problem is solved in terms of the constitutive variables (properties) and the generated solutions are also in terms of the constitutive variables. Once again, the reverse approach plays an important role; here it allows for the solution of process design problems in the property domain without having to commit to any components ahead of design. The process needs (solution in terms of functionalities) are now the input to the molecular design algorithm. Like the original property cluster operators used for processes, the formulation of the molecular property operators allows for simple linear additive rules of the individual molecular groups that make up the formulation. A systematic methodology to convert property data and constraints into molecular cluster data has been presented. Furthermore, a significant contribution of the developed methodology is that for problems that can be adequately described by just three properties, the process and molecular design problems are solved visually and simultaneously on ternary diagrams, irrespective of how many process streams or molecular fragments are included in the search space. First the process design problem is visualized on a process ternary cluster diagram, where the clusters are formulated according to the process operator mixing rules. Next, the process design algorithm identifies the optimal design in terms of process property clusters which are then converted to physical property data. The solution to the process design problem provides the property constraints used as input to 145 the molecular design algorithm. Using the molecular cluster rules developed in this work, the data is converted to molecular property targets. Next, the molecular property targets are plotted as a feasibility region on the molecular ternary cluster diagram. The set of molecular groups used as input to the algorithm are plotted as points on the ternary diagram. In regards to selection of groups for the molecular synthesis, all available groups can be included if no restrictions or constraints are placed on the design problem (e.g. if only alcohols are desired the ?OH would be included as one of the groups in the list of molecular fragments). The synthesis of candidate molecules is achieved in accordance with the necessary and sufficient conditions developed in this work. The rules describe how groups can be visually added on the diagram; and how the location of the final formulation is independent of group addition path. Once the molecular formulation is completed there are checks to guarantee the validity of the design molecule. The cluster value of the formulation must be contained within the feasibility region of the sink on the molecular ternary cluster diagram. The AUP value of the designed molecule must be within the range of the target. If the AUP value falls outside the range of the sink, the designed molecule is not a feasible solution. The aforementioned conditions are necessary but the sufficient condition is that for the designed molecule to match the target properties, the AUP value of the molecule has to match the AUP value of the sink at the same cluster location. And in the case where the design problem included Non-GC properties, those properties must be back calculated for the designed molecule using the appropriate corresponding GC property, and those values have to match the target Non-GC properties. The developed concepts have been illustrated through various application examples. 146 Although only those problems that can be described by three properties are covered by the visualization approach, the proposed molecular clustering methodology is capable of handling as many properties as needed to describe the system. In such cases, the visualization tool will no longer be available but the design problem is still simplified. The algebraic molecular clustering approach is used to formulate the design problem, with the molecular operators as the basis, therefore the dimensionality and complexity of the problem is significantly lowered from a MINLP to a LP. The molecular design problem is formulated as a set of equality and inequality equations to place bounds on the search space, while structural and non-structural constraints are also considered in the formulation. A proof of concept example has been solved to highlight the merits of the approach. The Molecular Property Cluster algorithm has proven to be a powerful tool in the simultaneous consideration of molecular and process design problems. The methodology can also be used independently for just molecular synthesis, e.g. solvent design as in the provided cases of the blanket wash and the gas purification solvents. The significant achievement of the methods presented in this thesis is the development of a systematic framework that enables a property based visual representation of the molecular synthesis problem. Molecular formulations are synthesized on the ternary diagram using lever-arm additive rules. The method enables synthesizing molecules systematically based on molecular fragments to satisfy the specific property needs of the process. The visual tool gives the designer a guide to which groups to include in the synthesis and those that will not help in satisfying the target performance requirements. For cases that require more than three properties, the 147 algebraic molecular clustering approach succeeds in lowering the dimensionality of the design problem from mixed integer non-linear program (MINLP) to that of a linear program (LP). The molecular property clusters are the key to bridging the gap between process and molecular design, thus allowing for a truly integrated design. 6.2. Future Directions The work presented in this thesis has resulted in a vital tool for the areas of process and molecular integration, as well as molecular synthesis. Recognizing that the field of property clustering is fairly new, there is still a lot of work that needs to be covered. The property clustering techniques for process and molecular design were developed to aid in those cases where conventional component based algorithms fail. Several issues need to be addressed in order to increase the application range of this innovative approach. 6.2.1. Property Model Development The molecular property clustering methods developed in this work are based on the molecular property operator formulations, which are functions of additive mixing rules according to the available GC property models. As long as models are available, the presented molecular synthesis rules are valid. Hence, to take advantage of the useful aspects of this methodology in molecular synthesis and design, additional efforts should be devoted to expanding the availability of group contribution properties, because that would translate to a wider scope of industrial applications for these techniques. In the 148 case of simultaneous process and molecular design, efforts need to concentrate on developing new property operator mixing rules with the same goal in mind, to have a well established bank of physical properties available. For example, properties like glass transition temperature for polymers already have mixing rules available but other properties like Knoop hardness and degree of polymerization still need to be developed. 6.2.2. Defining the Search Space In synthesizing molecular formulations, property targets as well as molecular fragments are used as a part of the input information for the algorithm. All possible fragments can be included as part of the synthesis problem. The ternary cluster diagram, that is used to visualize the design problem, can be used to help eliminate infeasible molecular fragments that do no help to reach the targeted feasibility region. Thus, the visualization tool is used to help in narrowing down the search space and in turn the synthesis problem is simplified. There will be a need for the development of automatic systems to guide how molecular fragments should be excluded as well as how to narrow down the search space without risking excluding optimal candidates. 6.2.3. Expanding the Application Range This thesis has shown how the development of the molecular clustering methods can help bridge the gap between process and molecular design. Although the introduced approach can be utilized independently for molecular design, it can also be used in conjunction with other algorithms. The molecular cluster algorithm developed here can be used in combination with a wide variety of other process synthesis and optimization 149 tools such as the property based pinch analysis developed by Kazantzi and El-Halwagi (2005). The synthesis problem is reformulated in terms of properties having environmental impact (e.g. toxicity). In such cases, valid empirical equations that could link the environmental properties to those of GCM are needed. Once those are identified, this methodology could be used to directly target those waste minimization criteria resulting in the synthesis of environmentally benign chemicals. Chemicals that might have been overlooked if only using the traditional dependence on laboratory experience. The development of an algebraic method for the formulation of the molecular design problem is significant. Although visualization is no longer a viable tool, solution of the process design problem is achieved by solving a set of linear algebraic equality and inequality equations as a result of a constraint reduction approach. Efforts should concentrate on developing simultaneous algebraic techniques for solving process and molecular design problems. The process algebraic methods have been developed by Qin et al. (2004) and the algebraic formulation of the molecular design problem is introduced here. Therefore, all that remains is an outline of the merged approach. 150 References Achenie, L. E. K. and M. Sinha (2004). "The design of blanket wash solvents with environmental considerations." Advances in Environmental Research 8(2): 213- 227. Achenie, L. E. K., R. Gani, V. Venkatasubramanian, Eds. (2003). Computer Aided Molecular Design: Theory and Practice. Computer Aided Chemical Engineering, 12, Elsevier. Albanese, J. (2004). Optimizing Formulas by Experimental Design. The NY Chapter Society of Cosmetic Chemists' newsletter "Cosmetiscope". Anderson, M. and P. Whitcomb (1996). "Optimize your Process Optimization Efforts." Chemical Engineering Progress 12: 51-60. Barnicki, S. D. and J. J. Siirola (2004). "Process synthesis prospective" Computers & Chemical Engineering 28(4): 441. Barton, A. F. (1985). Handbook of Solubility Parameters and other Cohesion Parameters. Boca Raton, Florida, CRC Press. Box, G., W. Hunter, and J. Hunter (1978). Statistics for Experimenters. New York, Wiley. 151 Brignole, E. A. and M. Cismondi (2003). Molecular Design - Generation & Test Methods. Computer Aided Molecular Design: Theory and Practice. L. E. K. achenie, R. Gani and V. Venkatasubramanian, Elsevier. 12: 23-41. Burke, J. (1984). Solubility Parameters: Theory and Application. The Book and Paper Group Annual: The American Institute for Conservation Annual Meeting. CAPEC (2006). ICAS database. CAPEC, Technical University of Denmark, Denmark. Cerda, J., A. W. Westerberg, D. Mason, B. Linnhoff (1983). "Minimum utility usage in heat exchanger network synthesis A transportation problem." Chemical Engineering Science 38(3): 373. Constantinou, L. and R. Gani (1994). "New group contribution method for estimating properties of pure compounds." AIChE Journal 40(10): 1697-1710. Constantinou, L., K. Bagherpour, R. Gani, J. Kein, D. Wu (1996). "Computer aided product design: problem formulations, methodology and applications." Computers & Chemical Engineering 20(6-7): 685-702. Constantinou, L., R. Gani, J. O?Connell (1995). "Estimation of the acentric factor and the liquid molar volume at 298 K using a new group contribution method." Fluid Phase Equilibria 103(1): 11-22. Constantinou, L., S. Prickett, and M. Mavrovouniotis (1993). "Estimation of thermodynamic and physical properties of acyclic hydrocarbons using the ABC 152 approach and conjugation operators." Industrial & Engineering Chemistry Research 32(8): 1734-1746. Cornell, J. (1990). Experiments with Mixtures. New York, John Wiley and Sons Inc. CRC Handbook of Chemistry and Physics (1980). Boca Raton, FL, CRC Press. Cussler, E. L. and G. D. Moggridge (2001). Chemical Product Design. New York, Cambridge University Press. d'Anterroches, L. and R. Gani (2005). "Group contribution based process flowsheet synthesis, design and modelling." Fluid Phase Equilibria 228-229: 141-146. d'Anterroches, L., R. Gani, P. Harper, and M. Hostrup (2005). Design of Molecules, Mixtures and Processses through a Novel Group Contribution Method. 7th World Congress of Chemical Engineering, Glasgow, UK. Doyle, S.J. and R. Smith (1997). ?Targeting Water Reuse with Multiple Contaminants,? Trans. Inst. Chem. E., B75: 181. Dunn, R. and G. Bush (2001). "Usiing Process Integration Technology for Cleaner Production." Journal of Cleaner Production 9: 1-23. Duvedi, A. P. and L. E. K. Achenie (1996). "Designing environmentally safe refrigerants using mathematical programming." Chemical Engineering Science 51(15): 3727. 153 Eden, M. R. (2003). Property Based Process and Product Synthesis and Design. CAPEC, Department of Chemical Engineering, Technical University of Denmark. Ph.D Thesis. Eden, M. R., P. M. Harper, R. Gani, and S. Jorgensen (2002). ?Design of Separation Process for Synthesis A/S - Separation of Aniline from Water?. Lyngby, CAPEC, Technical University of Denmark. Eden, M. R., S. B. J?rgensen, R. Gani and, M. El-Halwagi (2003). ?Reverse Problem Formulation based Techniques for Process and Product Design.? Computer Aided Chemical Engineering, 15A. Eden, M. R., S. B. Jorgensen, R. Gani, and M. El-Halwagi (2004). "A novel framework for simultaneous separation process and product design." Chemical Engineering and Processing 43(5): 595-608. El-Halwagi, M. (1997). Pollution Prevention Through Process Integration: Systematic Design Tools. San Diego, CA, Academic Press. El-Halwagi, M. (2006). Process Integration. Process Systems Engineering. Amsterdam, Academic Press. 7. El-Halwagi, M. M. and H. D. Spriggs (1996). ?An integrated approach to cost and energy efficient pollution prevention?. Fifth World Congress of Chemical Engineering, San Diego, USA. 154 El-Halwagi, M. M. and H. D. Spriggs (1998). "Solve Design Puzzles with Mass Integration." Chemical Engineering Progress 94: 25-44. El-Halwagi, M. M. and V. Manousiouthakis (1989). "Synthesis of mass exchange networks." AIChE Journal 35(8): 1233-1244. El-Halwagi, M. M., I. M. Glasgow, X. Qin, M. R. Eden (2004). "Property integration: Componentless design techniques and visualization tools." AIChE Journal 50(8): 1854-1869. Eljack F.T., Eden M.R., Kazantzi V., El-Halwagi M.M. (2007a): ?Molecular Design via Molecular Property Cluster - An Algebraic Approach?, Computer Aided Chemical Engineering (accepted) Eljack F.T., Eden M.R., Kazantzi V., El-Halwagi M.M. (2007b): ?Simultaneous Process and Molecular Design - A Property Based Approach?, AIChE Journal 53(5), 1232-1239. Eljack F.T.., Eden M.R. (2007): ?A Visual Approach to Molecular Design using Property Clusters and Group Contribution?, Computers and Chemical Engineering (accepted) Eljack, F. T., A. F. Abdelhady, M. Eden, F. Gabriel, X. Qin, and M. El-Halwagi (2005). Targeting optimum resource allocation using reverse problem formulations and property clustering techniques. Computers & Chemical Engineering 29(11-12): 2304-2317. 155 Eljack, F. T., M. R. Eden, V. Kazantzi, M. M. El-Halwagi (2006). ?Property Clustering and Group Contribution for Process and Molecular Design?. Computer Aided Chemical Engineering 21, Elsevier. EPA, United States Environmental Protection Agency (1997). "Printing/Publishing Industry." www.epa.gov/region02/p2/printer.htm. Floudas, C. A. (1995). Nonlinear and Mixed-Integer Optimization. New York, Oxford University Press. Foo, D. C. Y., V. Kazantzi, M.M. El-Halwagi, Z. Abdul Manan (2006). "Surplus diagram and cascade analysis technique for targeting property-based material reuse network." Chemical Engineering Science 61(8): 2626. Franklin, J. L. (1949). ?Prediction of Heat and Free Energies of Organic Compounds." Industrial & Engineering Chemistry Research 41: 1070-6. Friedler, F., L. T. Fan, L. Kalotai, A. Dallos (1998). "A combinatorial approach for generating candidate molecules with desired properties based on group contribution." Computers & Chemical Engineering 22(6): 809. Gani, Rafiqul, J. Perregaard, and H. Johansen (1990). "Simulation Strategies for Design and Analysis of Complex Chemical Processes." Trans. I. Chem. E., vol. 68A: 407-417. Gani, R. (2001). Computer aided process/product synthesis and design: Issues, needs and solution approaches". AIChE Annual Meeting, Reno. 156 Gani, R. and E. N. Pistikopoulos (2002). "Property modelling and simulation for product and process design." Fluid Phase Equilibria 194-197: 43-59. Gani, R. and J. P. O'Connell (2001). "Properties and CAPE: from present uses to future challenges." Computers & Chemical Engineering 25(1): 3. Gani, R., B. Nielsen, A. Fredenslund (1991). "A group contribution approach to computer-aided molecular design." AIChE Journal 37(9): 1318-1332. Gani, R., L. Achenie, and V. Venkatasubramanian (2003). Challenges and Opportunities for CAMD. Computer Aided Molecular Design: Theory and Practice. L. Achenie, R. Gani and V. Venkatasubramanian,eds.. Amsterdam, Elsevier. Computer Aided Chemical Engineering 12: 357. Garrison, G. W., A. A. Hamad, M. M. El-Halwagi. (1995). ?Synthesis of waste interception networks?. AIChE Annual Meeting, Miami. Gundersen, T. and L. Naess (1988). "The synthesis of cost optimal heat exchanger networks: An industrial review of the state of the art." Computers & Chemical Engineering 12(6): 503. Harper P. M. and R. Gani, (1999). ?CAMD and Solvent Design: From Group Contribution to Molecular Encoding?, AIChE Annual Meeting 1999, Dallas, TX. Harper, P. M. (2000). A Multi-Phase, Multi-Level Framework for Computer Aided Molecular Design. Ph.D. Thesis, CAPEC, Department of Chemical Engingeering, Technical University of Denmark. 157 Harper, P. M. and R. Gani (2000). "A multi-step and multi-level approach for computer aided molecular design." Computers & Chemical Engineering 24 (2-7): 677-683. Harper, P. M., R. Gani, P. Kolar, T. Ishikawa (1999). "Computer-aided molecular design with combined molecular modeling and group contribution." Fluid Phase Equilibria 158-160: 337. Hohmann, E. (1971). Optimum Networks for Heat Exchange. Ph.D. Thesis, University of South California, Los Angeles. Holland, J. H. (1975). Adaptation in Neural and Artifical Systems. Ann Arbor, Univeristy of Michigan Press. Hostrup, M. (2002). Integrated Approaches to Computer Aided Molecular Design. Ph.D. Thesis, Computer Aided Process Engineering Center (CAPEC), Technical University of Denmark. Hostrup, M., P. M. Harper, and R. Gani. (1999). "Design of environmentally benign processes: integration of solvent design and separation process synthesis." Computers & Chemical Engineering 23(10): 1395-1414. Hovarth, A. L. (1992). Molecular Design. Amsterdam, Elsevier. Jalowka, J. and T. Daubert (1986). "Group Contribution Method to Predict Critical Temperature and Pressure of Hydrocarbons." Industrial & Engineering Chemistry Process Design and Development 25(4): 139. 158 Joback, K. (2006). "Molecular Knowledge Systems, Inc.- Designing Better Chemical Products." from www.molknow.com. Joback, K. G. and G. Stephanopoulos (1995). ?Searching in Spaces of of Discrete Solutions: The Design of Molecules Possessing Desired Physical Properties?. Advances in Chemical Engineering, 21, Academic Press. Joback, K. G. and R. C. Reid (1983). "Estimation of Pure-Component Properties from Group Contributions." Chemical Engineering Communication 57: 233. Kazantzi V., Qin X., El-Halwagi M.M., Eljack F.T., Eden M.R. (2007): "Simultaneous Process and Molecular Design through Property Clustering Techniques", Industrial & Engineering Chemistry Research (published online April 14, 2007) Kazantzi, V. and M. M. El-Halwagi (2005). "Targeting material reuse via proeprty integration." Chemical Engineering Progress 101(8): 28-37. Kazantzi, V., Harell, D., Gabriel, F., Qin, X., El-Halwagi, M.M. (2004a). ?Property- based integration for sustainable development?. Computer-Aided Process Engineering 14 ,Elsevier, pp. 1069?1074. Kazantzi, V., Qin, X., Gabriel, F., Harell, D., and El-Halwagi, M.M. (2004b). ?Process modification through visualization and inclusion techniques for property based integration?. In: Floudas, C.A., Agrawal, R. (Eds.), Proceedings of the Sixth Foundations of Computer Aided Design (FOCAPD). CACHE Corp., pp. 279?282. 159 Kirkpatrick, S., C. D. G. Jr., M. Vecchi (1983). "Optimization by Simulated Annealing." Science 220: 671-680. Kreglewsi, A. and B. Zwolinski (1961). "A New Relation for Physical Properties of n-Alkanes and n-Alkyl Compounds." Journal of Physical Chemistry 65(6): 1050. Linke, P. and A. Kokossis (2002). ?Simultaneous Synthesis and Design of Novel Chemicals and Chemical Process Flowsheets. Computer Aided Chemical Engineering 10: 115-121. Linnhoff, B. and E. Hindmarsh (1983). "The pinch design method for heat exchanger networks." Chemical Engineering Science 38(5): 745. Linnhoff, B., D. Townsen, D. Boland, G. Hewitt, B. Thomas, A. Guy, R. Marsland (1982). A User Guide on Process Integration for the Efficient Use of Energy. Rugby, UK, Institute of Chemical Engineers. Lydersen, A. (1955). Estimation of Critical Properties of Organic Compounds. Ph.D. University of Wisconsin, Madison, WI. Lyman, W., W. Reehl, and D. Rosenblatt (1990). Handbook of Chemical Property Estimation Methods. American Chemical Society, Washington D.C. Marcoulaki, E. C. and A. C. Kokossis (1998). "Molecular design synthesis using stochastic optimisation as a tool for scoping and screening." Computers & Chemical Engineering 22(Supplement 1): S11-S18. 160 Marrero, J. and R. Gani (2001). "Group-contribution based estimation of pure component properties." Fluid Phase Equilibria 183-184: 183-208. Material Safety Data Sheets (2006). www.msdsonline.com. MSDSonline ? Nielsen, J., M. Hansen, et al. (1996). "Heat Exchanger Network Modeling Framework for Optimal Design and Retrofitting." Computers & Chemical Engineering 20: S249-S254. Odele, O. and S. Macchietto (1993). "Computer aided molecular design: a novel method for optimal solvent selection." Fluid Phase Equilibria 82: 39. Ourique, J. E. and A. S. Telles (1998). "Computer-Aided Molecular Design with Simulated Annealing and Molecular Graphs." Computers & Chemical Engineering 22: S615-S618. Papoulias, S. A. and I. E. Grossmann (1983). "A structural optimization approach in process synthesis--II: Heat recovery networks." Computers & Chemical Engineering 7(6): 707. Pistikopoulos, E. N. and S. K. Stefanis (1998). "Optimal solvent design for environmental impact minimization." Computers & Chemical Engineering 22(6): 717. Pretel, E. J., P. A. L?pez, S. Bottini, E. Brignole (1994). "Computer-aided molecular design of solvents for separation processes." AIChE Journal 40(8): 1349-1360. 161 Qin, X., F. Gabriel, D. Harell, M. M. El_Halwagi (2004). "Algebraic Techniques for Proeprty Integration via Componenetless Design." Indusrial & Engineering Chemistry Research 43: 3792-3798. Reid, R. C., J. M. Prausnitz, B. Poling (1987). The Properties of Gases and Liquids. New York, McGraw-Hill. Shelley, M. D. and M. M. El-Halwagi (2000). "Component-less design of recovery and allocation systems: a functionality-based clustering approach." Computers & Chemical Engineering 24(9-10): 2081-2091. Shenoy, U. (1995). Heat exchange network synthesis: process optimization by energy and resource analysis. Houston, Gulf Publishing Company. Sherali, H.D. and W.P. Adams (1999). A Reformulation-Linearization Technique for Solving Discrete and Continuous Non-Convex Problems. Dordrecht, Kluwer Academic Publishers. Sinha, M. and L. E. K. Achenie (2001). "Systematic design of blanket wash solvents with recovery considerations." Advances in Environmental Research 5(3): 239- 249. Smith, R. (2004) Processing integration extends its reach. ChemialProcessing.com: The digital resoruce of chemical processing magazine Volum 359. Srinivas, B. K. (1997). An overview of Mass Integration and its Application to Process Development, GE Research & Development Center. 162 Srinivas, B. K. and M. M. El-Halwagi (1993). "Optimal design of pervaporation systems for waste reduction." Computers & Chemical Engineering 17(10): 957- 970. Stats-Ease (1999). Design Expert 6.0. State-Ease inc. Takama, N., T. Kuriyama, K. Shiroko and T. Umeda (1980). ?Optimal Water Allocation in a Petroleum Refinery,? Computers and Chemcical Engineering, Vol. 4, p. 251. Teja, A. S., R. J. Lee, D. Rosenthal, and M. Anselme (1990). "Correlation of the critical properties of alkanes and alkanols." Fluid Phase Equilibria 56: 153. Tsonopoulos, C. (1987). "Critical Constatns of Normal Alkanes from Methane to Polyethylene." AIChE Journal 33(12): 2080-2083. Tsonopoulos, C. and Z. Tan (1993). "The Critical Constants of Normal Alkanes From Methane to Polyethylene: II. Application of the Flory Theory." Fluid Phase Equilibria 83: 127. Vaidyanathan, R. and M. El-Halwagi (1994). "Global optimization of nonconvex nonlinear programs via interval analysis." Computers & Chemical Engineering 18(10): 889-897. Van Krevelen, D. W. (1990). Properties of polymers. Amsterdam, Elsevier. 163 Van Krevelen, D. W. and P. J. Hoftyzer (1976). Properties of Polymers: Their Estimation and Correlation with Chemical Structure. Amsterdam, Elsevier Scientific Publishing. Venkatasubramanian, V., K. Chan, J. Caruthers (1994). "Computer-aided molecular design using genetic algorithms." Computers & Chemical Engineering 18(9): 833. Wang, Y. P. and R. Smith (1994). "Wastewater minimisation." Chemical Engineering Science 49(7): 981. Whitting, W. B. and Y. Xin (1999). Sensativity and uncertaintity of process simulation to thermodynamics data and models: case studies. American Institute of Chemical Engineering spring meeting, Houston. 164 Appendices 165 Appendix A: Group Contribution A.1: 1 st order GC Data Constantantinou and Gani (1994) estimate properties of pure organic compounds from their 1 st and 2 nd order groups. They have provided property models for the following properties: ? Normal boiling and melting temperatures ? Critical pressure, critical volumes and critical temperature ? Standard enthalpy of vaporization and standard Gibbs energy, and standard enthalpy of formation The general group contribution model equation used to predict properties is described by equation 3.26. The left hand side (LHS) of the equations represents property functionality and the right hand side (RHS) is the property contribution of each group. Universal constants that are included in the property models are listed in Table A.1, the GC property models are summarized in Table A.2 and all data for the first order groups is listed in Table A.3 (Marrero and Gani, 2001) ? = i ii CNXf )( (3.26) 166 Universal Constants Value t mo 102.425 K t bo 204.359 K t co 181.128 K p c1 1.3705 bar v co 4.35 cm 3 /mol g fo -14.828 kJ/mol h fo 10.835 kJ/mol h vo 6.829 kJ/mol h fuso -2.806 kJ/mol D 0.01211 m 3 /kmol Table A.1: Listed values of GCM universal constants 167 Property (X) LHS of Eq. 3.26 Function f(X) RHS of Eq. 3.26 1 st order GC term Normal melting point (T m ) ? ? ? ? ? ? ? ? mo m t T exp ? i mi i TN 1 Normal boiling point (T b ) ? ? ? ? ? ? ? ? bo b t T exp ? i bi i TN 1 Critical temperature (T c ) ? ? ? ? ? ? ? ? co c t T exp ? i ci i TN 1 Critical pressure (P c ) ( ) 5.0 1 ? ? cc PP ? i ci i PN 1 Critical volume (V c ) coc VV ? ? i ci i VN 1 Standard Gibbs energy 1 (G f ) fof GG ? ? i fi i GN 1 Standard enthalpy formation 1 (H f ) fof HH ? ? i fi i HN 1 Standard enthalpy vaporization 1 (H v ) vov HH ? ? i vi i HN 1 Standard enthalpy fusion (H fus ) fusofus HH ? ? i fusi i HN 1 1 Properties predicted at 298K Table A.2: Property functions for Group Contribution Methods 16 8 Table A.2: 1st order Groups and their contributions (Marrero and Gani, 2001) 16 9 Table A.2: Cont?d 17 0 Table A.2: Cont?d 17 1 Table A.2: Cont?d 172 A.2: Molar Volume GC data The group contribution model for molar volume by Constantinou and Gani (1995) was developed to include 1 st and 2 nd order groups, however the molecular property clusters presented in this work only considered first order groups, see equation A.1. mg Ng 1g gm vndV ?=? ? = (A.1) The first order group contribution data for molar volume is provided in Table A.4. Group v m Group v m CH 2 0.01641 CHC1 0.02663 CH 0.00711 CCl 0.02020 C -0.00380 CHC12 0.04682 CH 2 CH 0.03727 CCl 2 **** CH CH 0.02692 CCl 3 0.06202 CH2~ 0.02697 ACC1 0.02414 CH C 0.01610 CH2NO 2 0.03375 C C 0.00296 CHNO 2 0.02620 CH2-C=CH 0.04340 ACNO2 0.02505 ACH 0.01317 CH,SH 0.03446 AC 0.00440 I 0.02791 ACCH 3 0.02888 Br 0.02143 ACCH~ 0.01916 CH C **** ACCH 0.00993 C C 0.01451 OH 0.00551 ACF 0.01727 ACOH 0.01133 CI (C C) 0.01533 CH3CO 0.03655 HCON(CH2) 2 **** CH2CO 0.02816 CF 3 **** CHO 0.02002 CF2 **** CH3 COO 0.04500 CF **** CH2COO 0.03567 COO 0.01917 HCOO 0.02667 CC12 F 0.05384 CH30 0.03274 HCC1F **** CH20 0.02311 CC1F 2 0.05383 CH-O 0.01799 F **** FCH20 0.02059 CONH 2 **** CH2NH 2 0.02646 CONHCH 3 **** CHNH 2 0.01952 CONHCH 2 **** CH3NH 0.02674 CON(CH 3)2 0.05477 Table A.4: 1 st order groups and their V m contributions (Constantinou et al., 1995) 173 Group v m Group v m CH2NH 0.02318 CONCHsCH 2 **** CHNH 0.01813 CON(CH2)2 **** CH 3 N 0.01913 C2H502 0.04104 CH2 N 0.01683 C2H402 **** ACNH 2 0.01365 CH3S 0.03484 CsH4N 0.06082 CH2 S 0.02732 CsH3N 0.05238 CHS **** CHzCN 0.03313 C4H3 S **** COOH 0.02232 C4H2S **** CH 2C1 0.03371 Table A.4: Cont?d 174 Appendix B: Solubility Estimation Method Hansen (1976) assumed that solubility is a function of the non-polar (? d ), polar (? p ) and hydrogen bonding (? h ) contribution to the cohesive energy. These solubility parameters can be determined from molecular make-up according to equation 4.5 (van Krevelen, 1990). m hi h m pi p m di d V E V F V F ? ? ? === ??? 2 (4.5) F di , F ji , and E hi values for a selection of molecular building blocks are listed in Table B.1. The molar volume (V m ) calculations are predicted using group contribution equation (4.6) and the tabulated group contribution parameters were published by Constantinou et al. (1995). Hildebrandt parameters were originally a measure of cohesive energy density (cal/cm 3 ) 1/2 and a newer form that conforms to the Standard International (SI) units is in terms of cohesive pressure (MPa) 1/2 . According to Burke (1984) the conversion between the two units is as follows: () 2/1 3 2/1 cm cal 0455.2MPa ? ? ? ? ? ? ?= ?? (B.1) Barton (1985) determined the solute-solvent constraint, R ij , for synthesized molecules according to equation 4.5, where i is the solute and j is the solvent; and for the aniline solvent application example, the calculated solubility parameter are listed in Table B.2 In this case study i represents the designed formulations (M1-M8) and j is for aniline (CAS 62-53-3), whose Hansen?s solubility parameters are obtained from the ICAS data bank, and are listed in Table B.2. 175 ()( ) ( ) 222 4 j h i h j p i p j d i d ij R ?????? ?+?+?= (4.) F di F pi E hi g J 1/2 *cm 3/2 /mol J 1/2 .cm 3/2 /mol J/mol CH 3 - 420 0 0 -CH 2 - 270 0 0 -CH 2 0- 520 400 3000 CH 3 CH 2 - 690 0 0 CH 3 O- 520 400 3000 -CH 2 CO 560 770 2000 -CH 3 CO 710 770 2000 -O- 100 400 3000 -CO- 290 770 2000 Table B.1: Parameters for estimation of Hansen solubility (van Krevelen, 1990) 17 6 CH 3 CH 2 CH 2 O CH 3 O CH 2 CO CH 3 CO V m m 3 /mol ? d MPa 1/2 ? p MPa 1/2 ? H MPa 1/2 R ij MPa 1/2 2 4 130.03 14.77 0 0 15.50 2 5 146.44 14.95 0 0 15.29 2 6 162.85 15.11 0 0 15.11 2 7 179.26 15.23 0 0 14.98 1 3 1 141.78 12.63 5.43 3.76 14.88 1 4 1 140.44 15.74 5.48 3.77 9.65 2 8 1 218.78 16.09 1.83 3.70 10.73 1 3 1 120.22 14.56 3.33 5.00 11.58 solubility parameters (ICAS, 2006): ? d = 19.32, ? p = 7.86 and ? H = 9.78 MPa 1/2 Table B.2: Solubility calculations for candidate solven