COLLECTIVE CREATIVITY IN SCIENTIFIC COMMUNITIES 
 
 
 
 
 
Except where reference is made to the work of others, the work described in this thesis is 
my own or was done in collaboration with my advisory committee. This thesis does not 
include proprietary or classified information. 
 
 
 
 
 
_____________________________________ 
Guangyu Zou 
 
 
 
Certificate of Approval: 
 
 
 
_____________________________                            _____________________________ 
Jeffrey Smith                                                                Levent Yilmaz, Chair 
Professor                                                                       Associate Professor 
Industrial and Systems Engineering                             Computer Science and Software                              
                                                                                      Engineering                                    
 
 
 
_____________________________                            _____________________________ 
Saeed Maghsoodloo                                                     George T. Flowers 
Professor                                                                      Dean 
Industrial and Systems Engineering                            Graduate School  
 
COLLECTIVE CREATIVITY IN SCIENTIFIC COMMUNITIES 
 
 
 
Guangyu Zou 
 
 
 
A Thesis 
Submitted to 
the Graduate Faculty of  
Auburn University 
in Partial Fulfillment of 
the Requirements for the 
Degree of 
Master of Science 
 
 
Auburn, Alabama 
August 10, 2009 
iii 
 
COLLECTIVE CREATIVITY IN SCIENTIFIC COMMUNITIES 
 
 
Guangyu Zou 
 
 
Permission is granted to Auburn University to make copies of this thesis at its discretion, 
upon request of individuals or institutions and at their expense. The author reserves all 
publication rights. 
 
 
_____________________________ 
                                                                              Signature of Author 
 
 
 _____________________________ 
                                                                              Date of Graduation 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
iv 
 
VITA 
 
 
 
 
     Guangyu Zou, son of Baoyong Zou and Xiuzhi Kou, was born on May 12, 1979 
in Liaoyang, P. R. China. He graduated from Liaoyang No. 1 High School in 1997. He 
entered Northeastern University, Shenyang, P. R. China and graduated with a Bachelor of 
Science degree in Automation in July 2001. He was admitted into the Graduate School of 
Northeastern University upon recommendation with the entry examination waived. In 
March 2004 he graduated with a Master of Science in Systems Engineering. From April 
2004 to June 2007, he worked at ZTE CO., Ltd as a software engineer. From June 2007 
to December 2007, he worked at Alcatel-Lucent as a UMTS OAM Software Developer. 
After that, he entered Graduate School at Auburn University majoring in Industrial and 
Systems Engineering in January, 2008. During his graduate study at Auburn University, 
he performed well on both his academic study and research work. From January 2008 to 
December 2008, he was a graduate assistant in charge of keeping the department website 
updated and computer maintenance.  
 
 
 
 
 
 
 
 
 
 
v 
 
THESIS ABSTRACT 
 
COLLECTIVE CREATIVITY IN SCIENTIFIC COMMUNITIES 
 
 
Guangyu Zou 
Master of Science, August 10, 2009 
(M.S., Northeastern University, March 2004) 
(B.S., Northeastern University, July 2001) 
 
131 Typed Pages 
Directed by Levent Yilmaz 
 
Innovation is the driving force of personal growth, national wealth and social 
progress. Significant attention has been given to advancing cyber infrastructures, but little 
is known about their factors, as well as their interaction in producing the context that 
contributes to creating innovation. It is widely accepted that in open scientific 
communities, organizational creativity and innovation rate is high. So it is significantly 
important to analyze such communities in order to better understand their mode of 
operation. Our objective in this study is to use agent simulation as a computational 
laboratory to understand the innovation potential of scientific communities. The 
simulation model serves as a useful thinking tool for policy analysis to foster innovation 
in scientific communities. The simulation results show that centrality, as a measure of 
vi 
 
degree of connectedness, exhibits positive correlation with innovation in exploration-
oriented and utility-oriented community but negative correlation in service-oriented 
community. Additionally utility-oriented communities have social network with low 
density and high centrality, which suggest high potential for innovation. 
 
vii 
 
ACKNOWLEGEMENTS 
 
 
I would like to express special gratitude to my advisor Dr. Levent Yilmaz, associate 
professor, Department of Computer Science and Software Engineering at Auburn 
University, for his instruction, guidance, encouragement and patience in completion of 
the research and thesis. In particular, his suggestions, criticisms and materials greatly 
contributed to this thesis. 
Thanks also my advisory committee members, Dr. Jeffrey Smith, Dr. Saeed 
Maghsoodloo and the professors and staff members in the Department of Industrial and 
Systems Engineering at Auburn University for their kindness and help through these two 
years. 
Finally, sincere thanks to my wife Ying Zhao. She gave her greatest support and 
encouragement to help me succeed in finishing all the research work. Also, I thank my 
parents, who poured enormous effort into supporting my study during these years. 
 
 
 
 
 
 
  
viii 
 
Style manual or journal used: Guide to Preparation and Submission of Thesis and 
Dissertation 
Computer software used: Microsoft Word 2007 
 
 
ix 
 
TABLE OF CONTENTS 
 
 
LIST OF FIGURES .......................................................................................................... xii 
LIST OF TABLES .............................................................................................................xv 
CHAPTER 1 Introduction ................................................................................................1 
1.1 Problem ................................................................................................................. 1 
1.2 Importance ............................................................................................................ 2 
1.3 Methodology ......................................................................................................... 2 
1.4 Organization of the Thesis .................................................................................... 6 
CHAPTER 2 Literature Review .......................................................................................7 
2.1 Scientific Communities ......................................................................................... 7 
2.2 Complex Adaptive Systems Perspective ............................................................. 10 
2.3 Exploration of Scientific Communities Using Agent Based Modeling .............. 11 
CHAPTER 3 Design Concepts and Details ...................................................................13 
3.1 Purpose ................................................................................................................ 13 
3.2 State Variables and Scales ................................................................................... 13 
3.3 Process Overview and Scheduling ...................................................................... 15 
3.3.1 Entry and Enculturation ........................................................................16 
3.3.2 Innovation and Generators ....................................................................18 
3.3.3 Sub-Domain ..........................................................................................24 
3.3.4 Evaluator ...............................................................................................25 
3.3.5 Turnover ................................................................................................26 
3.3.6 Scheduling.............................................................................................27 
3.4 Framework for Communicating Individual Agent .............................................. 27 
3.4.1 Relationships .........................................................................................29 
3.4.2 Fitness ...................................................................................................30 
3.4.3 Stochasticity ..........................................................................................30 
3.4.4 Observation ...........................................................................................30
x 
 
3.5 Details ................................................................................................................. 30 
3.5.1 Initialization ..........................................................................................31 
3.5.2 Types of Scientific Communities ..........................................................31 
CHAPTER 4 Implementation of Simulation Model ......................................................38 
4.1 Introduction of Repast......................................................................................... 38 
4.1.1 Contexts ................................................................................................39 
4.1.2 Projections.............................................................................................39 
4.2 Implementation of Agents ................................................................................... 41 
4.2.1 Basic Agent ...........................................................................................42 
4.2.2 Individuals.............................................................................................42 
4.2.3 Evaluators .............................................................................................43 
4.2.4 Kenes.....................................................................................................44 
4.3 Implementation of Projection ............................................................................. 44 
4.3.1 Continuous Space..................................................................................44 
4.3.2 Network.................................................................................................45 
4.4 Scheduler............................................................................................................. 46 
4.4.1 Directly Schedule an Action .................................................................46 
4.4.2 Schedule with Annotations ...................................................................47 
4.4.3 Global Method ......................................................................................48 
4.5 Output ................................................................................................................. 49 
CHAPTER 5 Verification, Validation and Evaluation ...................................................50 
5.1 Verification .......................................................................................................... 51 
5.1.1 Unit Testing ...........................................................................................51 
5.1.2 Integration Testing ................................................................................53 
5.2 Validation ............................................................................................................ 53 
5.3 Experiments ........................................................................................................ 55 
5.3.1 Innovation Metrics ................................................................................55 
5.3.2 Network Metrics ...................................................................................57 
5.3.3 Sensitivity Analysis ...............................................................................59 
5.3.4 Experiments between Different Types of Communities .......................70 
5.3.5 What Distinguishes Innovative Communities? .....................................94 
xi 
 
CHAPTER 6 Conclusions ............................................................................................100 
6.1 Extension........................................................................................................... 101 
6.2 Future Research ................................................................................................ 102 
REFERENCES      ...........................................................................................................103 
APPENDIX        ...............................................................................................................107 
xii 
 
LIST OF FIGURES 
Figure 1.1 Complexity in terms of randomness .................................................................. 3 
Figure 1.2 Systems model of creativity .............................................................................. 4 
Figure 3.1 Flow chart of the life circle of agents .............................................................. 16 
Figure 3.2 Susceptibility to influence ............................................................................... 17 
Figure 3.3 Demonstration for elaberation ......................................................................... 20 
Figure 3.4 Location of new kenes ..................................................................................... 21 
Figure 3.5 Demonstration for combination ....................................................................... 22 
Figure 3.6 Demonstration for kene selection .................................................................... 23 
Figure 3.7 The pattern of kenes due to sub-domains ........................................................ 25 
Figure 3.8 Three kinds of relationships ............................................................................ 30 
Figure 3.9 Flowchart of decision process of exploration-oriented community ................ 33 
Figure 3.10 Flowchart of decision process of utility-oriented community ....................... 35 
Figure 3.11 Flow chart of decision type of service-oriented community ......................... 37 
Figure 4.1 Contexts and projections ................................................................................. 40 
Figure 4.2 Class view of systems model of creativity ...................................................... 41 
Figure 5.1 Overview of simulation model development .................................................. 50 
Figure 5.2 The number of kene over the time ................................................................... 54 
Figure 5.3 Percentage of kenes ......................................................................................... 54
xiii 
 
Figure 5.4 The pattern of kenes with the size of 10 .......................................................... 60 
Figure 5.5 The pattern of kenes with the size of 20 .......................................................... 61 
Figure 5.6 The pattern of kenes with the length of 10 ...................................................... 62 
Figure 5.7 The pattern of kenes with mainly using elaboration ....................................... 63 
Figure 5.8 The pattern of kenes with balance ................................................................... 64 
Figure 5.9 Impacts of growth rate on density. .................................................................. 66 
Figure 5.10 Impacts of growth rate on centrality .............................................................. 67 
Figure 5.11 Impacts of recruitment on density ................................................................. 67 
Figure 5.12 Impacts of recruitment on centrality .............................................................. 68 
Figure 5.13 Impacts of turnover on density ...................................................................... 69 
Figure 5.14 Impacts of turnover on centrality .................................................................. 69 
Figure 5.15 Histogram of the number of agents who create the same number of kenes .. 72 
Figure 5.16 Plot of the number of kenes created by each agent ....................................... 73 
Figure 5.17 Emergent pattern in GEM layout .................................................................. 74 
Figure 5.18 Impact factor over time on exploration-oriented community ........................ 75 
Figure 5.19 Histogram of the number of agents who create the same number of kenes .. 77 
Figure 5.20 Plot of the number of kenes created by each agents ...................................... 77 
Figure 5.21 Emergent pattern in GEM layout .................................................................. 78 
Figure 5.22 Impact factor over time on utility-oriented community ................................ 78 
Figure 5.23 Histogram of the number of agents who create the same number of kenes .. 80 
Figure 5.24 Plot of the number of kenes created by each agents ...................................... 81 
Figure 5.25 Emergent pattern in GEM style ..................................................................... 82 
Figure 5.26 Impact factor over time on utility-oriented community ................................ 82 
xiv 
 
Figure 5.27 Histogram of the number of agents who create the same number of kenes .. 84 
Figure 5.28 Plot of the number of kenes created by each agents ...................................... 84 
Figure 5.29 Emergent pattern in GEM style ..................................................................... 85 
Figure 5.30 Impact factor over time on service-oriented community .............................. 86 
Figure 5.31 Number of accepted kenes ............................................................................. 87 
Figure 5.32 Average kene fitness ...................................................................................... 88 
Figure 5.33 Average diffusion for kenes with different types of communities ................. 89 
Figure 5.34 Density with different types of communities ................................................ 90 
Figure 5.35 Centrality with different types of communities ............................................. 91 
Figure 5.36 Proportion of strong ties with different communities .................................... 92 
Figure 5.37 Clustering coefficient for agents with different types of communities ......... 93 
Figure 5.38 Total clustering coefficient with different types of communities .................. 93 
Figure 5.39 Network metrics of exploration-oriented community grouped by average 
kene fitness........................................................................................................................ 95 
Figure 5.40 Network metrics of utility-oriented community grouped by average kene 
fitness ................................................................................................................................ 97 
Figure 5.41 Network metrics of service-oriented community grouped by average kene 
fitness ................................................................................................................................ 98 
 
 
xv 
 
LIST OF TABLES 
 
Table 3.1 State variables and scales .................................................................................. 14?
Table 3.2 Parameters of three kinds of communities ........................................................ 32?
Table 3.3 Parameters of exploration-oriented community ................................................ 33?
Table 3.4 Parameters of utility-oriented community ........................................................ 34?
Table 3.5 Parameters of service-oriented community ....................................................... 36?
Table 4.1 Definition of basic agent ................................................................................... 42?
Table 4.2 Definition of individual ..................................................................................... 43?
Table 4.3 Definition of evaluator ...................................................................................... 43?
Table 4.4 Definition of kene ............................................................................................. 44?
Table 5.1 Range of parameters for communities .............................................................. 65?
Table 5.2 Predefined parameters for exploration-oriented community ............................ 71?
Table 5.3 Results on exploration-oriented community ..................................................... 71?
Table 5.4 Predefined parameters for utility-oriented community ..................................... 75?
Table 5.5 Results on utility-oriented community with equal probabilities ....................... 76?
Table 5.6 Results on utility-oriented community with unequal probabilities ................... 80?
Table 5.7 Predefined parameters for service-oriented community ................................... 83?
Table 5.8 Results on service-oriented community ............................................................ 83?
Table 5.9 Experiment results ............................................................................................. 93
xvi 
 
Table 5.10 Network metrics of exploration-oriented community grouped by average kene 
fitness ................................................................................................................................ 95?
Table 5.11 Network metrics of utility-oriented community grouped by average kene 
fitness ................................................................................................................................ 96?
Table 5.12 Network metrics of service-oriented community grouped by average kene 
fitness ................................................................................................................................ 98?
 
 
1 
 
CHAPTER 1                                                                   
Introduction 
1.1 Problem 
Innovation is the driving force of personal growth [1], national wealth [2] and 
social progress [3]. Creativity is the ability to produce work that is both novel and 
appropriate. Creativity is a topic of wide scope that is important at both the individual and 
societal levels for a wide range of task domains [4]. From a theoretical standpoint, 
understanding creative capacity is integral to a complete account of human cognition for 
the simple reason that human thought is so essential generative [5]. Significant attention 
has been given to advancing cyber infrastructures, but little is known about if virtual open 
science communities are capable of producing context that contributes to creating 
innovation [6]. The proposition being that virtual organizations can more efficiently and 
effectively leverage the combination of diverse information and knowledge, skills and 
resources from different locations and thereby enhance the individual opportunity to learn 
and the organizational capacity to innovate. To date, however, these claims remain largely 
untested. Furthermore, the knowledge creation process in such communities is poorly 
understood [7].  
 
 
2 
 
The problem this thesis focuses on is how innovation emerges based on the 
connections among members in the scientific community and how to determine whether 
or not a specific configuration has potential to lead to innovation. 
1.2 Importance 
Scientific communities consist of members that not only work on a common 
product, but also work together and adjust their actions to new information. Since in such 
community forms organizational creativity and innovation rates are high, the study on 
scientific communities is very important to solving problems and sharing knowledge. 
However, we know little about how social collectives govern and coordinate the actions 
of individuals to produce innovation output in an effective and efficient way. This study 
aims to explore alternative forms of organizing and governance to improve innovation 
and sustainability of such community forms as Exploration-oriented, Utility-oriented and 
Service-oriented [6].  
1.3 Methodology 
The strategy adopted here is to explore how and why do communities of innovation 
form and evolve using agent based modeling and by viewing such communities as a 
complex adaptive system. 
Complex system is a system composed of interconnected parts that as a whole 
exhibit one or more properties (behavior among the possible properties) not obvious from 
the properties of the individual parts. In essence, complexity is concerned with 
 
3 
 
emergency that is the process where the global behavior of system results from the 
actions and interactions of agents. [8] 
A number of scientists have been working on producing several candidate 
measures of effective complexity. Most of the proposed measures differ from each other 
but share at least one important characteristic, in that those strictly regular things as well 
as strictly irregular things are simple, while things that are neither regular nor irregular 
things are complex. Figure 1.1 illustrates how information, compressibility and 
randomness relate to any of these useful notions of complexity [9].  
 
 
So it is noteworthy that complexity is between orderly and random. 
Scientific communities have properties of a complex system, such as unpredictable 
creativity. At the same time, even a simple agent-based model (ABM) can exhibit 
complex behavior patterns and provide valuable information about the dynamics of the 
real-world system that it emulates [10]. In order to study a scientific community, agent 
Orderly Random 
Complexity 
Figure 1.1 Complexity in terms of randomness
4 
 
based modeling is used here. ABMs provide theoretical leverage where global patterns of 
interest are more than the aggregation of individual attributes, but at the same time, the 
emergent pattern cannot be understood without a bottom up dynamic model of the micro 
foundations at the relational level [11].  
Agents with their own states are independent of one another, which indicates the 
agent based model is heterogeneous, and each agent behaves according to a uniform rule 
predefined to be imposed on the system. 
In a scientific community, there are three main components: individual, evaluator 
and knowledge units (i.e., kenes) such as articles, experiments and various other artifacts 
produced during the scientific process whose relationships are shown in Figure 1.2. 
 
 
The components shown in Figure 1.2 make up the systems model of creativity, 
which is useful to explain the innovation process in the open scientific society. The first 
component is called ?individual?, and it plays the role of generator, which can make 
Figure 1.2 Systems model of creativity [adopted and extended [12]] 
Domain 
Evaluator Individual 
Climate 
and 
Structure 
Selected 
Novelty 
Transmits 
Information 
Kenes 
5 
 
contributions to the society. The second component is ?evaluator? whose function is to 
judge if the contributions created by individuals are appropriate. The last part is the 
domain that contains the knowledge the members of the society created and are interested 
in, and domain is composed of kenes that represent the knowledge units. 
As shown in Figure 1.2, there are also three major components in the simulation 
model presented in this thesis, which are individual, evaluator and kene. Kenes are 
created by individuals and are evaluated by evaluators. So kenes don?t have any 
behaviors except that they own some state variables. In addition, both individual and 
evaluator have a common super class that is called basic agent, because both of them 
have some of the same behaviors such as moving in the grid. Therefore, there are two 
types of agents i.e. individual and evaluator. In addition our purpose is to analyze how 
these two kinds of agents interact with each other to generate patterns of kenes and socio-
technical networks. 
Repast is used as a computational laboratory to simulate the activities of 
individuals in the scientific community, where two kinds of agents represent individual 
and evaluator respectively, and three sorts of networks correspond to the relationships 
among agents. Meantime agents with different states follow the same rule to move 
independently. The simulation results show that centrality has significantly effects on 
innovation over all communities.  Additionally a utility-oriented community has low 
density and high centrality for social network, which yield more potential for innovation. 
Based on the analysis of impact factors changing over time, the research topics in the 
utility-oriented community lose the interests of public in a shorter period than other two 
types of communities. At the same time there is the greatest variation of average kene 
6 
 
fitness in utility-oriented communities. Finally elaboration plays an important role on 
innovation. However, extensive divergence may mean lack of coherence, so balance is 
needed. 
1.4 Organization of the Thesis 
The additional sections of this thesis are organized as follows. The next chapter is a 
literature review that is composed of a comprehensive review of all background 
knowledge and circumstances pertinent to creativity in scientific communities. Chapter 3 
is based on the ODD (Overview, Design concepts, Details) template, which describes the 
overall design of the simulation model. And chapter 4 focuses on the implementation of a 
simulation model using Repast, followed by chapter 5 that verifies, validates and 
evaluates the simulation model. The last chapter summarizes some conclusions about 
using Repast to simulate creativity in scientific communities. 
 
 
7 
 
CHAPTER 2                                                                   
Literature Review 
2.1 Scientific Communities 
The scientific community consists of scientists, domain knowledge as well as their 
relationships and interactions. It is normally divided into "sub-communities" each of 
which works on a particular field within science, and objectivity is expected to be 
achieved by the scientific method. Peer review, through discussion and debate within 
journals and conferences, assists in this objectivity by maintaining the quality of research 
methodology and interpretation of results.  
Promoting affiliation between scientists is relatively easy, but creating larger 
organizational structures is more difficult, due to traditions of scientific independence, 
difficulties of sharing implicit knowledge, and formal organizational barriers. [13] 
An Open Community System is an open project that aggregates efforts of many 
geographically separate individuals toward a common research problem. [13] Open 
scientific community forms of organization, which have emerged in recent years, are 
based on open research conducted in the spirit of free and open source. Much like open 
source schemes are built around a source code that is made public, the central theme of 
open research is to make clear accounts of the methodology, along with data and results 
extracted from the internet. This permits a massively distributed collaboration. Such a 
8 
 
scientific form has affected almost every aspect of scientific communities including 
Social sciences: Anthropology, Economics, Psychology, Geography, Linguistics, 
Philosophy, Political science, Sociology, History, Education, Law, Management and 
Applied sciences: Architecture, Cognitive sciences, Engineering, Health sciences 
(Medicine), Military Science etc. 
Open source software is the most prominent example of open source development. 
The success of Linux, an open source operating system, is currently receiving much 
attention by software developers and software users alike. Linux is touted as highly stable 
and reliable. It has steadily increased its market share and has led to a consolidation 
among UNIX operating systems. Typically, open source software is developed by an 
internet-based community of programmers. Participation is voluntary and participants do 
not receive direct compensation for their work. In addition, the full source code is made 
available to the public [14].  
Open source systems are built by potentially large numbers (i.e., hundreds or even 
thousands) of volunteers. Work is not assigned; people undertake the work they choose to 
undertake. There is no explicit system-level design, or even detailed design.  There is no 
project plan, schedule, or list of deliverables [15]. So work on the open source projects 
can be summarized as 
? a creative exercise 
? leading to useful output 
? where the creativity is a lead driver of individual effort.[16]  
The value of any kind of data is greatly enhanced when it exists in a form that 
allows it to be integrated with other data. Unfortunately this has led to a proliferation of 
9 
 
ontologies when using common controlled vocabularies or ontologies. In biomedical 
domain, for instance, the Open Biomedical Ontologies (OBO) consortium is pursuing a 
strategy to overcome this problem of data integrated with other data. Existing OBO 
ontologies, including the Gene Ontology, are undergoing coordinated reform, and new 
ontologies are being created on the basis of an evolving set of shared principles 
governing ontology development. The result is an expanding family of ontologies 
designed to be interoperable and logically well formed and to incorporate accurate 
representations of biological reality. [17] 
Any individual who wishes to join the OBO needs to follow these steps. 
1. First join one or more mailing lists in salient areas as a way to become 
familiar with the collaborative. 
2. This will be followed by a period in which compliance with the principles is 
addressed, especially as it concerns potential conflicts in areas of overlap. 
3. By joining the initiative, the authors of ontology commit to working with 
other members to ensure that, for any particular domain, there is convergence 
on a single ontology. 
There are two central features of collective invention [18]: firms release to their 
competitors information about the design and efficiency of new plants or technologies; 
and individual firms devote very little resources explicitly to the discovery of new 
knowledge. Thus the key to understanding collective invention is in the exchange and 
free circulation of knowledge and information within groups rather than in the inventive 
efforts of particular firms or individuals. 
Our model discussed in Chapter 3 has a similar life circle of individuals as above. 
10 
 
Prior research suggests that open scientific communities can be considered a 
complex, self-organizing system that is typically comprised of large numbers of locally 
interacting elements. Although the rules describing those local interactions may be few 
and simple, often global properties emerge that are unexpected and difficult to predict. 
[19] The next section discusses the complex adaptive system in details. 
2.2 Complex Adaptive Systems Perspective 
Human societies are complex in that there are many, non-linear interactions 
between their units, that is between people. The interactions involve the transmission of 
knowledge and materials that often affect the behavior of the recipients. The result is that 
it becomes impossible to analyze a society as a whole by studying the individuals within 
it, one at a time. The behavior of the society is said to emerge from the actions of its units 
[20]. It is indeed precisely suggested in [21] that the emerging architecture of complex 
systems tended to often be spontaneously such, because complex systems were born out 
of simple ones, and because simple systems then tend to be somehow included in more 
complex ones. As for our modeling of scientific community development, the rationale 
for the emergence of a hierarchy of modules is strongly similar: a complex system is 
dynamically born out of a simple one; new modules are created out of existing ones to 
supplement them by developing existing functionalities or adding new ones; and these 
new modules can be included in higher ones during the compilation process or at least are 
called as sub-systems. [22] 
11 
 
2.3 Exploration of Scientific Communities Using Agent Based Modeling 
In order to study open science communities, agent based modeling (ABM) is used 
here. ABMs provide theoretical leverage where the global patterns of interest are more 
than the aggregation of individual attributes, but at the same time, the emergent pattern 
cannot be understood without a bottom up dynamic model of the micro foundations at the 
relational level. [11]  
Agent-based modeling relies on a novel view of the creation of structure in social 
systems. Traditional social science generally assumes that social facts such as markets or 
cooperative behavior exist, and that they produce various forms of social organization 
and structure. Agent-based modeling assumes that both social structure and such social 
facts as markets or cooperative behavior are created from the bottom up via the 
interactions of individual agents. Rather than examining how social structure shapes 
behavior, agent-based modeling focuses on how local interactions among agents serve to 
create larger and perhaps global social structures and patterns of behavior. [23] Currently 
there are a lot of successful applications using ABM. For instance, agent-based 
computational economics (ACE) is the computational study of economies modeled as 
evolving systems of autonomous interacting agents. Thus, ACE is a specialization of 
economics to the basic complex adaptive systems paradigm [24]. 
An understanding of organizational creativity will necessarily involve 
understanding the creative process, the creative product, the creative person, the creative 
situation and the way in which each of these components interacts with the others [25].  
 
12 
 
The following parts will be organized based on the pattern of ODD which stands 
for overview, design concept and details [26]. 
 
 
 
 
 
13 
 
CHAPTER 3                                                                   
Design Concepts and Details 
3.1 Purpose 
Applications-oriented social scientific simulation models are characterized by 
relatively complex agents, as they contain as a rule several social scientific theories, and 
by the fact that, at the same time, the simulations work with large populations of tens of 
thousands of agents. The simulations may be grounded in empirical data and are utilized 
to model real social systems, whereby social networks must also be modeled in a 
functionally equivalent way to the real social world. The goal of computer modeling is 
the optimization of social interventions. [27]  
There are two goals to simulate the process of knowledge production. One is to use 
agent based simulation as a computational laboratory to discover the innovation of a 
scientific community. The other is to explore innovation output of the simulated different 
communities under a variety of configurations and parameter settings. 
3.2 State Variables and Scales 
Table 3.1 presents the attributes of the individual and evaluator. An individual has 
seven attributes associated with it. The first three are probabilities that indicate the 
frequency to invoke the three kinds of generators for contribution that will be discussed 
14 
 
in section 3.3.2. Also the sum of these three probabilities is one. Motivation represents 
the probability for an agent to be activated in each time interval. Motivation is based on a 
reputation that corresponds to the agent?s contribution. In general, the more kenes an 
agent creates, the higher reputation he has. The tenured attribute denotes whether or not 
an agent has passed the enculturation process. Only those agents who are tenured can 
make contributions to the community where they exist. In addition, the time when the 
individual is created is used to find an appropriate time in the future to judge if he has 
reached the requirements to be tenured. The last attribute is the sub domain the agent is 
most interested in. 
 As far as the evaluator is concerned, there are three state variables associated with 
it. The first two weight factors are used to compute the fitness of a specific kene. If its 
fitness is greater or equal to the threshold, it will be retained in the domain. Otherwise it 
has to be removed. 
Table 3.1 State variables and scales 
Components State Variables 
 
 
 
Individual 
The probability to elaboration 
The probability to create  
The probability to combination 
Reputation 
Motivation 
Tenured 
The time when the individual is created 
Sub domain 
 
Evaluator 
Weight of input links of kenes 
Weight of output links of kenes 
Threshold for a kene to be retained 
 
There are also some predefined scales for this simulation model. 
 
15 
 
? The size of grid 
This parameter is used to define the grid where all the agents are. 
? The length of kene 
This parameter denotes the size of bit vector which is used to present the kenes. 
? The stop tick 
It is to define the time when the simulation will end. 
3.3 Process Overview and Scheduling 
There are three novel factors in Neil Smith?s model: the complexity of software 
modules as a limiting factor in productivity, the fitness of the software to its requirements, 
and the motivation of the developers. [28]
 
In the model, the kene?s bit vector, kene?s 
fitness and motivation of individual are corresponding to these three factors respectively. 
 
Begin 
Entry of agents with random number 
Enculturation 
 Reputation > threshold 
Yes
Innovation 
Turnover 
No
16 
 
 
 
Figure 3.1 shows the flow chart of an agent?s life circle. 
At each time interval, a random number of new agents enter the community. The 
max number of new agents is different with the types of societies, such as exploration-
oriented, utility-oriented and service-oriented, where a utility-oriented community has the 
max growth rate, a service-oriented has a moderate value of growth rate and an 
exploration-oriented community has the minimum value. 
After a new agent moves into the society, he begins the enculturation process 
during which he becomes gradually familiar with the society. Then a constant time 
interval later, an evaluation process occurs to determine whether or not the new agent has 
enough enculturation to do contribution to this domain. If and only if his enculturation 
level is greater than a threshold, the new agent is qualified. Otherwise, the new agent will 
leave the society. The threshold to evaluate enculturation level is also different with the 
types of communities, where an exploration-oriented community has a higher level than a 
utility-oriented community whose threshold is greater than a service-oriented community. 
The following sections will discuss the entry, enculturation, innovation and 
turnover processes in details. 
3.3.1 Entry and Enculturation 
At every time interval, a random number of new agents enter the community and 
begin the enculturation process. Agents go through the enculturation process to become 
familiar with the domain before making contributions. 
Figure 3.1 Flow chart of the life circle of agents 
17 
 
During the enculturation process, agents move randomly within the knowledge 
space. As they move, at each time interval they select a random number of community 
members in their neighborhood to interact with. The change in fitness level of the agent is 
a function of its susceptibility to influence and the intensity of influence it receives from 
the agents that it interacts with. 
The first parameter is the susceptibility of agents, which is defined as follows. 
 
S
null
nulltnull null ?
null
nulle
null?
null
null?
null
?
null
nullnullnullnullnull
 
                         3.1
Where, ?
null
nullt null 1null is the time period during which the agent has been in the 
community. ?
null
, ?
null
 and ?
null
 make sure the trend is that initial susceptibility is high and it 
decreases exponentially. 
The curve of this function is like Figure 3.2. 
 
Figure 3.2 Susceptibility to influence 
 
The second parameter is the inclination of agents, which is defined as equation 3.2. 
 
I
null
nulltnull nullnullI
nullnull
nulltnull
null
nullnullnull
 
I
nullnull
nulltnull nullnullF
null
nulltnull1null nullF
null
nulltnull1nullnull?
nullnullnullnull
 
3.2
18 
 
Where,  ?
nullnullnullnull
 is the rate at which the enculturation level is pulled toward the peer. 
I
nullnull
nulltnull indicates the influence of peer j to i. The influence that an agent receives is the sum 
of that from all the peers. 
The enculturation level of an agent is then specified as a function of the 
susceptibility and inclination. 
 
F
null
nulltnull nullF
null
nulltnull1null nullS
null
nulltnullI
null
nulltnull
                         3.3
3.3.2 Innovation and Generators 
Evolutionary theory of technical change often contains the following components. 
1. The point of departure is the existence and reproduction of entities like 
genotypes in biology or a certain set-up of technologies and organizational forms in 
innovation studies.  
2. There are mechanisms that introduce novelties in the system. These include 
significant random elements, but may also produce predicable novelties.  
3. There are mechanisms that select from among the entities present in the system. 
The selection process reduces diversity and the mechanism operating may be the natural 
selection of biology. [29] In this model, all three characters are included, which indicates 
the potential to create innovation to some extent. 
At each time interval, the motivation of agents determines whether or not this agent 
will make a contribution. The agents with higher motivation are more likely to gain the 
opportunity of innovation. There are three types of generators including creation, 
elaboration and combination. 
19 
 
3.3.2.1 Kene Creation 
The creation operator is to create a new kene independently. The steps are as 
follows. 
1. Randomly select a number n between 0 and the length of kene. 
2. Randomly select n bits from a kene and assign every selected bit as 1. 
3. Calculate the distance between the new kene and reference point, which is the 
number of bits with the value 1. 
4. Put the new kene at one of the four possible locations. 
Knowledge becomes highly idiosyncratic, does not diffuse automatically and freely 
moves among agents and it has to be absorbed by agents through their differential 
abilities accumulated over time [30].
 
Therefore, every kene has its location and can be 
learned by other individuals. 
3.3.2.2 Kene Elaboration 
Production system based agents have the potential to learn about their environment 
and about other agents through adding to the knowledge held in their working memories. 
The agents' rules themselves, however, always remain unchanged. For some problems, it 
is desirable to create agents that are capable of more fundamental learning: where the 
internal structure and processing of the agents adapt to changing circumstances. There are 
two techniques commonly used for this: neural networks and evolutionary algorithms 
such as the genetic algorithm (GA). [31] In my model an evolution mechanism similar to 
GA is used to create new kenes. 
20 
 
The elaboration is to generate a new kene based on an already existing kene. The 
steps are as follows. 
1. Select a kene retained in the domain randomly and copy it to a new kene.  
2. Randomly select the beginning position and end position in the new kene to 
mutate. 
3. Randomly change the value of selected bits in the kene between the beginning 
position and the end position, as is illustrated in Figure 3.3. 
4. Calculate the distance from new kene to original kene. 
5. Put the new kene at one of the four possible locations. 
The equation to calculate the location of new kene is as following. 
 b
x
=a
x
+D
x 
or b
x
=a
x
-D
x 
                         3.4
y
=a
y
+D
y 
or b
y
=a
y
-D
y 
3.5
Where [a
x
 , a
y
] is the coordination of kene a and [b
x
 , b
y
] is the coordination of kene 
b. D
x 
is equal to the number of x-dimensional bits different between kene a and kene b. 
Similarly, D
y 
is equal to the number of y-dimensional bits difference between kene a and 
kene b. So the position of the new kene will be any one of these four possible positions, 
which is depicted in Figure 3.4.  
 
 
Begin End 
Figure 3.3 Demonstration for elaberation
21 
 
 
 
 
3.3.2.3 Kene Combination 
The combination is to generate a new kene based on more than two exiting kenes. 
The steps are as follows. 
1. Randomly select a number of existing kenes. In this model, the number is 
selected arbitrarily as 3. 
2. Create a new kene and randomly select two positions in the bit vector of this 
kene. 
3. Copy the bits before the first position from the first kene. Copy the bits 
between the first position and the second position from the second kene. Copy 
the bits after the second position from the third kene. The process is 
demonstrated in Figure 3.5. 
4. Calculate the position of the new kene based on equation 3.6, which is also 
repeated for the y dimension.  
Kene b 
Kene a 
D
x
 
D
y
 
Figure 3.4 Location of new kenes 
22 
 
 xnullw
null
x
null
nullw
null
x
null
nullw
null
x
null
                          3.6
The x1 is the x-position of the first kene.The w1 is the weight of the first kene, 
which is the ratio of the number of bits contributing to new kene to the length of kene. 
Also it is held that w
null
nullw
null
nullw
null
null1. 
5. Put the new kene at the location calculated using the method above. 
 
3.3.2.4 Principle of Kene Selection 
Kenes with higher fitness are cited more frequently than those with lower fitness. 
In order to implement this, we define distribution to indicate the probability of kenes 
chosen to combine or elaborate, which is based on the fitness of kene. Assume that the 
fitness of kene i is f
null
 and the total number of kenes is N. Then the probability of the kene i 
to be chosen is calculated using the following equation. 
 
p
null
null
f
null
? f
null
N
nullnullnull
 
                         3.7
 
Figure 3.5 Demonstration for combination 
23 
 
At each time interval, a random number is created. If this random number is 
smaller than the probability of kene to be chosen, then that kene will be selected to 
perform combine or elaborate. In Figure 3.6, there are four kenes whose probabilities to 
be selected are 0.1, 0.2, 0.3 and 0.4 respectively. Further the kene in blue covers the range 
from [0, 0.1); the kene in dark red covers the range [0.1, 0.3); the kene in green covers 
the range [0.3, 0.6), and the kene in violet covers the range [0.6, 1). If the generated 
number is 0.4, then the green kene will be selected. 
 
Figure 3.6 Demonstration for kene selection 
 
Additionally, at every time interval, a similar mechanism is used to determine 
whether or not a specific individual will contribute to the domain knowledge. The 
probability of individual to make a contribution is based on his/her reputation that is 
assigned randomly in the initialization process. The reputation will change dynamically 
during the simulation process, which means the reputation of an individual will increase 
if the kene published by him/her is accepted. On the other hand, the reputation will 
0.1
0.2
0.3
0.4
0?? 0.1
0.1?? 0.3
0.3?? 0.6
0.6?? 1
24 
 
decrease if his/her kene is declined. We will discuss the reputation rule in details in the 
following section. 
3.3.2.5 Credits on Contributions 
If the kene created by an agent is accepted by the evaluator, the reputation of this 
agent will increase.  
 
R
null
nulltnull nullR
null
nullt null1null nullnull1null?nulltnullnullnullnull1nullR
null
nullt null1nullnull                          3.8 
Where, R
null
nulltnull is the reputation of agent i at time t. And R
null
nulltnull1null is the reputation 
of agent i at time t - 1. 
On the other hand, if the kene is declined by the domain, the corresponding 
reputation level will decrease. 
 R
null
nulltnull nullR
null
nullt null1null null ?nulltnull nullR
null
nullt null1null                          3.9
Where, ? is the proportion that the reputation of agent i changes, and the proportion 
monotonically increases with successive acceptance or refusal, which trend is as equation 
3.10. Once the successive acceptance or refusal sequence is interrupted, ? is set to ?null0null. 
 
?nulltnull null ?nullt null1null null0.5nullnull1null?nullt null1nullnull
?null0null null0.1 
                         3.10
3.3.3 Sub-Domain 
Each individual has its own sub domain that indicates what fields that individual is 
interested in. In this model the spaces of different sub domains have different reference 
points. All the kenes created by the individuals with the same domain are based on the 
unique reference point.  The sub domain is a kind of trait [32] which has a set of distinct 
values. For examples, there are several sub domains in OBO (Open Biomedical 
25 
 
Ontologies) such as anatomy, biochemistry and taxonomy etc. At the same time anatomy 
also has its own sub-domains such as amphibian gross anatomy, fungal gross anatomy, 
cell type and cell component etc. 
Figure 3.7 depicts the situation where there are four sub domains. Every individual 
belongs to one of these fours domains so that the locations of their contributions are 
based on the corresponding reference point. 
 
Figure 3.7 The pattern of kenes due to sub-domains 
3.3.4 Evaluator 
Each evaluator selects a kene randomly and judges whether or not it can be 
retained based on its fitness. One kind of fitness, called individual fitness, is assigned to a 
kene with a random number between 0 and 1. The other fitness is relational fitness and it   
26 
 
is calculated by the number of input and output links of a kene. If the total fitness of a 
kene is less than a particular threshold, the kene will be discarded. On the contrary the 
kene with a fitness greater than the threshold, will be retained in the domain in order to be 
reused by other individuals. 
 
 Relational Fitness = w
out
*N
out
 + w
in
*N
in
                          3.11
And w
out 
 + w
in
 = 1 
 
 Ratio of Relational Fitness = Relational Fitness / Max Relational Fitness 
 
3.12 
Where w
out
 and w
in
 are the weights associated with influence and dependence 
respectively, and N
out
 and N
in
 are the number of links outward and inward respectively. 
Additionally, the max relational fitness is the max value of all the kenes? relational fitness. 
So the final fitness equation is as below. 
 
 F(k) = a*I(k) + (1-a)*R(k)                          3.13
Where a is the weight of individual fitness. F(k) is the final fitness of kene k. I(k) is 
the individual fitness of kene k. R(k) is the ratio of relational fitness of kene k. 
3.3.5 Turnover 
If the motivation of an agent is less than the exit threshold, it leaves the community 
or transfers to another community in which it has more potential to increase motivation 
and make contributions. 
The exit threshold is different with the types of communities including exploration-
oriented, utility-oriented and service-oriented. Here service-oriented community has the 
maximum threshold, utility-oriented has a moderate value and exploration-oriented 
27 
 
community has the minimum threshold. And the higher the threshold for turnover is, the 
easier an agent will leave the community. 
3.3.6 Scheduling 
The scheduling deals with the order of processes and in turn the order in which the 
state variables are updated. In my model the characters of scheduling are as follows. 
? Use discrete time steps. 
? A global method to calculate the rank of each individual which determines the 
probability of making contributions. The rank of individual is based on his or 
her reputation which reflects the contribution and familiarity towards the 
domain.  
? At each time interval the execution sequence for individual is random and 
every individual updates its state asynchronously according to this order.  
? Every individual is based on the current context, i.e. asynchronous updating 
[33].  
3.4 Framework for Communicating Individual Agent 
The activities of scientific communities are simulated by Repast integrated 
simulation framework that is a kind of software for creating agent based simulations [10] 
using the Java language. 
All the kenes and agents including individuals and evaluators exist in a two 
dimension grid. At each time interval agents move randomly in the grid.  
 
28 
 
Three staged models of scientific society are developed with increasing levels of 
sophistication to study innovation and sustainability.  
 The first model considers the creation and development of knowledge in the 
scientific society, including the contribution of new knowledge, growth of the domain, 
citation behavior, and the clustering of knowledge into specialties. For the contribution of 
new knowledge three kinds of generators are available to use including creation, 
combination and elaboration. In addition, the generator of combination and elaboration 
will lead to citation behavior that includes not only the connection among kenes but also 
the relationship involving the associated owners of kenes.  
The second model views a scientific community as an autonomous system through 
the introduction of new individuals and the interconnection between normal individuals 
and evaluators. There are three life stages for an individual including enculturation, 
innovation and turnover process, where the innovation process is implemented in the first 
model. So the activities of agents in the enculturation and turnover process will be the 
key points of research at this stage. Different parameters will be investigated such as 
susceptibility for influence of other agents, decay rate and enculturation rate. 
The next model will extend the original model to simulate the different types of 
societies based on a variety of cultural parameters such as exploration-oriented, utility-
oriented and service-oriented which are different from one another in the following 
aspects:  recruitment selectivity, growth rate, turnover rate and decision making style. 
 
29 
 
3.4.1 Relationships 
There are three types of relationships including kene and kene, individual and 
individual, kene and individual. 
 Figure 3.8 shows these three kinds of relationships. 
i. Relation among kenes 
The relationship between kenes indicates that a kene is elaborated or combined 
with others. 
ii. Relation among individuals 
The owner individual of new kene is related to that owner of kene used to derive 
the new kene. This relation captures the influence between individuals. 
iii. Relation between kenes and individuals 
That an individual is related to the kene created by him indicates the individual is 
the owner of the kene. 
 
kene 
individual 
Relation between kenes
Relation between kene and individual 
Relation between individuals
30 
 
 
3.4.2 Fitness 
There are two simple rules for the fitness of agents. 
If the kene created by an agent is accepted by the evaluator, the fitness of this agent 
will increase. On the other hand, if the kene is declined by the domain, the corresponding 
fitness level will decrease. In the model, fitness is equivalent to reputation. 
3.4.3 Stochasticity 
In this model there are various stochastic aspects as described below. 
? Individual randomly selects three kinds of generators. 
? The kenes to combine and elaborate are selected randomly. 
? Individuals and evaluators move randomly in the context. 
? The location of new kene is chosen randomly in the four possible locations. 
3.4.4 Observation 
This section is about how data are collected from the agents based model for 
testing, understanding, and analyzing. In the model, there are three metrics observed over 
the time periods such as the number of kenes increases over time, the input and output 
links of kenes, how many kenes are created by each individual. 
3.5 Details 
In this section we discuss model elements (initialization, input, sub-models) that 
present the details that were not discussed in the overview. 
Figure 3.8 Three kinds of relationships 
31 
 
3.5.1 Initialization 
This part deals with how the environment and the individuals are created at the 
start of a simulation run. 
? The fitness of each individual is set randomly. 
? The sub domain of each individual is randomly selected. 
? The fitness of each kene is also set randomly. 
? The length of kene is 10. 
? The width and height of grids are 200 respectively. 
? The probability for three generators is the same. 
? The initial location of individual and evaluator is random. 
? The reference point is the center of grid.  
3.5.2 Types of Scientific Communities 
To test the different creative output of an open scientific community based on 
different evaluation strategies, we considered three types of open source society: 
Exploration-oriented, Utility-oriented and Service-oriented. The objective of exploration-
oriented is to share innovations and knowledge. One example of exploration-oriented is 
OBO foundry whose goal is to create a suite of orthogonal interoperable reference 
ontologies in the biomedical domain, thereby enabling scientists and their instruments to 
communicate with minimum ambiguity. In this way the data generated in the course of 
biomedical research will form a single, consistent, cumulatively expanding whole. The 
objective of utility-oriented is to satisfy an individual need. The example of utility-
oriented is nanoHUB organization whose vision is to pioneer the development of 
32 
 
nanotechnology. The members in nanoHUB are developing resources to help others learn 
about nanotechnology while making use of it in their own research and education. The 
purpose of service-oriented is to provide stable services, an example of which is ontology 
lookup service (OLS). The OLS provides a web service interface to query multiple 
ontologies from a single location with a unified output format. 
These three community cultures differ from each other in terms of recruitment 
selectivity, growth rate, turnover and decision making style.  
The table below defines the three kinds of communities. 
Table 3.2 Parameters of three kinds of communities 
Community Type Characters 
Exploration-oriented Recruitment Selectivity High 
Growth rate Low 
Turn over Low 
Decision-making Style Centralized 
Utility-oriented Recruitment Selectivity Moderate 
Growth rate High 
Turn over Moderate 
Decision-making Style Emergent selection 
Service-oriented Recruitment Selectivity Low 
Growth rate Moderate 
Turn over High 
Decision-making Style Council 
 
Recruitment selectivity indicates the threshold that determines whether an agent 
will begin to contribute or leave the community after the process of enculturation. If the 
enculturation level is greater than the threshold, the agent begins to contribute. Otherwise 
the agent will leave the community. 
Growth rate indicates the number of new individuals entering the community at 
each time interval i.e. the number belongs to U (0, Growth Rate). 
33 
 
Turnover indicates the threshold that determines whether or not an agent will leave 
the community. If the motivation level is less than the threshold, the agent will leave the 
community. Otherwise, the agent will stay in the community and continue to make 
contributions to the domain. 
Decision making style involves the process that determines whether to accept or 
reject a contribution.  
3.5.2.1 Exploration-oriented 
Table 3.3 Parameters of exploration-oriented community 
Community Type Characters 
Exploration-oriented Recruitment Selectivity High 
Growth Rate Low 
Turnover Low 
Decision-making Style Centralized 
 
Centralized indicates that each kene created by agents is judged by a single 
evaluator, which process is shown in Figure 3.9. 
 
Begin 
A new kene is created 
Fitness >= T 
Accept this kene Decline this kene 
Y
N
End 
Figure 3.9 Flowchart of decision process of exploration-oriented community 
34 
 
Once a new kene is created by an agent, the evaluation process takes place to 
determine whether to accept it or reject it. The evaluation is based on the kene?s fitness 
value that is a random float number belonging to uniform distribution from 0 to 1. In 
general, the threshold T for a kene to be selected is 0.5. If the fitness of the kene is greater 
than 0.5, it will be retained in the domain and would also be referenced by other agents in 
the future. On the other hand, the new created kene has to be removed if its fitness value 
is less than 0.5. 
3.5.2.2 Utility-oriented 
Table 3.4 Parameters of utility-oriented community 
 
Community Type Characters 
Utility-oriented Recruitment Selectivity Moderate 
Growth Rate High 
Turnover Moderate 
Decision-making Style Emergent selection 
 
Emergent selection decision is implemented as following. Because it is impossible 
for the new kene to be used by other individuals when it is just created, the evaluation of 
it will be postponed to a definite number of time periods later, such as 50 ticks. Then if 
references to this kene are greater than a threshold, it is accepted. Otherwise the kene will 
be removed from the domain. This process is shown in the flow chart below. 
T1 is a threshold to determine when the created kene will be evaluated and T2 is 
another threshold to determine the minimum number of out lines for a kene to be 
accepted. 
 
35 
 
 
 
 
As far as utility-oriented community is concerned, only one evaluator exists. At 
each time interval, the evaluator iterates all the kenes in the domain to find out those 
kenes that exist in the domain for just a predefined time.  Then the evaluator calculates 
the number of out links for every kene that is T1 years old. If the number is greater than 
T2 that is also a predefined value, the kene will be retained in the domain so that this 
kene could be used to create new kenes by other agents in the future. On the other hand, 
the kene with the number of out links less than T2 will be removed from the domain. 
 
 
Begin 
Iterate all the kenes 
Age = T
1
 
Accept this kene Decline this kene 
Y
N
End 
Compute the number of out links (n) 
n >T
2
 
Y
N
Figure 3.10 Flowchart of decision process of utility-oriented community 
36 
 
There is one thing that needs to be noticed. The out links of the kene are built when 
others kenes cite this kene. So the greater the number of out links of a kene, the more 
impact the kene has. In other words, the kene is more important in this domain.  
3.5.2.3 Service-oriented 
Table 3.5 Parameters of service-oriented community 
 
Community Type Characters 
Service-oriented Recruitment Selectivity Low 
Growth Rate Moderate 
Turnover High 
Decision-making Style Council 
 
Under the decision style of council, whether or not to accept a new kene is 
determined by several independent evaluators. If majority of the evaluators accept this 
new kene, then it is retained in the domain. Otherwise the kene is declined. The flow 
chart in Figure 3.11 describes the process. 
In future versions, the evaluator can be members of the community, and probability 
for acceptance can be a function of the path length between the two kinds of agents. 
In a service-oriented community, there exist multiple evaluators who are in charge 
of the task to determine whether to accept or reject a new kene. When a new kene is 
introduced by an agent, all these evaluators will give their own judgments independently. 
The process of each evaluator to evaluate the new kene is the same as that in the 
exploration-oriented community. After all the evaluators finish the process of evaluation, 
the new kene is accepted by the domain if more than half of the evaluators agree to 
accept it. Otherwise the kene is removed.  
 
37 
 
Figure 3.11 Flow chart of decision type of service-oriented community 
Begin 
A new kene is created 
 
Accept this kene Decline this kene 
Y
N
End 
Iterate all the evaluators 
This evaluate accepts the kene 
Fitness >= N(0,1) 
 
Iteration end? 
Y
This evaluate declines the kene 
N
 
More than half
evaluators agree? 
Y
N
38 
 
CHAPTER 4                                                                   
Implementation of Simulation Model 
4.1 Introduction of Repast 
According to the website of Repast, Repast is an acronym for the Recursive Porous 
Agent Simulation Toolkit that is a free and open source agent-based modeling toolkit that 
simplifies model creation and use. Repast Simphony provides a rich variety of features 
including the following. 
? The model development can use pure Java, Groovy, flowcharts and any 
mixture of them. 
? A pure Java model execution environment includes built-in results logging 
and graphing tools that make it easy to change the appearances of agents. 
? The context is based on a flexible hierarchy that can realize the modeling 
and visualization of 2D environments and 3D environments. 
? The discrete event scheduler is fully concurrent multithreaded. 
? All the models developed by Repast are object-oriented. 
In general, the standard model using Repast is based on contexts and projections 
[34].  
39 
 
4.1.1 Contexts 
The core data structure in Repast is called a Context and all agents must be in a 
context.  From a modeling perspective, the Context represents an abstract population. The 
objects in a Context form the population of a model. Although the context doesn?t 
provide mechanisms to implement relationships between agents, it is an infrastructure to 
define the interactions of the populations. 
In addition to maintaining the collection of proto-agents, the Context also holds its 
own internal states that can consist of multiple types of data. This provides a way in 
which the agents can interact with the context and exchange information. In order to 
maintain these states, context also has behaviors associated with it.  
Context is a hierarchical structure which means a parent context can have some sub 
contexts. Different sub contexts hold different internal states and the same agent could 
have different behaviors when it exists in different sub contexts. Also if an agent is in a 
sub context, it is certain it is in the parent context. On the other hand, the reverse is not 
true i.e. an agent in the parent context can not be in a sub context. 
4.1.2 Projections 
Projections are kinds of data structures used to define relationships between agents 
within a context. From a practical view, that Projections are added to a Context is to 
allow the agents to interact with each other.  Projections have a many-to-one relationship 
with Contexts, which means each Context can have an arbitrary number of Projections 
associated with it. In other words, within each Context, the agents can have more than 
one type of relationship with one another. 
40 
 
There are some frequently used projections including grid, continuous space, 
network and geography. The following figure shows how context, sub context and 
projection interact. 
 
 
In Figure 4.1, the context has three sub contexts where sub context 1 only has 
network projection which indicates the relationship between agents and in sub context 2 
the projection used is grid in which every agent occupies one cell that can be represented 
by a pair of coordinates. Furthermore, sub context 3 consists of the mixture of grid and 
network. As an agent in sub context 3, it can have two kinds of projections associated 
Context 
Sub Context 1 
Sub Context 2 
Sub Context 3 
Figure 4.1 Contexts and projections 
41 
 
with it. Finally, it is noticed that any agents existing in a sub context also belong to the 
parent context. 
4.2 Implementation of Agents 
According to Figure 1.2, there are three major objects:  individual, evaluator and 
kene respectively. Kenes are created by individuals and are evaluated by evaluator. So 
kenes do not have any behaviors except that they own some state variables. In addition, 
individual and evaluator have a common super class that is called basic agent, because 
both of them have some same behaviors such as moving in the grid.  
The Figure 4.2 presents the class view of these three major components in the 
model. 
 
 
Generator 
Elaboration Combination Creation 
Basic Agent 
Individual Evaluator 
Configure 
1
Domain 
Kene 
1 
0..* 
Contribute 
1..*
Evaluate 
1..*
11 1
1
Vector X 
Vector Y 
Include 
1 
1 
1 
Include 
Figure 4.2 Class view of systems model of creativity 
42 
 
The following is the corresponding implementation of these four kinds of objects. 
4.2.1 Basic Agent 
The basic agent does not have any state variables, but it has two behavior methods 
including move and isValidPosition. Here the move method defines how an agent moves 
in the context at each time interval. The method of isValidPosition is used to judge 
whether or not a coordinate is in the range of the current context and it is useful for 
agents to move and for new kenes to be located in the context. 
Table 4.1 Definition of basic agent 
 
Type Behaviors 
Basic Agent Move 
isValidPosition 
 
 
4.2.2 Individuals 
Three kinds of probabilities in Table 4.2 are corresponding to the frequency of 
using creation, combination and elaboration generators. Reputation represents the level of 
contribution of agents. When a kene created by an agent is accepted by the community, 
the agent?s reputation will be up. Otherwise the reputation value would be down if his/her 
kene is rejected. The motivation is generalize reputation, which means the probability to 
make contributions. In other words, the higher an agent?s reputation is, the more likely to 
provide his/her kenes. The time when the kene is created is useful to evaluate the result of 
enculturation. After a constant amount of time, a new agent needs to be assessed based on 
his/her enculturation level. Only those agents who reach a minimum threshold can stay in 
the community. Otherwise they have to leave. Tenure variable indicates if an agent has 
43 
 
passed the enculturation process successfully. Major represents what sub area an agent is 
interested in. 
All these behaviors listed in Table 4.2 form the whole life cycle of an agent 
including entry and enculturation, innovation and turnover. 
Table 4.2 Definition of individual 
 
Type State Variables Behaviors 
Individual Probability to use creation generator Create 
Probability to use combination generator Combination 
Probability to use elaboration generator Elaboration 
Reputation Enculturation 
Motivation Entry and turnover 
Time to be created  
Tenure  
Major  
4.2.3 Evaluators 
In this model, three kinds of open scientific societies are studied, which includes 
exploration-oriented, utility-oriented and service-oriented. As far as a special society is 
concerned, it has the particular different decision style as discussed in Chapter 2.5.2. 
Weight for out links is used to calculate the fitness of the new kene. More links a kene 
has, more effects it does. Additionally the threshold for new kenes to be accepted is the 
minimum fitness value. Finally, the evaluator has only one behavior i.e. evaluation. 
Table 4.3 Definition of evaluator 
 
Type State Variables Behaviors 
Evaluator Decision style Evaluate 
Weight for out links  
Threshold for new kenes to be accepted  
 
44 
 
4.2.4 Kenes 
Although kenes are not active agents, they make up domain that is one of the three 
main components of this model. In addition the pattern of kenes determines the creativity 
of one society to some extent. 
Table 4.4 Definition of kene 
Type State Variables Behaviors 
Kene Length of kene None 
Vector x 
Vector y 
Fitness 
 
The length of kene determines the complexity of a kene. Generally, the longer a 
kene is, the more complicated the kene is. Also the length of kene is consistent with the 
size of vector x and vector y, both of which are bases to calculate the location of kene. 
Fitness refers to the degree to which the kene is suitable for the society.  
4.3 Implementation of Projection 
In this model, two kinds of projections are used to represent the relationship 
between agents, which are continuous space and network. 
4.3.1 Continuous Space 
The continuous space is very similar as grid. The main difference between 
continuous space and grid is that the location of an agent is represented by floating point 
coordinates in continuous space rather than by integer coordinates as in grid. 
In this model the codes to implement continuous space is as follows. 
ContinuousSpace space = 
ContinuousSpaceFactoryFinder.createContinuousSpaceFactory(null).createContinuousSp
45 
 
ace("Simple Space", context, new RandomCartesianAdder(), new 
repast.simphony.space.continuous.StickyBorders(), gridWidth, gridHeight); 
There are some major arguments needing user to set including name, context and 
the size of continuous space. In addition, users also can define their own functions how to 
set the initial location of agents. 
At each time interval, both individuals and evaluators will move in the continuous 
space. In order to implement this motion, only one simple method is needed to invoke. 
moveTo(T object, double... newLocation) 
4.3.2 Network 
As discussed in Chapter 2.4.1, there are three kinds of relationships between agents 
including the relationship between kenes, the relationship between individuals and the 
relationship between kenes and individuals. And these relationships are implemented by 
network in Repast. 
The program to define a network is like below. 
NetworkBuilder builder = new NetworkBuilder("RelationOfKenes", context, true); 
Network network = builder.buildNetwork(); 
Since there are three kinds of relationship, three independent networks are needed 
to represent them. The other two are RelationOfIndividuals and 
RelationBetweenKenesAndIndividuals besides RelationOfKenes. 
For a developer, only three arguments are needed to be taken into considerations, 
which are name, context and directed. Because the relationships in this model are 
directed, the third argument is set to true.  
46 
 
When an individual creates a kene, a relationship will be build between them, 
which indicates the individual is the owner of this new kene. The program codes are like 
below. 
network.addEdge(individual, kene); 
If a new kene are combined from three other existed kenes, then the new kene will 
build a relationship with these three kenes. The program is like below. 
network.addEdge(old kene, new kene); 
At the same time, the owners of these related kenes are also needed to build a kind 
of relationship. It indicates that an individual cites others? products as reference. 
network.addEdge(cited individual, current individual); 
In addition, it is easy to get all the links associated with an agent. 
network.getEdges(agent); 
Sometime users may want only out or in links associated with a specific agent. 
network. getInEdges(agent); 
network. getOutEdges(agent); 
4.4 Scheduler 
There are basically two ways to work with the Repast Simphony Scheduler which 
determines the behavior of agents at each time interval. 
4.4.1 Directly Schedule an Action 
In this scenario, the user need get a schedule and tell it when and what to run. An 
example of adding an item to the schedule as follows: 
47 
 
ScheduleParameters params = ScheduleParameters.createRepeating(1, 2); 
schedule.schedule(params, agent, "move"); 
Firstly, an object of parameter for schedule is created, which defines the begin time 
and interval time. The example above initializes such a parameter that runs at the time 1 
and runs once every two time units. 
Secondly, the parameter defined will be sent to the scheduler of system along with 
the method that will be invoked when parameters are satisfied and the object in which the 
invoked method exists. Here, the method named as move in the object will be called per 
two time intervals from the time 1. 
4.4.2 Schedule with Annotations 
The Java 5 introduces several new and exciting features, one of which is annotation. 
For Repast, the typical case where the annotation is used is where action is defined. And 
the annotation tells the scheduler when and how often to invoke a method. This 
mechanism to define schedule is used in the model of this article. 
The model in this article has two kinds of agents who will do something at each 
time interval. So in each of them, there is a method named step with annotation 
associated with it. 
@ScheduledMethod(start = 1, interval = 1) 
 public void step() {?} 
The codes above let the scheduler of system know the step method will be called 
per 1 time interval from the beginning of time 1. 
 
48 
 
There is one thing worth notice that the method called by scheduler must be a 
public function. Otherwise a Java runtime exception will prompt. 
4.4.3 Global Method 
In the model of this article, it is necessary to update the individuals? rank based on 
their reputation. The purpose is to make sure that the individual with higher rank will get 
more opportunities to do contribution. So a global method has to be created, which will 
run before all the agents and will run only once at each time interval. And this method 
can?t be in any agents. Otherwise this global method may be run as many times as the 
number of all agents. Thus such a global method is defined in the context builder. There 
are two reasons. One is that the context builder has only one handler associated with it. 
The other is that the context builder is in charge of initialization of continuous space, 
networks and agents, which makes it be at the position over all agents. 
The codes to define the global method is similar as the directly schedule method 
discussed in the Chapter 3.4.1. The only difference from that is the argument of priority 
that is one parameter of class ScheduleParameters. 
ScheduleParameters params = ScheduleParameters.createRepeating 
(1, 1, ScheduleParameters.FIRST_PRIORITY); 
schedule.schedule(params, this, "update"); 
The argument of FIRST_PRIORITY guarantees that the global method will run 
before all the scheduled method. 
49 
 
4.5 Output 
Although Repast integrated simulation environment provides many useful and 
excellent tool kits to help developer implement their own purposes, general output of 
functions are not included in it. So we have to write some codes to reach the goal to show 
some useful information after the simulation model is end. 
In order to finish this task, it is separated into two steps. 
Firstly, we mush know when the model will stop. Since the length to run the model 
is defined in the parameters panel of the Repast, we can get the information using codes 
like below. 
endTick =  
(Integer) RunEnvironment.getInstance().getParameters().getValue("stopTick"); 
Once the current time is equal to the stop tick, the model can be stopped by the 
command below. 
RunEnvironment.getInstance().endRun(); 
And then we can invoke a method that does some analysis tasks. In our model, a 
dialog will be popped up to show some information like density and clustering coefficient. 
In this dialog window, there is a menu item linking to Guess software that can expertly 
analyze the pattern of kenes in the aspect of cluster. 
 
 
 
 
50 
 
CHAPTER 5                                                                   
Verification, Validation and Evaluation 
According to handbook of simulation, the evaluation for a simulation model 
consists of two levels, one is verification, and the other is validation that includes 
conceptual validation and operational validation. Conceptual validation means that the 
conceptual model is consistent with the real world. Operational validation refers to a test 
protocol to demonstrate the model outputs meet the requirement of real world. [35] 
 
 
Simulation modeling refers to the activity of driving the theoretical model from the 
real-world system. And the simulation programming refers to the activity that the 
computer based representation is derived from the model. There are two steps to check 
Real World System 
Simulation Model 
Simulation Output 
Simulation Modeling 
Simulation Programming 
Conceptual 
Validation 
Verification 
Operational 
Validation 
Figure 5.1 Overview of simulation model development [36] 
51 
 
whether or not the simulation program reflects the real world truly and fully, which are 
verification and validation. 
5.1 Verification 
Verification is the process of determining that a computer model, simulation, or 
federation of models and simulations implementations and their associated data 
accurately represents the developer's conceptual description and specifications. [37] In 
order to achieve the goal, unit test and integration test will be used. 
5.1.1 Unit Testing 
According to the Wikipedia, unit testing in computer programming is a software 
design and development method where the programmer gains confidence that individual 
units of source code are fit for use. A unit is the smallest testable part of an application. In 
procedural programming a unit may be an individual program, function, procedure, etc., 
while in object-oriented programming, the smallest unit is a method, which may belong 
to a base/super class, abstract class or derived/child class. 
Since our simulation uses object oriented programming language, Java, the unit 
testing is to assess the correctness of the method. In the simulation model there are three 
main classes, context builder, individual and evaluator. And each of them has a main 
entrance respectively. So unit testing focuses on these three methods. 
1. Build function in the class of ContextBuilder 
The build function is used to build the whole context of this simulation model 
such as continuous space and networks etc. So we check if the output is the same as our 
52 
 
expectation by variance of input arguments. For example, change the size of continuous 
space; change the initial location of agents and build relationships between different 
agents. Through these tests, we gain the confidence about the program itself. 
2. Step function in the class of Individual 
The step function in the class of Individual consists of move, enculturation and 
innovation. For the move function, we can change the coordinates of the next step to see 
if the function works well. And for the enculturation function, we can check the 
enculturation level when the individual meets different neighbors. In addition, for the 
innovation function, it is appropriate to change the probability of three kinds of 
generators to see if corresponding generator is invoked. 
3. Step function in the class of Evaluator 
Step function in the class of evaluator only has one role that is to evaluate 
whether or not to accept a new kene. So we can give it many kenes with different fitness 
value and check if the evaluator responds the correct decision. 
4. Results Dialog 
The results dialog is used to show some useful information after the simulation 
mode is end, and it is independent from the framework of Repast. So we can use JUnit to 
finish this test. For each function of the dialog, we can use assert to check the return 
value is right. Also as far as the layout of dialog is concerned, we can insert a main 
function to see if it is elegant. 
 In addition, for each of tests above, debug is a good idea to track the flow process 
of single function step by step. 
53 
 
5.1.2 Integration Testing 
Integration testing (sometimes called Integration and Testing, abbreviated I&T) is 
the activity [38] of software testing in which individual software modules are combined 
and tested as a group. It is between unit testing and system testing. 
Integration testing takes what have been unit tested, groups them in larger 
aggregates, applies tests defined in an integration test plan to those aggregates, and 
delivers as its output the integrated system ready for system testing. 
In this scenario, we check the correctness of the simulation model by analysis of 
output data. The core class of this model is Individual. So we pay much attention on it. At 
each time interval, the program can out put some information into a log file, such as the 
used generator, whether or not his kene is accepted, and the total number of his 
contributions. Firstly, the number of every individual?s contribution is none-decreasing 
function. Secondly, the new kene must be accepted when the number of contributions 
increases. 
 Therefore, we are confident with the simulation program through all the tests 
above. 
5.2 Validation 
Validation is the process of determining the degree to which a model, simulation, 
or federation of models and simulations, and their associated data are accurate 
representations of the real world from the perspective of the intended uses. [34] 
There are two expected regularities about the scientific community: one is the slope 
of curve of kene number over time will be smaller and smaller with the increasing of 
54 
 
maturity of this community; the other is most of kenes are created by a small numbers of 
individuals. 
 
Figure 5.2 The number of kene over the time 
 
 
Figure 5.3 Percentage of kenes 
 
 
In Figure 5.3, the x-axis is the percentage of individuals ordered by their ranks, and 
the y-axis is the corresponding percentage of kenes created by individuals. The top 20% 
individuals create more than 50% kenes. From Figure 5.2 and Figure 5.3, we can see the 
model is consistent with these two common senses above. 
55 
 
5.3 Experiments 
The motivation for pattern-oriented modeling (POM) is that, for complex systems, 
a single pattern observed at a specific scale and hierarchical level is not sufficient to 
reduce uncertainty in model structure and parameters [39]. So we use batch method to get 
the mean value. The next section is about some metrics used to measure creativity.  
5.3.1 Innovation Metrics 
5.3.1.1 Impact Factor 
In general, Impact Factor (IF) is frequently used as a metric for the importance of a 
sub-domain to its field. Impact Factor has the advantage over raw citation count that it is 
situated in time and accounts for changes in sub-domain importance over time [40]. 
Equation 5.1 represents the impact factor of sub-domain s for a given time frame t, which 
is 100 time steps in the model. 
 
IFnulls,tnull null
#citations from D
null
to D
null
nullnullnull
or D
null
nullnullnull
|D
null
nullnullnull
| null |D
null
nullnullnull
|
 
 
5.1
Where, D
null
 is the set of kenes in time t, and D
null
 is the set of kenes in sub-domain s. 
5.3.1.2 Diffusion 
However Impact Factor can?t accurately reveal the importance of a specific kene. 
One of drawbacks of Impact Factor is whether a particular kene has broad or narrow 
impact. Does a highly cited kene dominate a prolific sub-field or does it have broad 
appeal and utility across many fields? In this section, we use sub-domain based impact 
measures that reveal more than citation count alone.  
56 
 
The metric for evaluating broad-based impact is Diffusion [40], defined for a given 
sub-domain s: 
 
Diffusionnullsnull nullHnullP
null
null nullnullnullP
null
null
null
nulls
null
nulllogP
null
nulls
null
null 
Where,P
null
nulls
null
null null
#?citation?from?D
nullnull
?to?D
null
# citations to D
null
 
 
5.2
Where, D
null
 is the kene set for a sub-domain s, and D
nullnull
 is the kene set for a sub-
domain?snull. 
5.3.1.3 Average Kene Fitness 
The metric of Average Kene Fitness is used to evaluate the overall quality of kenes 
created by individuals. It is very important to assess the innovation of a community with 
combined use of the total number of accepted kenes. In general, the higher average kene 
fitness and the number of accepted kenes are, the more innovative a community is. 
Assume the total number of accepted kenes is N, then the equation to calculate the 
average kene fitness is as follows. 
 
Average Kene Fitness null
1
N
nullF
null
N
nullnullnull
 
 
5.3
Where, F
null
 is the fitness of kene i that includes individual fitness and relational 
fitness calculated in equation 3.13. 
 
57 
 
5.3.2 Network Metrics 
5.3.2.1 Density 
The density of the open science community network is defined as the ratio of the 
number of edges between individuals to the maximum number of possible edges, which 
indicates the cohesiveness of the community. The greater the density is, the higher the 
cohesiveness degree is. The equation of density of social network is equal to the number 
of edges between individuals divided by the number of all possible edges. 
 
? null
# edges
#allpossibleedges
                          5.4
5.3.2.2 Clustering Coefficient 
The clustering coefficient of a vertex in a graph quantifies how close the vertex and 
its neighbors are to being a clique (complete graph) [41]. 
A graph G = (V, E) formally consists of a set of vertices V and a set of edges E 
between them. An edge e
ij
 connects vertex i with vertex j. 
The neighborhood N
null
 for a vertex v
i
 is defined as its immediately connected 
neighbors as follows: 
 N
null
nullnullv
null
:e
nullnull
nullEnulle
nullnull
nullEnull                          5.5
Thus, the clustering coefficient of vertex i for directed graphs is given as 
 
C
null
null
nullnulle
nullnull
nullnull
k
null
nullk
null
null1null
nullv
null
,v
null
nullN
null
,e
nullnull
nullE 
                         5.6
Where, k
i
 is the total degree of the vertex i. nullnulle
nullnull
nullnull is the number of edges among 
all the neighbors of vertex i. 
58 
 
The clustering coefficient for the whole system is given by Watts and Strogatz as 
the average of the clustering coefficient for each vertex: 
 
C
null
null
1
n
nullC
null
null
nullnullnull
                          5.7
Where, n is the total number of the vertices in the graph.  
The clustering coefficient is used to denote the degree of kenes and individuals 
who connect with each other. Also there are three different clustering coefficients which 
are related to kenes, individuals and all the agents respectively. 
Clustering coefficient for technical network is only based on the network built on 
the kenes. And clustering coefficient for social network is in terms of the network of 
individuals. The last clustering coefficient regards total network as a whole. 
5.3.2.3 Centrality 
Within network analysis, the measure of centrality of a vertex determines the 
relative importance of a vertex in the graph. In this thesis, degree centrality is used to 
assess the social network. Degree centrality is defined as the number of links associated 
with a node. Then the average degree centrality is calculated as follows. 
 
C
null
null
1
N
nullC
null
N
nullnullnull
                          5.8
Where, N is the total number of vertices. C
null
 is the degree centrality of vertex i. 
It is noteworthy that there may be multiple links among individuals because an 
individual can cite the kenes created by the same individual for several times. Under this 
situation, the weight of link between individuals reflects the citation times. 
59 
 
5.3.2.4 Proportion of Strong Ties 
The difference between strong ties and weak ties is based on the number of links 
between vertices of the tie. In general, the more the number of links between two vertices 
is, the stronger the tie is. In this model, the strong ties are defined as those ties with the 
number of links greater than or equal to 2. The ties with the number of links equal to 1 
are defined as weak ties. Proportion of strong ties a vertex is equal to the number of 
strong ties divided by the total number of ties associated with it. The average proportion 
of strong ties over the whole population of agents is equal to the total number of strong 
ties over the total number of ties in the network, which is shown in Equation 5.9. 
 
R
null
null
#StrongTies
#StrongTiesnull#WeakTies
                          5.9
5.3.3 Sensitivity Analysis 
Sensitivity analysis (SA) is the study of how the variation (uncertainty) in the 
output of a mathematical model can be apportioned, qualitatively or quantitatively, to 
different sources of variation in the input of a model [42]. 
5.3.3.1 Experiments with Concept Creation Operator 
In the section we present the influence of creation operator. Figure 5.4 represents 
the situation with the length of kene at 10. 
60 
 
 
Figure 5.4 The pattern of kenes with the size of 10 
In this graph, there are four clusters due to the four sub domains for a new kene. 
Each cluster is a square whose length of side is 2*10 = 20. 
 If the length of kene is set to 20, the situation of kene pattern is like the figure 
below. 
61 
 
 
Figure 5.5 The pattern of kenes with the size of 20 
From Figure 5.5, we can see that the space covered by kenes is only associated 
with the length of kene when only creation operator is used. 
5.3.3.2 Experiments with Combination Operator as the Main Generator 
When combination generator is used mainly and the length of kene is 10, the 
pattern of kenes looks like Figure 5.6. 
62 
 
 
Figure 5.6 The pattern of kenes with the length of 10 
The combination operator is to generate new kene combined with some other 
existed kenes. And the location of new kene is in the middle of these kenes. So only using 
combination operator can not expand the occupied area.  
Figure 5.6 is mainly using combination operator. Compared with only using 
creation operator, the basic characters of these two situations are similar. However the 
interspaces among the four clusters are occupied by kenes, which shows the combination 
operator builds the bridge over different sub domains. 
5.3.3.3 Experiments with Elaboration as the Main Generator 
When mainly using elaboration operator, kenes occupy more areas than that of only 
using one of other two operators.  
63 
 
 
Figure 5.7 The pattern of kenes with mainly using elaboration 
From Figure 5.7 we can conclude the elaboration operator makes great contribution 
to extend the covered space. In another word, elaboration plays an important role on 
innovation. However, extensive divergence may mean lack of coherence, so balance is 
needed. 
5.3.3.4 Experiments with Equivalent Probability for Three Generators 
The experiment is to use three generators with equivalent probabilities, i.e. the 
probabilities of using each of them is 0.33333. Figure 5.8 depicts the situation when 
probabilities of using three generators are equivalent with each other. 
64 
 
 
Figure 5.8 The pattern of kenes with balance 
 
From Figure 5.8, we can see the scattering degree is moderate compared with other 
experiments. 
5.3.3.5 Experiments with Different Predefined Parameters of Communities 
Characters that distinguish three types of communities with each other are growth 
rate, recruitment selectivity, turn over and decision style. Three of them are parameterized, 
which are growth rate, recruitment selectivity and turnover. The experiment is to analyze 
the impacts on network metrics (i.e. density, clustering coefficient and centrality) with 
respect to different combinations of input parameters. 
 
65 
 
 Table 5.1 summarizes the range of parameters for three kinds of communities. 
Growth rate indicates a random number of agents belonging to U(0, growth rate) enter 
the community at each time interval. Recruitment selectivity represents the threshold for 
a new agent to pass enculturation process. If the enculturation level is greater than the 
threshold, the new agent becomes tenured. Otherwise the agent leaves the community. 
Turnover is the threshold for an agent to leave the community. If the reputation is less 
than the threshold, the tenured agent leaves the community. In addition, the relative 
relationship among these three kinds of communities stays unchanged, which means the 
growth rate of utility-oriented community is equal to the growth rate of exploration-
oriented community plus 2, and the growth rate of service-oriented community is equal to 
the growth rate of exploration-oriented community plus 1. Also the recruitment 
selectivity of utility-oriented community is equal to the recruitment selectivity of 
exploration-oriented community minus 0.1, and the recruitment selectivity of service-
oriented community is equal to the recruitment selectivity of exploration-oriented 
community minus 0.2. Moreover the turnover of utility-oriented community is equal to 
the turnover of exploration-oriented community plus 0.1, and the turnover of service-
oriented community is equal to the turnover of exploration-oriented community plus 0.2. 
Table 5.1 Range of parameters for communities 
 Exploration Utility Service 
Growth Rate [1.5, 5.5] [3.5, 7.5] [2.5, 6.5] 
Recruitment Selectivity [0.3, 0.5] [0.2, 0.4] [0.1, 0.3] 
Turnover [0.1, 0.3] [0.2, 0.4] [0.3, 0.5] 
 
Through experiments the impacts of these three predefined parameters are 
summarized below. The impacts on outputs consist of two components: density and 
66 
 
centrality in that these two parameters play a very important role on the potential for 
creativity. 
? Impacts of growth rate 
Impacts of growth rate on density are shown in Figure 5.9. 
 
Figure 5.9 Impacts of growth rate on density. 
From Figure 5.9, we can see the density of all three kinds of communities 
decreases with growth rate increasing.  
Impacts of growth rate on centrality are shown in Figure 5.10. The x-axis of the 
figure is the parameter values of exploration-oriented community. The growth rate of 
utility-oriented community and service-oriented community are based on that of 
exploration-oriented community with the relationship discussed in the second paragraph 
of the section. The similar policy is applied in all the figures of comparisons of impacts 
of predefined parameters for communities.  
From Figure 5.10, we can see the centrality of all three kinds of communities 
decreases with growth rate increasing. 
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
1.5 2.5 3.5 4.5 5.5
Impacts?of?Growth?Rate?on?Density?
Exploration
Utility
Service
67 
 
 
 
Figure 5.10 Impacts of growth rate on centrality 
? Impacts of recruitment selectivity 
Impacts of recruitment selectivity on density are shown in Figure 5.11. 
 
Figure 5.11 Impacts of recruitment on density 
From Figure 5.11, we can see the density of all three kinds of communities 
increases with recruitment selectivity increasing.  
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
1.5 2.5 3.5 4.5 5.5
Impacts?of?Growth?Rate?on?Centrality
Exploration
Utility
Service
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.0035
0.004
0.3 0.4 0.5
Impacts?of?Recruitment?on?Density
Exploration
Utility
Service
68 
 
Impacts of recruitment selectivity on centrality are shown in Figure 5.12. 
 
Figure 5.12 Impacts of recruitment on centrality 
From Figure 5.12, we can see the centrality of exploration-oriented and utility-
oriented community increases with recruitment selectivity increasing. However the 
centrality of service-oriented community almost stays unchanged with recruitment 
selectivity. 
? Impacts of turnover 
Impacts of turnover on density are shown in Figure 5.13.  
From Figure 5.13, we can see the density of exploration-oriented community 
decreases with turnover increasing. However the density of utility-oriented community 
increases with turnover increasing. Additionally the density of service-oriented 
community vibrates with turnover. 
 
0
0.5
1
1.5
2
2.5
3
0.3 0.4 0.5
Impacts?of?Recruitment?on?Centrality
Exploration
Utility
Service
69 
 
 
Figure 5.13 Impacts of turnover on density 
 
Impacts of turnover on centrality are shown in Figure 5.14. 
 
Figure 5.14 Impacts of turnover on centrality 
 
From Figure 5.14, we can see the centrality of exploration-oriented and service-
oriented decreases with turnover increasing. On the other hand the centrality of utility-
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.1 0.2 0.3
Impacts?of?Turnover?on?Density
Exploration
Utility
Service
0
0.5
1
1.5
2
2.5
0.1 0.2 0.3
Impacts?of?Turnover?on?Centrality
Exploration
Utility
Service
70 
 
oriented community slightly increases with turnover increasing. There is an interesting 
phenomena emergent that is the centrality of utility-oriented community higher than 
exploration-oriented community when the turnover is 0.3. 
Based on the sense that a community with low density and high centrality for social 
network yields more potential for innovation, much attention is paid in this situation in 
the model. Therefore growth rate is set to 2.5, and recruitment selectivity is set to 0.4. 
Additionally turnover is selected as 0.3 in that it leads to a kind of interesting phenomena 
as discussed in above paragraph. 
5.3.4 Experiments between Different Types of Communities 
Metrics including kene growth rates, clustering coefficient, diffusion and citation 
impacts are used to judge whether or not this simulation model can reflect the activities 
of real scientific societies. 
When the simulation is completed, a results dialog pops up. This dialog shows the 
number of output links associated with a specific individual or kene. Through this 
analysis, we can see the kenes published earlier have higher probability to be chosen to 
combination or elaboration. Similarly, the individual with higher motivation will produce 
more kenes. This situation is the same as the higher participants perceived personal 
rewards for their Linux engagement; the more they were willing to be involved in Linux-
related activities in the future [43]. It is also reflected by a case study of the development 
of the Apache web server showed that the top 15 programmers added 88% of the LOCs 
(Mockus et al., 2000). In contrast, the top 15 programmers for the GNOME project were 
only responsible for 48%, whereas the top 52 persons were necessary to reach 80%. A 
71 
 
clustering of the programmers based on the LOCs hinted the existence of a smaller group 
of 11 programmers within this larger group, who were still active. [44]  
5.3.4.1 Exploration-oriented 
1. Predefined parameters 
The following are predefined parameters for exploration-oriented community, and 
three operators probabilities are equal with each other i.e. 0.333. 
Table 5.2 Predefined parameters for exploration-oriented community 
Parameters Value Description 
Recruitment selectivity 0.4 If the enculturation level is greater than 0.4, the new 
agent becomes tenured. Otherwise the agent leaves the 
community. 
Growth rate 2.5 At each time interval, a random number of agents 
belonging to U(0, 2.5) enter the community. 
Turnover 0.3 If the reputation is less than 0.3, the tenured agent 
leaves the community. 
2. Results 
Table 5.3 includes experimental results such as density, diffusion, centrality and 
clustering coefficients. 
Table 5.3 Results on exploration-oriented community 
Metics Value 
Accepted Kenes 1387 
Average Kene Fitness 0.656 
Average Diffusion for Kenes 1.038 
Density for Social Network 0.001 
Ratio of Strong Ties to Weak Ties 0.043 
Centrality for Social Network 0.709 
Clustering Coefficient for Social Network 0.092 
Clustering Coefficient for Social-Technical Network 0.384 
 
Figure 5.15 indicates the total number of individuals who create the same amount 
of kenes. The x-axis represents the number of kenes, and the y-axis represents the total 
72 
 
number of individuals producing the corresponding number of kenes. From Figure 5.15, 
we can see most of individuals only generate few kenes. In addition, the number of 
agents decreases with the number of kenes increases. 
 
Figure 5.15 Histogram of the number of agents who create the same number of kenes 
 
Figure 5.16 indicates the number of kenes created by each individual ordered by 
the rank of individuals. And the rank of individuals depends on their motivation levels. 
The x-axis of histogram below is the No. of individuals and the y-axis is the number of 
corresponding kenes created by individuals. 
73 
 
 
Figure 5.16 Plot of the number of kenes created by each agent 
 
From Figure 5.16 we can easily find the core members make very greater 
contributions to this community than those common members do. 
When GEM (Graph Embedding) style is selected, the graph is shown in Figure 
5.17. 
74 
 
 
Figure 5.17 Emergent pattern in GEM layout 
Figure 5.17 shows a pattern of cluster, where green nodes are kenes and red nodes 
are agents. In addition, purple lines represent the relationship who creates which kenes. 
Blue lines represent the citation relationship between kenes. Also the relationships among 
agents are denoted by white line. 
Figure 5.18 shows the situation of accumulated impact factor changing over time 
where the unit scale of x-axis is 100 time ticks. The equation of impact factor is presented 
in equation 5.1. 
75 
 
 
Figure 5.18 Impact factor over time on exploration-oriented community 
 
The slope of trend of impact factor almost keeps the same over time. It shows the 
degree to which the people pay attention on this field doesn?t change with the time.  
5.3.4.2 Utility-oriented 
1. Predefined parameters 
The following are predefined parameters for utility-oriented community, and three 
operators probabilities are equal with each other (i.e. 0.333). 
Table 5.4 Predefined parameters for utility-oriented community 
Parameters Value Description 
Recruitment selectivity 0.3 If the enculturation level is greater than 0.3, the new 
agent becomes tenured. Otherwise the agent leaves the 
community. 
Growth rate 3.5 At each time interval, a random number of agents 
belonging to U(0, 3.5) enter the community. 
Turnover 0.4 If the reputation is less than 0.4, the tenured agent 
leaves the community. 
 
 
76 
 
2. Results 
Table 5.5 records the results on utility-oriented community. Compared with 
exploration-oriented community, the total knowledge and the degree of clustering are 
much fewer. The comparison of these three kinds of communities will be discussed in 
detail in section 5.3.4. 
Table 5.5 Results on utility-oriented community with equal probabilities 
Metrics Value 
Accepted Kenes 748 
Average Kene Fitness 0.829 
Average Diffusion for Kenes 0.542 
Density for Social Network 0.0004 
Ratio of Strong Ties to Weak Ties 0.556 
Centrality for Social Network 0.901 
Clustering Coefficient for Social Network 0.057 
Clustering Coefficient for Social-Technical Network 0.207 
 
Figure 5.19 indicates the total number of individuals who create the same amount 
of kenes. From this graph, we can see that there is a huge gap between the individual who 
creates the most kenes and other individuals. In my opinion, a super core member 
emerges under this situation. 
77 
 
 
Figure 5.19 Histogram of the number of agents who create the same number of kenes 
Figure 5.20 indicates the number of kenes created by each individual ordered by 
the rank of individuals. From it we also can see the No. 1 individual creates much more 
kenes than any of others. 
 
Figure 5.20 Plot of the number of kenes created by each agents 
Figure 5.21 is under the style of GEM using Guess software. In this graph there are 
one chief member around whom other agents and kenes are. 
78 
 
 
Figure 5.21 Emergent pattern in GEM layout 
 
Figure 5.22 is the trend of accumulated impact factor changing over the time. 
 
Figure 5.22 Impact factor over time on utility-oriented community 
79 
 
 
 
In the utility-oriented community with the predefined parameters, the impact factor 
decreases dramatically. Just after three hundred time intervals, the impact factors of three 
sub domains decrease to 0, which means that the research topics in this community have 
lost the interests of public. 
Because the decision style in the utility oriented community is emergent selection 
which means the determination for kenes to accept is based on the references of kenes. 
The more references a kene has, the more likely this kene will be accepted. But any kenes 
can?t be cited when it is just created. So the decision for a kene is postponed after a 
constant number of ticks that is 100 in this model. 
In addition, the overall reference circumstance is highly related to three kinds of 
operators including creation, combination and elaboration. Thus the increase of 
probability of using combination and elaboration will lead to the corresponding increase 
of clustering. The experiment below is to increase the probability of combination and 
elaboration so that the sum of probabilities choosing these two operators is equal to 90% 
i.e. the probability of creation is 0.1, the probability of combination is 0.45 and the 
probability of elaboration is 0.45. 
Table 5.6 is the summary of the simulation result after the probabilities of using 
three kinds of generators change. From this table, we can see all the metrics increase 
compared with Table 5.5. It shows that the probabilities of generators have great effects 
on the utility-oriented community. In other words, the operators of combination and 
elaboration play an important role on the expansion and sustainment of this community.  
80 
 
 
 
Table 5.6 Results on utility-oriented community with unequal probabilities 
Metrics Value 
Accepted Kenes 1459 
Average Kene Fitness 0.985 
Average Diffusion for Kenes 0.494 
Density for Social Network 0.001 
Ratio of Strong Ties to Weak Ties 0.457 
Centrality for Social Network 3.061 
Clustering Coefficient for Social Network 0.066 
Clustering Coefficient for Social-Technical Network 0.23 
Figure 5.23 indicates the total number of individuals who create the same amount 
of kenes. From this figure, it is still very clear that some chief members emerge in this 
community. 
 
Figure 5.23 Histogram of the number of agents who create the same number of kenes 
 
Figure 5.24 indicates the number of kenes created by each individual ordered by 
the rank of individuals. From it we also can see the No. 1 individual creates much more 
kenes than any of others. 
81 
 
 
Figure 5.24 Plot of the number of kenes created by each agents 
 
Figure 5.25 shows the pattern under the style of GEM. In this image there are three 
clear core members who make great contributions to the project. 
82 
 
 
Figure 5.25 Emergent pattern in GEM style 
 
 
Figure 5.26 Impact factor over time on utility-oriented community 
 
 
83 
 
Figure 5.26 shows the trend of impact factor over time. Compared with Figure 5.22, 
there are still three sub domains that can attract the publics although one sub domain has 
the zero impact factors. It proves that the appropriate selection of generators can keep the 
scientific community attractive for a longer time. 
5.3.4.3 Service-oriented 
1. Predefined parameters 
The following are predefined parameters for service-oriented community, and 
three operators probabilities are equal with each other i.e. 0.333. 
Table 5.7 Predefined parameters for service-oriented community 
Parameters Value Description 
Recruitment selectivity 0.2 If the enculturation level is greater than 0.2, the new 
agent becomes tenured. Otherwise the agent leaves the 
community. 
Growth rate 3.5 At each time interval, a random number of agents 
belonging to U(0, 3.5) enter the community. 
Turnover 0.5 If the reputation is less than 0.5, the tenured agent 
leaves the community. 
2. Results 
Table 5.8 summarizes the simulation results on service-oriented community. 
Table 5.8 Results on service-oriented community 
Metrics Value 
Accepted Kenes 1397 
Average Kene Fitness 0.69 
Average Diffusion for Kenes 1.038 
Density for Social Network 0.0004 
Ratio of Strong Ties to Weak Ties 0.031 
Centrality for Social Network 0.196 
Clustering Coefficient for Social Network 0.084 
Clustering Coefficient for Social-Technical Network 0.271 
 
 
84 
 
Figure 5.27 indicates the total number of individuals who create the same amount 
of kenes. 
 
Figure 5.27 Histogram of the number of agents who create the same number of kenes 
 
Figure 5.28 indicates the number of kenes created by each individual ordered by 
the rank of individuals. 
 
Figure 5.28 Plot of the number of kenes created by each agents 
85 
 
When GEM style is selected, the graph of agents? distribution will be like Figure 
5.29. From this image, it is clear that the community has kenes as the center. It is 
different from the exploration-oriented community and the utility-oriented community. 
 
Figure 5.29 Emergent pattern in GEM style 
 
86 
 
 
Figure 5.30 Impact factor over time on service-oriented community 
 
Figure 5.30 shows that the accumulated impact factor of service oriented 
community is also decreasing with the time. However the slope is larger than utility 
oriented community, which means the service oriented community has longer attractions 
than utility community. 
5.3.4.4 Emergent Social Organization Structures under Alternative Communities   
The purpose of this experiment is to compare communities with respect to 
innovation metrics and network metrics. So the experiment can be divided into two 
sections. One is to compare communities against number of accepted kenes, diffusion and 
average kene fitness. The other is to perform comparison of communities in terms of 
density, centrality and clustering coefficient.  
 
 
87 
 
? Comparison of Communities with Respect to Innovation Metrics 
Innovation metrics include number of accepted kenes, diffusion and average kene 
fitness. 
Firstly, the comparison of communities with respect to the number of accepted 
kenes is performed. Figure 5.31 represents that number of accepted kenes varies with the 
type of community. From this figure, we can conclude that exploration-oriented 
community has the similar number of accepted kenes to service-oriented community. And 
the number of accepted kenes of both of them is much larger than utility-oriented 
community. 
 
Figure 5.31 Number of accepted kenes 
 
Secondly, the comparison of communities with respect to the average kene fitness 
is performed. Figure 5.32 represents average kene fitness varying with the types of 
communities. Here utility-oriented community has the highest average kene fitness, and 
exploration-oriented community has the lowest average kene fitness. The reason is due to 
0
200
400
600
800
1000
1200
1400
1600
Exploration Utility Service
Accepted?Kenes
Accepted?Kenes
88 
 
the difference of decision-making style in different communities. The decision-making 
style in utility-oriented community is emergent selection under which evaluation of kenes 
is based on the references associated with the kene. Only those kenes with many enough 
references can be retained in the domain. So the kenes in the utility-oriented community 
has higher relational fitness so as to have higher average kene fitness further. 
 
Figure 5.32 Average kene fitness 
 
The situation that the average diffusion for kenes changes with the type of 
community is shown in Figure 5.33. It shows that the average diffusion of exploration 
oriented community is slightly higher than service oriented community that is also much 
higher than utility oriented community. It also means that the influence across sub 
domains in the utility oriented community is lower than other two types of communities. 
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Exploration Utility Service
Average?Kene?Fitness
Average?Kene?Fitness
89 
 
 
Figure 5.33 Average diffusion for kenes with different types of communities 
 
? Comparison of Communities with Respect to Social Network Metrics 
This section compares the three types of communities against social network 
metrics including density, centrality, clustering coefficient of social network. 
Firstly, the comparison of communities with respect to density of social network is 
made. Figure 5.34 represents that density of individuals varies with the type of 
community. From this figure, we can conclude that utility-oriented community has the 
similar density to service-oriented community. And the density of both of them is much 
lower than exploration oriented community. 
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Exploration Utility Service
Average?Diffusion?for?Kenes
Average?Diffusion?for?
Kenes
90 
 
 
Figure 5.34 Density with different types of communities 
 
Secondly, the comparison of communities with respect to centrality of social 
network is performed. Figure 5.35 shows the centrality with different types of 
communities. Here service-oriented community has much lower centrality than utility-
oriented and exploration-oriented community, which is consistent with the character of 
them. The purpose of service-oriented is to provide stable services. So the members in the 
service-oriented community consider co-operation less than those in utility-oriented 
community and in exploration-oriented communities. 
 
0
0.0002
0.0004
0.0006
0.0008
0.001
0.0012
Exploration Utility Service
Density
Density
91 
 
 
Figure 5.35 Centrality with different types of communities 
 
The situation of proportion of strong ties varying with type of community is similar 
to that of centrality. Utility-oriented community has significantly higher proportion of 
strong ties than other two types of communities. Also the proportion of strong ties in 
exploration-oriented community is larger than that in service-oriented community. The 
result is consistent with the definition for centrality and proportion of strong ties. 
Centrality is based on the degree of vertices, i.e., the number of links that a node has. 
Also the definition for strong tie is related to the number of links in the tie.  Comparison 
of Figure 5.36 and Figure 5.35 proves there are some kinds of correlations between 
centrality and proportion of strong ties.  
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Exploration Utility Service
Centrality
Centrality
92 
 
 
Figure 5.36 Proportion of strong ties with different communities 
 
The situation that the clustering coefficient for social network changes with the 
type of community is shown in Figure 5.37, and Figure 5.38 reflects that the clustering 
coefficient for social-technical network changes with the type of community. Both of 
these figures have the same conclusions about the clustering on different scientific 
communities. Here exploration oriented community has the strongest intention for clique. 
And utility oriented community has the loosest organizational structure. Service oriented 
community is between exploration oriented community and utility oriented community. 
0
0.1
0.2
0.3
0.4
0.5
0.6
Exploration Utility Service
Proportion?of?Strong?Ties
Proportion?of?Strong?Ties
93 
 
 
Figure 5.37 Clustering coefficient for agents with different types of communities 
 
 
Figure 5.38 Total clustering coefficient with different types of communities 
 
Table 5.9 summarizes all of the experiment results above in a table.  
Table 5.9 Experiment results 
 
Metrics Results 
Density for Social Network Exploration> ServicenullUtility  
0
0.02
0.04
0.06
0.08
0.1
Exploration Utility Service
Clustering?Coefficient?for?Social?
Network
Clustering?Coefficient?for?
Social?Network
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Exploration Utility Service
Clustering?Coefficient?for?Social?
Technical?Network
Clustering?Coefficient?for?
Social?Technical?Network
94 
 
Accepted Kenes Exploration null Service > Utility 
Average Diffusion for Kenes Exploration > Service > Utility 
Centrality for Social Network Utility > Exploration > Service 
Clustering Coefficient for Social Network Exploration > Service > Utility 
 
Another thing worth paying attention is absolutely most of kenes are created by 
very few individuals in utility-oriented community. 
5.3.5 What Distinguishes Innovative Communities? 
The purpose of this experiment is to explore the characters that distinguish the 
scientific communities. Here relationship between average kene fitness and social 
network metrics in a specific type of community is shown. To analyze the relationship 
batch run of the simulation model is needed where the replication times is 100. The 
experiment consists of two steps: the first step is to divide the collective data into three 
groups (i.e. low, moderate and high innovation) based on the average kene fitness; the 
second step is to compute the average network metrics for each group. 
5.3.5.1 Innovation in Exploration-oriented Community 
In all the one hundred replications, the minimum average kene fitness is 
0.630202741, and the maximum average kene fitness is 0.7205807. The replications with 
average kene fitness between 0.630202741 and 0.660328727 are classified as low 
average kene fitness. The replications whose average kene fitness are in [0.660328727, 
0.690454714) are classified as moderate kene fitness. The rest of replications belong to 
high kene fitness. Table 5.10 summarizes the network metrics grouped by average kene 
fitness. 
95 
 
Table 5.10 Network metrics of exploration-oriented community grouped by average kene fitness 
 Low Moderate High 
Density 0.001278 0.00138 0.001296 
Centrality 0.764141 0.771677 0.794853 
Clustering Coefficient 0.089459 0.091548 0.087866 
Proportion of Strong Ties 0.043825 0.035967 0.031939 
Number of Replications 10 68 22 
 
From Table 5.10, we can see the replications with moderate average kene fitness 
occupy the majority of all the collective data.  Figure 5.39 depicts the comparison of the 
results.  
 
Figure 5.39 Network metrics of exploration-oriented community grouped by average kene fitness 
 
In Figure 5.39, density and clustering coefficient almost keep the same in different 
group of average kene fitness. However, the centrality increases with the increase of 
average kene fitness. It shows centrality has positive effects on the quality of kenes in 
exploration-oriented community. Additionally, proportion of strong ties decreases with 
average kene fitness increasing. 
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Low Moderate High
Exploration?Oriented?Community
Density
Centrality
Clustering?Coefficient
Proportion?of?Strong?Ties
96 
 
5.3.5.2 Innovation in Utility-oriented Community 
In all the one hundred replications for utility-oriented community, the minimum 
average kene fitness is 0.708175474, and the maximum average kene fitness is 
0.955391252. The replications with average kene fitness between 0.708175474 and 
0.790580734 are classified as low average kene fitness. The replications whose average 
kene fitness are in [0.790580734, 0.872985993) are classified as moderate kene fitness. 
The rest of replications belong to high kene fitness. Table 5.11 summarizes the network 
metrics grouped by average kene fitness. 
Table 5.11 Network metrics of utility-oriented community grouped by average kene fitness 
 Low Moderate High 
Density 3.08E-04 4.43E-04 5.61E-04 
Centrality 0.730454 1.030052 1.310602 
Clustering Coefficient 0.04008 0.050025 0.054477 
Proportion of Strong Ties 0.530242 0.530113 0.565781 
Number of Replications 12 37 51 
 
From Table 5.11, we can see the replications with high average kene fitness occupy 
the majority of all the collective data, which is opposed to exploration-oriented 
community.  Figure 5.40 depicts the comparison of the results.  
97 
 
 
Figure 5.40 Network metrics of utility-oriented community grouped by average kene fitness 
 
In Figure 5.40, clustering coefficient is not significantly different across different 
innovation performance categories. However, density and proportion of strong ties, 
especially centrality, increase with the increase of average kene fitness. It shows 
centrality has positive effects on the quality of kenes in utility-oriented community. 
5.3.5.3 Innovation in Service-oriented Community 
In all the one hundred replications, the minimum average kene fitness is 
0.644617998, and the maximum average kene fitness is 0.728058495. The replications 
with average kene fitness between 0.644617998 and 0.672431497 are classified as low 
average kene fitness. The replications whose average kene fitness are in [0.672431497, 
0.700244996) are classified as moderate kene fitness. The rest of replications belong to 
high kene fitness. Table 5.12 summarizes the network metrics grouped by average kene 
fitness. 
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Low Moderate High
Utility?Oriented?Community
Density
Centrality
Clustering?Coefficient
Proportion?of?Strong?Ties
98 
 
Table 5.12 Network metrics of service-oriented community grouped by average kene fitness 
 Low Moderate High 
Density 2.97E-04 2.96E-04 2.97E-04 
Centrality 0.115405 0.113757 0.107917 
Clustering Coefficient 0.052939 0.05424 0.054654 
Proportion of Strong Ties 0.076199 0.07704 0.070723 
Number of Replications 9 69 22 
 
From Table 5.12, we can see the replications with moderate average kene fitness 
occupy the majority of all the collective data, which is similar to exploration-oriented 
community.  Figure 5.41 depicts the comparison of the results. 
 
Figure 5.41 Network metrics of service-oriented community grouped by average kene fitness 
 
In Figure 5.41, density and clustering coefficient almost stay the same in different 
group of average kene fitness. However, both centrality and proportion of strong ties 
decrease with the increase of average kene fitness. It shows centrality and proportion of 
strong ties has negative effects on the quality of kenes in service-oriented community. 
From all the experiments above, the common points are that the centrality has 
influences over the quality of kenes. Additionally the proportion of moderate average 
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Low Moderate High
Service?Oriented?Community
Density
Centrality
Clustering?Coefficient
Proportion?of?Strong?Ties
99 
 
kene fitness is larger than low and high average kene fitness in exploration-oriented 
community and service-oriented community. On the contrary, in the utility-oriented 
community the replications with high average kene fitness occupy the majority. 
Combined with the fact that utility-oriented community has the largest variety range of 
average kene fitness, we can conclude that there is the greatest variation of quality of 
kenes in the utility-oriented community. 
 
 
100 
 
CHAPTER 6                                                                   
Conclusions 
From what has been discussed above, we can safely reach the conclusion as 
follows. 
Firstly, the open science community model provides an existence proof that it is 
possible to use simple local rules to generate higher levels of organization. In particular, it 
shows that the three generators can lead to clusters of kenes and individuals that behave 
according to their independent rules. 
Secondly, the simulation model is a useful abstraction and simplification of the real 
world as discussed in the verification, validation and evaluation section. Therefore the 
model presented in this thesis provides a basis to perform further research by changing 
the value of parameters. 
Next, the simulation model can lead to the insights into where there might be 
policy leverage in the real world [45]. For example, one might identify which 
configuration of open science community has more potential to innovate through setting 
the configuration onto the simulation model in this thesis. Based on the simulation results, 
the utility-oriented community has the maximum centrality and minimum density of 
agents, which means it has more potential to be creative. 
 
101 
 
Finally centrality has significantly effects on the quality of kenes in all three kinds 
of scientific communities. At the same time there is the greatest variation of average kene 
fitness in utility-oriented community. 
6.1 Extension 
The key point of the model proposed in this thesis is the interconnection among 
agents. Thus the model can be used to simulate those communities which are organized 
based on the dynamic of relationship among members. 
These types of communities are listed as follows. [18] 
Shared Instrument: Its main function is to increase access to a scientific 
instrument. Shared Instrument collaboratories often provide remote access to expensive 
scientific instruments such as telescopes. In this community our model can help figure 
out the strategy of improving the efficiency of expensive instruments. 
Virtual Community of Practice: It is a network of individuals who share a 
research area and communicate about it online. Virtual Communities may share news of 
professional interest, advice, techniques, or pointers to other resources online. 
Virtual Learning Community: Main goal is to increase the knowledge of 
participants but not necessarily to conduct original research. 
Distributed Research Center: Its functions are similar as a university research 
center but at a distance. It is an attempt to aggregate scientific talent, effort, and resources 
beyond the level of individual researchers. 
 
 
102 
 
For these types of communities above, our model only needs to be slightly 
modified according to the specific characters of each type so that it can simulate the 
activities in these communities and analyze the data. 
6.2 Future Research 
In the current model, there are some simplifications. So we can enrich the model by 
adding additional attributes such as the knowledge level, which evolves over time as the 
agent innovates and receives information broadcast by other agents. Higher the level of 
knowledge is, more likely the agent provides an appropriate kene. Concomitantly the 
aggregated level of all agents in a community reflects the speed of knowledge diffusion. 
 In addition, one possible extension is to divide the context of Repast model from 
single one into two sub contexts, which means agents and kenes are in one of these two 
sub contexts respectively. In the sub context in which agents exist the grid projection is 
used so that agents can move and communicate with neighbors. In other context, network 
projection is used to represent the relationship among kenes. And the location of kenes 
can be omitted so as to focus on the cliquishness and clustering analysis.  
 Finally, in the current model, the evaluator is fixed, which means the agent as 
evaluator can?t do contribution to the community. However the reviewers of a journal, for 
instance, can also submit their own papers to be reviewed by other reviewers in the real 
world. Perhaps exchange of roles between evaluators and common members may lead to 
the emergence of other complex phenomena. 
 
103 
 
REFERENCES 
[1] Francisco Javier LLorens-Montes, V?ctor J. Garcia-Morales, Antonio J. Verdu-Jover, 
?The influence on personal mastery, organisational learning and performance of the 
level of innovation: adaptive organisation versus innovator organization,? 
International Journal of Innovation and Learning, Volume 1, Number 2 / 2004, Pages:   
101 ? 114 
[2] Faiz Gallouj, Edward Elgar, ?Innovation in the Service Economy: The New Wealth 
of Nations, ? 2002 
[3] S Gopalakrishnan, ?A Review of Innovation Research in Economics, Sociology and 
Technology Management, ? Omega, Int. J. Mgmt Sci. Vol. 25, No. 1 pp. 15-28, 1997 
[4] M. Csikszenthmihalyi, ?Implications of a systems perspective for the study of 
creativity,? Handbook of Creativity, pages 3, 1999. 
[5] Thomas B. Ward, Steven M. Smith, and Jyotsna Valid. ?Conceptual Structures and 
Processes in Creative Thought.?  
[6] Levent Yilmaz, ?Dynamics of Collective Creativity and Open Innovation in 
Scientific Commons Complex Adaptive Systems Perspective,? Computer Science 
and Software Engineering,  Auburn University. 
[7] Susan A. Mohrman, ?The Dynamics of Knowledge Creation: Phase One Assessment 
of the Role and Contribution of the Department of Energy's Nanoscale Science 
Research Centers.? University of Southern California, Los Angeles, CA 90089. 
[8] Complex System. From Wikipedia website. 
[9] Gary William Flake, ?The Computational Beauty of Nature: Computer Explorations 
of Fractals, Chaos,? Complex Systems, and Adaptation, Chapter 9. MIT Press, 2000. 
[10] Eric Bonabeau, ?Agent-based modeling: Methods and techniques for simulating 
human systems,? Icosystem Corporation, 545 Concord Avenue, Cambridge, MA 
02138. 
104 
 
[11] MichaelW. Macy and RobertWiller, ?FROM FACTORS TO ACTORS: 
Computational Sociology and Agent-Based Modeling,? Department of Sociology, 
Cornell University, Ithaca, New York 84153. 
[12] Levent Yilmaz, ?On the synergy of conflict and collective creativity in open source 
software communities,? Computer Science and Software Engineering, Auburn 
University. 
[13] Nathan Bos, Applied Physics Laboratory, Johns Hopkins, Ann Zimmerman, ?From 
Shared Databases to Communities of Practice: A Taxonomy of Collaboratories,? 
School of Information, University of Michigan. Journal of Computer-Mediated 
Communication, 12 (2007) 318?338. 
[14] Alexander Hars, ?Working for Free? ? Motivations of Participating in Open Source 
Projects,? IOM Department, Marshall School of Business, University of Southern 
California. 
[15] AUDRIS MOCKUS, ?Two Case Studies of Open Source Software Development: 
Apache and Mozilla,? Avaya Labs Research, Carnegie Mellon University. 
[16] Karim R. Lakhani and Robert G Wolf, ?Why Hackers Do What They Do: 
Understanding Motivation and Effort in Free/Open Source Software Projects,? MIT 
Sloan School of Management,  The Boston Consulting Group. 
[17] Barry Smith, ?The OBO Foundry: coordinated evolution of ontologies to support 
biomedical data integration,? Nature Biotechnology 25, 1251 - 1255 (2007). 
[18] R. Cowan, N. Jonard, ?The dynamics of collective invention,? Journal of Economic 
Behavior & Organization, Vol. 52 (2003) 513?532. 
[19] Greg Madey, ?Agent-Based Modeling of Open Source using Swarm,? Computer 
Science & Engineering, University of Notre Dame. 
[20] Nigel Gilbert, ?Agent-based social simulation: dealing with complexity,? Centre for 
Research on Social Simulation, University of Surrey, Guildford, UK. 
[21] Simon, Herbert A. ?The Architecture of Complexity.? Proceedings of the American 
Philosophical Society, 106 (December): 467-482. 1962. 
[22] Jean-Michel, Paul A. David, ?SimCode: Agent-based Simulation Modelling of 
Open-Source Software Development,? Dalle University Pierre-et-Marie-Curie & 
IMRI-Dauphine and Stanford University & Oxford Internet Institute. 
105 
 
[23] Brian J. L. Berry, L. Douglas Kiel, and Euel Elliott, ?Adaptive agents, intelligence, 
and emergent human organization: Capturing complexity through agent-based 
modeling,? School of Social Sciences, University of Texas at Dallas, Richardson, TX 
75080. 
[24] Leigh Tesfatsion, ?Agent-based computational economics: modeling economies as 
complex adaptive systems,? Department of Economics, Iowa State University, Ames, 
IA 50011-1070, USA. 
[25] Richard W. Woodman, ?Toward a Theory of Organizational Creativity,? Texas A&M 
University. 
[26] Volker Grimma, Uta Bergerb, ?A standard protocol for describing individual-based 
and agent-based models,? Department of Okologische Systemanalyse, Permoserstr. 
15, 04318 Leipzig, Germany. 
[27] Robert Tobias and Carole Hofmann, ?Evaluation of free Java-libraries for social-
scientific agent based simulation,? Journal of Artificial Societies and Social 
Simulation vol. 7, no. 1. 
[28] Neil Smith, Andrea Capiluppi, Juan Fern?ndez Ramil, ?Agent-based Simulation of 
Open Source Evolution,? Computing Department, The Open University, Walton Hall, 
Milton Keynes MK7 6AA, U.K. 
[29] Charles Edquist, ?Systems of Innovation, Technologies, Institutions and 
organizations,?  Pages 6-7. 
[30] Franco Malerba, CESPRI, ?Sectoral Systems of Innovation and Production,? 
Bocconi University, Via Sarfatti 25, 20136 Milan, Italy. 
[31] Nigel Gilbert, ?How to Build and Use Agent-Based Models in Social Science,? 
Centre for Research on Simulation for the Social Sciences, School of Human 
Sciences, University of Surrey, Guildford, GU2 5XH, UK. 
[32] Robert Axelrod, ?The dissemination of Culture: A Model with Local Convergence 
and Global Polarization,? Journal of Conflict Resolution 41 (1997): 203-26. 
[33] John H. Miller and Scott E. Page., ?The Standing Ovation Problem,?  April 12, 2004. 
[34] Turtlebender, Etatara, ?Getting started for Repast Simphony,? Confluence July 24, 
2008. 
106 
 
[35] Edward J. Rykiel Jr., ?Testing ecological models : the meaning of validation,? 
Ecological Modeling 90 (1996) 229-244.  
[36] Stephen Vincent, ?Input Data Analysis Chapter 3,? Compuware Corporation. 1998. 
[37] Validation & Accreditation (VV&A) for Models and Simulations, Department of 
Defense Documentation of Verification, Missile Defense Agency, 2008. 
[38] SoftwareTestingClub.com, 2009, "Is Integration A Phase?", 
http://www.softwaretestingclub.com/forum/topics/is-integration-a-phaseIs 
Integration a phase. 
[39] Volker Grimm, ?Pattern-Oriented Modeling of Agent-Based Complex Systems: 
Lessons from Ecology,? Science 310, 987 (2005). 
[40] Gideon S. Mann, David Mimno, Andrew McCallum, ?Bibliometric Impact Measures 
Leveraging Topic Analysis,? Department of Computer Science, University of 
Massachusetts Amherst, Amherst MA 01003. 
[41] D. J. Watts and Steven Strogatz, ?Collective dynamics of 'small-world' networks,? 
Nature 393: 440?442. (June 1998). 
[42] Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D. Saisana, 
M., and Tarantola, S., ?Global Sensitivity Analysis,? The Primer, John Wiley & Sons, 
2008. 
[43] Guido Hertel, Sven Niedner, Stefanie Herrmann, Institut fuer Psychologie, 
?Motivation of software developers in Open Source projects: an Internet-based 
survey of contributors to the Linux kernel,? University of Kiel, Olshausenstr. 40, D-
24 098 Kiel, Germany. 
[44] Stefan Koch & Georg Schneider, ?Effort, co-operation and co-ordination in an open 
source software project: GNOME,? Department of Information Business, Vienna 
University of Economics and Business Administration, Augasse 2?6, A-1090 Vienna, 
Austria 
[45] Robert Axelrod, ?Building New Political Actors: A Model for the Emergence of New 
Political Actors,? Artificial Societies: The Computer Simulation of Social Life, 1995, 
19-39. 
 
107 
 
APPENDIX 
Pseudo Code for the Model 
? ContextCreator 
The class of ContextCreator is used to build the context in which all agents must be. 
It has two major parts. One is constructor; the other is a global method. 
public class ContextCreator implements ContextBuilder 
{ 
 public Context build(Context context) 
 { 
  // Read parameters and build context according to these parameters   
  int N = readParameters("geneLength"); 
  int gridWidth = readParameters("gridWidth"); 
  int gridHeight = readParameters("gridHeight"); 
  int amplification = readParameters("amplification");  
  ContinuousSpace space = buildSpace("Simple Grid", gridWidth, 
gridHeight); 
  // Build the network that represents the relation between kenes. 
108 
 
NetworkBuilder("RelationOfKenes", context, true); 
  // Build the network that represents the relation between individuals. 
  NetworkBuilder("RelationOfIndividuals", context, true); 
// Build the network that represents the relation between individuals and 
kenes. 
  NetworkBuilder("RelationBetweenKenesAndIndividuals", context, true); 
     
//Set a global method named ?update? that runs every time tick. 
setGlobalMethod(?update?, 1, 1); 
  return context; 
 } 
public void update() 
 { 
  Context context = m_context; 
  // Invoke the function that is in charge of entry and turnover. 
entryAndTurnover(context); 
  // Calculate the ranks for every individuals, which is based on their 
reputations. 
  calRanksForIndividuals(); 
  // Judge if the end time reaches. 
109 
 
  if (currentTick == endTick) 
  { 
   // Calculate some metrics when the simulation is end. 
   calMetrics(); 
  }   
  //Record some history data 
  recordHistory(currentTick); 
 } 
} 
? Individual 
The class Individual represents the member of scientific community, who makes 
contributions to the community. The class has two major functions. One is constructor; 
the other is named as ?step? which is invoked every time tick. 
public class Individual 
{ 
 //The argument means if the individual is tenured when he is created 
 public Individual(boolean tenure) 
 { 
  // Read parameters and assign them to state variables 
  m_iAmplification = readParameters("amplification"); 
110 
 
  m_dProbCombination = readParameters("probCombination"); 
  m_dProbCreation = readParameters("probCreation"); 
  m_dProbElaboration = readParameters("probElaboration"); 
  m_dReputation = Math.random(); 
  m_bTenure = tenure; 
  m_iCreatedTick = getCurrentTickCount(); 
  m_subDomain = (int)(Math.random()*4);   
  m_ID = ++s_ID;   
 }  
 @ScheduledMethod(start = 1, interval = 1) 
 public void step() 
 { 
  // Agents move in the grid randomly. 
  move(); 
  if (m_bTenure) 
  { 
   // Innovation only if his motivation is greater than a random 
number. 
   double rand = Math.random(); 
   if (rand > this.m_dProbMotivation) 
111 
 
   { 
    return; 
   } 
   // Select one of three generators based on the random number. 
   if (rand < m_dProbCreation) 
   { 
    create(); 
   } 
   else if (rand < m_dProbCreation + m_dProbCombination) 
   { 
    combination(); 
   } 
   else 
   { 
    elaboration(); 
   } 
  } 
  else 
  { 
   //enculturation 
112 
 
   enculturation(); 
  } 
 } 
} 
? Evaluator 
The class Evaluator represents the arbitrator of scientific community, who 
determines whether or not a kene is appropriate to be retained in the domain. The class 
has three major functions. One is constructor; the second is named as ?step? which is 
invoked every time tick; the third is named as ?evaluate? which is used to determine the 
fitness of a specific kene. 
public class Evaluator 
{ 
 // Constructor 
 public Evaluator() 
 { 
  // Read parameters and assign them to state variables. 
  m_iAmplification = readParameters("amplification"); 
  m_strDecisionType = readParameters("typeCommunity"); 
 } 
  
113 
 
 @ScheduledMethod(start = 1, interval = 1) 
 public void step() 
 { 
  move(); 
  // Emergent selection corresponding to utility-oriented community 
  if (m_strDecisionType.equals("Utility")) 
  { 
   Context context = ContextUtils.getContext(this); 
   // Get the iterator of kenes 
   Iterator<Kene> iter = context.getObjects(Kene.class).iterator(); 
   // Get the network for the relationship of kenes 
Network network = (Network) 
context.getProjection("RelationOfKenes"); 
   int tenureDuration = 100; 
   int iCurrentTick = getCurrentTickCount(); 
   Vector<Kene> removedKene = new Vector<Kene>(); 
   // Iterate all the kenes 
while (iter.hasNext()) 
   { 
    Kene kene = iter.next(); 
114 
 
    // Only those kenes that pass the duration of tenure. 
    if (iCurrentTick == kene.getCreatedTick() + 
tenureDuration) 
    { 
     // The kenes that never are cited. 
     if (!network.getEdges(kene).iterator().hasNext()) 
     { 
      removedKene.add(kene); 
     } 
    } 
   } 
   // Remove those kenes that never are cited. 
   for (Kene kene : removedKene) 
   { 
    context.remove(kene); 
   } 
  } 
 } 
 // Determine whether or not a specific kene is qualified. 
public boolean evaluate(Kene kene) 
115 
 
 { 
  boolean result = false; 
  Context context = ContextUtils.getContext(kene); 
  if (context == null) 
  { 
   // Calculate individual fitness of the kene. 
   result = kene.getM_individualFitness() >= 0.5; 
  } 
  else 
  { 
   // Calculate the sum of individual fitness and relational fitness. 
   result = kene.getM_individualFitness() + calFitness(context, 
kene)/g_dMaxFitness >= 0.5; 
  } 
  return result; 
 } 
}