<strong>the</strong> orig<strong>in</strong>al or o<strong>the</strong>r termites <strong>in</strong> <strong>the</strong> colony to fur<strong>the</strong>rtransform <strong>the</strong>ir environment. Grasse def<strong>in</strong>ed stigmergy as:“<strong>the</strong> stimulation of <strong>the</strong> workers by <strong>the</strong> very performances<strong>the</strong>y have achieved” [2]. The stigmergy process has beenobserved <strong>in</strong> termites, ants, bees, and wasps <strong>in</strong> a wide rangeof activities. In a termite colony, a highly complex termitenest is not caused by <strong>the</strong> net build<strong>in</strong>g knowledge of<strong>in</strong>dividual termites. It is <strong>the</strong> result of <strong>the</strong> collective behaviorof large numbers of <strong>in</strong>dividual termites perform<strong>in</strong>gextraord<strong>in</strong>arily simple actions <strong>in</strong> response to <strong>the</strong>ir localenvironment. There are no direct communications betweentermite workers for coord<strong>in</strong>at<strong>in</strong>g <strong>the</strong>ir nest build<strong>in</strong>g actions.The modified environment caused by an <strong>in</strong>dividual termite’ssimple actions serves as <strong>the</strong> coord<strong>in</strong>at<strong>in</strong>g signals. The stateof <strong>the</strong> nest structure triggers some behaviors, which <strong>the</strong>nmodify <strong>the</strong> nest structure and trigger new behaviors until <strong>the</strong>construction is complete.The OSS developer community is a new k<strong>in</strong>d of onl<strong>in</strong>esoftware development group where participants can read,modify, and redistribute software source code without cost.In <strong>the</strong> software development field, a long-held belief is thatlarge scale software development should be regarded as acollaborative activity to be conducted <strong>in</strong> a hierarchicallystructured organization such as a company. However, manyOSS development systems do not evolve <strong>the</strong> same as <strong>the</strong>seproprietary systems. The distributed and unplanned OSSdevelopment model has been shown to be very effective as asoftware development paradigm and outperforms <strong>the</strong>proprietary software development schemes. In an OSSdeveloper group, while people are much more <strong>in</strong>telligentthan social <strong>in</strong>sects, open software development usesessentially <strong>the</strong> same stigmergic mechanism for collaborat<strong>in</strong>g.Participants <strong>in</strong> OSS projects ma<strong>in</strong>ly engage <strong>in</strong> onl<strong>in</strong>ediscussion forums or threaded email messages as a centralway to observe, participate <strong>in</strong>, and contribute to publicdiscussions of topics of <strong>in</strong>terest to ongo<strong>in</strong>g projectparticipants [3]. The newly developed software source codesare uploaded to <strong>the</strong> community website for <strong>the</strong> purpose ofbe<strong>in</strong>g scrut<strong>in</strong>ized by <strong>the</strong> members of <strong>the</strong> community. Anybug, error or lack<strong>in</strong>g functionality will be po<strong>in</strong>ted out andthus entice community members to take up <strong>the</strong> shortcom<strong>in</strong>g.The concept of stigmergy provides a <strong>the</strong>ory for expla<strong>in</strong><strong>in</strong>ghow disparate, distributed, ad hoc contributions from<strong>in</strong>dividuals could lead to <strong>the</strong> emergence of <strong>the</strong> largestcollaborative enterprises <strong>the</strong> world has seen.III. RELATED WORKIn most onl<strong>in</strong>e host<strong>in</strong>g environments, project relatedactions are logged and <strong>the</strong> log <strong>in</strong>formation can later bem<strong>in</strong>ed to understand <strong>the</strong> community structure and <strong>in</strong>teractionpatterns. This log data provides enormously detailed<strong>in</strong>formation for analysis [4, 5]. Crowston, et al. [6] proposeda model for effective work practices <strong>in</strong> OSS development.The model was based largely on an exist<strong>in</strong>g model of groupeffectiveness <strong>in</strong>itially proposed by Hackman [7] <strong>in</strong> 1986.Smith, et al. [8] presented an agent-based OSS simulationmodel that <strong>in</strong>cludes <strong>the</strong> software modules’ complexity, <strong>the</strong>software’s fitness for purpose, <strong>the</strong> motivation of developers,and <strong>the</strong> role of users <strong>in</strong> design<strong>in</strong>g requirements. In researchon how OSS developers collaborate, research considers <strong>the</strong>OSS movement as a self-organiz<strong>in</strong>g system and acollaborative social network [9]. SNA was used foranalyz<strong>in</strong>g OSS community structure. Actor relationships arerepresented as nodes and l<strong>in</strong>ks. The actor can be a user ordeveloper. Every node i represents an actor with<strong>in</strong> <strong>the</strong>network; l<strong>in</strong>k(i,j) denotes a social tie between actors i and j.However, as we mentioned <strong>in</strong> <strong>the</strong> <strong>in</strong>troduction, <strong>in</strong> <strong>the</strong> OSScommunity, direct connections between actors as unusual.They exchange <strong>in</strong>formation through <strong>the</strong> forum or email-list<strong>in</strong>directly. Most of time, <strong>the</strong> actor does not even know who<strong>the</strong> recipient of his/her message is prior to send<strong>in</strong>g <strong>the</strong>message. Hence, SNA may not be able to expla<strong>in</strong> <strong>the</strong> forumbased collaboration concept characteriz<strong>in</strong>g <strong>the</strong> OSScommunity.Elliott [10] argued that collaboration <strong>in</strong> small groups(roughly 2-25) relies upon social negotiation to evolve andguide its process and creative output. <strong>Collaboration</strong> <strong>in</strong> largegroups (roughly more than 25) is enabled by stigmergy.Heylighen [11] proposed to dist<strong>in</strong>guish stigmergy <strong>in</strong> <strong>the</strong>OSS community as direct and <strong>in</strong>direct. In OSS development,<strong>the</strong> unf<strong>in</strong>ished jobs serve as <strong>the</strong> direct stigmergy, whichstimulates o<strong>the</strong>r actors to participate <strong>in</strong> f<strong>in</strong>ish<strong>in</strong>g <strong>the</strong> jobs.Indirect stigmergy can be recognized <strong>in</strong> forums where bugsor function requests are posted. These forums are regularlyconsulted by <strong>the</strong> developers, thus attract<strong>in</strong>g <strong>the</strong>ir attention totasks that seem worth perform<strong>in</strong>g. However, <strong>the</strong> numericalresearch and <strong>the</strong> ma<strong>the</strong>matical stigmergy model are notdiscussed <strong>in</strong> Elliott’s and Heylighen’s publications. In thisresearch, we propose a stigmergy collaboration OSS modelto produce a simulation that accurately represents <strong>the</strong>collaboration <strong>in</strong> an OSS community. The simulation outputsare compared with <strong>the</strong> empirical data retrieved from actualOSS project log <strong>in</strong>formation.IV. AGENT BASED SIMULATION MODELOur approach to reproduc<strong>in</strong>g <strong>the</strong> complex environment of<strong>the</strong> OSS software development community was to developan agent-based simulation framework us<strong>in</strong>g <strong>the</strong> stigmergyapproach. We assume that OSS members already haveenough motivation to jo<strong>in</strong> <strong>the</strong> community and makecontributions. The model will represent how <strong>the</strong> OSScommunity collaborates and how each <strong>in</strong>dividual developerchooses which forum to jo<strong>in</strong>. Our hypo<strong>the</strong>sis is that <strong>the</strong>collaborations of <strong>in</strong>dividual OSS developers and users arestigmergy collaborations. The forum posts and email listmessages serve as <strong>the</strong> digital stigmergy. Peer-to-peercommunications between <strong>in</strong>dividual OSS members do notoccur very often.To simplify <strong>the</strong> simulation, we assume <strong>the</strong>re are two k<strong>in</strong>dsof agents <strong>in</strong> <strong>the</strong> simulation - <strong>the</strong> developer and <strong>the</strong> user. The
developer agents voluntarily contribute <strong>the</strong>ir effort and time<strong>in</strong> answer<strong>in</strong>g questions from users, develop<strong>in</strong>g code andfix<strong>in</strong>g bugs. The agents do not <strong>in</strong>teract with each o<strong>the</strong>rdirectly. Instead, <strong>the</strong>y go through <strong>the</strong> forums for <strong>in</strong>formationexchange. The user agents post messages on <strong>the</strong> forum. Auser agent can change to a developer agent if <strong>the</strong>y want to.There are two k<strong>in</strong>ds of forums, <strong>the</strong> public forum and <strong>the</strong>developer forum. The public forum can be accessed andmessages can be posted by any agents <strong>in</strong>terested <strong>in</strong> <strong>the</strong>software project. Most of <strong>the</strong> time, it serves as <strong>the</strong> <strong>in</strong>directmessage exchange board between users and developers.Users can post questions about how to use <strong>the</strong> software,bugs <strong>the</strong>y found dur<strong>in</strong>g software use and functions <strong>the</strong>y wishto be <strong>in</strong>cluded <strong>in</strong> <strong>the</strong> software. Each problem will berepresented by one forum thread. The o<strong>the</strong>r users and <strong>the</strong>developers occasionally go through <strong>the</strong> forums, answer <strong>the</strong>questions and get first hand <strong>in</strong>formation about bug problemsand wish list functionality.In this simulation, developers use <strong>the</strong> Ant ColonyAlgorithm to choose which forum thread problem tocontribute to solv<strong>in</strong>g. In this algorithm, each forum threadserves as one potential digital trail to different softwaredevelopment directions and <strong>the</strong> posted messages <strong>in</strong> thisthread represent <strong>the</strong> digital pheromones laid down on <strong>the</strong>trail. Every time a user or developer posts a new message <strong>in</strong>this forum thread, a new pheromone is deposited on <strong>the</strong> trail.The pheromone content of a forum thread can be updatedand decayed by us<strong>in</strong>g <strong>the</strong> follow<strong>in</strong>g models.Pheromone update: when a message is posted <strong>in</strong> a forumthread, <strong>the</strong> pheromone for this thread is <strong>in</strong>cremented by aconstant, γ. The nom<strong>in</strong>al value of γ is one. Equation (1)describes <strong>the</strong> pheromone update procedure when a messageis posed by actor a <strong>in</strong> a post thread d at time t.ttP 1 d Pd (1)Pheromone decay: to account for pheromone decay, eachthread’s pheromone values are periodically multiplied by <strong>the</strong>decay factor, ε -τ . The decay rate is τ0. A high decay ratewill quickly reduce <strong>the</strong> amount of rema<strong>in</strong><strong>in</strong>g pheromone,while a low pheromone decay rate will degrade <strong>the</strong>pheromone value slowly. The nom<strong>in</strong>al pheromone decay<strong>in</strong>terval (or decay period) is one day. Equation (2) describespheromone decay.t 1 tP * dPdIf no message has been posted <strong>in</strong> a thread <strong>in</strong> quite sometime, <strong>the</strong> pheromone for this thread will be decayed to anear-zero value. The thread will be removed from <strong>the</strong>developer’s potential thread selection direction.ACO algorithm: The developers and users follow <strong>the</strong> ACO(Ant Colony Optimization) algorithm [12] <strong>in</strong> swarm<strong>in</strong>telligence for choos<strong>in</strong>g <strong>the</strong> thread <strong>the</strong>y jo<strong>in</strong> and post(2)messages. Accord<strong>in</strong>g to <strong>the</strong> pheromone <strong>the</strong>ory, <strong>the</strong>y willmost likely jo<strong>in</strong> and post messages <strong>in</strong> <strong>the</strong> thread that hashighest pheromone content.Thread selection: Actors will randomly chose a threadbased on <strong>the</strong> amount of pheromone present on each forumthread. Equation (3) describes thread d ’s probability dof be<strong>in</strong>g chosen.d N( Pi1td( P K)tiF K)N is <strong>the</strong> total number of forum threads. The constants Fand K are used to tune <strong>the</strong> forum-choos<strong>in</strong>g behavior of anactor. The value of K determ<strong>in</strong>es <strong>the</strong> sensitivity of <strong>the</strong>probability calculations to small amounts of pheromone. If Kis large, <strong>the</strong>n large amounts of pheromone will have to bepresent before an appreciable effect will be seen <strong>in</strong> <strong>the</strong>message post<strong>in</strong>g probability. The nom<strong>in</strong>al value of K is zero.Similarly, F may be used to modulate <strong>the</strong> differencesbetween pheromone amounts. For example, F > 1 willaccentuate differences between l<strong>in</strong>ks, while F < 1 willdeemphasize <strong>the</strong>m. F = 1 yields a simple normalization. Thenom<strong>in</strong>al value of F is two.Ano<strong>the</strong>r forum <strong>in</strong> <strong>the</strong> simulation is <strong>the</strong> developer forum. Itserves as <strong>the</strong> <strong>in</strong>ternal forum and is used by developers topost <strong>the</strong>ir ongo<strong>in</strong>g work. It represents <strong>the</strong> email-list andSVN repository <strong>in</strong> an actual OSS project. Each developer’scontribution is stimulated by o<strong>the</strong>r developers’ ongo<strong>in</strong>gwork posted <strong>in</strong> <strong>the</strong> developer forum. The probability c fordeveloper i to cont<strong>in</strong>ually contribute to a software elementdevelopment is modeled as <strong>the</strong> termite mud drop probability<strong>in</strong> Grasse stigmergy model and is given by:cif ( i) ni*(k f ( i)jLF2)1 ( i,j)f ( i) max(0.0, (1 )2 Here, ( i,j)[0,1]is <strong>the</strong> dissimilarity value betweenideveloper i contributed post and <strong>the</strong> developerj contributed post message <strong>in</strong> email-list and SVNrepository, n is <strong>the</strong> message length. [0,1]is a data-2dependent scal<strong>in</strong>g parameter, and is <strong>the</strong> total number ofposted messages <strong>in</strong> <strong>the</strong> developer forum dur<strong>in</strong>g a predef<strong>in</strong>edtime period L [1,15].(3)(4)(5)