Design and Implementation of a 3D Action Puzzle Game

Design and Implementation of a 

3D Action Puzzle Game 

Entwurf und Implementierung eines 3D Action-Puzzle Spiels 

Falco Wockenfuß 

Bachelor degree dissertation attendant to 

“Realtime Techniques for Computer Games” 

Supervising tutor: Prof. Dr. Maic Masuch 

Trier, 03/06/2009

Gratitude 

This project would not have been possible without the support and collaboration of the whole 

Assembler Bay design team. Jörg Meyer and Pascal Leyrat provided the impressive visuals, 

the catchy soundtrack and made the idea of Assembler Bay become reality. My highest tribute 

goes to them and their long lasting motivation to create this game. 

I have to thank my family and friends for their permanent support and motivating 

words. Although I had hardly much spare time for them in the last 6 months, they kept on 

cheering me up and assisting me wherever they could. 

I am also grateful to Prof. Dr. Maic Masuch, who helped to make this paper possible 

in its current form. Without his courtesy, this paper would miss some important parts. On the 

side of communications design Prof. Dr. Franz Kluge was the supporting pillar, who believed 

in the success of this interdisciplinary project. Last but not least I have to thank the whole department 

of computer science and foremost its dean Prof. Dr. Andres Künkler. Their backing 

helped to get this project to the point it is now. I hope the support of all these people will last 

until the game is finally complete. 

II

Outline 

This paper depicts the process of developing a prototype for a 3D computer game and the underlying 

game engine. It includes in depth information about game design basics as well as 

detail about the implementation of various game engine features. These features include a 

physics engine, modern rendering techniques and advanced shadow mapping algorithms. The 

elaboration of a game concept and the resulting game design paper is detailed for the game 

Assembler Bay, which was developed in the scope of this project. Additionally modern games 

will be analyzed for requirements and expectations at their technologies. The most important 

of these features are included in the game Assembler Bay. 

Zusammenfassung (German) 

Diese Arbeit befasst sich mit der Entwicklung eines Prototyps für ein digitales 3D-Spiel und 

der zugrunde liegenden Spielengine. Sie enthält sowohl detaillierte Informationen über die 

Grundlagen des Spieldesigns als auch Einzelheiten über die Implementierung verschiedener 

Komponenten der Spielengine. Diese Komponenten umfassen unter anderem eine Physikengine, 

moderne Rendering-Techniken und moderne Verfahren zur Schattenberechnung mit 

Hilfe von Texturen (Shadow Mapping). Die Ausarbeitung eines Spielkonzepts und der daraus 

Folgenden Spieldesign-Vorlage sind anhand des Spiels Assembler Bay beschrieben, welches 

im Rahmen dieses Projekts entwickelt wurde. Zusätzlich werden verschiedene moderne Computerspiele 

auf Anforderungen und Erwartungen an die verwendeten Technologien analysiert. 

Die wichtigsten dieser Technologien sind im Spiel Assembler Bay enthalten. 

III

Table of Contents 

1 Introduction.....................................................................................................1 

1.1 Motivation – We Want to Make a Game!...................................................................1 

1.2 Task – State of the Art Game Design..........................................................................2 

1.3 Planning a Game – From Concept to Implementation..............................................3 

1.4 Make it Different – Characteristics of a Unique Game.............................................5 

2 Engine Design...................................................................................................7 

2.1 Structure – Duties of a Game Engine..........................................................................7 

2.1.1 Basic Structure of a 3D Game Engine.....................................................................8 

2.1.2 Why a Physics Engine is Required in Modern Games............................................9 

2.1.3 Artificial Intelligence in Modern Games...............................................................10 

2.1.4 Input Handling for 3D Games...............................................................................11 

2.1.5 Content Management and Content Processing......................................................13 

2.1.6 Game Screen Management and Scene Administration.........................................13 

2.1.7 Hardware Compatibility........................................................................................14 

2.2 Graphics – Rendering Techniques and Lighting......................................................14 

2.2.1 Transform and Lighting – Rendering Techniques.................................................15 

2.2.2 Forward Rendering – A Single Pass and Many Possibilities................................16 

2.2.3 Deferred Shading – Multiple Passes for Multiple Lights......................................17 

2.2.4 Conclusion – Deferred Rendering Does the Job....................................................19 

2.3 Physics – Simulation or Feint.....................................................................................20 

2.3.1 Collision Detection and Collision Response.........................................................20 

2.3.2 Collisions Under Extreme Conditions...................................................................21 

3 Draft and Expectations – Assembler Bay....................................................23 

3.1 Story and Setting – A Game Concept is Born...........................................................23 

3.2 Presentation – The Look of a Game..........................................................................25 

3.2.1 Visual Features for Assembler Bay.......................................................................26 

3.2.2 Dynamic Soft Shadows – High Realism Through Shadows?...............................27 

3.2.3 Stencil Shadow Volumes – Pixel Perfect Hard Shadows......................................28 

3.2.4 Shadow Mapping – Simple, Compatible and Somewhat Blurry...........................29 

3.2.5 Ambient Occlusion – Realtime or Static ?............................................................30 

3.2.6 Additional Effects for an Unique Look.................................................................31 

3.3 Game Mechanics – Concepts for a 3D Puzzle Game...............................................33 

IV

4 Implementation..............................................................................................36 

4.1 Game Engine – The Core for Assembler Bay...........................................................36 

4.1.1 Engine Structure – Organization is the key...........................................................36 

4.1.2 Smooth Skin Animation – Now It Can Walk!.......................................................39 

4.2 Physics Engine – Collisions and Reactions................................................................41 

4.2.1 Binary Space Partitioning – Divide and Conquer..................................................42 

4.2.2 Physics Calculations – Let It Bounce!...................................................................44 

4.2.3 Animated Objects – Neither Moving Nor Static...................................................46 

4.3 Deferred Shading – The Visual Part of QEE............................................................46 

4.3.1 Surface Formats – Compatibility and Storage.......................................................47 

4.3.2 Geometry-Buffer Attributes – Precision and Performance...................................48 

4.3.3 The Final G-Buffer Layout – Performance Takes the Lead..................................49 

4.3.4 Implementation of Deferred Lights.......................................................................50 

4.3.5 Stencil Buffer Light Volumes – Less Pixels More Performance..........................52 

4.4 Dynamic Shadows – Light Comes With Darkness...................................................53 

4.4.1 Shadow Map Aliasing – One Problem Many Answers.........................................54 

4.4.2 Variance Shadow Mapping – Smooth Shadows in Every Resolution...................57 

5 Achievement and Conclusion.......................................................................60 

5.1 Achievements and Tasks – Comparing Idea and Reality........................................60 

5.1.1 Primary Visual Features – Does it Look Good ?...................................................61 

5.1.2 Optional Graphics Effect – Does it Look Better ?.................................................63 

5.1.3 Open Source Game Engine – Is it Really Quick and Easy?..................................64 

5.2 Teamwork and Timetables – The Long Way of Game Design...............................65 

5.3 Comparison and Conclusion – Is it Really State of the Art?...................................66 

5.3.1 Dynamic Soft Shadows – Who is John Carmack?................................................66 

5.3.2 Colors and Atmosphere – It is Vivid, but is it Unique?.........................................68 

5.3.3 Conclusion – Not Perfect but Worth the Effort.....................................................68 

List of literature...............................................................................................70 

V

Table of Figures 

Figure 1.1: Bloom Effect and Ambiance in Prince of Persia: Warrior Within...........................6 

Figure 1.2: Unique Look of World of Warcraft..........................................................................6 

Figure 2.1: Part of a tile based level in Dune II........................................................................11 

Figure 2.2: Seamless world of Total Commander....................................................................11 

Figure 2.3: Visualization of 4 Preliminary Buffers used for Deferred Shading.......................18 

Figure 2.4: Screenshot of Killzone 2 showing capabilities of Deferred Shading.....................19 

Figure 3.1: Vivid Colors in Mirror's Edge................................................................................25 

Figure 3.2: Stencil Shadow Volumes in Doom 3, producing hard edges.................................28 

Figure 3.3: Shadow Map Aliasing (1024x1024 shadow maps) PCF only................................30 

Figure 3.4: Comparison of Overexposure and Normal Lighting in Assembler Bay................31 

Figure 3.5: Pre-Rendered Depth of Field and selective Bloom for Assembler Bay.................32 

Figure 3.6: Excerpt of a draft for the Assembler Bay manual..................................................34 

Figure 4.1: Class Diagram Showing the Basic Structure of the game engine..........................37 

Figure 4.2: Various Keyframes from a Jump Animation of the Main Character.....................39 

Figure 4.3: Quad-tree is redundant and very unbalanced when objects are grouped...............42 

Figure 4.4: Per Polygon Collision Detection Using Bounding Box Queries............................44 

Figure 4.5: Relevant Surface Formats for Direct X Hardware.................................................47 

Figure 4.6: Geometry-Buffer Configuration for QEE..............................................................49 

Figure 4.7: Using Light Volumes and the Stencil Buffer to Exclude Pixels............................53 

Figure 4.8: Shadow Map Alignment with Trapezoidal Shadow Maps.....................................56 

Figure 4.9: Fake Penumbra in Variance Shadow Mapping and Solutions...............................59 

Figure 5.1: Previews of additional Level Designs for future development..............................61 

Figure 5.2: Comparison of Shadow Mapping Filter Methods..................................................62 

Figure 5.3: Screenshot of Assembler Bay with active MSAA..................................................63 

Figure 5.4: Screenshot of a Physics Simulation in Assembler Bay..........................................65 

Figure 5.5: Close up Shadow Artifacts in Far Cry 2.................................................................67 

Figure 5.6: Magnified Shadow Details in Assembler Bay........................................................67 

Figure 5.7: Assembler Bay Logo..............................................................................................69 

VI

Introduction ::: Introduction 1 

1 Introduction 

The first chapter will provide a basic look into the distinctive features of computer 

games compared to other media. The motivation of the author and the goals of this project 

will be detailed in the first two sections. The third section will show concepts for 

the creation of an unique game, which will be used in the design process of this project. 

1.1 Motivation – We Want to Make a Game! 

Digital Entertainment Software (Games) has been a growing phenomena ever since the 

first personal computers were available. But despite the first games being quite simple 

programs with just a few lines of code, developed by single computer scientists or small 

groups of them, current console or computer games play in a whole different league. 

State of the art titles (2008 / 2009) for the XBOX 360, PS 3 or modern Gaming PCs require 

some years of development, and teams with 30 programmers and at least as much 

designers are not uncommon 1 . But the wish to convert your very own ideas into a real 

game still remains and high quality standards just raise the expectations. 

Modern computer games open a field of creativity and freedom any other media 

can hardly achieve. Current graphics hardware allow a range of effects and highly detailed 

graphics, which can simulate nearly any environment thinkable. Nowadays real 

time graphics produce results not imaginable even in pre-rendered movies some years 

ago. Dynamic environments have the ability to immerse the user into a fantastic universe 

with its own rules. Realistic graphics are one of the main reasons the user accepts 

the fantastic reality presented to him. The second reason is the number of abilities of interactive 

story telling which lift the user from his audience position and take him into 

the middle of action. Virtual characters not only convince with their looks, but with 

their actions and reactions. Every action the user takes or does not take has a direct impact 

on the virtual reality surrounding him and the consequences his inactiveness entail, 

can persuade him to interact with the game. 

1 The development costs for a modern game can easily surpass 3 million dollars 

[http://news.bbc.co.uk/1/hi/technology/4442346.stm]

Introduction ::: Motivation – We Want to Make a Game! 2 

The motivation for this project was the ability to work in a team and produce a 

result not only for the records, but a game we could enjoy ourselves. We had a concept 

and many ideas at hand. We did not want to modify an existing game, or reuse any 

graphics, models or music, but intend to produce an original outcome. The incentive for 

this project is, to let an idea come true as a game. 

1.2 Task – State of the Art Game Design 

The first, most important goal of this project was the design of a 3D action game called 

“Assembler Bay”. The concept for Assembler Bay is depicted in detail in chapter 3. The 

game implemented in this project should be able to compare to current top titles for the 

XBOX 360, PlayStation 3 or the PC. Since the team for design and implementation of 

this project consisted of only three members and the timespan available was as small as 

5 months from concept to implementation, the team members had to adapt their specifications 

to be adequate to their means. The task is the design of a game reaching the 

graphics quality of current top titles, utilizing the newest and most promising rendering 

techniques available. The final implementation shall include an original soundtrack 

composed for this game, as well as an unique look and story design. 

However the implementation of these high quality game assets would be very 

time intensive and it would be impossible to meet the quantitative requirements of modern 

games. A modern 3D action game is expected to deliver several hours of gameplay, 

including a variety of levels and quests. Although it might have been possible for the 

team to produce this amount of assets, their quality would be severely worse as opposed 

to the design of a few high quality assets in the same time. The objective for this project 

was thus specified to scope at least the following features: 

The most important feature is one fully playable level, representing the stage for 

a tutorial, including all major gameplay elements for the final game. In addition to this 

at least two alternative stages are to be designed, as a foundation for future levels in the 

same environment. To preview the possible extensions to this game, the team will 

design concepts for two other realms with a design completely different from the three 

demo stages. At least one fully playable puzzle for the demo stage has to be designed 

and implemented, as well as the prototypes for at least two other types of puzzles. In addition 

to this the team will sketch at least two more puzzle drafts, to give an outlook to 

the possibilities of game elements in the final game. 

These are the objectives for this project, concerning its deadline as a final paper 

for the Bachelor degree of the author. But furthermore the whole team is convinced to 

lead this project to a final game. After graduation, the team will go on to implement the 

features canceled in the concept phase due to time limitations. The game will also be 

tested and if possible released on an online gaming platform.

Introduction ::: Planning a Game – From Concept to Implementation 3 

1.3 Planning a Game – From Concept to Implementation 

The next step, after outlining the objective of this project is the design of a timing 

schedule for the whole venture from concept to implementation. But prior to the set up 

of this schedule, a review of the preconditions and available resources is inevitable. The 

first step was the decision for a framework, followed by the main design decision, if an 

already available game engine would be used and extended, or if a complete game engine 

was to be implemented from scratch. 

A game meeting the conditions set in the last section can be implemented in 

various programming languages and frameworks. The two main branches of computer 

languages suited for game design are object-oriented languages with fully managed 

code, or procedural programming languages. PPLs 2 usually deliver higher performance 

than OOPLs 3 , resulting from less type casting and a more direct access to hardware resources. 

In contrast to this OOPLs spend many resources on resource management and 

type save programming. But this usually results in safer code, as the exception handling 

of an OOPL will usually handle all occurring errors. Another benefit of OOPLs is the 

modularity and clear structure resulting from a clean object hierarchy. Since object oriented 

design is the preferred method, wherever the available performance allows for 

managed code, the author prefers this approach. 

The bottleneck for performance of modern games, is usually the speed and abilities 

of the graphics adapter or the size of available main memory. This results in a 

wider use of object oriented languages for game engines. Another factor in the decision 

for a framework was the motivation to make the game available for a console, preferably 

the Microsoft XBOX 360. The XNA framework by Microsoft is based on C# and 

provides the ability to design games which are supported by PC and XBOX 360. In addition 

to this benefits, the author already has programming experience with XNA being an 

excellent tool for object oriented game design using Microsoft Direct X 9. These facts 

results in the decision to use XNA as the underlying framework for Assembler Bay. 

2 PPL stands for Procedural Programming Language 

3 OOPL stands for Object Oriented Programming Language

Introduction ::: Planning a Game – From Concept to Implementation 4 

A potential game engine for Assembler Bay should satisfy the following conditions: 

• The game engine has to be open source and free of any licensing obligations, 

because the team wants to hold all rights on the final 

product resulting from this project. 

• The second requirement is high flexibility and modularity, so the engine 

can be adapted to the special requirements of this game. 

• A potential game engine has to provide an adequate period of adaption, 

in comparison to the features it provides. 

• The required features of a 3D game engine are detailed in section 2.1 

After thorough research about XNA and available 3D game engines the author decided 

that there is no engine available fulfilling the depicted requirements. Most open source 

game engines are specialized on a single component, like a highly sophisticated physics 

engine, but lack at various other essential features. The time required to familiarize with 

an engine and rewrite the parts, which are not suited for the needs of this project, would 

most likely be comparable to the time needed to design a new game engine from 

scratch. This leads to the second main objective of this project: To design a reusable, 

flexible and modular 3D game engine with XNA. 

After the decision for a framework and the implementation of a new game engine 

had been taken, a timing schedule was outlined. The corner points of this schedule 

were a concept phase, followed by a segment to set up the design for all engine components, 

calculated for the first 15% of the schedule. The main phase of the project was 

the implementation of the game engine, planned to take about 60% of the whole project 

time. After implementation, the specialization for the game was assigned 15% of the 

overall time, leaving the last 10% for testing and evaluation. Although the evaluation 

time was designed very short, evaluation of the game engine was included in the implementation 

part and the last step was only evaluation of the features original to the game.

Introduction ::: Make it Different – Characteristics of a Unique Game 5 

1.4 Make it Different – Characteristics of a Unique Game 

A total of about 200 4 commercial computer games was published in 2008, most being in 

the traditional genres of sport, shooters, RPG 5 or strategy. Although many games with 

refreshing and unique gameplay mechanics were developed, the best sold and most discussed 

games were traditional games 6 using newest technologies. This clarifies that in 

most cases a visually stunning presentation is much more important for the acceptance 

of a game than original gameplay elements. The rare occasion of a visually deprecated 

game with an original concept becoming popular are games based on freeware games, 

which became popular beforehand. An example for this phenomena is World of Goo. 

But the downside of many praised high end graphics games is the lack of an unique 

gameplay. Many modern games come without any new ideas, still making sales with 

their visual presentation. This leads to more and more sequels of popular games and a 

decreasing number of original games with a high budget 7 . Hence the goal for this project 

is a game which is both visually stunning and original in its gameplay. 

The difficult part in designing an original game is to make it unique enough to 

stand out, but still keep enough standards, so the user can easily adapt to the new game 

mechanics. Although unique gameplay is the main attribute of an original game, the 

first step towards it is an original look. This can include an outstanding character or 

level design, but mostly an unique atmosphere. A game has an unique look, if the user 

can immediately distinct any of its scenes from other games. This can be achieved by 

special effects, such as a certain colored tinting of the whole scene like in Prince of 

Persia: The Warrior Within. (see Figure 1.1 8 ) 

The whole game is covered by a sand colored bloom effect, which makes the 

series stand visually out in contrast to other games. An other method to achieve a unique 

look can be an original style of characters and levels like it is used in World of Warcraft. 

The whole game has a very colorful and comic-style look, but without the use of cell 

shading or similar effects. The characters are designed to be very striking and burlesque. 

Although the whole HUD 9 has been copied by many other online RPGs, screenshots of 

World of Warcraft are still distinctive in their looks. (see Figure 1.2 10 ) 

4 Source: [ http://www.gamestar.de/ ] 

5 Role Play Game 

6 Traditional refers to games using frequently used and long established game mechanics 

7 Source: [ http://www.nytimes.com/2005/08/08/technology/08game.html ] 

8 Cutout screenshot from [http://uk.media.pc.ign.com/media/672/672283/imgs_5.html] 

9 Head Up Display – An acronym adopted from the air force, meaning the informations projected right 

into the vision of the pilot. In games it refers to any information displayed on the screen as an overlay 

independent of the direction the camera pinpoints 

10 Cutout screenshot from [http://www.markeedragon.com/]

Introduction ::: Make it Different – Characteristics of a Unique Game 6 

Figure 1.1: Bloom Effect and Ambiance in Prince of Persia: Warrior Within 

Figure 1.2: Unique Look of World of Warcraft 

This being said the most significant feature of an original game is an original gameplay 

concept. Gameplay is a term used to merge the game mechanics, controls and main 

story elements in a single concept of the way it feels to play the game. Presenting the 

user with an unprecedented gameplay is the goal of an unique game. But confronting 

the user with too complex game mechanics, or game rules he does not understand has to 

be avoided. One way to achieve this, is to combine a familiar environment utilizing 

common controls with new game mechanics. An unique game, which is easy to understand 

and provides easy access to original game mechanics, this is what Assembler Bay 

should finally incorporate.

Engine Design ::: Engine Design 7 

2 Engine Design 

The first section of this chapter will provide a short outline about the history and benefits 

of game engines and introduce the problems a game engine addresses. Following 

this overview of features is a concept for the structure of a puzzle game engine. 

The second section will depict features the graphics component of a game engine 

has to offer and subsequent compare the two major approaches to rendering. Where 

Deferred Rendering is chosen as the preferred technology for this project. 

The third section will explain the tasks a physics engine has to address. These 

tasks are divided into Collision Detection and Collision Response. Various methods to 

deal with both are mentioned and the different requirements of realistic simulations and 

action games is explained. 

2.1 Structure – Duties of a Game Engine 

The first digital games in history did not need any game engine. In fact the first games 

did not even need any software, referring to PONG 11 , which was first implemented 

solely in hardware. But as soon as the first Amiga personal computers arose early lines 

of gaming code were being written in machine language (ref. [BET03]). Following 

these assembler coded games the era of computer games really arose with software development 

in higher computer languages. But even then games were written just line by 

line and the only kind of reusable code were I/O loops for keyboard or graphics. The 

first famous and highly reused game engine was probably the SCUMM 12 -Interpreter 

written by Lucas Arts in 1987 which was a virtual machine ported to various available 

platforms. Instead of writing games line by line from scratch for each platform, new 

point and click adventures could be written in a unified scripting language, which was 

interpreted by the SCUMM-Interpreter, which managed Input, Sound and Graphics on 

the target platform. But apart from this most games were still one hard coded program 

without any modular components, most due to hardware limitations. 

11 PONG is a video game published 1972 by Atari [ http://www.atarimuseum.com/ ] 

12 Script Creation Utility for Maniac Mansion developed by LucasArts [ http://www.lucasarts.com/ ]

Engine Design ::: Structure – Duties of a Game Engine 8 

2.1.1 Basic Structure of a 3D Game Engine 

The core of every modern 3D computer game is the game engine. The basic idea in creating 

a game engine is separating the reusable components from parts unique to one particular 

game. Reusable components of a first person shooter would be methods calculating 

physic reactions or processing input by keyboard and mouse. Unique components 

are pictures, like a splash screen, or the level data, textures and sounds. By convention 

all materials in a game, which are not implemented into the game code itself, but rather 

read from external files or imported on runtime are called assets (ref. [BET03]). The 

most common game assets are textures, geometry data, sounds and configuration files. 

The remaining unique parts of a game are methods or classes which are modified for 

this particular game, or completely new functions which are not common in similar 

games. There is a range of benefits in breaking up this three parts, including reusability 

of the game engine for similar games or whole assets for sequels or add-ons. Modern 

game engines are so complex and highly sophisticated, they created their own market. 

Some game companies make higher acquisition selling their engines than selling their 

games. Even parts of a game engine can be profitable enough for a whole company, like 

HAVOK 13 or SpeedTree 14 demonstrate. 

A game engine should provide the following features (ref. [MIC06]): 

• Graphics 15 

• Physics 16 

• Artificial intelligence 

• Input handling 

• Content import 

• Content management 

• Game screen management 

• Scene administration 

• Customization options 

• Hardware compatibility 

13 HAVOK is an Irish software Company who's main product is a physics engine used in various 

games [ http://www.havok.com/ ] 

14 SpeedTree is a software package produced by IDV with the sole purpose of generating realistic 3D 

trees in games [ http://www.speedtree.com/ ] 

15 The graphics component of the game engine will be explained in in detail in section 2.2 

16 The physics component will be explained in detail in section 2.3


2.1.2 Why a Physics Engine is Required in Modern Games 

A professional game engine has to provide modular and flexible solutions for each of 

the addressed problems. While graphics quality is the most remarkable feature of a 3D 

engine, it is really just one aspect of many. But the first impression is always a visual 

one and graphics quality is still the first and most discussed topic in 3D gaming. Next 

generation consoles 17 , modern graphic adapters and multi core CPUs 18 have paved the 

way for visually stunning games, which set the standards high for nowadays game developers. 

While top titles in 2004 19 were the first to present a convincing 3D physics engine 

and some even made it a core element of their game, current action titles are expected 

to have a strong physics engine. Even if the simulation of realistic physics only play 

a marginal role in gameplay the user will be easily dissatisfied with the game experience 

if objects in the game do not behave physically correct (ref. [EBE04]). 

An example is a visually stunning game set in a tropical forest, the light 

only shimmers through the leaves in dark greens. The sound effects of 

birds and distant howls add up to the atmosphere and have the user completely 

immersed in this virtual jungle. The user sights a clearance just 

some steps ahead and sets into motion. But he keeps walking on the spot 

not moving an inch, because a little twig one inch above the ground 

blocks his way. He tries to jump, but just can not pass the little twig, because 

this way was just not designed to be passed. The user will immediately 

be teared out of the fantastic story and complain about the stupid 

game not letting him do what should be physically possible. Or maybe he 

even carries some kind of weapon which can not harm the twig in any 

way, because the game was designed to only attack enemies with these 

weapons. 

To create a storyline and game experience really captivating the user is difficult to 

achieve, but even just a little glitch in physics calculations can easily destroy this atmosphere. 

17 The consoles: Microsoft XBOX 360, Nintendo Wii and Sony PlayStation 3 are referred to as “Next 

generation consoles” or “NextGen consoles” 

18 CPU meaning Central Processing Unit, will in the scope of this dissertation always refer to the main 

processing unit in a personal computer 

19 Top titles 2004 refers to top-rated games released for the PC in 2004, based on ratings of international 

gaming magazines [ http://www.gamerankings.com/ ] including Half-Life 2, Unreal Tournament 

2004, Rome: Total War, Doom 3, Far Cry, Warhammer 40,000: Dawn of War, Counter-Strike: Condition 

Zero


2.1.3 Artificial Intelligence in Modern Games 

Virtually of equal importance in modern games is a good AI 20 , because just as bad as a 

barrel hanging in midair and making unexpected noises is an enemy running circles instead 

of attacking, or a strategy game where an unit just can not find a way to the location 

it should move to. The AI of a 3D game has many tasks to coordinate and decisions 

to take. The most sophisticated AI nowadays can be found in strategy games, where the 

user can choose to play against the computer and wants to be challenged. Although current 

computers have finally caught up to the best international chess players and even 

home computers can beat most chess amateurs utilizing shortest thinking times, the requirements 

of an AI in a real time strategy game lie on a whole different level 21 . 

One basic difference is available calculation time and power 22 in realtime applications. 

While a turn based chess game poses very small load on the running system if 

two players play against each other, a modern real time strategy game might already be 

short on calculation power even if there is no strategy AI active at all. The hardware is 

already busy calculating physics, graphics effects, path finding and game mechanics. 

This leaves only small amount of calculation time for a strategic AI facing an amount of 

possible turns far outweighing any board game on the planet. This is the second major 

difference between classic board games and digital strategy games – the amount of rules 

and possibilities. A board game has always a very limited number of positions where a 

unit can stand. There are (8x8) 64 fields in classic chess and even the Japanese Go 23 as 

one of the biggest board games has only (19x19) 361 possible positions for a token. In 

contrast to this even one of the oldest digital strategy games like Dune II 24 already had a 

game field about ten times as big and a variety of different units (see Figure 2.1 25 ). And 

many modern strategy games play on seemingly undivided game fields where a unit's 

position is not limited to a certain grid of positions, but saved in a floating point variable 

with high precision. Total Commander Is most likely the real time strategy game with 

the biggest seamless levels and most units on the battlefield and a good example how 

the complexity of real time strategy games evolved (see Figure 2.2). 

20 Artificial Intelligence, meaning the ability of computer games to perform seemingly intelligent tasks 

(like path finding or reacting like a human) 

21 The Chess Computer Deep Blue beats most chess players [ http://www.research.ibm.com/deepblue ] 

22 Calculation power refers to the CPU-load a certain task can occupy, equivalent to the number of CPU 

cycles used by this task 

23 Go is a Japanese strategy game played on a field with 19 horizontal and 19 vertical lines making up 

361 crossings where tokens can be placed 

24 Dune II was released 1992 for DOS developed by Westwood Studios. It was one of the first games 

comparable to modern strategy games utilizing economy, tactic and strategic warfare in realtime 

25 Screenshot found on [http://www.mobygames.com/]


Figure 2.1: Part of a tile based level in Dune II Figure 2.2: Seamless world of Total Commander 

The AI in first person shooters is not less demanding, because many modern shooters 

aim for high realism. High realism includes the enemies being human in most games 

and the player expects them to act like human beings and not like brainless monsters. 

This implies not only good path finding and fighting with tactics, but also acting based 

on emotions and character, for example enemies fleeing if they do not have any chance 

of winning (ref. [SUP07]). However this also means the level of AI required highly depends 

on the particular game being planned. Games with computer controlled enemies 

or friends which represent intelligent beings (like humans) need to be equipped with a 

highly capable AI, controlling their movements and even communication with the player. 

Contrary games without intelligent entities besides the player need little to no AI at 

all. Examples for this are multiplayer-only first person shooters without computer controlled 

players. In this case state of the art graphics and a good physics engine are expected, 

but no AI is required at all, because all acting entities in the game are controlled 

by users. Another example are arcade or puzzle games without acting adversaries. 

The concept for Assembler Bay is a puzzle game 26 , which implies little to no AI 

will be required in the final game. This leads to the design decision an AI component 

will not be included in the primary engine structure. 

2.1.4 Input Handling for 3D Games 

Input handling is another important component, which distributes a great part to the 

game experience. How a game feels, meaning how the virtual character and the environment 

react to the player is one of the most important points in the audiovisual experience 

called gameplay. The challenge of a good input component in 3D action games is 

to let the virtual character act according to what the user wants him to do, as fast as possible. 

Too complex or unintuitive input controls can easily frustrate the user and lead to 

26 The concept and planning of Assembler Bay will be detailed in section 3.1


a high difficulty. Challenges resulting from unintuitive controls or long input latency 

will in most cases be a severe design failure. Even though there are some games using 

purposely counterintuitive controls as part of their gameplay, most other games try to 

avoid any difficulty resulting from the user not knowing certain controls. Some representatives 

of counterintuitive controls are fighting games 27 for consoles, providing a high 

challenge for the user in learning complex button combinations. 

Another design decision is the choice of input hardware. Apart from highly specialized 

input components like steering wheels, drum pads, guitars, rifle imitations and 

data gloves the main input components for modern games are gamepad, keyboard and 

mouse. Habit is the most important factor to obey in input design because most users do 

not want to learn new complex controls, but rather play the game with the controls they 

are used to. This fact can be witnessed as almost every game ported by a console to the 

PC or vice versa adapts to the common controls and input devices of the target platform. 

The easiest way to take on this problem is to support a wide spectrum of different input 

devices, as well as customization options for the user. None the less many games are restricted 

to certain input devices and even have certain unchangeable button assignments 

28 . Although full support of any input device or configuration should always be 

preferred, it is easier and more straightforward to accept certain constrains in the available 

configuration options. Certain control standards have been evolved over the last 

decade in popular genres such as first person shooters, racing simulations or real time 

strategy games (ref. [BOW06]). An example is the use of a mouse in first person shooters 

on the PC. Although the mouse movements could also be used to move the character 

around in the level or to open certain menus, almost every FPS 29 on the PC predefines 

the mouse to control the view direction of the character. Available configuration options 

may include mouse sensitivity, mirroring of the mouse y-axis or mouse acceleration and 

frequently the mouse buttons can be assigned any function. But the mouse movements 

are fixed to control the view direction. 

The concept of Assembler Bay includes two target platforms, the PC and the 

XBOX 360. Subsequently the input component of the underlying game engine has to 

support the native input devices of both platforms. These are the XBOX 360 Gamepad 

for the console and mouse and keyboard for the PC. Although the PC also supports a 

range of various gamepads and joysticks, support for these requires a high level of hardware 

compatibility and is optional, since most users on a PC use only the available input 

devices mouse and keyboard. Since Assembler Bay will be a 3D action puzzle game introducing 

new gameplay elements 30 which can not be found in major genres, there are 

no established control standards to adapt. Hence the input system will be fully customizable 

and user tests will determine an intuitive standard configuration. 

27 Fighting games, also called “beat 'em up” are games where players have to fight each other in close 

combat, usually only one on one in a close up side view 

28 In many games functional keys like a pause or menu button or the escape key are not changeable 

29 FPS is used as an abbreviation for First Person Shooter in the scope of this dissertation 

30 The gameplay elements and design decisions for Assembler Bay will be depicted in section 3.1


2.1.5 Content Management and Content Processing 

Content management is a problem only barely recognized by the user in the form of 

loading times and the size of a game installation. But a game engine has to administrate 

a whole database of game assets like sounds, textures 31 , level data, 3D objects and lighting 

information. Whereas a game with small levels using only a few textures and sounds 

can just load all assets for the whole level into memory 32 , a game with huge outdoor 

levels or many detailed textures can easily overflow memory trying to load all assets at 

once. Games with huge outdoor levels usually separate their levels into tiles. If the player 

moves close to a certain tile, all assets in this tile are loaded into memory and assets 

of tiles out of range for the player are discarded. If the content manager is designed well 

and the hardware has enough power, the loading of new tiles can be done in background 

without the user even noticing (ref [BET03]). But the concept for Assembler Bay uses 

small indoor levels requiring only content management between levels. 

Content processing is also part of a game engine, because the game assets are 

usually created in arbitrary formats, which differ from the internal format the game engine 

uses for assets. The two possibilities to handle content import are at build time, or 

at runtime. If the content processing is done at build time, all assets (e.g. textures in 

JPEG format) are converted into an internal game engine format and saved using serialization. 

The advantage of this is shortened loading time, resulting from less content processing 

at runtime. But on the other hand all content for the game needs to be converted 

using the build time converter of the engine, which is usually not included in the final 

release. In this case users are unable to add their own content into the game without prior 

access to the content processor. Runtime content processing can easily import any 

content provided by the user, but requires analysis and processing of any content loaded 

at runtime. Most games use build time content processing for all available assets, but 

still provide runtime content import for individual assets by the user (like his own pictures 

as an ingame logo) 

2.1.6 Game Screen Management and Scene Administration 

Game screen management is a component designated to menu navigation, it handles the 

various game states commonly beginning with a start screen being followed by the main 

menu. The game screen management component has to ensure the content manager 

loads and disposes all assets for a certain screen at the right time. Decisions about which 

screens are updated and displayed on various events are also part of game screen management, 

which is the fundamental component of a game engine structure. 

Scene administration is a topic getting more crucial as scenes in 3D games become 

more complex. A scene in a 3D game is the virtual space enclosing the entirety of 

all loaded objects at a moment in time. It includes not only visible or movable objects in 

31 Textures meaning 2D or 3D bitmap textures (including sprites, normal maps, diffuse textures...) 

32 Memory refers to PC main memory and memory of the graphics adapter


a level, but also the level architecture itself, all players or other characters, lights and 

cameras 33 . This implies a scene in a 3D game is very dynamic, as many objects including 

the characters can move around and new objects or characters can be instanced or 

removed any time. The scene administration component keeps track of all these objects 

and removes them if they are no longer needed, or their lifetime expired. The component 

also keeps track of the positions and relations of the objects to each other. A good 

scene administration can have a grave impact on graphics performance using a scene 

graph for early frustum culling 34 and is mandatory for a high performance physics engine 

35 . 

2.1.7 Hardware Compatibility 

The goal of the game engine for Assembler Bay is to be a reusable engine for many projects, 

but a level of hardware compatibility professional modern PC games offer can not 

be achieved in the scope of this project. This does not affect the target platform XBOX 

360, since the engine will be designed to work with the special requirements of this console. 

But to give an exhaustive compatibility to PC hardware or even operating systems 

would require multiple implementations of many parts of the engine. A professional 

commercial engine is expected to run on virtually any system on the market having at 

least the calculation power to display the game at an adequate frame rate. The game engine 

developed in this project will try to maintain compatibility with many hardware 

configurations, but it will most likely only support a certain subset of operating systems 

and modern hardware. 

2.2 Graphics – Rendering Techniques and Lighting 

As depicted in the last section the graphics component is the most important component 

of a 3D game engine. The render pipeline is the central part of 3D visualization and 

combines several steps to create the final image. The first step is always raw 3D position 

data consisting of a list of 3D vectors representing points in 3D space and an index 

buffer 36 containing the relations of these points. These buffers are created by the content 

importer from a 3D model file. The points representing the model are multiplied with 

matrices to transform them according to the absolute position of the model in the virtual 

world of the game and then multiplied by view and projection matrices generated by a 

virtual viewport 37 to project the points on the screen. After being projected, the points 

33 Cameras refer to the viewpoint by which the scene can be rendered 

34 Frustum culling is a rendering technique to omit objects which can not be seen by the camera 

35 A high-performance physics engine uses 3D space partitioning techniques to find objects which can 

correlate and omit objects which can not influence each other at this point in time 

36 The index buffer contains information about which points in the position buffer make up a polygon, 

one point can be part of multiple polygons 

37 The virtual viewport is the virtual representation of a camera, based on a View and Projection matrix

Engine Design ::: Graphics – Rendering Techniques and Lighting 15 

represent 2D coordinates on the screen, where the points are connected to triangles using 

the index buffer data and these triangles are filled with the colors calculated from 

textures and lighting (ref. [MIC06]). Most of this calculations is done on the GPU 38 relieving 

the CPU for physics and AI calculations. 

2.2.1 Transform and Lighting – Rendering Techniques 

T&L 39 was the first attempt to source out 3D calculations to the graphic adapter and has 

evolved to Vertex and Pixel Shader 3.0, supported by the newest generation of graphic 

adapters. Vertex Shader process the points and texture coordinates of a model and can 

manipulate them, before they are sent to the Pixel Shader. Pixel Shader process each 

pixel drawn by a polygon and can manipulate the final color. Together with modern rendering 

techniques shader allow games to have a unique look by using specific lighting 

calculations. 

Prior to each rendering technique certain culling and sorting algorithms may be 

applied to the scene to sort out irrelevant parts of the geometry 40 . One very basic technique 

of these is view frustum culling based on bounding volumes 41 . In a complex scene 

many parts are not visible in the viewport image, because they are either behind the 

camera, outside of the view angle, or too far away 42 . Processing the geometry of these 

objects in the rendering call would be wasted effort, because they will not be visible in 

the final image. The camera view frustum can be represented by a geometric volume 

and intersection tests of the bounding spheres of objects in the scene can decide if the 

objects can be visible in the final scene. 

Traditional Forward Rendering consists of a single rendering pass, where the 

polygons of all objects visible are drawn and the vertex and pixel shader compute lighting 

and other surface effects. But moder graphic hardware has made a different rendering 

approach possible. Deferred Rendering is a technique which has just recently become 

interesting, because of the possibilities of modern graphics adapters and the growing 

complexity of scenes in modern 3D games. The following subsections will first introduce 

traditional Forward Rendering and depict some of its benefits and drawbacks. 

Afterwards Deferred Rendering will be explained and compared to Forward Rendering, 

concluding in the decision for the rendering technique used in this project. 

38 GPU means Graphics Processing Unit, in contrast to the CPU on the main board of the PC 

39 Transform and Lighting was first introduced by NVIDIA Corporation with the GeForce 256 in 1999 

40 Geometry which does not affect the output image of a particular rendering pass 

41 A bounding volume is a simplified geometric shape surrounding the whole object geometry 

42 For depth buffer precision and increased performance a virtual viewport has a near and a far clipping 

plane, all objects with greater distance to the camera than the far clipping plane are not drawn


2.2.2 Forward Rendering – A Single Pass and Many Possibilities 

Forward rendering processes all relevant geometry sequentially, random or sorted by a 

scene graph for improved performance. If transparent meshes exist in the scene, they 

have to be processed after solid geometry and sorted back to front. Since every object is 

drawn on a final bitmap, objects can only be drawn on top of each other. If a transparent 

object was rendered to early by accident, solid geometry further away from the camera, 

which is occluded by the transparent surface can not be drawn behind this transparent 

surface anymore. 

The object geometry is processed by a Vertex Shader, which applies WVP 43 

transformations and can also apply effects like skinning based on bones or other effects 

changing the geometry of an object. Lighting can also be done in the Vertex Shader, but 

calculating it in the Pixel Shader results in improved quality, since lighting information 

is not simply interpolated over the surface, but calculated for each pixel. Lighting calculation 

uses certain values of the lights in the scene, commonly including the position of 

the light source, its direction and color. These informations together with the position, 

normal orientation and attributes of the surface of the processed object calculate the final 

color of each pixel filled by the object. Most lighting calculations are based on the 

Phong or Blinn-Phong lighting model, which is a slightly simplified version of Phong 

lighting with improved performance (ref. [EBE01]). 

Since lighting calculations require not only different values changing over time, 

but in many cases completely different calculations, several formulas have to be implemented 

for various materials or light types in the scene. If there are multiple lights in a 

scene, the lighting calculations have to be done for all lights affecting an object resulting 

in one drawback of forward rendering. Although shader units are freely programmable, 

they have many limitations regarding the complexity of shader code. As such 

Shader Model 2.0 offered a maximum of 64 arithmetic instruction slots and a maximum 

of 32 constant value registers. Dynamic branching, function calls and loops are not supported 

and can only partially be replaced (ref. [DIE04]). Resulting from this limitations 

Shader Model 2.0 supports a maximum of three light sources affecting the same object. 

Another consequence from the lack of support for dynamic branching is the requirement 

of a unique shader for each material – light type combination with different lighting formulas. 

If a scene has 3 different light types and 2 different materials with a maximum 

support for 3 lights affecting an object, full support for all combinations of a material 

with 3 light types result in a total of 54 44 different shader which have to be implemented 

and updated, whenever something in the lighting model or materials changes. 

43 WVP = World View Projection means the transformation matrices resulting from scale, rotation and 

position of an object in the scene and the view and projection matrices by the active viewport 

44 Two materials can be lit each by 3 lights at once, each light can be one of 3 types 2∗3 3 =54


Although Shader Model 3.0 solves some of these problems, providing support 

for dynamic branching and up to 512 arithmetic instruction slots, it still remains limited 

to a maximum of 8 lights affecting an object at once (ref. [HAR04]). All material shader 

can be unified in a single Meta Shader using a branch for each material. But the problem 

with a huge number of different shader for all light material combinations still remains. 

Shader Model 3.0 is only supported by the latest generation of graphic adapters, 

while the majority of graphic adapters only supports Shader Model 2.0 or even lower. 

To improve performance and deal with complex scenes including multiple light 

sources, a complex scene graph has to be maintained. To avoid complex lighting operations 

on geometry which is occluded by other solid geometry in the final image, all solid 

meshes in the scene can be sorted front to back in regards to the camera view. This way 

the closest objects are rendered first and if other objects processed later were to be 

rendered behind these objects the depth test fails and no shader operations have to be 

calculated. To comply with light numbers greater than the maximum number of lights 

supported, the lights can be sorted for each mesh by distance and intensity. Using this 

sorting only the most relevant lights for each object are processed by the shader. Scenes 

with a great number of dynamic lights are usually outdoor night scenes, where a whole 

scenery can easily be filled with more than 20 lights. But most of these lights only affect 

a small area of the whole scene, and a single object is probably not affected by more 

than 3 lights. 

2.2.3 Deferred Shading – Multiple Passes for Multiple Lights 

Deferred Shading is a whole different approach to rendering. It addresses the limitations 

of Shader Model 2.0 and Forward Rendering with a render design separating lighting 

from object geometry. Using Deferred Shading, the whole scene geometry is first 

rendered to multiple render targets. A render target is a bitmap just like the final output 

image, but render targets can not only save colors, but also position, normal direction, or 

other surface parameters of the underlying object 45 . No surface calculations are done in 

the first pass of Deferred Rendering, neither are any lighting computations or other surface 

effects. However the first pass includes all effects changing the geometry of an object, 

like skinning or displacement mapping using only Vertex Shader. The render targets 

used in this process usually save all information needed for lighting and sometimes 

special post processing effects like motion blur. 

In the next step the render targets are used as input textures for lighting calculations. All 

scene lights are applied per pixel to the geometry buffer, step by step filling the final 

scene with light. Since each light is rendered in a single shader pass, each light only 

needs one shader, which can access all required information already formatted 46 . 

45 Some of these parameters require the support of Floating Point Render Targets. 

46 The geometry buffer can already do light independent calculations and save the surface parameters as 

normalized values as needed for lighting calculations


Figure 2.3: Visualization of 4 Preliminary Buffers used for Deferred Shading 

The contents of the Geometry-Buffer can be visualized (see Figure 2.3). Using a discrete 

rendering pass for each light in the scene, the limitations of Shader Model 2.0 can 

be bypassed, as each light has a particular shader, only containing the lighting calculations 

for this particular light. As a result the only limitations for the number of lights is 

the available performance of the target system. Another benefit of Deferred Rendering 

is the per pixel calculation of lighting. Since all effects are only applied to the pixels 

making up the final scene, each light will only affect pixels which are visible on the final 

screen, even without any front to back sorting. Complex geometry will only be relevant 

in the geometry part of Deferred Shading and can be treated completely separated 

from any lighting or post processing effects. Another side effect of the screen space 

lighting per pixel is that the calculation time of light depends immediately on the screen 

space it covers 47 . 

But outweighing those benefits are some drawbacks of Deferred Rendering. The 

principal drawback is the limitation of the geometry buffer, which only offers a single 

set of surface parameters for each pixel of the final image. Whereas transparent objects 

occupy the same pixels as the objects partially occluded by them. Resulting from this 

Deferred Rendering can only render lighting and effects for the transparent surface or 

the surface lying behind it. Since both surfaces require lighting, a second pass for transparent 

objects is inevitable. One option is to use a multi layer geometry buffer, which 

can store multiple sets of surface parameters for each pixel. This informations can then 

be used by multiple lighting passes to generate the final screen pixel. Another option is 

47 This is a desired effect, since small lights very far from the camera should only use minimal processing 

power, even if the light very complex geometry, which only covers a few pixels of the screen


to render a whole second pass of Deferred Shading rendering the transparent surfaces on 

top of the solid geometry processed in the first pass. But this would be very expensive, 

because it would almost double the rendering time of a single frame. 

The third and most widely used method to deal with transparent geometry results 

from the fact transparent surfaces not needing as precise lighting as solid geometry, 

since they are translucent. All solid geometry in the scene is rendered using Deferred 

Shading and the transparent surfaces are drawn using Forward Rendering. Since transparent 

surfaces are not as visible as solid geometry, only the main lights of a scene need 

to be considered in lighting. This allows the use of Shader Model 2.0 for a scene with an 

nearly unlimited number of lights affecting all solid geometry, and the three most important 

lights also affecting transparent surfaces. 

2.2.4 Conclusion – Deferred Rendering Does the Job 

Considering benefits and drawbacks of both rendering techniques, the author chooses 

Deferred Rendering for Assembler Bay. The game concept is designed to run on current 

hardware, but Shader Model 3.0 is only supported by the latest generation of graphic adapters 

and even state of the art titles released 2009 still support Shader Model 2.0. Since 

Assembler Bay aims to be a visually stunning game, the limitations of Forward Rendering 

with Shader Model 2.0 are too severe. Another motive is the use of new technologies 

in the game engine for this project. Whereas Forward Rendering is a traditional 

Rendering technique used by almost every game, Deferred Rendering has just been introduced 

to the market by some top titles each setting milestones in game graphics. One 

of these titles is Killzone 2 which releases 2009 for Sony's PlayStation3 (Figure 2.4 48 ) 

Figure 2.4: Screenshot of Killzone 2 showing capabilities of Deferred Shading 

48 Screenshot found on [http://www.killzone.com/kz/media.psml]


2.3 Physics – Simulation or Feint 

The importance of a working Physics Engine in every modern 3D game with action elements 

has been explained in section 2.1.2. This section will depict the variety of tasks a 

Physics Engine has to solve and problems arising from it. Without a Physics Engine a 

character could float like a ghost, no gravity would pull him down and he could just fly 

through walls and solid objects. A Physics Engine has two primary tasks – Collision 

Detection and Collision Response. The Collision Detection component uses various 

methods to decide if two objects intersect, or will intersect in the next frame if they keep 

on moving. The Collision Response component highly depends on the participating objects. 

While a massive stone should fall on the ground, breaking things under its weight, 

a rubber should deform and bounce. Characters or other living entities are a special 

case, because they react to collisions by moving their own body, if they are still alive. 

2.3.1 Collision Detection and Collision Response 

The straightforward method to check if two objects intersect would be to check each 

polygon of one object against each polygon of the other object for intersection. But with 

high polygon counts in current games, it is not uncommon to have about 500.000 polygons 

in a scene. This would result in about 250 billion polygon intersection checks per 

frame, even a 5 GHz CPU would still need about several minutes for a single frame 49 . 

To improve performance, a lot of objects can be excluded from Collision Detection using 

information about the scene (e.g. static geometry can not collide with each other, because 

it is not moving) The second step to reduce calculation costs is the use of bounding 

volumes, which can either be used to deny collision detection for a whole object, if 

its bounding volume does not intersect with any other bounding volume, or to simplify 

collision detection by using the bounding volume to replace the actual geometry for 

physics calculations. 

The second method is binary space partitioning, where a huge number of polygons 

or objects can be omitted from intersection tests, because they are in distinct parts 

of the scene and too far away from each other to intersect. Using binary space partitioning, 

polygon exact collision detection can be performed with high performance (see 

[EBE04]). This is desirable, because the use of collision meshes requires additional 

work in level design, as collision geometry has to be created for every object in a level. 

Although professional games can afford high expenses in level design and additional 

work, the concept for the game engine designed in this project is an engine which can 

easily integrate new game assets without special requirements. 

49 Even if an intersection test were possible in as few as 10 CPU cycles, computation would still need 

about 250 billion∗10 cycles /5GHz=500seconds

Engine Design ::: Physics – Simulation or Feint 21 

The second task of a Physics Engine is Collision Response. This includes objects 

bouncing off each other, objects lying on the ground and falling over edges. The easy 

part of Collision Response is the calculation of the velocity of colliding objects using 

the law of preservation of energy. More complex is the influence of angular velocity on 

the collision and the resulting torque. A Physics Engine for a game does not always 

need to be correct. In most cases a Physics Engine only has to look the way the user expect 

things to behave. Games which require real physics simulations are usually sport 

games including golf simulations, racing or soccer games. Physics Engines for these 

kind of games need to simulate much more sophisticated physical phenomena like drag, 

friction or the Magnus-Force. In these cases numerical methods are required to simulate 

such complex phenomena. The most widespread used operation is the fourth-order 

Runge-Kutta method, which is a special case of the numerical Runge-Kutta method, 

used to solve ordinary differential equations. However such advanced effects are not required 

in games using the Physics Engine just as an optional component. 

Action games or first person shooters usually only require a Physics Engine 

which looks believable. This can be achieved using simplified equations which represent 

the physical phenomena with most influence, being gravity, friction and preservation 

of momentum. To accomplish a believable behavior of different objects with this 

simplified equations, many tests and fine tuning are required, because the values used 

are purely empirical 50 . 

2.3.2 Collisions Under Extreme Conditions 

Extreme conditions regarding Collision Handling emerge when objects collide with 

very high or very low velocities or in special environments like underwater. High velocity 

collisions are critical, because a game runs in discrete cycles. Collision Detection is 

calculated every frame, but very high velocities can result in an object moving a huge 

gap in one frame. Especially if the distance an object travels in a single frame is larger 

than the diameter of the object itself. The object can be on one side of a wall in the first 

frame and on the other side in the next frame, without intersecting with the wall in 

either of these two frames. To comply with this problem, the physics component computes 

Collision Handling each frame in advance to the movement of the object and does 

not use the actual position, but rather the forecast position of the object, if it keeps its 

velocity. In this matter, if the velocity of an object is too high, it can run through a loop 

of Collision Detections, moving just a part of the whole distance each time, until it 

either collides or Collision Detection is computed for the whole range. 

Another problem is an objects getting stuck between other objects or gaps in the 

level geometry. If an object collides with a wall, its velocity is adapted in a way repelling 

it from the wall, but if it immediately collides with an opposite wall it may get 

50 Although the formulas used are usually based on physic equations, values for mass, coefficient of 

restitution or moment of inertia are just guesses for objects in a 3D-Game

Engine Design ::: Physics – Simulation or Feint 22 

stuck inside the wall. Recursive Collision Detection would have to do a second cycle 

after the first Collision Response, to check if the object will collide with another object 

prior to moving it. But if an object gets stuck between two walls, this recursion would 

be infinite. Even a reasonable limit in recursions would still have a severe impact on 

performance. A reasonable way to deal with this problem is to halt the movements of an 

object completely if it collides with two opposite surfaces at once, if they both exceed 

the colliding object in weight. This leads to another problem: Preservation of momentum 

for whole groups of objects. In reality the lowest crate of a high stack of crates 

can not be moved easily, because the whole weight of all crates on top of it add up to its 

weight, being connected by static boundary friction. But to simulate this kind of influence 

of objects on each other is a complex task, which should only be implemented if a 

correct simulation of physics is a core element of the game. To feint the effect of static 

boundary friction resulting from gravity, each object can be assigned a list of objects 

which add their mass to the current objects mass. On Collision Response this calculated 

mass can be used. Each object, which collides with another object below its center of 

gravity will add itself to the list of its partner as long as the collision occurs. Although 

this feint does not deal with all influences of static boundary friction, it is sufficient. 

The second problem are very low velocity collisions in conjunction with gravity. 

In our everyday environment it is natural, that objects fall to the ground and keep lying 

still if no external force is applied to them. But lying on the ground is in fact a permanent 

collision of an object accelerated by gravity. Naturally a big part of the surface of an 

object touches the ground, if it is lying still. A permanent collision results in positive 

Collision Detection for all polygons touching the floor, in every frame. This produces 

the need for many Collision Responses, as each colliding polygon has an influence on 

the movement of an object. Because most objects will be lying still on the ground most 

of the time, the permanent collisions will have a severe impact on performance. To 

comply with this, an object can be labeled static, as soon its velocity undercuts a certain 

minimum while it collides with the ground. But this results in a new problem, because 

an object can also be colliding with another object in midair. If both objects are standing 

relatively still, it could happen that both objects get stuck at each other and are both 

labeled static, keeping floating in midair. Again many tests and fine tuning is required to 

get an acceptable result. 

The goal of the physics component designed in this project is a compromise 

between performance and a realistic look. Although it is acceptable, if objects do not behave 

in a physical correct way, it is important the user does not notice this while playing 

the game.

Draft and Expectations – Assembler Bay ::: Draft and Expectations – Assembler Bay 23 

3 Draft and Expectations – Assembler Bay 

This chapter will depict the design process of the game Assembler Bay. The first section 

will introduce various ideas for gameplay, story and setting of Assembler Bay and detail 

the process of brainstorming and sorting out ideas. 

The second section will explain why certain ideas were modified or discarded 

due to limitations and features of the underlying game engine. It will detail the high dependency 

of a digital game on its engine. 

3.1 Story and Setting – A Game Concept is Born 

The first idea to Assembler Bay came from brainstorming for creative game ideas. It is a 

3D puzzle game, set in an science fiction environment. The interesting basic idea of it is 

to make a game where the user can puzzle together complex objects like a space ship, 

abstract figures, tools or weapons in a free 3D space. The first game idea did neither include 

any means of collecting the parts nor an explicit main character or a particular 

story. But the strong point of the concept is the idea for a unique gameplay, because the 

central part of Assembler Bay is a 3D puzzle game. There is no game available with a 

similar gameplay concept, which makes Assembler Bay original in its fundamentals. 

The crucial point for the success of this project is the implementation of this genuine 

idea into a state of the art video game. The basic idea of a puzzle game could also have 

been realized without any surrounding story or main character. The game Tetris points 

out, that a good game does neither need a captivating storyline nor a great presentation 

to be successful. But the team decided at the very beginning to create a game with a 

continuous story and a real main character with emotions and his own history. Although 

a story based game requires a lot more work to be done, it is a whole different experience 

for the user, because he is able to identify with the protagonist. 

For the successful creation of a game, a sophisticated concept scoping details in 

gameplay as well as the intended design is inevitable, unless the game is designed by a 

single person. The first step for the whole team or at least all leading game designers is 

to sit together and envision their ideas for the final game. The images for the final game 

will naturally differ from person to person. This means a compromise has to be found in 

the form of a detailed paper, including all features and design specifications for the final

Draft and Expectations – Assembler Bay ::: Story and Setting – A Game Concept is Born 24 

game. The first concept session for Assembler Bay brought up more questions than answers, 

because the visions of the three team members differed greatly. If a concept session 

starts with too many differences, it has to be subdivided. 

A game concept can be divided in four main categories: 

• Setting 

• Presentation 

• Game Mechanics 

• Story 

The setting of a game is the basic situation the game represents. It includes the fundamentals 

of the virtual reality wherein the game is situated. A setting can even be represented 

by just a single sentence or even just a few drafts. Although it may seem the three 

last categories include the setting as part of the presentation and the story, it is important 

to agree on a certain setting, before the following points are discussed. Because the setting 

of a game will limit the possible game mechanics and even features of the presentation, 

it is the first point to be handled. The different ideas of the team members are unified 

to the setting of Assembler Bay: 

Assembler Bay is a 3D Puzzle Game, taking place in a fantastic future 

virtual reality. The main character is represented by his virtual alter ego 

in the computer world. His mission is the recovery of the virtual network, 

which is threatened by a superior entity. To do so, the protagonist has to 

reconstruct several parts of the virtual reality, which were destroyed by 

the hostile entity and subsequent defeat the antagonist. 

Although many details of story and gameplay are not defined in this outline, it is a good 

setting for a game. For a setting to be good, it has to be accepted by all team members 

and must not specify details, which will certainly be subject to change afterwards. An 

early agreement on a certain setting allows the team to spend their energy on the right 

topics. If a setting is too vague, whole concepts will be developed in vain, because they 

do not have any common base. This setting provides enough key points to start the 

design of concepts for game mechanics, presentation and story elements. 

Since the protagonist is a virtual character in a computer generated environment, 

he can have exceptional abilities. The whole environment can be designed in any imaginable 

shape, since it does not represent a continuous world, but parts of a damaged 

virtual network. These are particular key points for game mechanics, even though it was 

not even needed to decide if the protagonist is male or female or not even human. The 

same is true for the antagonist. Although his influence on the game and the virtual real-

Draft and Expectations – Assembler Bay ::: Story and Setting – A Game Concept is Born 25 

ity is stated, no details about his nature are determined. The antagonist could be a human, 

but he could also be a program or a group of various programs. The antagonist 

could be hostile or just accidentally causing problems without having influence on the 

game mechanics. This setting leaves enough story points open for later discussion, but 

sets a concrete base for the development of game mechanics and presentation. 

3.2 Presentation – The Look of a Game 

Although many points were still very vague in the first idea for Assembler Bay, including 

the name itself, the target ambiance of the game was already determined. Assembler 

Bay will take place in a clean science fiction environment with very clear and shining 

colors. The main color in the first draws is white, with strong color accents in the level 

architecture. An obvious inspiration is the game Mirror's Edge by Electronic Arts, 

which impresses with vivid colors and various effects. One of these effects is HDR Rendering 

51 , which gives every scene a maximum of contrast and saturation. Another effect 

is a subtle bloom effect on the brightest tints, furthermore increasing the feeling of shining 

colors. The combination of these effects together with strong color accents creates 

an unique look (see Figure 3.1 52 ) 

Figure 3.1: Vivid Colors in Mirror's Edge 

51 HDRR = High Dynamic Range Rendering, a rendering technique using a higher range of possible colors 

than the output display naturally supports. This results in a higher contrast and more details 

52 Picture taken from [http://on-mirrors-edge.com/incgalleryview.php?file=14.jpg]

Draft and Expectations – Assembler Bay ::: Presentation – The Look of a Game 26 

Although being a guide for Assembler Bay, this project does not try to copy the look of 

Mirror's Edge, but rather to evolve its own look inspired by it. One key difference for 

Assembler Bay is lighting, because it takes place in a virtual environment, which does 

not need to obey natural laws. The concept for Assembler Bay suggests a discrete level 

architecture, where each level is situated in its own environment and transfer between 

levels is restricted, because they represent separated parts of a virtual network. A virtual 

reality can be equipped with a number of different light sources, while a game in a real 

outside environment needs to simulate the sky. This includes sunlight, moon and depending 

on the game even a realistic day and night circle. Indoor scenes are still bound 

to visible light sources, such as light bulbs or neon lamps (see [MIC06]). If a scene is 

just lighted, without the presence of a traceable light source, the game will look unrealistic. 

But Assembler Bay wants to achieve a look between realism and fiction, the virtual 

reality is reminiscent of the real world in obeying some laws of reality. But on the 

other hand some features of the world in Assembler Bay will always look unrealistic 

and remind of the uncertainty and the threats this virtual world bears. 

The second key difference of Assembler Bay to realistic games is level architecture. 

A level in Assembler Bay can use any imaginable geometry, without the need of 

being physically possible. Furthermore discontinuous levels allow for great differences 

between levels, whereas a coherent world needs to have a coherent representation. The 

presentation of Assembler Bay is designed to create an ambiance of a virtual reality reminding 

of a high technology science fiction world. The key point to create this look is 

the design of a detailed and clear level architecture. But the level design alone is not 

enough to create a unique look. The rendering component of the game engine greatly 

contributes to the final look of the game. The resulting ambiance or style of a game is a 

composition of object design, texturing and rendering. 

3.2.1 Visual Features for Assembler Bay 

The concept phase of a game includes setting up requirements for the game engine. The 

requirements for the rendering component have to be adapted to the game, since a high 

quality graphics engine is specialized for certain circumstances. As such, many different 

rendering techniques for various effects and situations exist. But there was neither a 

need, nor the time to implement all promising available techniques. The team created a 

list with effects and techniques which would improve the image quality of Assembler 

Bay and contribute to the aspired look. This list is ordered by importance and will be 

processed and implemented from top to bottom.


Desirable Graphic Features for Assembler Bay: 

• High Quality Dynamic Lighting 

• Dynamic Soft Shadows 

• Ambient Occlusion 

• Overexposure 

• Multi Sample Anti Aliasing (optional) 

• Bloom Effect (optional) 

• Depth of Field (optional) 

• HDR Post Processing (optional) 

The first required feature is high quality dynamic lighting, which is different from 

standard lighting supplied by Direct X 9 graphics adapters. The lighting model for Assembler 

Bay should include various types of different lights in particular shapes. These 

light types and respective parameters are aimed at the lighting model provided by the 

3D CAD 53 software the team used – 3ds Max by Autodesk. An implementation using the 

same parameters as specified in the 3D modeling software had one big advantage. The 

designed geometry could be completely lighted in the 3D modeling software, which allows 

for instant rendering of lighted scenes. The lighting information could be transferred 

to the game afterwards and provide the lighting, which was desired by the architect 

of the level. High quality dynamic lighting also includes the features of Deferred 

Shading detailed in section 2.3.3. 

3.2.2 Dynamic Soft Shadows – High Realism Through Shadows? 

Dynamic shadows are a feature expected in every modern 3D engine. Because Ray Tracing, 

or Radiation are still far from being applicable in real time applications, the last 

years established two major approaches for shadow generation. Shadow generation in 

games has to deal with performance and visual appearance, while realism does not matter 

to the user, as long as the result looks believable. This two major approaches will be 

discussed in the following sections, since the shadowing algorithm has a great influence 

on the final look of a game. The first approach are Stencil Shadow Volumes. 

53 3D Computer Aided Design – Software for 3D modeling and animation


3.2.3 Stencil Shadow Volumes – Pixel Perfect Hard Shadows 

The basic idea behind Stencil Shadow Volumes is to generate a polygonal shadow 

volume from the object mesh itself. This shadow volume represents the volume where 

light is occluded by the object. This volume is rendered to the Stencil Buffer 54 in a way 

similar to the rendering of light volumes in Deferred Shading (see section 4.3.5). Once 

the Stencil Buffer holds information about which parts of the scene are in shadow, the 

light can be applied to all pixels not marked in the Stencil Buffer. The benefit of Stencil 

Shadow Volumes is resolution independence. All shadows will always produce pixelperfect 

sharp edges, because polygons are used for shadow determination (Figure 3.2). 

Figure 3.2: Stencil Shadow Volumes in Doom 3, producing hard edges 

However the complex part of producing Stencil Shadow Volumes is the generation of 

the shadow volume geometry. To generate a shadow volume, the boundary between 

lighted and shadowed polygons in respect to a certain light source has to be found. This 

can be done by separating the polygons into front-facing and back-facing polygons as 

seen from the center of the light source. The edges between those polygons are extruded 

away from the light source towards infinity. This extrusion really needs to be done to infinity, 

because any static value could be too small, if the light source is very close to an 

object. For objects with complex geometry the generation of shadow volumes can be 

very time consuming, as many polygons have to be drawn. Although an approach using 

the Depth Buffer for geometry independent shadow volume generation exists, it is not 

very practical, as it introduces aliasing and does not scale well on complex objects. 

54 The Stencil Buffer is an arbitrary rendering buffer, which can save additional information


Although being used in some recent games (e.g. Doom 3) Stencil Shadow 

Volumes have some serious drawbacks. One drawback is the performance for complex 

geometry, as the shadow volumes will have many small edges stretched over the whole 

screen, but filling only a few pixels. This will produce high drawing costs. Another 

downside are the perfectly sharp shadow edges, which seem very unnatural if a game 

uses ambient occlusion, or other soft shadowing effects. 

But the most important reason, this project will not use Stencil Shadow Volumes 

in an open source community engine, is the strict requirement to mesh geometry. Because 

the shadow volumes are created from object polygons, all objects are required to 

be completely closed meshes. A single open polygon or unconnected edge could crash 

the whole game, or at least produce severe artifacts. Professional studios can invest a 

great amount of time into the quality of 3D-models, but community games are usually 

developed as a hobby and use models which can not guarantee these strict requirements. 

3.2.4 Shadow Mapping – Simple, Compatible and Somewhat Blurry 

The second technique, which is used in most games is Shadow Mapping. The idea behind 

Shadow Mapping is that every shadowed pixel in a scene needs to be occluded by 

something, e.g. something is blocking the light rays from reaching this pixel. If a pixel 

is occluded, this means this pixel is not visible from the view of the light, as another 

pixel is closer to the light at this particular point. Since the Depth Buffer is designed to 

save the distance of pixels to a viewport, Shadow Mapping uses this available technology. 

First the whole scene is rendered to a Depth Buffer from the light viewport 55 . This 

Depth Buffer is used in the lighting draw call, where every pixel drawn from the camera 

viewport will be projected to the light viewport and compared against the stored value 

in the Depth Buffer. If the stored value is smaller than the calculated depth value of the 

pixel, the pixel is shaded. Otherwise the pixel is directly lit by the light source and lighting 

can be applied as usual. 

However Shadow Mapping has one severe drawback in contrast to Stencil Shadow 

Volumes, which is aliasing. Because the shadow map is a bitmap, it has limited resolution, 

which results in visible pixel borders, if one shadow map pixel is projected to 

multiple pixels of the final image. This aliasing will not occur with Stencil Shadow 

Volumes, because the Stencil Buffer used is mapped 1:1 to the final image. Another 

drawback is the limited precision of the Depth Buffer, which leads to self shadowing or 

discontinuous shadow boundaries. Because the difference in the depth values of two 

surfaces very close to each other can be smaller than the smallest step in the Depth Buffer. 

The aliasing problem of Shadow Mapping can be witnessed in Figure 3.3. 

55 The light viewport is a virtual camera, which can see everything directly illuminated by this light


Figure 3.3: Shadow Map Aliasing (1024x1024 shadow maps) PCF only 

But many drawbacks of Shadow Mapping can be addressed by using more sophisticated 

algorithms. The comparison and implementation of these algorithms will be detailed in 

section 4.4, where Variance Shadow Mapping is implemented for Assembler Bay. 

3.2.5 Ambient Occlusion – Realtime or Static ? 

Ambient Occlusion is another shading technique, which simulates diffuse reflections 

and occlusion of surfaces. The basis for Ambient Occlusion is the fact, that even objects 

which are not directly lit by a light, receive indirect lighting by diffuse reflections of the 

surrounding surfaces. On the other hand surfaces which are close, block the light for 

each other, darkening both surfaces. Ambient Occlusion creates a strong depth profile 

for scenes, even with only few light sources. Using this effect, scenes can have a convincing 

spatial look, even without the use of any textures or different colors. 

However since Ambient Occlusion depends on the relation of many surfaces to 

each other, calculation costs for dynamic real time Ambient Occlusion are very high. 

Various methods for the simulation of Ambient Occlusion exist, but none of them is 

really suited for a real time application on current hardware. NVIDIA released a paper 

on Screen Space Ambient Occlusion in September 2008, describing a method of feinting 

Global Illumination by using the depth buffer (ref. [BUN05]). The depth buffer contains 

a height map of the part of the scene visible to the camera. Although SSAO 56 

misses out on a great part of the scene, which is not visible to the camera, it still produces 

convincing results and is used in titles like S.T.A.L.K.E.R or Assassins Creed. 

56 Screen Space Ambient Occlusion uses the Depth and Normal-Buffer of the final image for calculations 

rather than the whole scene geometry.


Still, even this feint effect produces a severe impact on performance, which is 

why the team decided to use pre rendered Static Ambient Occlusion. Static Ambient 

Occlusion is calculated in the 3D modeling software beforehand, utilizing Global Illumination 

and Radiosity. The resulting light maps of these calculations are written to textures 

and used in the final game as basis for the diffuse color textures. These method requires 

no calculations in the final game and produces great results for static geometry. 

However, since the lighting textures are static in the final game, moving objects will not 

have dynamic Ambient Occlusion, leading to wrong lighting for these objects in the 

game. But Ambient Occlusion is a very subtle effect, having the biggest impact on self 

shading surfaces or the occlusion resulting from very big objects. Small moving objects 

have only little effect as occluders and will not decrease visual quality. 

3.2.6 Additional Effects for an Unique Look 

Another important effect for the desired look of Assembler Bay is Overexposure, which 

adds an ambient lighting to the scene, resulting in shining colors. The basic idea of 

Overexposure is to let all colors shine with maximum intensity regardless of how much 

light actually reaches the surface. Without other effects this would result in a completely 

plain picture, lacking any sense of depth. But in combination with Ambient Occlusion, 

specular highlights and dynamic shadows an unique look can be achieved. The whole 

scene has a bright and illuminated look including bright specular reflections on glossy 

surfaces. On the other hand rifts, or occluded pieces are darker and all objects project 

strong cast shadows on their surroundings. The effect of Overexposure can be seen in 

Figure 3.4. 

Figure 3.4: Comparison of Overexposure and Normal Lighting in Assembler Bay


The left side of Figure 3.4 shows traditional lighting of a scene, the right side shows 

only colors with Ambient Occlusion and the center part shows the result of Overexposure 

with shadows and highlights. Since time and resources for this project are limited, 

all effects which are not essential for the look of the game are labeled optional. These 

effects would be beneficial for the final visuals, but are not required as essential features. 

The first effect is Multi Sample Anti Aliasing (MSAA), which improves visual 

quality by interpolating edges from multiple samples for a single pixel in the final image. 

MSAA is natively supported by modern graphic adapters and hence available in 

most games. But MSAA has some limitations, if special rendering techniques or effects 

are used which utilize multiple render targets or floating point textures. These effects include 

Deferred Shading, HDRR and certain methods of motion blur or Depth of Field. 

Since Assembler Bay uses Deferred Shading, which is incompatible with MSAA, the 

implementation would only work via certain changes in the render architecture. 

The next optional feature is a Bloom effect. Bloom is an post processing effect, 

which is addressed in section 1.4. It operates solely on 2D color data and can be implemented 

independent of any rendering technique used. Bloom usually works on certain 

colored or exceptional bright parts of an image and blurs these colors, resulting in 

bleeding of these shades over neighboring objects. Bloom simulates a natural effect, for 

example if light shines into a dark room through a small window, the light will bleed 

over the window border and appear brighter (ref. [MIC06]) Depth of Field is another 

post processing effect, but in contrast to bloom, it requires additional spatial information 

to the pixel color. Depth of Field simulates the limited depth of focus a camera or the 

human eye can only see sharp at once. Everything outside of this depth range will appear 

blurred. Depth of field uses a certain predefined or calculated depth range for the 

viewport and blurs every part of the image, which is closer or further away. 

Figure 3.5: Pre-Rendered Depth of Field and selective Bloom for Assembler Bay


Depth of Field can concentrate the user's attention on a certain space or object in 

the scene and create a feeling of great distances. Figure 3.5 shows a pre rendered simulation 

of depth of field in addition to a selective Bloom filter. A selective Bloom filter is 

a special case of Bloom effect. Selective Bloom does not filter the picture with a lowpass 

filter to extract the regions which should be affected, but rather uses scene information, 

like a certain object-id rendered in an arbitrary buffer. 

3.3 Game Mechanics – Concepts for a 3D Puzzle Game 

The basic concept of Assembler Bay is a 3D puzzle game, but a much more detailed 

concept of game mechanics is needed for the final paper 57 . Since a game is not limited 

to its primary game element, the possible game elements for a puzzle game can be divided 

in three sections. 

These three sections make up the final game experience: 

• Primary Puzzle Mechanics 

• Supplemental Elements 

• Secondary Game Mechanics 

Primary Puzzle Mechanics are the core gameplay of a puzzle game. If a game is designed 

without any storyline or additional features, the primary game mechanics will be 

sufficient for the whole game. In Assembler Bay Primary Puzzle Mechanics are required 

for the user to interact with the 3D puzzle. To assemble complex objects in free 

3D space is a difficult task with about 8 degrees of freedom for position and alignment 

of a tile. It would be very difficult for the user to start with any tile free floating in 3D 

space and attach other parts. In addition to the complex task of finding the right part to 

attach, the user would also have to look out for the surrounding free space remaining 

and adjust the tiles to fit exactly. The team decided Assembler Bay is designed to be an 

action oriented game and subsequent the Puzzle Mechanics should be as intuitive as 

possible. Assembler Bay will have time limited puzzle stages, where the user has to react 

fast and attach the right tiles in rapid succession. 

A speed oriented gameplay requires a less complex puzzle structure. The team 

decided on a hierarchical puzzle structure, where each puzzle would include a single 

center part as a starting point for assembly. This center part will be placed automatically 

in the right spot and the user has to attach the remaining parts to this center part. But the 

puzzle will not be limited to a single layer, although independent parts can be attached 

to the center part in any sequence, other tiles require the prior attachment of certain 

57 Final paper refers to the final game design document, including all guidelines for the game

Draft and Expectations – Assembler Bay ::: Game Mechanics – Concepts for a 3D Puzzle Game 34 

parts. Any attached tile can also provide additional connectors for subsequent tiles, 

which can only be attached, after their parent tile has been attached. To limit the degrees 

of freedom and the precision required by the user, each connection between two parts is 

represented by two compatible connectors. This way the user has a discrete amount of 

possible positions where a puzzle tile can be attached, leaving only the rotation around 

the connector as an stepless degree of freedom. Another idea is the introduction of particular 

patterns on the connectors, which would identify the correct orientation of a tile. 

However this idea is labeled optional, because tests are needed to verify if it improves 

gameplay. 

The Primary Puzzle Mechanics are detailed enough with these facts, but Assembler 

Bay is aimed to be a game with a continuous story and an elaborate main character. 

A simple row of puzzle stages can neither tell a dramatic story nor include an individual 

protagonist, because a storyline needs continuous interaction with the character. This 

leads to the requirement of a continuous level design, which allows the user to identify 

with his character and spend time with him. 

One way to integrate a story into a puzzle game is the implementation of cinematic 

sequences between stages, which tell a story using cut- scenes, pictures or ingame 

animations. An other way is the design of freely accessible worlds around the puzzle 

stages. If the player can walk around and interact freely with the environment, he will 

spend more time in direct control of the protagonist. This form of free interaction helps 

the user to identify with the main character and promotes the level geometry from 

simple background scenery to an active element of gameplay. The story can be experienced 

in direct interaction with the environment or other characters. 

Supplemental Elements are the parts of game mechanics, which have a direct 

connection to the core element of gameplay, but are not included in the Primary Game 

Mechanics. There are several Supplemental Elements in Assembler Bay. The first is the 

collection of the puzzle tiles, which are scattered throughout the level. The protagonist 

has to master several jump and run passages and gather all tiles at the puzzle stage. The 

collection part includes smaller challenges, where certain obstacles have to be cleared 

to reach a tile. Examples for these obstacles are closed doors, lowered bridges or inactive 

lifts, which have to be activated. An excerpt of the draft for a manual to Assembler 

Bay, concerning the interaction with the environment using collected puzzle tiles can be 

seen in Figure 3.6. The user can use collected tiles to activate unreachable buttons. 

Figure 3.6: Excerpt of a draft for the Assembler Bay manual

Draft and Expectations – Assembler Bay ::: Game Mechanics – Concepts for a 3D Puzzle Game 35 

Secondary Game Mechanics are additional features in gameplay, which are not immediately 

related to the Primary Puzzle Mechanics. Since the team wants to concentrate on 

the core gameplay, they are only optional in the design of Assembler Bay. As an additional 

reward for clearing a level, the user may be able to access an action game sequence, 

where he uses an assembled object to reach the next level. This action sequence 

should be a simple and very fast game of reaction in contrast to the calm atmosphere of 

assembling the puzzle. This could be a high speed flying sequence with an assembled 

glider through a canyon, or a shoot out with an assembled battle robot. But these elements 

are only optional since they are contrary to the idea of an unique gameplay, and 

may lead to frustration with certain users, who dislike action games and prefer Assembler 

Bay because of its lack of fighting sequences.

Implementation ::: Implementation 36 

4 Implementation 

This chapter will detail the main parts of the implementation of the features mentioned 

in the previous chapters. The first section will depict the implementation of the game 

engine as a base for Assembler Bay. Included in the game engine is a physics engine, 

which provides a good and robust simulation of physics for the game. 

The second and third chapter will depict the implementation of the graphics engine 

for Assembler Bay and detail the implementation of its two main features – Deferred 

Shading and Shadow Mapping. 

4.1 Game Engine – The Core for Assembler Bay 

The required features of a game engine have been depicted in section 2.1. The subsequent 

sections will detail the implementation of these features in the form of an open 

source game engine. The engine developed in this project is titled QuicknEasy Engine 

(QEE). The programming language used in this project is C# utilizing XNA libraries. 

The Microsoft XNA Framework is a game development framework extending the .NET 

software platform and supporting DirectX9 or DirectX10. The use of these widely distributed 

technologies is preferable, because most hardware will support the interfaces 

provided by DirectX9 and many hardware specific issues can be ignored. 

Although the game was originally planned primary for the XBOX 360, the platform 

used for development is a PC running Microsoft Windows XP. This decision is 

based on the fact that development on the XBOX 360 is paired with several complications. 

Microsoft restricts development on the XBOX 360 by particular licensing obligations. 

Costly accessories and special licenses have to be acquired, which would have 

been a hindrance for this project. The subsequent sections will detail certain aspects of 

the implementation, which represent important milestones in the development of this 

project. Many less important details of the implementation will not be explained in detail, 

as the whole source code for the engine is publicly available. 

4.1.1 Engine Structure – Organization is the key 

The implementation of the mentioned features for a game engine requires a coherent 

draft for classes and relations. This diagram includes all important classes, which 

provide basic functionality for the final game engine. A class diagram with the basic en-

Implementation ::: Game Engine – The Core for Assembler Bay 37 

gine structure of QEE 58 can be seen in Figure 4.1. The classes are connected with three 

different types of arrows to symbolize single references, multiple references and inheritance. 

This diagram is hierarchically arranged to provide an overview over the control 

mechanisms between various classes. 

Each class in the diagram is designed to control all classes underneath itself. 

Control is an abstract design concept, which will be implemented as initialization, regular 

update calls and disposal of the controlled classes. To organize a game engine in a 

hierarchical way has the advantage that every class only needs to manage a particular 

subset of classes. If all classes are registered as independent game components, which 

are updated and drawn by the framework engine, an additional structure is needed to 

manage groups of entities. In contrast to this, a hierarchical engine provides easy methods 

to pause or reactivate a whole group of entities via its controller. 

Figure 4.1: Class Diagram Showing the Basic Structure of the game engine 

The basic class in this structure is the QEEGame class, which holds references to several 

instances of the Screen class. A Screen represents a certain state of the game, like the 

main menu or the ingame view of the game. The QEEGame class also provides a reference 

to the InputManager and the preferred DirectX graphics device. It can be accessed 

by any class in the game via a static attribute referencing the sole active instance. 

Each Screen contains a BindingSet, which is promoted to the InputManager, and 

represents a list of key bindings which are active as long as the owning Screen is active. 

The InputManager stores a list of all BindingSets and calls the stored call back method 

of any active key binding if the mapped event occurs. A Screen contains multiple Lay- 

58 QEE = QuicknEasy Engine, the game engine developed for Assembler Bay


ers, each representing a graphical viewport of a scene. Every Layer contains a graphics 

component which will draw all relevant objects for this Layer in a Scene. All update and 

draw calls by the game engine are forwarded through the hierarchy if all members are 

set active respectively visible. 

The Scene class is the third component besides the InputManager and the 

Screen, which is immediately updated by the engine. A Scene controls a collection of 

Entities, which can interact with each other. This represents the collection of all objects 

in a level or in the whole game world which can possibly interact with each other. A 

Scene can be drawn by multiple Layers, but each object can only be present in a single 

Scene at a point in time. 2D and 3D objects can be present in the same Scene, including 

HUD-Elements and for example the ingame character. The renderer of each Layer decides 

which Entities in the collection should be drawn on this particular Layer. 

An Entity can represent any object in a scene, regardless of being visible or interacting 

with other objects. Two basic classes are derived from the Entity class, which 

are Entity2D and Entity3D. Although most objects can be assigned to one of these two 

classes, it is possible to add Entities to a scene, which are only derived from the base 

class Entity. These are objects, which can neither be assigned to a certain owning Entity, 

nor represent and object in 2D or 3D space. In contrast to this Entity2D represents 

objects in the 2D screen space of the final image. The class includes attributes for position, 

scale and orientation of the object, as well as its velocity and angular velocity. This 

class is used for HUD-Elements, menus and on-screen messages. 

Entity3D represents an object in the 3D space of the game world. It provides the 

3D counterparts to the attributes of an Entity2D object, as well as a relative velocity and 

a relative angular velocity. These attributes specify velocities relatively to the local 

world orientation of the object. Although these attributes are not required for some objects 

in 3D space, they are mandatory as it simplifies interaction with most objects and 

only produces minimal calculation overhead. 

Objects derived from Entity3D are the main character, lights, puzzle tiles, the 

camera, collision geometry or the level architecture. Because all attributes of Entity2D 

and Entity3D are designed to be distinct, it is possible to derive a class from both. This 

object can provide a 2D and 3D representation for the same Entity. Each Entity3D includes 

a List of Updater3D objects, which are controlled by the Entity and provide 

modular features for its owner. To control the interaction of all Entities participating in 

physical reactions, the CollisionManager class is updated by the Scene. It holds reference 

to all physics-activated Entities in the scene and informs their Physics Updaters of 

any collisions with other objects, before their positions are changed.


4.1.2 Smooth Skin Animation – Now It Can Walk! 

The basics of animation are the same for television or video games. A sequence of 

frames with slight variations to each other is displayed with at least 24 fps and it is perceived 

as animation. Most modern computer displays feature a refresh rate of 60Hz 

which implies that 60 fps is the desired frame rate for every modern game. Although 

higher frame rates are possible, the animation may even deteriorate if the game is running 

on a frame rate, which is not a multiple of 60fps, since an irregular number of 

frames has to be skipped. Although this part of smooth animations depends on performance 

and the renderer and framework architecture, modern animations introduce different 

problems. Since the frame rate for an animation is very high and not even predetermined 

beforehand, animations are specified via keyframes. Concerning 3D animations a 

keyframe specifies the exact position 59 of all vertices or bones at a certain point in time. 

To get the state of this object at a moment not specified by a keyframe, the neighboring 

keyframes have to be interpolated to a satisfying result. 

Figure 4.2: Various Keyframes from a Jump Animation of the Main Character 

This blending results in much lower storing costs for 3D animations, since only a few 

keyframes need to be stored instead of the whole frame sequence. Three different types 

of 3D animations are used in modern games. The first is Morph Target Animation, 

which is the most fundamental type. A Morph Target Animation specifies the position 

of each vertex in a Mesh for every keyframe. The Positions are interpolated between 

keyframes and result in a smooth animation. Morph Target Animations can be used for 

almost every type of animation, because each vertex can move independently. But the 

downside of this high flexibility is the high bandwidth needed for storage and transfer of 

the vertex data. If a character model with 100.000 vertices is animated, each frame the 

position of all those vertices has to be recalculated and sent to the rendering pipeline. 

59 If bone animation is used, the orientation is the most important feature saved for a certain time


Because Morph Target Animations have this high bandwidth usage, Bone Animations 

are needed for objects with high polygon count. Bone Animations are separated in 

two categories: Simple Bone Animations and Weighted Bone Animations 

A Simple Bone Animation attaches all meshes contained by an object to individual 

bones. A bone is an abstract component, which is only represented by a transformation 

matrix, but can be visualized using geometry. If a bone is moved, rotated or 

scaled, the attached meshes will be transformed respectively. If a mesh only needs to be 

transformed with linear transformations, Simple Bone Animation only needs to transfer a 

single transformation matrix to the graphics adapter. More complex animations are possible 

using Weighted Bone Animation. Each vertex in the mesh is influenced by one or 

more bones with different strength. Only the updated bone transformation matrices have 

to be transferred to the graphics adapter. Smooth skin animation can be achieved using 

minimal bandwidth and only a few matrix interpolations for the animated bones. 

Because Bone Animations are not suited for complex varying movements like facial 

expressions or fluid simulations, a third type of 3D animation covers these cases. 

Procedural Animations save animation data using an individual format specialized for 

an individual type of animation. Facial Expressions can be saved as the cooperation of 

various individual mimic phenomena. Assembler Bay uses only Bone Animation for all 

types of 3D animation. Bone Animation requires smooth interpolation, not only between 

neighboring keyframes, but also between different animations. 

The main character can perform a variety of different actions, each requiring its 

own animation. Smooth transitions are required between all consequent animations. 

This could be realized by adding transitional animations for every possible combination 

of subsequent animations. This would result in a high number of animations, which can 

also be calculated at runtime. Smooth interpolation between neighboring keyframes can 

be adapted to interpolate subsequent animations. These interpolated transformation 

matrices have to be calculated for every bone and sent to the graphics adapter. 

Every keyframe contains transformation matrices only for bones, which move in 

this particular frame. These matrices have to be blended with the matrices of the following 

keyframe. But linear interpolation between transformation matrices results in unwanted 

artifacts, as the blending between two matrices containing rotational and translational 

components may introduce deformation 60 . To deal with this problem, the matrices 

are decomposed into scale, rotation and translation and each component is interpolated 

individually. Additionally the interpolation between animations is done in a way that 

both animations keep playing while gradually being interpolated. 

60 Deformation refers to non uniform scaling or unwanted scaling of certain parts of an object.


// For all Bones attached to a mesh 

for (int i = 0; i < BoneCount; i++) 

{ 

// The blendamount is a value between 0 and 1 indicating 

// the relative time position between the two frames 

if (!nextframe.HasTransform(i) || blendamount == 0) 

{ 

// If the next frame does not specify a new matrix, keep the active matrix 

boneTransforms[i] = lastKeyTransforms[i]; 

} 

else if (blendamount == 1) 

{ 

boneTransforms[i] = nextframe.GetTransform(i); 

} 

else 

{ 

Vector3 scale, scale2, trans, trans2; 

Quaternion rot, rot2; 

} 

} 

// Both matrices are decomposed into their scale|rotation|translation parts 

lastKeyTransforms[i].Decompose(out scale, out rot, out trans); 

nextframe.GetTransform(i).Decompose(out scale2, out rot2, out trans2); 

// All parts are interpolated individually and composed to the final matrix 

boneTransforms[i] = 

Matrix.CreateScale(Vector3.SmoothStep(scale, scale2, blendamount)) * 

Matrix.CreateFromQuaternion(Quaternion.Slerp(rot, rot2, blendamount)) * 

Matrix.CreateTranslation(Vector3.SmoothStep(trans, trans2, blendamount)); 

Code Segment 4.1: SkinnedModel.AnimationPlayer : line 178-198 (commented) 

4.2 Physics Engine – Collisions and Reactions 

As depicted in section 2.3.1 the first important step for every physics engine is collision 

detection. This can be done via various bounding volumes or per polygon and needs 3D 

space partitioning algorithms for complex scenes. Different methods of space partitioning 

are available and each suits a certain need. The key in finding the right space partitioning 

for a game is to analyze the level architecture and possible locations of objects 

to each other. A space flight game for example takes place in a widespread space, where 

only a few objects fly around -collision detection is only needed when two objects are 

somewhat close to each other. Jump and Run games on the other hand are usually quite 

one dimensional in their level design, which means a single axis is sufficient for space 

partitioning. The deciding factor is the distribution of objects in the level. 

Most 3D games will need at least two axes for space partitioning, because the 

level spreads across a big landscape, but the vertical position of objects is usually within 

a certain smaller range. Space partitioning can be applied for the objects in a scene, but 

also for the polygons in a mesh. Both methods are required depending on the game and 

in most cases both are used independently.

Implementation ::: Physics Engine – Collisions and Reactions 42 

4.2.1 Binary Space Partitioning – Divide and Conquer 

Since Assembler Bay is designed for small indoor levels, global space partitioning 

would not be of much use. The whole level geometry is a single entity in the game, 

which results in all objects being inside the collision range of the level at all times. Instead 

of implementing a complex space partitioning for objects, all moving objects are 

tested if they are within range of another objects bounding sphere. The complex part is 

the per polygon collision detection, if the first test is positive. The available possibilities 

include oct-trees and kd-trees 61 . An oct-tree separates the space into 8 cuboid volumes 

with the same size. Each of these volumes containing more than one polygon will again 

be divided in 8 cuboids. Although the generation of an oct-tree is very simple, it is not 

very efficient, because the space is evenly divided by size, regardless of how many 

polygons are in each part. Another downside is the possibility of many polygons being 

in more than one cuboid, which makes the tree redundant. An example for the drawbacks 

of oct-trees can be seen on the basis of a quad-tree in Figure 4.3. 

Figure 4.3: Quad-tree is redundant and very unbalanced when objects are grouped 

In contrast to oct-trees, kd-trees are a bit more complex to generate, but can guarantee 

an optimal balanced tree for any arrangement of objects. An kd-tree sorts vectors by using 

one of their dimensions depending on the level of the tree. QEE uses 3-dimensional 

binary trees for space partitioning. Each level of the tree splits the remaining volume in 

two, leaving half of the remaining points on either side. This split plane is normal to the 

X-axis at the root of the kd-tree. Mathematically all vectors are sorted by their X-coordinate 

values and the vector with the median X-value is attached to the root, the lower 

half of the remaining vectors is forwarded to the left child node and the other half to the 

61 kd-tree stands for a k dimensional tree (in most cases a 3-dimensional binary tree)


right child node. On the next level the procedure is repeated for the Y-coordinates, following 

the Z-coordinates in the third level. The fourth level again uses the X-coordinate, 

repeating the cycle until an empty node is reached and the vector can be inserted. 

If the kd-tree is generated or updated at runtime, sorting all objects for each node 

in the tree is too expensive 62 . This results in the use of an either unbalanced or static kdtree. 

Since Assembler Bay uses Simple Bone Animation for most objects, a static kdtree 

can be generated at build time for every mesh. The only object with Weighted Bone 

Animation is the main character, which can not use a kd-tree for per polygon collision 

detection, since the positions of the vertices change based on the bone animation. 

However a per polygon collision detection for a moving character is not desirable anyways, 

since jump and run elements work much better with collision geometry. Another 

drawback of the runtime generation of BSP-trees for bone animated meshes, is the calculation 

of the resulting vertex positions. Space partitioning operations are usually calculated 

on the CPU, whereas the skinning of meshes using Weighted Bone Animation is 

calculated solely on the GPU, saving many CPU cycles. The implementation of the 

build time generation of a kd-tree can be seen below. 

// index and count specify the part of the Array, which is relevant for this branch 

public static KDTree CreateFromPolyArray(Poly3[] Polys, int index, int count, int depth) 

{ 

// xyz indicates the coordinate which is used on this level of the tree 

int xyz = depth % 3; 

} 

// If only a single node remains, recursion ends and the leaf is returned 

if (count == 1) return new KDTree(Polys[index], depth); 

// The relevant part of the Array is sorted by the active coordinate 

if (xyz == 0) Array.Sort(Polys, index, count, PolyCompareX); 

else if (xyz == 1) Array.Sort(Polys, index, count, PolyCompareY); 

else Array.Sort(Polys, index, count, PolyCompareZ); 

int center = index + count/2; 

int odd = count % 2 – 1; 

KDTree tree = new KDTree(Polys[center], depth); 

/// 17 lines of code skipped /// 

// Left and right branches are recursively generated if needed 

tree.LeftChild = CreateFromPolys(Polys, index, count/2, depth + 1); 

if (count > 2) 

{ 

tree.RightChild = 

CreateFromPolys(Polys, center + 1, count / 2 + odd, depth + 1); 

} 

return tree; 

Code Segment 4.2: KDModelPipeLine.KDTree : line 47-77 (commented) 

62 Calculation complexity: sorting � O �n⋅log�n�� ⇒ for each node� O � n 2 ⋅log �n��


Using this Space partitioning, collisions can be detected by bounding box queries on the 

kd-trees of both objects. The bounding box of the first object is used in a volume query 

on the kd-tree of the second object, which returns a bounding box for all polygons of the 

second object which intersect with the bounding box of the first object. Using this smaller 

volume, the query is repeated on the first object, resulting in an even smaller bounding 

box. The remaining bounding volume is used in a query on both kd-trees to get all 

polygons which are possibly intersecting. If at any step of this process a qurey returns 

null, no collision can occur and the path returns without further processing. The process 

of narrowing down bounding volumes is visualized in Figure 4.4. 

Figure 4.4: Per Polygon Collision Detection Using Bounding Box Queries 

4.2.2 Physics Calculations – Let It Bounce! 

If Collision Detection returns positive, the Physics Updater component of both colliding 

objects is called by the Collision Manager. The Physics Updater of each object handles 

the reactions resulting from the collision. If two objects intersect with more than one 

polygon, the Physics Updaters are called multiple times, once for each collision to produce 

an adequate reaction. Most objects in the game use the same Physics Updater, they 

are assigned a mass and a coefficient of restitution and their reactions on collisions are 

calculated using simple Newton Physics. Exceptions from this general handling are static 

level geometry, which does not possess a Physics Updater and is treated as not movable 

by other objects, and the Character, which reacts different on collisions, since he 

represents an acting entity.


A special case is the handling of very low velocity collisions, which require special 

attention, because they can severely deteriorate performance. Most objects in the 

game have a coefficient of restitution below one 63 and will become slower after time. 

After a short time most objects will lie on the floor, accelerated by gravity and immediately 

repelled by the floor with very small velocity. This results in a high number of collision 

handling, without much movements. To improve performance, these objects are 

detected by their low velocities and normal of the colliding surface (which has to face 

down, if the object is lying on the floor) These objects are marked and gradually moved 

away from the colliding surface with low velocity until they do not touch the floor anymore. 

// The partner has to be marked as static 

if (CPartner.Bounds3D.Moving == Boundings3D.MovingModes.Still 

// The surface normals have to point up respectively down 

&& Vector3.Dot(CNormal,Vector3.Up) > 0.8f 

&& Vector3.Dot(CenterDist, Vector3.Up) > 0 

// The velocity and angular velocity have to be very low already 

&& Owner.Speed3D.Length() < (Bounce / 0.4f) + 0.01f 

&& Math.Abs(Math.Acos(Owner.Rotation3D.W)) < Bounce * 10 + 0.5f) 

{ 

Owner.RotationSpeed3D = Quaternion.Identity; 

Owner.Speed3D = Vector3.Zero; 

// The object is moved slightly away from its collision partner 

Owner.Position3D += CNormal * radius / 1000; 

Marked = true; 

} 

Code Segment 4.3: QuicknEasyEngine.PhysicsUpdater3D : line 68-76 (commented) 

Although this code fixes the performance problems and other glitches resulting from 

low velocity collisions, it introduces some new problems. One of these problems is the 

balance of objects. Although an object is barely moving and colliding with the ground at 

certain points, this information is not sufficient to label the object static. A Cube could 

collide only with one edge on the ground, the anticipated reaction for the cube would be 

to fall over and land on one side. To achieve this effect, the position of all occurring collisions 

for an object are saved in a list and forwarded to the Physics Updater. The relative 

position of these points to the center of gravity of the object is calculated and compared. 

If a plane can be found passing through the center of gravity with all points on 

one side of this plane, the object is not lying stable. If this case is encountered the object 

receives torque in the direction missing collision points. 

63 The coefficient of restitution is a number indicating how much an object is repelled by another object, 

in reality this number is limited between 0 and 1 (e.g. a chunk of butter would have a small value, as it 

sticks to the floor when dropped, whereas a rubber ball would have a high value)


4.2.3 Animated Objects – Neither Moving Nor Static 

The second problem introduced by the marking of static objects is caused by animations. 

Since a big entity like the whole level geometry can be partially animated, marking 

the whole object as static can result in objects floating in midair or passing through 

each other, because the Collision Manager ignores static objects. To deal with this, the 

whole Level geometry could be separated in individual entities for all meshes. But this 

would increase scene complexity and difficulties when debugging the game. The approach 

taken for QEE is the assignment of a Marker Array for each animated object. 

The Marker Array saves the animation state of each mesh in the object. This introduces 

a new object state for physics, in addition to moving and static, which is animated. If an 

object is animated some of its meshes can be treated as static, while others have to be 

processed as moving. The Animation Updater of an object saves this information in the 

Marker Array, which is evaluated by the Collision Manager. 

Because performance can still break down under particular circumstances, an additional 

measure is implemented in the Collision Manager. If the number of polygons 

encountered after the final tree queries is too high, a certain number of polygons is 

skipped in collision handling. Because the list of polygons resulting from a kd-tree 

query is localized 64 , skipping adjacent polygons will most likely result in skipping redundant 

collision handling. For most polygons in a mesh, collision handling of neighboring 

polygons will result in a similar effect. If only one of these polygons is processed, 

the resulting reaction will not differ much. 

4.3 Deferred Shading – The Visual Part of QEE 

As depicted in section 2.2.3 Assembler Bay tries to create an unique look using Deferred 

Shading as a basis. The first important decision to take when choosing Deferred 

Shading, is the format and distribution of attributes in the Geometry-Buffer. The Geometry-Buffer 

usually consists of multiple Render Targets, which store all necessary information 

for lighting and post processing. Although the most simple way would be the 

storage of all attributes in the needed format in a single off-screen surface, which would 

have to support a range of different values. However current Direct3D compatible hardware 

is limited to particular surface formats. Subsequently the G-Buffer 65 has to be simulated 

by multiple off screen surface Render Targets. Since many of the possible surface 

formats are not supported by various graphics adapters, it is best to choose only from 

the most widely supported formats (ref. [VAL07]). 

64 In a list received from a kd-tree query, polygons which are spatially close are also close in the list 

65 Geometry-Buffer

Implementation ::: Deferred Shading – The Visual Part of QEE 47 

4.3.1 Surface Formats – Compatibility and Storage 

When using multiple Render Targets the scene would have to be rendered multiple 

times to all different surfaces. This would result in a severe loss of performance, since 

the scene geometry would have to be processed multiple times, although the whole G- 

Buffer uses the same geometry calculations. To keep up with this, most modern graphics 

adapters support drawing to multiple Render Targets in a single draw call. Although 

these surfaces may have a different format, all require to have the same bit-depth. The 

usual format for textures is 32 bit Color, but newer hardware also supports rendering to 

64 or 128 bit Render Targets. A maximum of 4 simultaneous Render Targets is the limit 

for many graphic adapters. Since 128 bit textures are still a new feature with poor performance, 

the available space in the G-Buffer are either four 32 or four 64 bit Render 

Targets. Whereas the 32 bit Render Targets operate at higher performance. 

The relevant Render Target formats for the PC and the XBOX 360 are listed in 

Figure 4.5. Relevant are all formats supported by PC and XBOX 360, providing a surface 

format, which is neither included nor analog to another surface format. These surface 

formats can be combined to store all required attributes in the G-Buffer. 66 

Surface 

Format 

bitdepth 

Red 

Channel 

Green 

Channel 

Blue 

Channel 

Alpha 

Channel 

Color 32 bit 8-bit unsigned 8-bit unsigned 8-bit unsigned 8-bit unsigned 

Single 32 bit 32-bit IEEE format 

HalfVector2 32 bit 16-bit floating point 16-bit floating point 

10-10-10-2 32 bit 10-bit float. point 10-bit float. point 10-bit float. point 2-bit 

Vector2 64 bit 32-bit IEEE format 32-bit IEEE format 

HalfVector4 64 bit 16-bit float. 16-bit float. 16-bit float. 16-bit float. 

Figure 4.5: Relevant Surface Formats for Direct X Hardware 

These register formats provide a different range of possible values, depending on their 

bit-depth. An 8-bit unsigned normalized integer maps 2^8 = 256 possible values onto 

the range from 0 to 1 resulting in a very poor precision, which may be used for colors. 

The 32-bit IEEE format provides storage for a full float value, of the same type which 

is used for storage in the underlying computer language C#. The 16-bit floating point 

register maps 2^16 = 65,536 values onto the range from 0 to 1, which provides excellent 

precision for normalized values like surface normal coordinates. The 10-bit floating 

point register maps 2^10 = 1024 values onto the range from 0 to 1, which provides a 

66 Information on the XBOX 360 Render Targets from [ http://msdn.microsoft.com/en-us/library/ ]


compromise between 8 and 16-bit registers, allowing the storage of e.g. 3 surface normal 

components in a single surface. The 2-bit register maps 4 values onto the range 

from 0 to 1, which is only useful for certain flags like the type of a surface. 

Multiple attributes can be calculated in the first draw call and may be saved in 

the G-Buffer. Some attributes are required for lighting calculations, but others are optional 

for the use in post processing effects, or complex material shaders. Since the 

available capacity of the Render Targets is limited, only important attributes will be 

saved. The attributes can be stored in different formats. The world position of an object 

can also be recalculated by its depth and screen space position. All these attributes are 

stored per pixel at the position of the color pixel in the final image. 

4.3.2 Geometry-Buffer Attributes – Precision and Performance 

The following list will include all important attributes for QEE, including storage requirements 

for each type. Each type of storage has benefits and drawbacks in the fields 

of performance and precision. Performance depends on the number of operations 

needed to transform an attribute into the format required for calculations. The world position 

can be saved as the depth of an object, but this value has to be projected into 

space using its screen position and the matrices. If the world position is stored as a three 

component vector with the world coordinates, no extra calculations have to be done. 

Precision is the other factor. A surface normal vector can be stored with three 8 

bit registers, but it has to be normalized in the 0 to 1 range, leaving only 128 different 

values for the normal direction on a particular axis. This can result in visible lines on the 

surface of big curved objects, as the available colors allow for more precision than the 

normal directions can provide. Using a 16-bit register for each component of the normal 

vector results in optimal precision and improved image quality, but it requires twice as 

much storage capacity. The attributes marked with asterisk in the following list are required 

for basic Blinn lighting calculations. 

A list of attributes for storage in the G-Buffer: 

• Diffuse Color* (3x 8-bit or 3x 16-bit for HDRR) 

• World Position* (32-bit depth-value or as 3x 32-bit vector) 

• Normal Direction* (3x 8-bit, 3x 10-bit or 3x 32-bit) 

• Specular Intensity* (2-bit, 8-bit, 10-bit or 16-bit) 

• Specular Power* (2-bit, 8-bit, 10-bit or 16-bit) 

• Glow Color (2-bit, 8-bit, 3x 8-bit or 3x 32-bit color) 

• 2D Velocity (2x 8-bit or 2x16-bit) 

• Bloom Color (2-bit, 8bit, 3x 8-bit or 3x 32-bit color)


Since the graphic features which require the last three attributes are optional in this project, 

the Geometry-Buffer will be designed for the five required attributes. The Color of 

an object will always suffice with three 8-bit registers in this project, since no HDR textures 

are used. The world position is stored in a single depth channel, since the only alternative 

with acceptable quality would require three whole 32-bit render targets or at 

least two 64-bit render targets. The additional processing time resulting from calculating 

the world coordinates is much lower compared to the use of 64-bit Render Targets. 

Specular Intensity is a number in the range of 0 to 1, representing how much a 

surface reflects the light. Specular Power is a positive integer representing the collimating 

of this reflected light. Both require no more precision than an 8-bit register each. 

The normal direction has higher requirements, but the artifacts resulting from the usage 

of 8-bit registers for normals are barely noticeable in the game. Wasting 64-bit render 

targets on all other attributes would result in severe processing overhead. 

4.3.3 The Final G-Buffer Layout – Performance Takes the Lead 

The author decided for the higher performance and compatibility, using the layout in 

Figure 4.6. This buffer stores all required values in three 32-bit Render Targets, which 

provide best performance. The two Surface Types Color and Single are the two basic 

and most supported Render Target formats, since usually images are rendered to a Color 

Surface, using a 32-bit Single or 24-bit depth buffer. The fourth register is an optional 

surface for further development of this project and will support additional effects. This 

fourth Render Target is not yet implemented to save memory bandwidth. 

# Format Red Green Blue Alpha 

1 Color Diffuse R Diffuse G Diffuse B S. Intensity 

2 Color Normal X Normal Y Normal Z S. Power 

3 Single 32-bit Depth Value 

4 Color Glow R Glow G Glow B Bloom 

Figure 4.6: Geometry-Buffer Configuration for QEE 

The Deferred Renderer renders the whole solid scene geometry to this three buffers in a 

single draw call. The second draw call renders the remaining transparent geometry to 

the Color Buffer only, using the active depth buffer. The second draw call uses traditional 

shading, providing lighting for the three most important lights on transparent geometry. 

The Geometry Buffer is now filled with all required information and all lighting 

calculation can be applied per pixel like any post processing effect.


A drawback of this technique is the inability to use MSAA 67 with multiple 

Render Targets on DirectX 9 hardware. MSAA is a feature, which severely improves image 

quality and is supported by most major titles. However even some commercial state 

of the art titles lack support for MSAA, because they use Deferred Shading. This is a 

downside of Deferred Shading, as many users with expensive hardware demand the improved 

image quality of MSAA. One option would be the use of a single Render Target 

for each draw call, resulting in three draw calls for the G-Buffer creation. But a compromise 

is to outsource rendering of the Color Buffer into a separate draw call. This will 

result in degraded performance, because scene geometry has to be processed a second 

time. But since MSAA always has an impact on performance, being an optional feature 

for better hardware, this can be accepted. 

4.3.4 Implementation of Deferred Lights 

A light in a scene drawn with Deferred Shading is just a special type of a per pixel post 

processing effect. All required parameters to light a surface are stored in the G-Buffer at 

the position of each pixel. The resulting colors of each light are stored in an Accumulation 

Buffer, which gathers the shading of all lights and combines it in a final full screen 

processing with the Color Buffer. The Accumulation Buffer is used, because the added 

values for the brightness of each light could surpass the maximum value which can be 

saved in a buffer, turning the pixel completely white. However this effect is only wanted 

for specular highlights, but not for the diffuse component of the lights. For this reason 

the Accumulation Buffer uses two surfaces to save separate values for diffuse and specular 

light. In the final draw, the diffuse component is added to the ambient light in the 

scene and clamped in the range [0;1]. This value is multiplied with the color read from 

the Color Buffer and added to the value in the Specular Buffer. 

The lighting calculations in the Pixel Shader of each light type are processed like 

the standard Blinn-Phong lighting. The only difference is the source of the parameters, 

which are unpacked from the Geometry-Buffer instead of being read from the Vertex 

Buffer. Most lights have additional parameters like Near Attenuation, Far Attenuation, 

Hot-Spot and Falloff. The effect of these parameters, as well as the effect of shadow calculations 

is applied to the resulting intensities. These intensities are finally multiplied 

with the light color and saved in the Accumulation Buffer. Each light type has an individual 

Pixel Shader for these calculations. An excerpt of the Sunlight Pixel Shader featuring 

all of the mentioned details is pasted on the next page: 

67 Multi Sample Anti Aliasing – A technique which improves image quality by using multiple samples 

for each pixel


PixelShaderOutput LightMap(VertexShaderOutput input) 

{ 

PixelShaderOutput output = (PixelShaderOutput)0; 

} 

// Unpack Normals 

float4 normalData = tex2D(normalSampler, input.TexCoord); 

float3 Normal = 2.0f * normalData.xyz - 1; 

// Unpack Specular Power & Intensity 

float specularIntensity = tex2D(colorSampler, input.TexCoord).w; 

float specularPower = normalData.w * 255; 

// Unpack Depth and recalculate World Position from Depth 

float depthVal = tex2D(depthSampler,input.TexCoord).r; 

float4 position; 

position.x = input.TexCoord.x * 2.0f - 1; 

position.y = -(input.TexCoord.y * 2.0f - 1); 

position.z = depthVal; 

position.w = 1.0f; 

position = mul(position, InvertedProjection); 

position /= position.w; 

// Calculate Diffuse Intensity from Light- and Normal Direction 

float diffuse = saturate(dot(-LightDirection, Normal)); 

// Calculate Phong components per-pixel 

float3 reflectionVector = normalize(reflect(LightDirection, Normal)); 

float3 directionToCamera = normalize(CameraPosition - position.xyz); 

float specular = specularIntensity 

* pow(saturate(dot(reflectionVector, directionToCamera)), specularPower); 

if (ShadowsActive) /// Skipped 23 lines of code for shadow mapping 

// Diffuse Lightcolor and Specular Color are saved in the Accumulation Buffer 

output.Color = float4(LightColor * saturate(diffuse) * lightscale, 0); 

output.Specular = float4(LightColor * saturate(specular) * lightscale, 0); 

return output; 

Code Segment 4.4: QuicknEasyEngine.Content.Shader.SunLight.fx : line 101-177 (commented) 

One requirement for the graphics component of QEE is the compatibility with 

PixelShader 2.0, because the limitations of this revisions benefit the most from the features 

of Deferred Shading and it is supported by the widest range of graphics adapters 

and software. But to deal with the limitations of PixelShader 2.0 more complex light 

sources had to use the less complex Blinn-Phong lighting model. As for Shadow Mapping 

all components calculating shadows had to be sourced out into individual Shaders. 

Although this results in the need of more draw calls per frame, the overall performance 

will be scalable by the number and size of active light sources and shadow resolution.


4.3.5 Stencil Buffer Light Volumes – Less Pixels More Performance 

The performance of rendering lights can be further improved by narrowing down the affected 

area for each light. Because a light is applied as a Pixel Shader post processing 

effect, the performance can be increased by processing a smaller amount of pixels. Each 

light source can be represented by a bounding volume, which encloses every part of the 

scene, which is directly lit by this light source. A spotlight can be represented by a pyramid 

or a cone, depending on the projected shape. Whereas a directional light can be 

enclosed by a cylinder or cuboid. If the bounding volume of a light source is rendered, 

using the same viewport as the scene geometry, only possible lit pixels will be processed 

by the Pixel Shader. But there are still many pixels being processed, which are 

not affected by the light, because they are either in front or behind the bounding 

volume. The Depth Buffer and the Stencil Buffer can be used to exclude these pixels 

when drawing the bounding volume. The Depth Buffer holds the information about the 

distance of every pixel to the camera. The only pixels affected by the light source are 

those, which lie inside the bounding volume. These are pixels which have a depth value 

greater than the front-facing 68 and smaller than the back-facing polygon of the bounding 

volume at that position. 

The Stencil Buffer can be used to mark pixels when drawing polygons. To affect 

only pixel which are inside the bounding volume, two draw calls are needed. The Stencil 

Buffer is cleared to zero beforehand. The first draw call is a stencil write only call, 

drawing only the front-facing polygons. The depth value of every rastered pixel is compared 

to its corresponding value stored in the Depth Buffer. If the stored value is lower 

than the new value, the Stencil Buffer is set to one at this position. The second draw call 

uses the Pixel Shader of the light to write to the Accumulation Buffer, but each pixel is 

tested against its corresponding values stored in the Depth and Stencil Buffer. Only 

pixels with a stencil value of one and a new depth value smaller than the stored one are 

processed by the Pixel Shader. This process is visualized in Figure 4.7. 

Using bounding volumes for light rendering reduces fill-rate costs 69 immensely. Because 

the size of the light volume in the final image is proportional to the number of 

pixels processed for this light, small lights will only need small drawing costs. This 

means many small lights (e.g. in an outdoor night scene) will require about the same 

amount of processing time as one full screen light, like the sun does. This is a huge difference 

to traditional forward rendering, as the design team of Assembler Bay will be 

free to place a great number of small lights in a scene to create atmospheric lighting, 

which would not be possible without Deferred Shading. 

But using the bounding volume of the light as a way to improve performance 

only works, if these bounding volumes are scaled properly when placed in the level. 

68 Front-facing polygons have surface normals which point towards the camera 

69 Fill-rate costs refer to the time needed by the Pixel Shader units to draw a certain amount of pixels


Figure 4.7: Using Light Volumes and the Stencil Buffer to Exclude Pixels 

The far plane of each light should be set to a reasonable amount, as small as possible to 

reduce overdrawing and avoid the light volume being clipped by the camera far plane. 

4.4 Dynamic Shadows – Light Comes With Darkness 

As depicted in section 3.2.4 dynamic shadows are a required feature for every modern 

3D-Engine. Assembler Bay will use Shadow Mapping in favor of Stencil Shadow 

Volumes, because Shadow Maps are independent of geometry and will scale flawless to 

any kind of scene geometry. Since Shadow Mapping renders the scene to a bitmap, it is 

not only independent from object geometry, but it also natively supports alpha textures. 

Even semi transparent shadows can be applied using Shadow Mapping, but with restrictions 

similar to Deferred Shading. Because a single bitmap is used to store depth information, 

only a single object can cast a shadow along a straight line from the light 

source. If a transparent object is in front of another occluder, only the transparent shadow 

or the solid shadow can be cast on both objects, since they are represented by the 

same pixels in the shadow map. Because the technologies used by Shadow Mapping are 

similar to Deferred Shading, the two require similar hardware capabilities and hence 

profit from the same features and improvements. 

Despite this Shadow Mapping introduces new problems severely deteriorating 

image quality if not addressed. Depth Buffer imprecision is one problem, because it can 

lead to artifacts at touching surfaces. But the main hurdle of Shadow Mapping is aliasing, 

which occurs in any algorithm, since the shadow map is always represented by a 

bitmap. Various approaches to deal with aliasing and limited depth buffer precision exist. 

However shadow map aliasing is clearly noticeable in every frame and severely de-

Implementation ::: Dynamic Shadows – Light Comes With Darkness 54 

teriorates image quality, whereas depth buffer imprecision only leads to small artifacts, 

which are only visible on certain objects. Subsequently the primary target of the shadow 

mapper implemented in QEE will be to deal with shadow map aliasing. 

The first and most obvious step to reduce aliasing of a map is to increase the resolution 

of the bitmap. But even with a maximum resolution of 2048x2048, which is the 

limit in texture size for many graphics adapters, will result in severe aliasing for light 

sources which affect a wide area, like an outdoor sunlight. To increase the number of 

available pixels beyond this limitation, multiple shadow maps can be used. But multiple 

maps will consume a huge amount of memory and require several draw calls. The 

source for aliasing is the projection of a single shadow map pixel to multiple pixels of 

the final image, if these could be mapped 1:1, the shadow map could have the same resolution 

as the screen and no aliasing would occur at all. However the shadow map needs 

to be aligned with the light viewport, which makes a direct projection to the screen 

plane impossible. 

4.4.1 Shadow Map Aliasing – One Problem Many Answers 

Despite a direct projection being not available, the basic idea of adapting the shadow 

map to the camera viewport is the source for many different approaches to improve 

Shadow Mapping. Other approaches try to enhance shadow quality, by introducing soft 

shadow edges and simulating penumbrae 70 . The author tested and reviewed a range of 

different papers and dissertations on improved Shadow Mapping variants. The following 

list provides an overview of available techniques to improve Shadow Mapping, the 

entries marked with an asterisk are partially implemented in Assembler Bay. 

Technologies reviewed in the scope of Shadow Mapping for QEE: 

• Trapezoidal Shadow Maps* (ref. [MAR04]) 

• Adaptive Shadow Projection* (ref. [NEA05]) 

• Variance Shadow Maps* (ref. [DON05]) 

• Cascaded Shadow Maps (ref. [NEA05]) 

• Shadow Map Percentage Closer Filtering (ref. [BUN04]) 

• Shadow Silhouette Maps (ref. [SEN03]) 

• Penumbra Faking with Smoothies (ref. [CHA03]) 

• Light Space Perspective Shadow Maps (ref. [WIM05]) 

• Layered Variance Shadow Maps (ref. [LAU08]) 

• Dual Paraboloid Point-Light Shadows (ref. [Hay08]) 

70 Penumbra is the blurred shadow edge resulting from scattered light


Although a variant of Shadow Silhouette Maps and Shadow Map Percentage Closer 

Filtering were completely implemented in the development of Assembler Bay, they will 

not be detailed in this dissertation, because they were discarded in favor of techniques 

providing better image quality for the desired look of Assembler Bay. The details for all 

techniques can be accessed via the referenced papers, but the scope of this paper will 

only cover the implementation of the algorithms which were at least partially implemented 

for the final game. 

Trapezoidal Shadow Mapping tries to align the shadow map with the camera's 

bounding frustum to use a maximum of the available resolution for visible parts of the 

final image. Big light sources, like the sun in outdoor scenes, will most likely illuminate 

a much greater part of the scene than actually visible to the camera. Calculating shadows 

for these parts wastes processing time and shadow map capacity. The camera's 

bounding frustum is a trapezoid, which is projected to an irregular octagon in the 2D 

view space of the light source. Every pixel which is not part of this shape represents a 

point in 3D-space which is not visible to the camera. Minimizing these wasted pixels is 

the first step of Trapezoidal Shadow Mapping (see Figure 4.8-1) . 

The easiest way to increase the used area of the shadow map, is linear scaling. 

The whole bounding shape can be enclosed in a rectangle, which provides Translation 

and Scaling factors. However since the camera may look in any direction, the bounding 

shape could face diagonally and almost half of the shadow map would be wasted in 

many cases. But the bounding shape can be rotated, so it always faces upwards along 

the Y-Axis (see Figure 4.8-2). Now the only wasted parts in the shadow map are the 

small gaps resulting from the trapezoid shape of the projected bounding frustum. Minimizing 

these parts is a bit more complex, since each transformation needs to be represented 

by a transformation matrix, so that all can be multiplied with the projection matrix 

of the light viewport. The only available transformations to change the trapezoid shape 

are scaling, rotations and shearing, which also results in non-linear scaling. 

Trapezoidal Shadow Maps use concentrated applications of these matrices not 

only to transform the trapezoid into a square, but also to distribute the resolution nonlinear 

(see Figure 4.8-3). Since objects close to the camera require a higher level of detail, 

shadow maps too need a higher level of detail on surfaces close to the camera. Objects 

which are very far away, are only represented by a few pixels in the final image. 

High resolution shadows would thus be wasted on these shadows. 

Trapezoidal Shadow Maps provide a projection matrix, which will distribute 

points close to the Near Plane of the camera over a great vertical range in the shadow 

map. Since a perspective view point has a smaller Near Plane than Far Plane, the points 

are also spread over a wider horizontal range, when the trapezoid is transformed to a 

unit square.


Figure 4.8: Shadow Map Alignment with Trapezoidal Shadow Maps 

The shadow mapper of QEE utilizes certain steps of Trapezoidal Shadow Maps, but not 

the whole algorithm. Although greatly improving image quality, the algorithm of 

Trapezoidal Shadow Mapping uses a great deal of processing power for the calculation 

of the non-linear transform matrices. Since shadow quality is only slightly improved 

after the rotation step, QEE uses a different amount of adaptation for different light 

sources. All light sources except the Sunlight utilize only translation and scaling, since 

most of the time the camera frustum and the light volume will partially intersect, where 

rotation would need extra calculations to deal with the partially visible frustum. 

However Sunlight utilizes rotation, since the sun will always light the complete Scene 

and the whole camera frustum will always be completely included in the shadow map. 

To improve the quality of Sunlight shadows further, the points are distributed using a 

quadratic scaling for the Y-Axis. This results in a non-linear scaling, which is comparable 

in effect and quality to the non-linear scaling of Trapezoidal Shadow Maps. 

Another improvement is achieved by moving the Far Plane of the camera frustum 

as close to the viewer as possible. QEE does this by reading whole Depth Buffer 

after the Geometry-Buffer is filled. The greatest value smaller than 1 is the most distant 

point still visible. The Far Plane is adapted to be exactly at that point. This technique increases 

shadow resolution usage to a reasonable amount. But the shadows still suffer 

from smaller aliasing. Even percentage closer filtering, which can be used by the graphics 

adapter, can not generate smooth shadows with a medium resolution for big light 

sources. A desired effect would be the simulation of soft penumbrae.


4.4.2 Variance Shadow Mapping – Smooth Shadows in Every Resolution 

Various techniques to smooth shadow maps by simulating penumbrae have become 

available for real time applications in the last years. Because shadow mapping is the 

preferred shadow generation approach in most modern games, the development of these 

techniques has been pushed in recent time. Most of the methods use either additionally 

created geometry around object silhouettes to cast penumbrae wedges 71 or modify the 

generated shadow maps with image processing algorithms. The generation of additional 

geometry suffers from the same drawbacks as Shadow Volumes, which are not acceptable 

for this project. Image processing algorithms generate good results too, but are instable 

and subject to severe artifacts under certain conditions. Algorithms of each categories 

have been reviewed by the author, but both require a big amount of specialization 

for compatibility, which is not acceptable for QEE. 

Another approach is Variance Shadow Mapping, which is a completely empirical 

method of generating penumbrae. The basic idea of VSM 72 is utilizing a stochastic algorithm 

to analyze the distribution of surrounding depth values of a pixel. If the pixel is 

near to a shadow silhouette, the surrounding depth values will vary between the two 

heights of the shadow caster and shadow receiver. On the other hand, if the pixel is 

completely in the umbra 73 of a shadow, all surrounding pixels will show the same depth 

value and variance will be close to zero. VSM is applied in two steps. The first step is 

the generation of a twofold shadow map, which not only stores distance, but also the 

square of it. This shadow map is processed by a Gaussian Blur filter, providing a parameter 

for the size of penumbrae by the size of its filter kernel. 

The second step is the shadow test itself. Instead of just comparing the depth 

value of the processed pixel to the one stored in the shadow map, the variance of the 

neighboring pixels can be calculated using the stored value depthsqare and the stored 

value depth: 

Var=depthsqare−depth 2 �bias 

The bias is a value to account for depth buffer imprecision. The variance is clamped 

between one and zero, but has to be mapped to a value proportional to the distance of 

the shaded pixel, by using the value newdepth of the processed pixel: 

shadow=Var÷�Var�newdepth−depth 2 � 74 

71 A Penumbrae wedge is the gradient between light and darkness at a shadow edge 

72 VSM = Variance Shadow Mapping 

73 The umbra is the core shadow of an object, it is not affected by any direct illumination of the light 

74 Original formula found by [DON05]


With this, being the standard theory of VSM, the look of shadows can be further improved. 

The author tested various values and calculations to get better image quality and 

changed the formula to use the square of the results as the actual shadow value. This 

formula for VSM results in much smoother visuals and almost no noticeable gap 

between umbra and penumbra, which is visible in standard VSM. 

However VSM has one big downside, which is introduced with higher depth 

complexity of the scene. Although VSM generates perfectly smooth shadows for a 

single shadow caster and multiple receivers, a problem arises if two shadow casters 

overlap. Because the distribution of surrounding pixels will not only be high at shadow 

edges, but at any variance in the shadow map. If a little object is closer to the light 

source than a large object, which casts a shadow on a surface below both of them, a 

wrong penumbra occurs. The small object causes a high variance at its edge, leaving a 

bright silhouette inside a complete shadow (see Figure 4.9-1). This problem is addressed 

by different papers, which propose the use of Layered Shadow Maps or complex depth 

tests. But the author decided to try an original way of dealing with this problem, inspired 

by discussions about this topic. 

The problem of identifying wrong penumbrae can be reduced to the problem of 

deciding if a pixel is in the umbra. If a pixel is in the umbra can be decided by revisiting 

all surrounding pixels and check if they are shaded. However checking all surrounding 

pixels for every probably shaded pixel in the image would cost too much calculation 

time and could not be handled in a single Pixel Shader 2.0-compatible shader. But this 

value can also be calculated beforehand, utilizing the Gaussian Blur shade, which is 

processing the surrounding pixels of each pixel to recalculate its value. This is exactly 

what needs to be done for the absolute shadow value. A second shadow map is generated, 

where the maximum of all surrounding pixels is saved for each pixel. The maximum 

depth value represents the pixel farthest aways from the light source. If this value is 

still smaller than the depth value of a processed pixel in the final image, the pixel is 

completely occluded from the light and can be rendered in shadow, regardless of the 

calculated variance. The result of this step can be seen in Figure 4.9-2. 

However using the maximum shadow value to darken the umbra results in new 

artifacts at the border between different shadow casters. Since the silhouettes of both 

casters intersect at this point, the high variance produces a very bright penumbra. But 

this penumbra will start to be a fake penumbra at a certain point, where the pixels will 

be completely unlit. This produces aliased artifacts at these critical locations, which are 

slightly noticeable. To deal with these artifacts, another shadow map is introduced. But 

since the original shadow map already saved two values (depth and square of depth) a 

fourth register will be available anyway.


Figure 4.9: Fake Penumbra in Variance Shadow Mapping and Solutions 

Using this register a second version of the maximum-depth shadow map is generated 

with a smaller neighborhood of surrounding pixels. This maximum-depth map with 

smaller range will be used to identify the umbra, whereas the wider ranged first maximum-depth 

map will be used to identify the silhouette of penumbra. Combining these 

two, any pixels at the silhouette of the penumbra will be darkened by squaring their 

shadow value 75 , whereas pixels in the umbra will be set to zero. The result of this techniques 

can be seen in Figure 4.9-3. 

These steps minimize the appearance of artifacts, although umbra artifacts can 

not be completely avoided using simple shadow mapping. This approach produces results 

comparable to Layered Shadow Mapping, but uses only one additional Render Target 

and no additional passes. Because the shadowing process is always a consideration 

between performance and visual quality, VSM seems to be the best compromise. 

75 Since the shadow value is clamped between 0 and 1, squaring will result in darker pixels

Achievement and Conclusion ::: Achievement and Conclusion 60 

5 Achievement and Conclusion 

This final chapter will give a detailed review and rating of the project outcome. The first 

section will show the achievements of this project and compare them to the intended 

tasks. Accomplished tasks will be compared against the prior set targets for quality and 

performance, while uncompleted tasks will be analyzed critically. 

The second section will revolve around the whole process of completing this 

project. It will point out certain hurdles in the development of a game engine, as well as 

the pros and cons of working in an interdisciplinary team on a game design project. 

The third section will compare the result of this project against commercial 

games and try to rate its level of quality. The final chapter will finish with a conclusion 

of the whole project. 

5.1 Achievements and Tasks – Comparing Idea and Reality 

The goals for this project have been detailed mainly in the sections 1.2 and 3.2.1. They 

include certain requirements for the technical features of the game engine, the overall 

look of the game as well as the extend of included content. Like anticipated in the 

concept phase of Assembler Bay, the ultimate objective of designing a complete game 

could not be achieved in the scope of this project. However the design of a playable prototype 

was possible, which fulfills most of the required tasks. 

The foremost objective for Assembler Bay was the implementation of a fully 

playable demo level offering all important features of the final game and being expendable 

into a tutorial level. This objective is for the most parts achieved by the prototype 

of Assembler Bay, but it is not a fully playable demo level. Although all gameplay elements 

for Assembler Bay have been independently implemented in the prototype, they 

are not connected to one playable game experience. The user can freely move around 

the level, jump on platforms and search for tiles, but he can not collect this parts and 

start puzzling right away. The puzzle mode is activated by an additional key, which 

switches to an arbitrary camera. In this view puzzles can be assembled like in the final 

game, although optional features like a time limit for puzzling and a fixes boundary for 

no-gravity movements still have to be implemented. The character however is fully animated 

and jump&run sections in the level can be mastered with challenging difficulty.

Achievement and Conclusion ::: Achievements and Tasks – Comparing Idea and Reality 61 

The complete prototype level is partially animated, fully textured and illuminated 

with a range of dynamic lights. Included in the level is a complete puzzle, which 

can be used in the final game without further changes. The tasks for Assembler Bay also 

included two additional levels, which can provide an overview of the variety the final 

game will offer. These two levels are designed as 3D sceneries, which are not currently 

included in the prototype, because the level architecture is not adapted to the game 

mechanics. Each level includes a puzzle designed for the architecture and look of the individual 

level (see Figure 5.1). But like the levels these puzzles are not adapted to the 

gameplay mechanisms and offer only a preview to the possibilities of future development. 

The task of implementing two additional levels and puzzles has be partially completed, 

as both levels are accessible in the game, but are not complete yet. 

Figure 5.1: Previews of additional Level Designs for future development 

In addition to these 3D modeled level designs, various concepts for levels, puzzles and 

challenges for the final game are available in the design handbook. These level design 

drafts include scribbles for tropical, desert and organic settings, which provide a guide 

for future level designs. The designated tasks for the amount of content in this project 

have been mostly accomplished. Although the available content is still far from a professional 

game demo, the base for a variety of levels has been set. 

5.1.1 Primary Visual Features – Does it Look Good ? 

The goal to create the prototype for a game, which could compare to professional games 

includes state of the art visuals. In section 3.2.1 the author provides a list with desired 

graphic features for Assembler Bay. This section will review which of these features 

have been implemented, and what quality could be achieved in this points. The most 

fundamental graphics feature in Assembler Bay is High Quality Dynamic Lighting, 

which is implemented using Deferred Shading. The visual quality of the used lighting 

model is equal to per-pixel Phong Lighting, but it surpasses the capabilities of normal


lighting models. The use of Deferred Shading allows Assembler Bay to use a huge number 

of dynamic lights, which far exceed the expectations in both quality and performance. 

The implementation of these tasks exceeded expectations. 

The second desired graphics feature is the rendering of dynamic soft shadows, 

which are implemented using a combination of Variance Shadow Mapping, Adaptive 

Shadow Mapping and parts of Trapezoidal Shadow Mapping. The final shadow quality 

in Assembler Bay is better than expected. While normal shadow algorithms suffer from 

aliasing and many artifacts, the shadow mapping, implemented in Assembler Bay has 

only a single noticeable downside. The cast shadow of the character does not connect to 

his feat if the shadow map resolution is too low. This is a problem of depth buffer imprecision, 

which could be addressed by implementing special processing for the character's 

cast shadows, but this would reduce the consistency and reusability of the shadow 

model. The performance is good enough to use high resolution shadow maps up to 

2048x2048 pixels. However Variance Shadow Mapping provides astonishing quality, 

even for shadow map resolutions as low as 256x256 pixels, which would otherwise result 

in an unbearable quality (see Figure 5.2). The task of dynamic soft shadows is a full 

success. 

Figure 5.2: Comparison of Shadow Mapping Filter Methods 

The third desired visual feature is Ambient Occlusion, which is implemented by baking 

lightmaps into the level textures. Although real time dynamic Ambient Occlusion would 

have been an option for Assembler Bay, a static approach is preferable. Dynamic Ambient 

Occlusion can be simulated by various techniques, which deliver quite good results, 

but still can not deliver the quality of pre-rendered lightmaps. The small improve in 

visual quality which can be achieved by the use of dynamic Ambient Occlusion techniques 

is negligible in contrast to the vast amount of processing power it consumes. The


static Ambient Occlusion, which is integrated in diffuse textures produces a good sense 

of depth for normal lighting models. However the specialized overexposure lighting 

model used in Assembler Bay enhances this effect to produce the unique look as a combination 

of vivid colors and a great sense of depth. 

Overexposure is the fourth desired visual effect for Assembler Bay. It is not an 

established graphics effect but a technique developed in the scope of this project. Although 

the final effect was only implemented in the last stage of visual design in Assembler 

Bay, the desired look, it should provide was already a steady idea inside the 

team. The effect overexposure is achieved by using and Ambient Light of maximum 

brightness, darkened only through Ambient Occlusion and cast shadows. This combination 

of sterile, clean brightness in every part of the level and dynamic cast shadows produces 

a surreal look, which is unique to Assembler Bay. Overexposure is the main effect, 

which generates an original look for Assembler Bay, the other parts being the abstract 

level design and particular textures style. 

5.1.2 Optional Graphics Effect – Does it Look Better ? 

MSAA 76 was the first optional effect for Assembler Bay, since it greatly improves image 

quality, but is still only an optional feature for some modern games. Because Deferred 

Shading is natively not compatible with MSAA, the implementation required certain 

changes and adjustments to the process of Deferred Shading. The task was a partial 

success in supporting a combination of MSAA and Deferred Shading on NVIDIA 

graphics adapters. The optional task of MSAA can be seen as a full success, depending 

on the final XBOX-360 compatibility with this feature. 

Figure 5.3: Screenshot of Assembler Bay with active MSAA 

76 Multi Sample Anti Aliasing


The remaining optional effects were not implemented in this project. However the base 

for these effects has been set by the foundation for post processing effects and the extensibility 

of the Deferred Shading component. An additional Geometry-Buffer Render 

Target is still available for a highly specialized bloom effect, whereas Depth of Field 

can use the Depth Surface of the G-Buffer, which is already parsed for certain depth 

values in the process of Shadow Mapping. 

5.1.3 Open Source Game Engine – Is it Really Quick and Easy? 

As depicted in section 1.2,the second main objective of this project is the development 

of an Open Source XNA-Engine, which can be used by the community to develop state 

of the art titles for the PC and the XBOX 360. Although the current QuicknEasy Engine 

is still incomplete, it already offers a combination of features which can not be found in 

any other available Open Source XNA Engine. This section will provide an overview 

over various features of QEE and analyze their reusability and functionality for a redistributable 

game engine. 

The game engine structure is intended to provide all commonly required features 

of a game engine in the simplest manner possible. The class structure provides a shallow 

hierarchy, so it is easy for developers to understand the dependencies and functions 

of each component in the engine. Most relevant public methods are declared virtual, 

which provides the possibility to deduct new classes using this methods as an interface. 

This method is preferred by the author, because a huge number of interfaces will degrade 

performance and only confuse developers for any classes, which are used unmodified. 

Since all parts of QEE are implemented by the author, using only public available 

sources as inspiration, the whole game engine can be used in any product, without licensing 

obligations. 

The physics engine, included in QEE, provides an easy interface for the implementation 

of new Physics Updaters, which can control the way an object reacts to collisions 

and gravity. The standard classes provide an easy object management for the simulation 

of physics, which will suffice for most 3D games. Although the physics engine 

may be immoderate for its use in Assembler Bay, it offers the flexibility of implementing 

physically based challenges. Any further game developed using QEE can use the 

physics engine without any changes to simulate a believable reacting environment. 

The performance of the physics engine is also better than expected. The physics 

engine was tested with 100 instanced of an object, each using polygon exact collision 

detection and consisting of 2.000 polygons (see Figure 5.4). This setting still produced 

acceptable results with an average frame rate of 60 frames per second on modern hardware. 

The real time collision handling for such an amount of shapes and polygons is 

more than a physics engine normally has to deal with, which makes the performance acceptable.


Figure 5.4: Screenshot of a Physics Simulation in Assembler Bay 

QuicknEasy Engine is a complete game engine developed in this project. Although still 

many improvements and documentation are required, before the project can be released 

in a developer community, all the fundamentals are implemented on a level fulfilling the 

expectations. 

5.2 Teamwork and Timetables – The Long Way of Game Design 

Assembler Bay is an interdisciplinary collaboration, to make the idea of a real 3D game 

become reality. To master a project like this not only requires a strict timetable and thorough 

planning, but foremost a perfect cooperation between all team members. The team 

for Assembler Bay consisted of three members: Two members of communication 

design, being responsible for audiovisual design and the author, being the only computer 

scientist. The development of a game can not be achieved without close collaboration, 

including regular team sessions, because all team members have to adapt to the 

specifications resulting from other parts of development. The implementation of a game 

engine requires regular testing and verification of all components, which requires game 

assets to be reliable. Although many tests can also be run with dummy objects or test 

objects by other sources than the original design team of the project, 3D game design introduces 

various particular problems. 

A high hurdle on the track of game development is import and export of game 

assets. Although many standards have been established in public every day media, like 

music, documents and pictures, other areas still lack comprehensive standards. These 

areas includes 3D geometry, 3D animation, sound interchange, compressed textures and 

lighting information, which are the most fundamental assets for 3D game design. Be-

Achievement and Conclusion ::: Teamwork and Timetables – The Long Way of Game Design 66 

cause all assets from these areas are designed in arbitrary formats, custom exporters and 

importers have to be utilized to make a transfer of these assets into the game possible. 

Professional engines usually provide their own specialized exporters to convert assets 

into a format suitable for their custom content pipelines. But this is not suitable for a reusable 

game engine, unless the exporters are compatible to the most widely used modeling 

programs and receives regular updates. This project uses an intermediate approach 

combining an Open Source model exporter with a self developed exporter for meta information 

like lights and animation meta data. The whole meta information is stored in 

XML-format, which is parsed at runtime to provide a maximum of flexibility and transparency. 

The meta information files are designed in a way, which is easily readable. A 

level designer can understand and change the values, watching the direct influence in 

the game, without the need to rebuild anything. 

The overall cooperation in this project went perfectly, as all team members met 

on a regular basis. The transfer of game assets into the game engine is now running 

smoothly, without the need of any code changes or particular adaption. The whole level 

geometry, as well as puzzle objects are imported and processed by the game engine, resulting 

in a fully functional puzzle generated only from meta-information provided in the 

form of naming conventions for the puzzle-parts. This modularity provides the possibility 

to design puzzles as well as levels completely independent from the game engine and 

transfer them using the provided tools. 

5.3 Comparison and Conclusion – Is it Really State of the Art? 

The high set aim of this project is the design of a 3D game prototype, which can be 

called state of the art, meaning it is comparable to current professional gaming titles. 

This section will analyze and compare the visual quality, performance and features of 

Assembler Bay and compare it to professional games released in the last months 77 . Subsequently 

a review and the conclusion of the project will be provided in the last subsection. 

5.3.1 Dynamic Soft Shadows – Who is John Carmack? 

Dynamic Soft Shadows are a desired feature for every top title, which even modern 

games can not provide in optimal quality (see Figure 5.5 78 ). The goal of Assembler Bay 

is to reach the shadow quality of modern state of the art titles like Far Cry 2. Because 

Assembler Bay uses Variance Shadow Mapping it features much smoother shadows, but 

at the cost of shadow precision, as small structures will not be visible when blurred (see 

Figure 5.5). On the other hand Assembler Bay can be compared in terms of performance 

and visual quality, because the final look of a game is the deciding factor. 

77 Referring to the timespan of September 2008 – January 2009 

78 Cutout from a screenshot of Far Cry 2 with Shadow Quality Setting on high

Achievement and Conclusion ::: Comparison and Conclusion – Is it Really State of the Art? 67 

Figure 5.5: Close up Shadow Artifacts in Far Cry 2 

Figure 5.6: Magnified Shadow Details in Assembler Bay 

In a direct comparison of shadow quality, Assembler Bay hardly shows any noticeable 

artifacts, while Far Cry 2 has several scenes, where jagged shadow edges are noticeable 

and deteriorate image quality. Variance Shadow Mapping provides this smooth look at 

the cost of lost shadow details, but the overall impression is more appealing to the eye.


5.3.2 Colors and Atmosphere – It is Vivid, but is it Unique? 

Assembler Bay wants to impress with an unique look, which is different to the atmosphere 

of most games. To achieve this, the common lighting model was modified to create 

a bright and clean look in Assembler Bay. This comes at the cost of realism and 

lighting details, which may be interpreted as the absence of a sophisticated lighting 

model. But the game takes place in a virtual environment, which should look like a perfectly 

clean science fiction environment. Everything is completely lighted and all colors 

shine with maximum intensity. This design idea is similar to the visuals of Mirror's 

Edge. But while Mirror's Edge emphasizes the contrast between the strong colors in the 

sunlight and the dirty look of the indoor stages, Assembler Bay features a completely 

over brightened world. 

The look of Assembler Bay is already unique by its bright colors and soft cast 

shadows, but can be further improved with additional effects like depth of field and 

bloom. However even without these optional effects, the task of creating an original atmosphere 

for this project was successful. Assembler Bay appears very different to any 

commercial game, but provides pictures which can compare to most of them in quality 

and lighting accents. Although the level details in this project are not as sophisticated as 

the assets in commercial games, the game design of Assembler Bay allows an environment 

without much decals. Since the absence of these decals and certain details fits to 

the world of Assembler Bay, it will not be noticeable as a negative fact. 

5.3.3 Conclusion – Not Perfect but Worth the Effort 

It took a long way to create the result of this project, a playable prototype for a state of 

the art 3D game based on a self developed sophisticated game engine. Although the 

aims were set high at the beginning of this project, all of the primary goals were 

achieved and even some optional features were implemented in the engine. Throughout 

the whole project, more than once the team faced doubts, if the aims were set too high. 

But the implementation process proved time to be the foremost limitation to our project. 

Many optional features, effects and improvements in performance are possible in the 

current state of the game engine. These tasks will be handled in the further development 

of Assembler Bay and QuicknEasy Engine, because an even longer way awaits Assembler 

Bay from this prototype onwards. The team is confident to find more members to 

advance this project and evolve it to a final game, which realizes all parts of the original 

project for Assembler Bay. 

Although the prototype developed in this project is not comparable with a professional 

game demo, the underlying technologies can compete with commercial games. 

Because professional design teams usually consist of a great number of team members 

and take years of development for a game including game engine design, the development 

of a 3D game with the means of only three team members and six months of time 

from concept to prototype seemed hardly possible at the beginning. Because the imple-


mentation of this prototype was a success, the further development of Assembler Bay 

seems definite. The team is highly motivated to finalize Assembler Bay and is supported 

by the University of Applied Science Trier in this effort. 

Reviewing the whole timespan of development, the original timetables for this 

project were totally off, regarding some features. Although an additional buffer time of 

6 weeks was present in the original timetables, particular problems and new requirements 

shortened these time to zero. Regarding this project, a timetable for a game 

should calculate at least one third of the available time as buffer time for unexpected 

problems and delays. Although some tasks were underestimated in the timetable design, 

many problems resulted from hardware failures, driver bugs or missing documentation 

for highly specialized features. Facing such problems can not be avoided and require a 

reasonable amount of extra time for any project using newest technologies. 

Eventually this project can be seen as a huge success, because it shows that the 

development of a moderate 3D engine using newest technologies is possible, even for 

small private developer teams. The cooperation of developer communities and the ever 

lasting motivation to make game concepts become true, make it possible to design 

games which can compare to modern commercial games. 

Figure 5.7: Assembler Bay Logo

List of literature 

BET03: Erik Bethke, Game Development and Production, Wordware Publishing, Inc., 2003 

MIC06: Michael Dickheiser, Game programming gems 6, Charles River Media, 2006 

EBE04: David H. Eberly, Game Physics, Morgan Kaufmann, 2004 

SUP07: Angela Suppan, Interactive Story telling, GRIN Verlag, 2007 

BOW06: Doug A. Bowman, 3D User Interfaces, Addison Wesley, 2006 

EBE01: David H. Eberly, 3D Game Engine Design, Morgan Kaufmann, 2001 

DIE04: D. Sim Dietrich Jr., Nvidia, Shader Model 3.0, 2004 

HAR04: Shawn Hargreaves, Deferred Shading, 2004 

BUN05: Michael Bunnell, Dynamic Ambient Occlusion and Indirect Lighting, 2005 

VAL07: Michal Valient, Deferred Rendering in Killzone 2, 2007 

MAR04: Tobias Martin and Tiow-Seng Tan, Anti-aliasing and Continuity with Trapezoidal 

Shadow Maps, 2004 

NEA05: Andrew V. Nealen. Shadow Mapping and Shadow Volumes, 2005 

DON05: William Donnelly, Andrew Lauritzen, Variance Shadow Maps, 2005 

BUN04: Michael Bunnell, GPU Gems, Addison-Wesley Professional, 2004 

SEN03: Pradeep Sen, Mike Cammarano, Pat Hanrahan, Shadow Silhouette Maps, 2003 

CHA03: Eric Chan and Frédo Durand, Rendering Fake Soft Shadows with Smoothies, 2003 

WIM05: Michael Wimmer, Daniel Scherzer, Werner Purgathofer, Light Space Perspective 

Shadow Maps, 2005 

LAU08: Andrew Lauritzen, Michael McCool, Layered Variance Shadow Maps, 2008 

Hay08: Kyle Hayward. Dual-Paraboloid Shadow Maps, 2008 

70

Design and Implementation of a 3D Action Puzzle Game

Create successful ePaper yourself

Delete template?

Save as template?