Game Engines Selection Framework for High-Fidelity Serious Applications

International Journal of Interactive worlds

Download PDF  | Download for mobile

Panagiotis Petridis1, Ian Dunwell1, David Panzoli2, Sylvester Arnab1, Aristidis Protopsaltis1, Maurice Hendrix1 and Sara de Freitas1

1Serious Games Institute, Coventry University Technology Park, Coventry, West Midlands, UK

2IRIT-UT1C, Université Toulouse I Capitole, France

Volume 2012 (2012), Article ID 418638, International Journal of Interactive worlds, 19 pages, DOI: 10.5171/2012.418638

Received date : ; Accepted date : ; Published date : 4 April 2012

Copyright © 2012 Panagiotis Petridis, Ian Dunwell, David Panzoli, Sylvester Arnab, Aristidis Protopsaltis, Dunwell, Maurice Hendrix and Sara de Freitas. This is an open access article distributed under the Creative Commons Attribution License unported 3.0, which permits unrestricted use, distribution, and reproduction in any medium, provided that original work is properly cited.


Serious games represent the state-of-the-art in the convergence of electronic gaming technologies with instructional design principles and pedagogies. Despite the value of high-fidelity content in engaging learners and providing realistic training environments, building games which deliver high levels of visual and functional realism is a complex, time consuming and expensive process. Therefore, commercial game engines, which provide a development environment and resources to more rapidly create high-fidelity virtual worlds, are increasingly used for serious as well as for entertainment applications. Towards this intention, the authors propose a new framework for the selection of game engines for serious applications and sets out five elements for analysis of engines in order to create a benchmarking approach to the validation of game engine selection. Selection criteria for game engines and the choice of platform for Serious Games are substantially different from entertainment games, as Serious Games have very different objectives, emphases and technical requirements. In particular, the convergence of training simulators with serious games, made possible by increasing hardware rendering capacity is enabling the creation of high-fidelity serious games, which challenge existing instructional approaches.  This paper overviews several game engines that are suitable for high-fidelity serious games, using the proposed framework.

Keywords: High Fidelity, Serious Games, Game Engines, Visualization


The potential for using game engines for serious games has been recognized for years (Seung Seok Noh 2006).  In 1997, the National Research Council cited some of the objectives of multi-player games for military simulation: “The underlying technologies that support these objectives address similar requirements: networking, low-cost graphics hardware, human modeling, and computer generated characters.” (National Research Council 1997). The vision of using games for simulation has been realized in a number of military training- oriented games,    including America’ s Army and Full Spectrum Command (Wray 2004). As a result, there has been a trend towards the development of more complex, serious games which are informed by both pedagogic and game elements. Whilst it is true that the technical state-of-the-art in serious games mirrors that of leisure games (Anderson 2009), the technical requirements of serious games are frequently more diverse and wide-ranging than their entertainment counterparts. Serious game developers frequently resort to bespoke and proprietary development due to their unique requirements, such as the use of Blitz Games’ in-house engine for developing the serious game: Triage Trainer (Jarvis 2009). Such examples of bespoke development and the limited use of off-the-shelf game engines for serious applications highlight the difficulties that exist for engine designers seeking to understand and comprehensively support the needs of instructional design.

The convergence of high fidelity simulators with serious games represents an area with increasing potential to fulfil cognitive learning requirements with a high degree of efficacy, whilst leveraging the advantages of game-based content to motivate and engage users. Modern games are frequently developed on game engines, which can be deployed on personal computers, game consoles, pocket PCs and mobile devices.  The popularity of video games, especially among younger people, results in them frequently being perceived as an ideal medium for instructional programmes aimed at hard-to-reach demographics (Malone 1987); however, studies have also shown this demographic responds poorly to low-fidelity games (de Freitas 2009).

The value and impact of realism on learning purely through simulation is well-documented (de Freitas 2006; Jarvis 2009).  However, when game elements are introduced, although they have been shown to improve learning in comparison to pure simulation (Mautone 2008), the relationship between simulation and game is less simple to define. Visual fidelity, whilst being a common objective in simulations, is considerably less valued in gaming; Jarvis and de Freitas (Jarvis 2009) demonstrate excessive fidelity to be detrimental to learning and the approaches put forward by learning theorists such as Vygotsky  avoid creating a facsimile of reality in favour of internally-consistent abstractions. Visual fidelity is best linked to learning requirements; more is not necessarily always better, particularly from a cognitivist perspective (cognitive overload (Warburton 2008)); therefore, the ability to constrain and control fidelity can be seen as desirable, particularly with respect to composable content.

Hence, whilst visual fidelity is one component of an immersive environment, serious games often seek to use abstraction or non-realistic visual elements. Knez and Niedenthal (Knez 2008) present an interesting factor for consideration through a study linking changes in the affect of users to lighting within a virtual scene. Given that the affect of learners has a noted impact on the efficacy of learning (Bryan 1996), the capacity of a game engine to handle sophisticated lighting approaches is of concern. An increasing recent trend in game engines to transfer lighting calculations to the GPU  has led to a significant variation between engines in their support for shader models and in turn their capacity to perform sophisticated light and shading effects.

In the next section, the researchers consider in more depth the motivation behind the creation of high-fidelity serious games. Through an analysis of background literature, the researchers are able to highlight common pedagogic elements and hence recognise the subsequent technical implications.


As  seen before, the technical state-of-the-art in serious games mirrors that of leisure games (Anderson 2009), however, the technical requirements of serious games are frequently more diverse and wide ranging than their entertainment counterparts. Serious game developers frequently resort to bespoke and proprietary development due to their unique requirements, such as (Playgen 2010) and (PIXELearning 2010), and difficulties exist for game engine developers in accurately understanding and supporting the needs of instructional design.

Although many serious games have limited visual interactivity, immersion and fidelity, there is an increasing motivation to create serious games that intend to support situative (social and peer-driven) and experiential pedagogies, partially because behaviourist approaches have been shown to be limited (e.g. people learn to play the game, not address learning requirements), whilst cognitive approaches struggle to impart deeper learning in the areas of affect and motivation (Egenfeldt-Nielsen 2005). Furthermore, recent work by Mautone (Mautone 2008)  demonstrates enhanced learning when introducing game elements to a standard flight simulator. Consequently, re-evaluation of simulator approaches to incorporate game and game-like elements places an increasing demand for serious game developers to deliver high-fidelity solutions. Given this motivation to create immersive, high fidelity serious games, an obvious development choice is to utilise game engines, which provide ‘out of the box’ support for state-of-the-art desktop GPU rendering and physics. In the remainder of this section, the researchers  discuss a range of key considerations when selecting the technology behind serious games, supporting effective pedagogy, the learner and their context.

Fritch (Fritsch 2004) et al compared different games engines for large-scale visualization of outdoor environments focusing mainly on the issues of composability. Similarly, Shiratuddin compared various game engines for visualizing large architectural scenes mainly focusing on accessibility and the availability of game engines (Shiratuddin 2007). Following the analysis of all these potential factors,  the methodology used here is defined to include areas, such as audiovisual fidelity, functional fidelity, composability, availability and accessibility, networking and heterogeneity.

Visual Fidelity

High-fidelity in serious games is typically seen as desirable in situations where there is a need to transfer process knowledge learnt within the game to real world situations; and thus the closer the similarity between real and virtual space, the more effective the learning transfer is likely to be. Although a link between learning transfer and verisimilitude of learning activity has been observed particularly in training contexts (Park 2005; Janet L. Grady 2008; Davidovitch 2009)), this link does not necessarily hold true in all game-based learning scenarios; Jarvis and de Freitas (Jarvis 2009) suggest that the level of fidelity required must be mapped onto learning objectives. Furthermore, an over-emphasis upon visual fidelity can mask the complexities of producing verifiable and replicable learning activities and experiences.

The need to engage the learner, and specifically, immersion through high-fidelity content is one such mechanism through which such engagement can be achieved. The concept of immersion is a common one in serious games, although the components that constitute an immersive experience can be more difficult to define. The capacity to immerse learners is a significant consideration, although the means for achieving this immersion can be diverse from highly visual content to less technical approaches such as narrative immersion (Mott 2006). Robertson et al (Robertson 1997) looked specifically at the relationship between user and standard desktop PC as an interface for virtual reality, and compared it to head-tracked systems. Robertson claims “immersion should not be equated with the use of head-mounted displays: mental and emotional immersion does take place, independent of visual or perceptual immersion”, an opinion reinforced by Csikszentmihalyi and Kubey (Csikszentmihalyi 1981). Thus, a further discretisation of the concept of immersion between psychological and perceptual levels is identified. The role consistently is one of ‘drawing in’ the user, such that they experience a perceptual shift between simply viewing the screen and existing within the environment. Breaks in consistency, such as those induced by low frame rate or discontinuities in world content are shown to have a significant negative impact on immersion (Csikszentmihalyi 1981).

With respect to all three of these aspects, there are a number of dimensions in which fidelity must be considered. At a high level, there are many aspects of a game that can be represented with differing levels of fidelity; the narrative, the depth of visual and auditory content, the interaction medium and the behaviour of characters and objects within the game world. Whilst all these concerns must be reflected upon when designing a serious game, in terms of engine selection, a clear distinction exists between affordances for audiovisual and functional fidelity. It is possible to engineer a world, which appears realistic but does not behave in a realistic fashion. Although increasing audiovisual fidelity often implies increased functional fidelity,  as a virtual room fills out with furniture to become more visually realistic, players start expecting furniture to function as it would in the real world. For example, placing a virtual telephone on top of a desk can bring with it a host of potential questions from users expecting to be able to dial out.

The concept of immersion is a common one in serious games, although its definition, and specifically, the components that create an immersive experience can be more difficult to define. The capacity to immerse learners is a significant consideration, although the means for achieving this immersion can be diverse: from visual fidelity to functional fidelity as well as less technical approaches such as narrative immersion. A simple out-of-place texture or inappropriate sound can have catastrophic effects for believability, as metrics of immersion such as the performance indications and cognitive surveys applied by Pausch et al. (Pausch 1997).


Unlike leisure games, target demographics for serious games are often non-game players, with little interest in technology or knowledge of user interfaces. Furthermore, the developers of serious games may be instructional designers wishing to explore a new medium, rather than traditional game developers seeking to develop instructional content. Therefore, the capacity of the engine to support both developers and users with limited expertise is of relevance. Previous Research (Jarvis 2009) has shown that conventional keyboard and mouse interaction in a world with multiple degrees of freedom can prove initially overwhelming for non-gamers. As such, serious games can often require interfaces that deviate from those common to entertainment games, and seek to simplify interactions based upon an understanding of learning requirements.


Zyda identified as early as 1995 (Zyda, 1995) three fundamental challenges in multiuser virtual environment design, which he defines as composability, scalability and heterogeneity. These remain substantial challenges within current virtual environment research. With respect to serious games, heterogeneity is of particular concern, since target demographics are frequently ‘non-gamer’, and thus platforms capable of deployment across a wide range of hardware and software platforms are significantly advantageous.


Serious games often seek to model real world locales and situations, or adapt real world data for use in games with minimal development overheads. Composability in this context is used to describe both the reusability of content created within a game engine, and also its capability to import and use data from common or proprietary sources.

Technology evolution often requires the recreation of game content in ever-increasing levels of visual and functional fidelity.. If the scene consists of many high fidelity objects at various distances, it may be possible to adopt a low level of-detail approach (Engel) and use less complex geometry, or even dummy objects (Akenine-Moller T. 2008), to approximate distant objects (Sander 2006). Alternatively, if only a small sub-section of the world or object is in sight at any one time, it may be possible to hold only these visible parts in memory and then replace them as new parts come into view by applying some form of spatial partitioning (Crassin 2009). Another issue that the designer/developer has to consider arises when attempting to import a high-fidelity model into a game engine, which may not support the geometry or texture formats within the model, or require specific optimisations such as polygon reduction to be rendered in real-time.

Two principal solutions exist: the first is to address the problems on a large-scale through an algorithmic approach that seeks to provide automated conversion between formats and import the model as a whole into the game engine. The second is to select specific components of the model for conversion and progressively convert and integrate them into the game engine by hand. However, algorithms intending to support such an approach face a number of challenges: decomposition of a mesh into multiple levels-of-detail is difficult to optimise since the perceived visual fidelity of the resultant lower-resolution meshes is linked to the perception characteristics of the user, and hence a solution is not only mathematically complex, but must also consider how humans perceive and integrate features of a three-dimensional scene (Treisman 1980). Therefore, developers are commonly forced to select specific components of the model for conversion, and progressively convert and integrate them into the game engine by hand, a task necessary when conversion tools are inadequate or the original data format has insufficient information for the game engine; often the case when models are developed without adequate information on materials, textures, bump maps or levels-of-detail.

Secondly, occlusion culling may only be performed if information on the visibility of polygons is computationally less expensive to obtain than rendering them. Early game engines such as the Quake engine seeks to achieve visibility data prior to run time through pre-processing a visibility matrix and binary partitioning of virtual space. Thirdly, mesh decomposition must be performed in concert with analysis and scaling of texture level of detail to provide a satisfactory visual outcome.


Multi-user elements are often specified at early-stage designs of serious games, since they often affect the nature of the game and its role within a training programme as a whole. User interaction within the game itself is often used to address the difficulties in automating the behaviour of non-player characters in a believable and coherent fashion. In this context, instructors play the role of virtual characters in order to converse and interact with learners in a realistic and adaptive manner. Whilst this can be an effective way of creating believable virtual scenarios, it suffers from limited scalability due to the availability of instructors and practical limits on pupil-tutor ratios.

Support for larger-scale communities and social elements are gaining increasing recognition within the leisure gaming community as a mechanism for increasing uptake and long-term play. As an example, the recent Guitar Hero [35] game wraps a cognitively simple task within a socio-culturally motivating setting, and has consequently proved to be highly successful commercially. Within serious games, social elements often take the form of online communities, and the convergence of games with social networking technologies remains an area of interest.

Summary of Requirements for Serious Games

As  seen in section 0, the key elements are fidelity, consistency and support for tools for creating immersion and flow. Fidelity can be further subdivided into visual and functional fidelity. Consistency relates to technical features in this area including the need to load between areas or stream geometry, or whether the engine is standalone or web based. Finally, corresponding features for tools creating immersion and flow include game scripting tools, especially when narrative is a non-linearity (i.e. implementing some form of artificial intelligence or artificial life). This ties in with immersion, but  the term immersion within the framework is avoided due to the current lack of consensus on its definition (Slater 1993), (Slater 2003). Instead the researchers  focus on the elements that contribute to immersive experiences.

Additional technical elements include heterogeneity (on which platforms can the engine be deployed ; what hardware requirements; can it scale automatically), accessibility (support for non-standard interfaces and devices; as well as support for standardised interfaces e.g. WASD) and multiuser support, beneficial since in the absence of sophisticated AI, human instructors often play a role in virtual learning experiences, and similarly socio-cultural elements can be key motivators as mentioned above.

Overview of Modern Game Engines

Modern game engines combine several technologies from the area of computer science such as: graphics, artificial intelligence, network programming, languages and algorithms. Modern computer game engines are robust and extensively tested (Lepouras G. 2005), in terms of the usability and performance, work on off-the-shelf systems (Robillard G. 2003) and can be easily disseminated, for example via online communities (Burkhard C. Wünsche 2005). A game engine is any tool or a collection of tools that creates an abstraction of hardware, and/or software, for the purpose of simplifying common game development tasks. Many computer game developers support modification of their game environments by releasing level editors, for example to modify the game environment and tools to edit the game behaviour. This allows the reuse of the underlining game engine technology, including 3D rendering, 2D drawing, sound, user input and world physics/dynamics (Lewis M. 2002). For example, users can create new levels, maps and characters, adding them to the game, known as partial conversion or they can create entirely new games by altering the game source engine, known as total conversion (Trenholme 2008). Modern game engines have a modular structure, so that they can be reused into different games (Jacobson J. 2003; Trenholme 2008). Analysis of a range of current game engines suggests the following modules are common: the Graphics Module, the Physics Module, the Collision Detection Module, the I/O Module, Sound Module, the AI Module and the Network Module. The graphics module is responsible for the generation of the 2D/3D graphics in the environment, including libraries for texture mapping, shadowing, lighting and shader effects. The Physics Module ensures that objects behave according to physical laws, for example objects fall under gravity and glass breaks. The Collision Detection Module is used to ensure that certain actions will occur when two objects collide. Furthermore, the input/output module is responsible for the input and output device which can be integrated into the 3D engine.

One of the most important elements of the creation of serious games is the visual representation of these environments. Although serious games have design goals that are different from those of pure entertainment video games, they can still make use of the wide variety of graphical features and effects that have been developed in recent years. Most game engines provide support for texture mapping, shadowing, lighting and shader effects in their graphics model. Additionally, modern engines frequently include a selection of such effects, which can include more traditional image processing, such as colour correction, film-grain, glow or edge-enhancement, as well as techniques that require additional scene information, such as depth of field and motion blur (Akenine-Moller T. 2008).

An Artificial Intelligence module, often used to create objects or “Non Playing Characters” (NPCs), is able to interact “intelligently” with the player. An important aspect in the creation of realistic scenes is to create in the game environment intelligent behaviours for the inhabitants of the virtual world, which is achieved using Artificial Intelligence (AI) techniques. However, it is important to understand that when the researchers  refer to the AI of virtual entities in game engines, it is not truly AI — at least not in the conventional sense (McCarthy 2007). The techniques applied to computer games are usually a mixture of AI-related methods whose main concern is the creation of a believable illusion of intelligence (Scott 2002), e.g. the behaviour of virtual entities only needs to be believable enough to convey the presence of intelligence and to immerse the human participant in the virtual world. The Network Module is responsible for the multiplayer implementation of the game. As a result, players could cooperate in exploring an area or exchange opinions about certain aspects of a virtual environment, while being located in different areas of the world.

The IO module provides support for different input/output devices. This module provides tools that allow the user to communicate and interact with the game environment. Most game engines provide support for standard input devices such as, joysticks, gamepads and keyboard. Technological improvements and cost reduction in computing power, display and sensor technology have resulted in a widespread use of 3D Input Devices (Fröhlich 2000; Petridis 2005; Mourkoussis 2006). Devices, such as the Nintendo Wii and Playstation controller provide 6 Degrees Of Freedom interaction that could enhance the user interaction with the environment and increase the immersion of the user.

With such a broad definition, what is referred to as a game engine can vary among developers and where a game engine ends and a game begins is not always a clearly defined line (Trenholme 2008). Game engines should be distinguished from graphics engines that come only with rendering capabilities, and also from Software Developer Kits (SDKs) that aid game development. The reason for this differentiation is that graphics engines impose limitations as to what and how things can be included in a game, whilst SDKs are much more flexible but with narrower focus. For example, Gamebryo is a very flexible proprietary renderer but has no collision detection or physics capabilities, unlike Havoc, which is solely a physics engine. Similar middleware include Criterion’s Renderware and Speedtree.

A Framework for Game Engine Selection

In short,  key elements are defined arising from the background as fidelity subdivided into visual and functional fidelity, consistency, (technical features in this area would include need to load between areas or stream geometry, whether the