Scriptless Testing for an Industrial 3D Sandbox Game

Fernando Pastor Ric

1 a

, Beatriz Mar

ın

1 b

, Tanja Vos

1,2 c

, Joseph Davidson

and Karel Hovorka

Universitat Polit

ecnica de Val

encia, Val

encia, Spain

Open Universiteit, The Netherlands

GoodAI, Prague, Czechia

Keywords:

Computer Game Testing, Autonomous Agents, Scriptless Testing, Exploratory Testing, Automated Testing.

Abstract:

Computer games have reached unprecedented importance, exceeding two billion users in the early 2020s. Hu-

man game testers bring invaluable expertise to evaluate complex games like 3D sandbox games. However, the

sheer scale and diversity of game content constrain their ability to explore all scenarios manually. Recogniz-

ing the signiﬁcance and inherent complexity of game testing, our research aims to investigate new automated

testing approaches. To achieve this goal, we have integrated scriptless testing into the industrial game Space

Engineers, enabling an automated approach to explore and test sandbox game scenarios. Our approach in-

volves the development of a Space Engineers-plugin, leveraging the Intelligent Veriﬁcation and Validation

for Extended Reality-Based Systems (IV4XR) framework and extending the capabilities of the open-source

scriptless testing tool TESTAR. Through this research, we unveil the potential of a scriptless agent to explore

3D sandbox game scenarios autonomously. Results demonstrate the effectiveness of an autonomous scriptless

agent in achieving spatial coverage when exploring and (dis)covering elements within the 3D sandbox game.

1 INTRODUCTION

Computer games are dynamic and interactive systems

designed to immerse and entertain users in captivating

virtual environments. In the early 2020s, the video

game industry surpassed 2 billion players worldwide,

generating an impressive revenue of 120 billion dol-

lars. This remarkable trend is anticipated to experi-

ence substantial growth in the future (Cooper, 2021).

Game testing predominantly relies on game

testers, who invest signiﬁcant manual effort and time

verifying that user interactions within virtual scenar-

ios yield the intended outcomes (Politowski et al.,

2021). However, as industrial games grow in com-

plexity, companies face the inherent limitations of hu-

man game testers’ efforts. Automated approaches are

needed to support game testers’ manual efforts with

complementary testing approaches (Pascarella et al.,

2018). However, due to the intricate nature of games,

which surpasses traditional software in complexity

(Santos et al., 2018), the ﬁeld of game systems lacks

standardized frameworks to facilitate test automation.

Sandbox 3D games, such as the industrial game

Space Engineers developed by Keen Software House

and GoodAI companies, emphasize the freedom and

https://orcid.org/0000-0002-5790-193X

https://orcid.org/0000-0001-8025-0023

https://orcid.org/0000-0002-6003-9113

creativity of users in virtual scenarios. Players are

given a wide range of tools and resources to shape

the game scenarios according to their preferences and

playstyle. The testing team of Space Engineers com-

prises ten game testers who excel in assessing func-

tionality to create, destroy, modify, or interact with in-

game objects, verify visual aspects, and manage game

scenarios. Nevertheless, despite the testers are dedi-

cated to performing numerous daily manual tests, the

extensive range of in-game elements constrains their

time for exploring and testing unforeseen scenarios.

This study evaluates the scriptless testing tech-

nique with the industrial sandbox game Space Engi-

neers. Scriptless testing automatically generates test

sequences at run-time to explore the System Under

Test (SUT) by selecting and executing the available

actions in the discovered states (Pastor Ric

os, 2022).

While this approach appears well-suited for 3D sand-

box games, existing scriptless testing tools are primar-

ily designed for desktop, web, and mobile applica-

tions. Adapting these techniques for game testing re-

quires addressing distinctive 3D game features, such

as precise position and orientation data for character

movements and properties of interactive elements.

To bridge the gap between scriptless testing

tools and technologies capable of discerning a

game’s states, we leverage the Intelligent Veriﬁca-

tion/Validation for Extended Reality Based Systems

Pastor Ricós, F., Marín, B., Vos, T., Davidson, J. and Hovorka, K.

Scriptless Testing for an Industrial 3D Sandbox Game.

DOI: 10.5220/0012599400003687

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 19th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2024), pages 51-62

ISBN: 978-989-758-696-5; ISSN: 2184-4895

(IV4XR) framework (Prada et al., 2020).

In this paper, we extend the previous experiences

of using the IV4XR framework (Prasetya et al., 2022).

The research contributions are:

1. Advancements in Scriptless Game System

Testing. This research provides insights into

the IV4XR framework and scriptless testing tools

components, contributing to the landscape of

scriptless and game testing methodologies.

2. Empirical Evaluation with an Industrial

Game. Through empirical evaluation, this study

demonstrates the beneﬁts of effective spatial

coverage by developing sophisticated decision-

making algorithms in autonomous scriptless

agents to test the industrial Space Engineers

game system.

These contributions are valuable for researchers

and game development practitioners since they show-

case how integrating autonomous exploratory agents

can enable automated navigation and game testing.

The paper is structured as follows: Section 2

presents related work. Section 3 outlines the Space

Engineers game and testing challenges. Section 4

details the integration within the IV4XR framework.

Section 5 describes the extension of the scriptless test-

ing tool TESTAR for games. Section 6 presents the

empirical evaluation, and Section 7 concludes.

2 RELATED WORK

Compared to desktop (Pezze et al., 2018), web

(Garc

ıa et al., 2020), and mobile (Kong et al., 2018)

applications, for complex 3D game systems, there

are no highly adopted automation framework or tools

suitable to implement automated testing approaches.

ICARUS framework (Pfau et al., 2017) trains Re-

inforcement Learning (RL) agents to complete a 2D

linear adventure game. In (Rani et al., 2023), an RL-

BGameTester model was used to detect screen errors

in a 2D Atari game. However, 2D games have simpler

visuals and mechanics than 3D games. The absence

of the third dimension eliminates complexities such

as physics intended to simulate real-world scenarios.

Various RL approaches have been researched with

diverse demo or gym-training 3D games. In (Gordillo

et al., 2021), curiosity-driven RL agents move and

jump in a self-crafted 3D game map to enhance spatial

coverage and detect areas that stuck players. For gym-

training semi-realistic games like ViZDoom (Kempka

et al., 2016), the study (Ariyurek et al., 2022) use

RL to train agents that simulate different personas to

discover alternative play-style trajectory paths. Simi-

larly, in (Sestini et al., 2022), a curiosity and imitation

RL approach is used to train agents that explore game

areas while uncovering collision bugs and glitches.

For Unreal Engine sample games, a pixel-based agent

called Inspector (Liu et al., 2022) is employed to ex-

plore the game space using curiosity-based RL, re-

sulting in the detection of two potential bugs.

Meanwhile, initial studies are using approaches to

test open-source Virtual Reality (VR) Unity projects.

The VRTest framework (Wang, 2022) streamlines the

integration of various testing techniques using rota-

tion, movement, and click trigger events. In (de An-

drade et al., 2023), metamorphic tests and RL are used

to identify collision and camera faults when moving

the game character. Nevertheless, both approaches re-

quire further work to support wide types of events and

to be evaluated with SUT not based on Unity.

In contrast to the previously mentioned studies,

Space Engineers is a complex industrial 3D sandbox

game in the market that involves a wide range of func-

tional blocks and items with diverse properties.

RiverGame framework (Paduraru et al., 2022)

uses various AI techniques to test visual, physical,

and sound game aspects. The accuracy of visual AI

techniques has been validated with demo and open-

source games, and the voice testing detection rate ap-

proach with an industrial game. In contrast, our re-

search evaluates the effectiveness of exploring an in-

dustrial 3D sandbox game while assessing the func-

tional aspects of in-game objects.

Wuji framework (Zheng et al., 2019) uses evolu-

tionary Deep RL to improve state exploration while

accomplishing missions in 2D and 3D online com-

bat games. Its effectiveness was demonstrated by de-

tecting real injected bugs from previous versions and

uncovering 3 new bugs. However, despite the avail-

ability of the Wuji open-source classes, the project

lacks the documentation details necessary for seam-

less integration with other game systems and has had

no recent activity since June 8, 2020.

In summary, some studies explored scriptless

techniques to train RL agents for playtesting demo or

gym-training games. Scriptless RL agents have been

applied to real games in a few cases. Still, there is a

lack of standardized open-source frameworks able to

effectively identify the states of complex 3D games,

which is essential to streamline test automation pro-

cesses. Thus, our proposal goes beyond the state of

the art in two key aspects: (i) We establish a connec-

tion to the game environment for robust observation

of internal game objects and execution of actions that

control the agent using game functions by leveraging

the IV4XR open-source framework; (ii) We integrate

an autonomous scriptless testing agent within a real

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

3D sandbox game that employs intelligent run-time

algorithms to simulate the exploration experiences of

real players and does not require pre-executing train-

ing iterations to learn how to play the game.

3 SPACE ENGINEERS

Space Engineers

is a sandbox 3D game developed by

Keen Software House and GoodAI companies. The

game is coded in C#, started its alpha release in 2013,

transitioned to beta in 2016, and was ofﬁcially re-

leased in 2019. In 2022, Space Engineers had an

average of 5,000 players with peaks of more than

9,000 concurrent players. Between 2013 and mid-

2023, the game has evolved over nearly 596 game

build changes. The game is available on Steam and

console platforms and has sold around 5 million units.

The Space Engineers game simulates realistic

open-world 3D scenarios. Due to the nature of the

sandbox game, there is no speciﬁc objective to ﬁnish

the game. Users can explore planets in space, portray

their idealistic spatial constructions, play challenging

scenarios to survive, or collaborate and compete with

other players.

3.1 Game Mechanics

In Space Engineers, all game objects reside in a posi-

tion and orientation of a three-dimensional world and

have properties that indicate the object name, velocity,

and unique identiﬁer inside the game environment.

The astronaut is a playable game character that al-

lows users to be part of and interact with the game

environment. The astronaut has various characteris-

tics such as energy, hydrogen, oxygen, and health, and

capabilities like ﬂying using the jet-pack. Due to the

game recreating an open space world, the astronaut

can move, ﬂy, and rotate in 3D scenarios.

The game has atomic Block objects with proper-

ties representing attributes like type, integrity, vol-

umetric physics, mass, inertia, and velocity. These

blocks can be categorized into functional and struc-

tural blocks. Functional blocks have the capability of

executing a task. For instance, this functional task can

be producing energy for power blocks or restoring the

character characteristics for life support blocks such

as a medical room block. Structural blocks do not

execute a task on their own but are used to build con-

structions. For example, armor structural blocks are

used to build the ﬂoor and walls of space stations.

Constructions are known as Grid objects. A grid

https://www.spaceengineersgame.com/

can be as simple as a set of structural blocks that con-

stitute the ﬂoor of a space station or a complex engine

that extends the task capabilities of functional blocks.

For example, a medical room that restores the astro-

naut’s health can be connected with an O2/H2 gener-

ator to additionally restore the astronaut’s oxygen.

To construct blocks or sustain functional tasks, the

astronaut needs so-called Items like: Tools used to in-

teract with blocks and game mines; Ores mined from

planets or asteroids using drill tools; Materials reﬁned

from ores into useful ingots; Components crafted

from materials and required to construct blocks.

Figure 1 shows a Space Engineers scenario with

a functional medical room connected to a functional

O2/H2 generator via structural conveyor blocks. The

O2/H2 generator can reﬁne ice ores and supply oxy-

gen to the connected medical room. This allows the

astronaut to restore oxygen when interacting with the

medical panel. However, if the integrity of the O2/H2

generator is less than 80%, the ice ores cannot be re-

ﬁned, and the oxygen will not be supplied.

Figure 1: Space Engineers game scenario.

Space Engineers scenarios can be launched in cre-

ative and survival modes. Creative mode makes the

astronaut invulnerable. His health, oxygen, hydrogen,

and energy statistics will not decrease when resources

are spent. Moreover, the astronaut can build blocks

without the need to have the correct components in

the inventory. In contrast, in survival mode, oxygen

decreases over time, energy reduces with activity, hy-

drogen is spent when the jet-pack is used, and the as-

tronaut can die if he/she loses its health points.

The variety of blocks and items, the diverse pos-

sible constructions to build with them, the scenar-

ios game modes, and the 3D open-world movements

make Space Engineers a highly complex game to test.

3.2 Development & Manual Testing

Cycle

A typical Space Engineers game development cycle

takes about 3 or 4 months, depending on the extent

Scriptless Testing for an Industrial 3D Sandbox Game

of game changes. This cycle involves two primary

teams: developers and testers. The developers design

and implement new game features and ﬁx bugs re-

ported by the testers during the release of new game

versions. As developers ﬁnalize these changes, they

open Jira tickets (Fisher et al., 2013) to point testers

to the features that require testing.

In turn, testers process developers’ Jira tickets

to verify the functionality of new or updated game

features and validate potential bug ﬁxes. Addition-

ally, they assist the game community in verifying and

documenting possible problems that users encounter

when crafting speciﬁc scenarios. The testers team

comprises ten members with different expertise roles

that include: console port (testing features in con-

sole); scene and world creation (ensuring scenarios

can be created, saved, and loaded); surround sound

(checking sound volume in scenarios); player support

(simplifying bug reproduction steps reported by com-

munity users to assist developers in resolving issues).

Testers’ proﬁciency in understanding game me-

chanics from players’ perspectives is crucial to ac-

curately reproduce scenarios during sandbox game

feature validation. As a result, manual testing re-

mains essential to ensure comprehensive game test-

ing. However, testers lack sufﬁcient time to man-

ually test the extensive and diverse combinations of

entity interactions in the game. Manually exploring

blocks and items to reach spatial coverage can be

prohibitively costly and time-consuming. In light of

these challenges, it becomes relevant to investigate

scriptless testing as an automated solution.

3.3 Scriptless Testing

Traditional testing approaches, such as manual test-

ing, focus on evaluating speciﬁc scenarios by interact-

ing with and verifying the expected behaviors of SUT

elements. However, these approaches may not cap-

ture the full range of interactions and emergent behav-

iors that can emerge during gameplay. Scriptless test-

ing techniques do not rely on explicit test case instruc-

tions. Instead, they employ algorithms to dynamically

generate non-sequential actions during run-time, en-

abling them to explore and discover SUT objects au-

tonomously. This introduces randomness and vari-

ability, which helps to complement traditional test-

ing by uncovering unexpected issues and performing

unanticipated combinations of interactions (Jansen

et al., 2022; Bons et al., 2023).

Existing scriptless testing tools rely on different

Graphical User Interface (GUI) technologies to detect

the interactable elements. For example, TESTAR

(Vos et al., 2021; Jansen et al., 2022) uses UIAutoma-

tion, WebDriver, Java Access Bridge, and Appium;

Murphy (Aho et al., 2014) relies on UIAutomation

and image recognition; GUI Driver (Aho et al., 2011)

uses Jemmy Java library, Crawljax/ATUSA (Mesbah

and Van Deursen, 2009) and Webmate (Dallmeier

et al., 2012) use WebDriver; GUITAR (Nguyen et al.,

2014) uses Java Accessibility, WebDriver, and UNO

Accessibility; AUGUSTO (Mariani et al., 2018) and

AutoBlackTest (Mariani et al., 2011) use IBM Func-

tional Tester and Selenium.

For traditional GUI applications, the aforemen-

tioned technologies sufﬁce with the identiﬁcation of

GUI elements that can be interacted with through key-

board and mouse inputs. However, these technologies

fall short of meeting the necessary requirements for

games. Game environments demand additional infor-

mation to accurately identify their respective states,

such as positional or orientation vectors for move-

ments or properties associated with the objects be-

ing interacted with. Consequently, prior to employing

scriptless testing for games, it becomes imperative to

establish a connection between this testing approach

and technologies capable of detecting the interactive

elements within a game. We used the IV4XR frame-

work described below to establish this connection.

4 SPACE ENGINEERS IN THE

iv4XR FRAMEWORK

IV4XR is a Java framework with a plugin architec-

ture that provides a set of interfaces that can be im-

plemented to connect, get information, and interact

with game objects. The Entity interface represents

the existing game objects and their properties. The

Environment interface allows connecting with game

scenarios, observing the deﬁned entities, and deﬁn-

ing the actions that can be executed in the game. The

agent can be any automated software testing tool that

uses the environment to connect with the game and

takes the role of a playable entity to observe the game

entities, execute game actions, and apply test oracles.

These IV4XR interfaces streamline the develop-

ment of game plugins. The development of the

IV4XR Space Engineers-plugin

enables access to

the internal data and functions of the game. In Space

Engineers, the agent takes the role of the astronaut.

The Space Engineers-plugin consists of server and

client components. The server-side is implemented in

C#, has 8462 lines of code (LOC), and allows the con-

nection with the game by deﬁning the properties and

controller functions of game objects. The client-side

https://github.com/iv4xr-project/iv4xr-se-plugin

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

Observe:

obtain the

entities

information

Action:

Move,

use item,

or place

block

Space Engineers iv4XR plugin

Pose

Position

OrientationForward

OrientationUp

Entity

Name

Velocity

…

Block

Integrity

Size

Functional

…

CubeGrid

List<Block>

Mass

Parked

1..n

CharacterObservation

Health

Oxygen

SuitEnergy

…

UseObject

Name

SupportedActions

Action

ContinuousUsage

CharacterMovement

MovementDirection

MovementSpeed

Rotation

…

Observation

CharacterObservation

List<CubeGrid>

0..n

Observe:

obtain the entities

information

Action:

Move, rotate or

use tool object

Space Engineers iv4XR plugin

Pose

Position

OrientationForward

OrientationUp

Entity

Name

Velocity

Block

Type

Integrity

Size

Functional

…

CubeGrid

List<Block>

Mass

Parked

Character

Health

Oxygen

SuitEnergy

…

ItemsController

SetToolbar

Equip

…

CharacterController

MovementDirection

RotationDirection

TurnOnJetpack

BeginUsingTool

…

Observation(range)

Character

List<CubeGrid>

0..n

1..n

BlockPlacer

PlaceBlock

…

Figure 2: Space Engineers-plugin overview.

is written in Kotlin to ensure interoperability between

the C# game and the Java IV4XR framework, has a

size of 17671 LOCs, and provides classes that grant

access to game data and logical functions like naviga-

tion for the agents. Figure 2 shows an overview of the

plugin classes, which are discussed below.

4.1 Space Engineers-Plugin Entities

In the Space Engineers environment, each game ob-

ject is represented as an Entity that stands in a spe-

ciﬁc Pose. The Pose denote their position and orien-

tation within the game, and the Entity properties indi-

cate each object’s identiﬁer, name, and velocity.

Block extends Entity with properties that indicate

the block’s type, integrity, and size, together with an

attribute that indicates if the block is of the func-

tional category (e.g., power block or medical room).

CubeGrid contains the list of blocks that compose a

grid (e.g., a spaceship grid is composed of a cock-

pit, thruster, and power blocks), and properties repre-

senting the grid mass and if the grid is parked (e.g., a

spaceship is parked or is being controlled). The Char-

acter entity extends Entity properties with the astro-

naut’s characteristics of health, oxygen, energy, etc.

4.2 Space Engineers-Plugin

Observation

The agent connects with the Character through the

Space Engineers-plugin to Observe the Space Engi-

neers environment. The Character is always present.

The existence of CubeGrid and Block entities depends

on a conﬁgurable observation range of the agent and

its distance from the game objects. Figure 3 shows

how the observation range, a 3D sphere, works in the

Space Engineer’s environment. The agent observes

itself, the main platform grid, and one spaceship grid.

As we have explained, the grids are composed of a

set of block entities. The spaceship grid is composed

of a cockpit block, thruster block, power block, etc.

The grid platform is composed of a group of structural

ArmorBlock representing the ﬂoor of the scenario, to-

gether with functional MedicalBlock and PowerBlock.

4.3 Space Engineers-Plugin Actions

The Space Engineers-plugin allows the agent to con-

trol the Character to interact with the game. The

plugin controllers call internal game functions to

move or rotate the character, turn on/off the jet-pack,

equip/unequip an item, place a block, etc. To invoke

these controls, the agent executes commands.

Observing the entity’s data within the game envi-

ronment and executing commands by invoking game

controls make the IV4XR Space Engineers-plugin a

more robust approach compared to the usage of key-

board and mouse inputs or visual recognition tools

that lack access to the internal game data. However,

commands are insufﬁcient even to accomplish simple

tasks such as grinding a block. It is necessary to group

sequences of commands in actions. For example, an

action to grind a block is composed by the commands:

Find the grinder tool, equip the grinder, aim the block,

start using the grinder, and stop using the grinder.

4.4 Space Engineers-Plugin Navigation

Moving the agent within the game is a complex task,

as it requires the agent to perceive which positions are

obstructed/walkable to determine a path of positions

to reach the desired entity. In the IV4XR framework,

this functionality is known as navigation.

Some game engine platforms, such as Unity, can

automatically build a navigation mesh

that contains

the walkable positions of the game by using the vir-

tual objects geometry. Pathﬁnding algorithms can

then optimize the traversal of these navigable mesh

nodes to reach a desired position (Cui and Shi, 2011).

The IV4XR framework provides an A* pathﬁnd-

ing algorithm to efﬁciently ﬁnd the best path between

two nodes within a navigation mesh. Nonetheless,

the initial version of the Space Engineers-plugin did

https://docs.unity3d.com/es/Manual/Navigation.html

Scriptless Testing for an Industrial 3D Sandbox Game

TESTAR State

SUT

agent

platform

ship

power medical gravity

…

Platform

grid

Space

ship

Position

Orientation

Oxygen

Hydrogen

Jetpack

etc.

ArmorBlock …

MedicalBlock

PowerBlock …

Position

Mass

Name

Parked

etc.

CockpitBlock…

ThrusterBlock

PowerBlock…

Position

Orientation

Integrity

Size

Functional

etc.

Position

Orientation

Integrity

Size

Functional

etc.

Position

Mass

Name

Parked

etc.

Character

Agent

Figure 3: Space Engineers observed environment by the agent.

not have the capability to create a default navigation

mesh (Prasetya et al., 2022). Instead, it constructed

a navigation graph on-the-ﬂy using the geometry in-

formation of observed entities. This approach has

drawbacks, as it incurs a signiﬁcant time cost after

each game exploration movement and is not robust in

three-dimensional space, where the agent could dy-

namically change its orientation.

To enhance game navigation, the Space Engineers

team introduced automatic graph calculation for each

CubeGrid entity. This calculation generates a list of

positions the agent could reach without obstructions.

In Figure 4, the agent observes the navigable posi-

tions, allowing him to create actions with a path of

command movements to reach the interactive entities.

MedicalBlock

PowerBlock

CockpitBlock

Free Spaces

to Build

Figure 4: Navigable actions to reach interactive entities.

5 TESTAR FOR GAME TESTING

Once the IV4XR Space Engineers-plugin resolves

the technical prerequisites necessary for detecting the

state, it is necessary to integrate this capability into a

scriptless automation tool. We select the TESTAR

tool (Vos et al., 2021) since it already initiated the

integration process with the IV4XR framework (Pas-

tor Ric

os, 2022; Prasetya et al., 2022), making it a

ﬁtting choice to continue the integration efforts.

TESTAR is an open-source tool for scriptless

GUI testing that automatically obtains the state of

desktop, web, and mobile applications, derives and

executes GUI interactions such as click, type, or drag,

and applies oracles to check if the system responds

correctly. To be able to interact with the Space Engi-

neers game, TESTAR has been extended to integrate

the IV4XR Space Engineers-plugin from Fig. 2.

TESTAR launches the Space Engineers game as

a Windows executable, then connects to it using the

Space Engineers-plugin and loads the desired sce-

nario. Subsequently, it starts a cyclic ﬂow that can

generate multiple test sequences of various actions

until a STOP condition is met (e.g., perform a maxi-

mum number of actions). The operational ﬂow steps

of the TESTAR agent are shown in Figure 5:

• First, it observes the entities that constitute the

game state.

• Second, it derives all the available actions that can

be executed for each entity.

• Third, based on the available derived actions, it de-

cides what to do next? by selecting one of the derived

actions using an Action Selection Mechanism (ASM).

• Fourth, it executes the selected action and applies a

series of oracles to verify the robustness of the system

and the functional aspects of the game entities.

TESTAR has a Java class called protocol that

contains the methods corresponding to each of these

four steps. A tester can, for example, change the ASM

by plugging a different one into that Java protocol.

5.1 TESTAR Agent: Game State

The TESTAR agent employs the Space Engineers-

plugin to actively Observe all the game entities that

reside in the observation range area. Each game en-

tity contains a set of aforementioned properties, such

as position and orientation for all entities; health and

oxygen for the Character entity; and type and in-

tegrity for Block entities. Together, these observed

entities and their properties constitute the game state.

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

iv4XR

Framework

Core

se-plugin

Observe

State

Apply

Oracles

Derive all

possible

actions

Execute

action

Generate

Results

Scriptless Exploratory

Test Agent

Integrity increases / decreases correctly

Health, Oxygen, Energy increases

correctly

Jetpack properties

are adequate

Construction with

materials works

correctly

MedicalBlock:

- position

- size

- integrity

- etc.

PowerBlock:

- position

- size

- integrity

- etc.

CockpitBlock:

- position

- size

- integrity

- etc.

Unobstructed

floor spaces

1A. Observe all available

entities (state)

MedicalBlock:

- reach + grinder

- reach + welder

- reach + heal

PowerBlock:

- reach + grinder

- reach + welder

- reach + shoot

CockpitBlock:

- reach + grinder

- reach + welder

- reach + enter

Explore

positions

2B. Derive all available actions:

- Navigate to entities and interact

- Navigate to explore positions

4D. Execute Action and

Apply Game Oracles:

- Verify Entities properties

- Verify Astronaut properties

3C. Action Selection Mechanism:

What to do next ?

Select one of ALL possible actions:

- reach + grinder MedicalBlock

- reach + welder MedicalBlock

- reach + heal MedicalBlock

- reach + grinder PowerBlock

- reach + welder PowerBlock

- reach + shoot PowerBlock

- reach + grinder CockpitBlock

- reach + welder CockpitBlock

- reach + enter CockpitBlock

- explore position … 1

- explore position … 2

- explore position … 30

Figure 5: TESTAR operational ﬂow with Space Engineers.

5.2 TESTAR Agent: Derived Actions

and Navigation

Depending on the type and other entity properties,

the TESTAR agent derives all the available actions

that can be executed for each entity. For example, to

grinder or welder all non-structural ArmorBlock enti-

ties, or to interact with MedicalBlock or CockpitBlock

functional blocks to restore oxygen.

A distinctive characteristic between testing tradi-

tional GUI software and games is the need to reach the

desired entity to interact, as well as explore through

states where no functional blocks exist in the observed

area to potentially discover new game entities. While

the Space Engineers-plugin’s navigation graph and

the A* pathﬁnding algorithm provided by the IV4XR

framework facilitate ﬁnding the optimal movement

path between initial and destination nodes, there is a

necessary decision-making step at the top level to de-

termine which action to derive and select during the

exploration process.

To reach the desired Block to interact, the TES-

TAR agent exploits the navigation capabilities of the

Space Engineers-plugin to observe the unobstructed

ﬂoor spaces and calculates if there is a navigable path

of positions that can be followed to reach the Block.

If so, TESTAR derives an action that navigates the

path, rotates to aim the Block, and interacts with the

Block using a Tool. However, if the Block is not reach-

able, TESTAR does not even try to derive an interac-

tion with the Block action.

To potentially discover new game entities, the

TESTAR agent not only considers deriving actions

that interact with observed reachable Blocks but also

derives actions that explore unobstructed positions.

To do this, TESTAR’s protocol has been extended

so that after deriving all available interaction actions

with unobstructed Blocks, it also derives all available

exploration actions to unobstructed positions.

5.3 TESTAR Agent: Action Selection

Mechanism

After deriving all available actions, the TESTAR

agent uses, by default, a random ASM to decide

which action to execute next. Although random

ASMs have proven practical for traditional software

(Vos et al., 2021), for exploring 3D sandbox games, it

is necessary to research on more sophisticated ASMs.

Let us consider the example in Figure 5. First, the

TESTAR agent observes 3 functional blocks (Cock-

pit, Medical, and Power) and derives 3 different ac-

tions for each block to navigate and interact with.

This computes a total of 9 navigate and interact ac-

tions with functional blocks. Second, because there

are 30 unobstructed positions in the observation area

Scriptless Testing for an Industrial 3D Sandbox Game

(e.g., imagine there are 30 purple dots), the TESTAR

agent derives other 30 available exploration actions.

A random ASM will have less than 25% proba-

bility of selecting one of the 9 available interaction

actions from the 39 total actions. This increases the

chance of selecting an exploration action to more than

75%. Moreover, within the set of available explo-

ration actions, selecting remote areas that remain un-

explored can potentially allow the TESTAR agent the

discovery of new entities. To enhance the exploration

of unexplored areas, we have developed the so-called

Interactive Explorer ASM depicted in Algorithm 1.

Algorithm 1: Interactive Explorer ASM.

Data: interacted ▷ List the interacted entities

Data: explored ▷ Area of explored position

Data: actions ▷ All available state-actions

1 if actions contains entities that were not interacted then

2 nearEntity ← nearestEntity(actions) ;

3 a ← select to navigate and interact with the

nearEntity ;

4 save nearEntity as interacted ;

5 else if actions contains positions that were not explored

then

6 remotePos ← remotePosition(actions) ;

7 a ← select to navigate to explore the remotePos ;

8 save remotePos as explored ;

9 else

10 a ← random selection from all actions ;

11 end

12 return a ▷ Return the selected action

The Interactive Explorer ASM tracks a list

of interacted entities and an area containing the

explored positions. First, the ASM checks whether

the set of available actions contains an action that

interacts with a non-interacted entity (line 1). In

that case, because there can be several non-interacted

entities, it prioritizes choosing the nearest entity

(nearEntity) to the agent (line 2). Thus, the ASM se-

lects the action that navigates and interacts with the

nearEntity (line 3), saves this nearEntity as inter-

acted to not to be prioritized in the next iterations (line

4), and ﬁnally, returns the selected action (line 12).

Second, the ASM checks whether the set of avail-

able actions contains an action that explores a po-

sition out of the explored area (line 5). If so, be-

cause there can be several unexplored positions, it

prioritizes choosing the remote position (remotePos)

to the agent position (line 6). Consequently, the

ASM selects the action that navigates and explores

the remotePos (line 7) and includes the position in the

explored area to enhance selecting other unexplored

positions in the next iterations (line 8). Finally, the

ASM returns the selected action (line 12). In case the

actions do not contain a non-interacted entity or non-

explored position (line 9), the ASM selects (line 10)

and returns an action randomly (line 12).

Different ASMs can be conﬁgured in the Java pro-

tocol of TESTAR. This way, the game testers can ad-

just the decision-making of the agent based on the re-

quirements of different Space Engineers scenarios or

testing objectives.

5.4 TESTAR Agent: Oracles

TESTAR integrates generic oracles intended to ver-

ify the robustness of the SUT: detect if the process

has crashed or hung or if the state elements, or de-

bugging logs, contain suspicious exception messages.

Although these generic oracles are a good way to

start with automated scriptless testing, for Space En-

gineers, it is of paramount importance to test also the

functional aspects of the game entities.

Examples of oracles can be to check that the in-

tegrity of all blocks decreases after grinding or shoot-

ing or increases after welding; that the agent’s health,

oxygen, hydrogen, and energy are restored when in-

teracting with medical rooms or cockpits; or that the

jet-pack and the dampeners are not enabled automat-

ically without player activation after entering a cock-

pit, medical room, or interacting a ladder.

These oracles have been studied in (Prasetya et al.,

2022). In this paper, we apply oracles that validate the

integrity of blocks, but we mainly emphasize evaluat-

ing the effectiveness of ASMs exploration.

6 SCRIPTLESS GAME TESTING

EVALUATION

In order to assess the efﬁcacy of scriptless testing for

exploring the Space Engineers game, we evaluate the

potential beneﬁts of investing time and effort in devel-

oping ASMs for more sophisticated exploration tech-

niques. To accomplish this, we quantitatively mea-

sure the spatial coverage of discovered and interacted

entities and navigated positions within a randomly

generated scenario. To guide our study, we have for-

mulated a research question and null hypothesis:

RQ: How effective is spatial exploration in the Space

Engineers game when using different TESTAR ASMs?

: The Interactive Explorer ASM is not more effec-

tive than a random ASM.

We designed a controlled experiment based on

Wohlin’s guidelines (Wohlin et al., 2012) and a

methodological framework speciﬁcally built to eval-

uate software testing techniques (Vos et al., 2012).

The experiment consists of running the ran-

dom default ASM and the more intelligent decision-

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

making Interactive Explorer ASM from Algorithm 1

that prioritizes the interaction with newly discovered

blocks and the exploration of remotely unexplored ar-

eas. Each trial measures spatial coverage of discov-

ered Space Engineers blocks and ﬂoor positions.

6.1 Space Engineers Generated

Scenario

A randomly generated Space Engineers scenario was

used to ensure that the evaluated ASMs were not bi-

ased. The scenario consists of a 100x100 map with

8157 navigable positions and obstructive walls that

TESTAR must navigate around to reach interactive

blocks. The number of interactive blocks is randomly

placed in a uniform distribution in various reachable

parts of the map. From the 62 blocks speciﬁed by

the company as fundamental for manual testing, we

chose 16 types of 1x1 blocks that were allowed to

be placed in the random scenario creation. Gravity

blocks are also included to simulate gravity, resulting

in 313 functional blocks of 17 different types.

6.2 Independent Variables

To focus on the exploratory capabilities of the ASMs

and prevent the agent from dying, we load the gener-

ated scenario in creative mode.

The TESTAR agent can use diverse tools and

weapons to test the integrity of blocks. However,

since this study focuses on spatial exploration, we ap-

plied the blocking principle (Wohlin et al., 2012) to

limit the TESTAR agent interactions to a grinder tool

that veriﬁes that the integrity of functional blocks de-

creases and shooting one bullet at gravity blocks to

reduce their integrity without destroying their gravi-

tational functionality. We also limited the observation

range through the Space Engineers-plugin to encour-

age exploring and discovering new blocks.

6.3 Dependent Variables

To answer our research question, we measured the

number of discovered and interacted blocks, and the

observed and walked positions. The Space Engineers

game stores the scenario data in local XML ﬁles. This

data contains information about the ﬂoor positions

and existing blocks, and we compare it with real-time

observations during the exploration process to obtain

spatial coverage. Using this data, we can generate a

2D map highlighting the covered space.

Figure 6 shows an example of one exploratory se-

quence of 500 actions in the experimental map. Re-

garding navigable positions, yellow squares represent

ﬂoor positions, red squares obstructive walls, blue cir-

cles observed areas, and green dots walked positions.

Then, about interactive blocks, magenta dots repre-

sent not observed blocks, pink dots observed but not

interacted blocks, and orange dots interacted blocks.

Figure 6: Space Engineers spatial coverage map.

6.4 Design of the Experiment

We evaluate the random and the Interactive Explorer

ASMs by executing an exploration of 500 actions on

the generated scenario. We repeated the exploration

process 30 times for each ASM. For each new exe-

cution, we reload the same initial conditions in the

same Windows machine with 8 CPU cores and 16 GB

RAM. We obtain independent spatial coverage met-

rics for each execution and accumulative spatial cov-

erage metrics for the 30 executions of the different

random and Interactive Explorer ASMs.

6.5 Results

We ﬁrst present the spatial coverage achieved in the

30 independent runs. Next, we compare the accumu-

lative spatial coverage of the two ASMs. Finally, we

use the Wilcoxon test to determine whether there is a

signiﬁcant difference between the ASMs. The exper-

iments were performed in Space Engineers v201.14.

The replication package can be found here

Figure 7 shows the results for the observed and in-

teracted blocks. Each line represents one of the 30 in-

dependent runs. The random ASM achieved a cover-

age ranging from 8% to 30% for observed blocks and

https://doi.org/10.5281/zenodo.10683676

Scriptless Testing for an Industrial 3D Sandbox Game

3% to 8% for total interacted blocks. In contrast, the

Interactive Explorer ASM achieved coverage ranging

from 54% to 82% for observed blocks and 52% to

77% for total interacted blocks.

Figure 7: Observed and interacted blocks coverage.

Figure 8 shows the results for the observed and

walked ﬂoor positions. The random ASM achieved a

coverage ranging from 7% to 30% for observed posi-

tions and 4% to 12% for walked positions. In compar-

ison, the Interactive Explorer ASM achieved a cover-

age ranging from 54% to 77% for observed positions

and 21% to 24% for walked positions.

Figure 8: Observed and walked ﬂoor positions coverage.

Figure 9 compares the accumulative spatial cover-

age achieved by both ASMs over the 30 exploratory

runs. The Interactive Explorer ASM achieves over

95% of observed and interacted blocks and observed

ﬂoor positions around the 2000 executed actions, cor-

responding to the combination of 4 independent ex-

ecutions. At the end of the 30 exploratory runs, it

reached 88% of walked positions. In contrast, the ran-

dom ASM requires over 5000 actions to observe 50%

of the existing blocks and ﬂoor positions, over 12000

actions to interact with 50% of blocks, and achieves

only about 54% of walked ﬂoor positions at the end

of the 30 runs. Due to the random uniform distribu-

tion of blocks when creating the experimental map,

we can appreciate that the observed blocks and posi-

tions curves grew similarly during the exploration.

Figure 9: Accumulative spatial coverage comparison.

The Interactive Explorer ASM outperforms the

random ASM by prioritizing interacting with newly

observed blocks and calculating efﬁcient routes to un-

explored ﬂoor areas. Table 1 shows Wilcoxon test

results to verify a signiﬁcant difference between the

two ASMs. We extracted values from the 30 differ-

ent runs when executing 100, 300, and 500 actions.

This means we calculate the signiﬁcant difference in

3 different moments of the exploratory process. For

the observed and interacted blocks and the observed

and walked positions, the Wilcoxon test results show

a p-value of less than 0.05, indicating that the Interac-

tive Explorer ASM is statistically superior to the ran-

dom ASM. This allows us to reject H

and conﬁrm

that investing time and effort in developing intelligent

ASMs beneﬁts TESTAR exploration effectiveness.

Table 1: Wilcoxon p-value signiﬁcant difference.

Wilcoxon test p-values results

Executed actions Observed

Blocks

Interacted

Blocks

100 actions p=1.730e-06 p=1.718e-06

300 actions p=1.732e-06 p=1.734e-06

500 actions p=1.730e-06 p=1.729e-06

Executed actions Observed Posi-

tions

Walked

Positions

100 actions p=1.734e-06 p=1.733e-06

300 actions p=1.734e-06 p=1.732e-06

500 actions p=1.734e-06 p=1.734e-06

6.6 Threats to Validity

We discuss some threats to validity according to

(Wohlin et al., 2012; Ralph and Tempero, 2018).

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

Construct Validity. For the exploratory evaluation,

we use the information from the Space Engineers sce-

nario to design the concept of spatial coverage. Then,

we use this data to measure the effectiveness of the

random and Interactive Explorer ASMs. Although

this spatial coverage is a self-design benchmark, the

metrics come from the Space Engineers game’s data.

Content Validity. For the exploratory evaluation,

the spatial coverage measures the existing blocks and

ﬂoor positions over a 2D scenario space. Still, there

are various types of blocks, and the game environment

allows 3D motions. Although more sophisticated spa-

tial coverage metrics can be researched in the future,

the obtained 2D metrics allow us to measure the effec-

tiveness of the exploratory ASMs. Moreover, while

we did not encounter any bugs related to the integrity

of the functional blocks used, it’s important to empha-

size that our solution effectively covered the scenario

space. The lack of failure detection may be attributed

to the random distribution of the test scenario or the

absence of issues in the types of blocks utilized.

Internal Validity. For the exploratory evaluation,

we launched the Space Engineers scenario in creative

mode to avoid the astronaut dying and provide enough

ammo items to realize the shoot gun actions.

External Validity. The empirical study has been

realized with the highly complex Space Engineers

game. Even though we demonstrated that the TES-

TAR agent has exploratory capabilities to navigate

and test Space Engineers automatically, we consider

this to be a ﬁrst step regarding game scriptless test

automation. Moreover, to facilitate the generaliza-

tion of our results, we use the architectural analogy

(Wieringa and Daneva, 2015) since we carefully de-

scribe the components of the case and the correspond-

ing interactions, such as the game system and the

scriptless tool with the corresponding conﬁguration.

Conclusion Validity. Due to the degree of random-

ness in the action selection of the exploratory ASMs,

we cannot assume normal distribution in the experi-

ments (Arcuri and Briand, 2011). To address this, we

repeated the exploration 30 times and used Wilcoxon

statistical non-parametric tests on the results.

7 CONCLUSION

Computer 3D sandbox games, such as Space Engi-

neers, are complex games characterized by a multi-

tude of in-game features and entities. Manual test-

ing of these games poses challenges due to time and

resource constraints, especially when exploring and

testing unforeseen scenarios or large combinations of

gameplay interactions.

In this paper, we have showcased the automated

scriptless exploration of an industrial 3D sandbox

game using TESTAR and IV4XR. This work shows

the value of implementing TESTAR’s ASMs as re-

usable artifacts at a high abstraction level. These

ASMs prove to be effective in enhancing game nav-

igation and testing capabilities by guiding the agents

toward speciﬁc areas of the game. Our research show-

cases the advantages of using an intelligent ASM as a

powerful tool for optimizing spatial coverage. Imple-

menting different ASMs allows directing the agents

toward speciﬁc parts of the game to achieve more

comprehensive coverage and uncover potential issues

that might have been overlooked otherwise.

This paper demonstrates that with a dedicated

navigation layer, an autonomous scriptless agent can

effectively reach and test game entities, and it is pos-

sible to exercise automated exploration of scenarios

without training the agent to play the game.

We consider the integration of the TESTAR agent

as a ﬁrst step in the inclusion of intelligent scriptless

testing agents for games. Future research is planned

to use high-level artifacts like ASMs to promote the

reusability and maintainability of the testing frame-

work. We plan to study if ASMs can be adapted to dif-

ferent scenarios, reducing the effort required to con-

ﬁgure the testing environment. Finally, we will con-

tinue the research with future experiments to extrapo-

late our results to other games in the market.

ACKNOWLEDGEMENTS

This work has been partially funded by the iv4XR

H2020 project and the ENACTEST project.

REFERENCES

Aho, P., Menz, N., R

aty, T., and Schieferdecker, I. (2011).

Automated java gui modeling for model-based testing

purposes. In 2011 8th ITNG, pages 268–273. IEEE.

Aho, P., Suarez, M., Kanstr

en, T., and Memon, A. (2014).

Murphy tools: Utilizing extracted gui models for in-

dustrial software testing. In IEEE 7th ICST Work-

shops, pages 343–348.

Arcuri, A. and Briand, L. (2011). A practical guide for us-

ing statistical tests to assess randomized algorithms in

software engineering. In 33rd ICSE, page 1–10. ACM.

Ariyurek, S., Surer, E., and Betin-Can, A. (2022). Playtest-

ing: What is beyond personas. IEEE Transactions on

Games, pages 1–1.

Bons, A., Mar

ın, B., Aho, P., and Vos, T. E. (2023). Scripted

and scriptless gui testing for web applications: An in-

dustrial case. Information and Software Technology,

158:107172.

Scriptless Testing for an Industrial 3D Sandbox Game

Cooper, K. M. (2021). Software Engineering Perspectives

in Computer Game Development. CRC Press.

Cui, X. and Shi, H. (2011). A*-based pathﬁnding in modern

computer games. International Journal of Computer

Science and Network Security, 11(1):125–130.

Dallmeier, V., Burger, M., Orth, T., and Zeller, A. (2012).

Webmate: a tool for testing web 2.0 applications. In

Workshop on JavaScript Tools, pages 11–15.

de Andrade, S. A., Nunes, F. L., and Delamaro, M. E.

(2023). Exploiting deep reinforcement learning and

metamorphic testing to automatically test virtual real-

ity applications. STVR, 33(8):e1863.

Fisher, J., Koning, D., and Ludwigsen, A. (2013). Uti-

lizing atlassian jira for large-scale software develop-

ment management. Technical report, LLNL, Liver-

more, CA (United States).

Garc

ıa, B., Gallego, M., Gort

azar, F., and Munoz-Organero,

M. (2020). A survey of the selenium ecosystem. Elec-

tronics, 9(7):1067.

Gordillo, C., Bergdahl, J., Tollmar, K., and Gissl

en, L.

(2021). Improving playtesting coverage via curiosity

driven reinforcement learning agents. In Conference

on Games (CoG), pages 1–8. IEEE.

Jansen, T., Ric

os, F. P., Luo, Y., van der Vlist, K., van Dalen,

R., Aho, P., and Vos, T. E. (2022). Scriptless gui test-

ing on mobile applications. In 22nd QRS, pages 1103–

1112. IEEE.

Kempka, M., Wydmuch, M., Runc, G., Toczek, J., and

skowski, W. (2016). Vizdoom: A doom-based ai re-

search platform for visual reinforcement learning. In

Symposium on computational intelligence and games

(CIG), pages 1–8. IEEE.

Kong, P., Li, L., Gao, J., Liu, K., Bissyand

e, T. F., and

Klein, J. (2018). Automated testing of android apps:

A systematic literature review. IEEE Transactions on

Reliability, 68(1):45–66.

Liu, G., Cai, M., Zhao, L., Qin, T., Brown, A., Bischoff,

J., and Liu, T.-Y. (2022). Inspector: Pixel-based au-

tomated game testing via exploration, detection, and

investigation. In CoG, pages 237–244. IEEE.

Mariani, L., Pezz

e, M., Riganelli, O., and Santoro, M.

(2011). Autoblacktest: A tool for automatic black-

box testing. In 33rd ICSE, pages 1013–1015. ACM.

Mariani, L., Pezz

e, M., and Zuddas, D. (2018). Augusto:

Exploiting popular functionalities for the generation

of semantic GUI tests with oracles. In 40th ICSE, page

280–290. ACM.

Mesbah, A. and Van Deursen, A. (2009). Invariant-based

automatic testing of ajax user interfaces. In 31st ICSE,

pages 210–220. IEEE.

Nguyen, B. N., Robbins, B., Banerjee, I., and Memon, A.

(2014). Guitar: an innovative tool for automated test-

ing of gui-driven software. Automated software engi-

neering, 21:65–105.

Paduraru, C., Paduraru, M., and Stefanescu, A. (2022).

Rivergame-a game testing tool using artiﬁcial intelli-

gence. In 15th ICST, pages 422–432. IEEE.

Pascarella, L., Palomba, F., Di Penta, M., and Bacchelli,

A. (2018). How is video game development different

from software development in open source? In 15th

MSR, pages 392–402.

Pastor Ric

os, F. (2022). Scriptless testing for extended real-

ity systems. In 16th RCIS, pages 786–794. Springer.

Pezze, M., Rondena, P., and Zuddas, D. (2018). Automatic

gui testing of desktop applications: an empirical as-

sessment of the state of the art. In ISSTA/ECOOP

2018 Workshops, pages 54–62.

Pfau, J., Smeddinck, J. D., and Malaka, R. (2017). Auto-

mated game testing with icarus: Intelligent comple-

tion of adventure riddles via unsupervised solving. In

CHI PLAY’17 Extended Abstracts, pages 153–164.

Politowski, C., Petrillo, F., and Gu

eneuc, Y.-G. (2021).

A survey of video game testing. In 2nd AST, pages

90–99. IEEE.

Prada, R., Prasetya, I., Kifetew, F., Dignum, F., Vos, T. E.,

Lander, J., Donnart, J.-y., Kazmierowski, A., David-

son, J., and Fernandes, P. M. (2020). Agent-based test-

ing of extended reality systems. In 13th ICST, pages

414–417. IEEE.

Prasetya, I., Pastor Ric

os, F., Kifetew, F. M., Prandi, D.,

Shirzadehhajimahmood, S., Vos, T. E., Paska, P., Hov-

orka, K., Ferdous, R., Susi, A., et al. (2022). An agent-

based approach to automated game testing: an experi-

ence report. In 13th A-TEST Workshop, pages 1–8.

Ralph, P. and Tempero, E. (2018). Construct validity in

software engineering research and software metrics.

In 22nd EASE, pages 13–23. ACM.

Rani, G., Pandey, U., Wagde, A. A., and Dhaka, V. S.

(2023). A deep reinforcement learning technique for

bug detection in video games. International Journal

of Information Technology, 15(1):355–367.

Santos, R. E., Magalh

aes, C. V., Capretz, L. F., Correia-

Neto, J. S., da Silva, F. Q., and Saher, A. (2018).

Computer games are serious business and so is their

quality: particularities of software testing in game de-

velopment from the perspective of practitioners. In

12th ESEM, pages 1–10. ACM/IEEE.

Sestini, A., Gissl

en, L., Bergdahl, J., Tollmar, K., and Bag-

danov, A. D. (2022). Automated gameplay testing and

validation with curiosity-conditioned proximal trajec-

tories. IEEE Transactions on Games.

Vos, T., Aho, P., Pastor Ricos, F., Rodriguez-Valdes, O., and

Mulders, A. (2021). testar–scriptless testing through

graphical user interface. STVR, 31(3):e1771.

Vos, T. E., Mar

ın, B., Escalona, M. J., and Marchetto, A.

(2012). A methodological framework for evaluating

software testing techniques and tools. In 12th QSIC,

pages 230–239. IEEE.

Wang, X. (2022). Vrtest: an extensible framework for auto-

matic testing of virtual reality scenes. In ACM/IEEE

44th ICSE Companion, pages 232–236.

Wieringa, R. and Daneva, M. (2015). Six strategies for

generalizing software engineering theories. Science

of computer programming, 101:136–152.

Wohlin, C., Runeson, P., H

ost, M., Ohlsson, M. C., Reg-

nell, B., and Wessl

en, A. (2012). Experimentation in

software engineering. Springer.

Zheng, Y., Xie, X., Su, T., Ma, L., Hao, J., Meng, Z., Liu,

Y., Shen, R., Chen, Y., and Fan, C. (2019). Wuji: Au-

tomatic online combat game testing using evolution-

ary deep reinforcement learning. In 34th ASE, pages

772–784. IEEE.

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering