GOTE: An Edge Computing Architecture for Mobile Gaming

Gabriel Robaina and Adriano Fiorese

Graduate Program in Applied Computing - PPGCAP, Santa Catarina State University - UDESC, Joinville,

ﬁ

Keywords:

Edge Computing, Mobile Gaming, Real-Time Video Streaming.

Abstract:

The mobile games market has grown in relevancy compared to traditional gaming platforms. The standard ar-

chitecture for these games requires the processing of game logic and graphics using the device’s own hardware.

Alternatively, cloud based architectures for remote gaming on smartphones present high game input delay at

a high cost for the service provider. This poses a limitation to the variety and complexity of games that target

these platforms as well as constraining user QoE. To address that limitation, this work proposes the Gaming

On The Edge (GOTE) architecture, that aims to enable complex games to be played on smartphone devices

while leveraging edge computing infrastructure into graphics processing and content distribution systems. A

GOTE architecture’s proof of concept is developed and tested using WebRTC with an RTP streaming pipeline

that exploits NVENC for achieving low latency video encoding. Experimental results show that GOTE ar-

chitecture is a viable alternative to cloud based remote gaming on smartphones at the advantage of lowering

latency of video and game input. An open source implementation of the architecture is provided in order to

assist further research in this area.

1 INTRODUCTION

The mobile games market has grown to be the biggest

one when compared to traditional platforms like PC

and consoles. In 2021, tablet and smartphone games

added up to a revenue of 96 billion dollars, 52% of

market share, and a trend of growth for the following

years (Newzoo, 2021). The standard architecture for

these games requires the processing of game logic and

graphics using the device’s own hardware. This poses

a limitation to the variety and complexity of games

that target these platforms, since mobile devices have

less hardware capabilities when compared to special-

ized gaming PC or consoles (Messaoudi et al., 2017).

One possible solution to processing game logic

and graphics relies on cloud infrastructure instead of

the player’s mobile device. Companies like Sony,

NVIDIA and Paperspace have been providing cloud

gaming services (Lin et al., 2019). In the cloud

gaming architecture the player interactions are sent

to a cloud server and a rendered game scene is sent

back as a video stream (Messaoudi et al., 2017). Al-

though scalable, the cloud approach requires the game

code to be ofﬂoaded to one or multiple cloud servers,

making the architecture susceptible to high player in-

https://orcid.org/0000-0003-1140-0002

put delay on poor network conditions, which leads

to low Quality of Experience (QoE). Besides, this

model imposes non-trivial infrastructure costs to the

gaming service provider since the most part of the

computational workload is performed in the comput-

ing provider (Cai et al., 2016). Recently, Google

announced shutting down its cloud gaming platform

Stadia (Google, 2022).

Edge computing (EC) is a paradigm that takes the

processing of data to the edge of the network instead

of a centralized cloud. Dedicated edge infrastructure,

such as cloudlets (Lin et al., 2019), or devices like

routers and mobile phones can exchange workloads

and achieve low-latency communication inside a local

network while still being able to send post-processed

data to the cloud if needed (Liu et al., 2019). These

edge nodes can make use of virtualization to host the

execution of code ofﬂoaded from external applica-

tions, like mobile games (Zhang et al., 2019). This

strategy enables sophisticated applications to the mo-

bile users while extending battery lifetime since most

of the computational load is being performed in the

edge of the network (Mach and Becvar, 2017). There-

fore, the problem being faced by this work is how to

provide an opportunity for games to be played on mo-

bile devices using computing resources that are more

prone to be found in infrastructure services.

Robaina, G. and Fiorese, A.

GOTE: An Edge Computing Architecture for Mobile Gaming.

DOI: 10.5220/0011962000003467

In Proceedings of the 25th International Conference on Enterprise Information Systems (ICEIS 2023) - Volume 1, pages 721-730

ISBN: 978-989-758-648-4; ISSN: 2184-4992

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

721

To accomplish that, this work proposes an Edge

Computing (EC) architecture to mobile gaming

named Gaming On The Edge (GOTE), leveraging the

proximity between edge nodes and mobile devices to

achieve low input delay. Similarly to the cloud gam-

ing architecture, this alternative prevents game logic

and rendering from being performed by the player’s

mobile device by ofﬂoading the related code to a

nearby edge node, enabling sophisticated games to be

played on smartphones and increasing user QoE.

In this sense, this work provides the follow-

ing contributions: 1) An edge-based remote gaming

architecture that aims to enable resource intensive

games to be played on mobile devices; 2) A video

streaming pipeline that achieves low-latency game

video feedback in the edge context, without any in-

strumentation of game code; 3) An open source proof

of concept implementation

of the architecture’s core

in order to assist further research in this area.

This work is organized as follows. Section 2 de-

notes background concepts to the understanding of

the proposed work. Section 3 presents and discusses

previous work that took the remote gaming approach

to the edge computing context. Section 4 deﬁnes the

GOTE architecture. Furthermore, Section 5 describes

the experiments performed with the GOTE architec-

ture and its results, that are discussed on Section 6.

Section 7 concludes this paper and proposes future

work.

2 BACKGROUND

This section introduces concepts and tools used on

the system architecture and implementation. The

Web Real-Time Communication (WebRTC) standard

is used on the GOTE architecture for establishing

communication between the player client and the ren-

dering server, while the GStreamer framework is re-

sponsible for the media pipeline that enables real-

time streaming of the game scenes. Also, the built

pipeline leverages the hardware based Nvidia En-

coder (NVENC) for achieving low latency video feed-

back.

2.1 WebRTC

WebRTC is a standard that provides Application

Programming Interfaces (APIs) that enable real-

time Peer-to-Peer (P2P) communication to HTML5

browsers, and it is commonly used for web confer-

encing. It also enables Real-time Transport Protocol

https://github.com/gpr-indevelopment/gote-game-

server-2

(RTP) streams to be displayed on an HTML5 video

tagged page (Loreto and Romano, 2014). RTP takes

advantage of the User Datagram Protocol (UDP) in-

stead of the Transmission Control Protocol (TCP) on

the transport layer in order to achieve low latency

communication and high data throughput. For the

peers to connect, ﬁrst they must go through the sig-

naling process, in which each peer shares informa-

tion about supported media types, codecs and re-

lated conﬁguration by means of the Session Descrip-

tion Protocol (SDP). Also, reachability information of

each peer, such as public Internet Protocol (IP) ad-

dresses, are collected from a Session Traversal Utili-

ties for Network Address Translation server (STUN),

and shared through the Interactive Connectivity Es-

tablishment (ICE) technique (RFC5245)(Rosenberg,

2010). Traditionally, all communication and informa-

tion exchange in this process is mediated by a ded-

icated signaling server (Loreto and Romano, 2014).

In GOTE the streaming server also provides signal-

ing functionalities and acts as a mediator of this pro-

cess. WebRTC is advantageous for the GOTE archi-

tecture since it provides a standard for establishing a

streaming session between the rendering server and

the smartphone client. It also enables the game stream

to be easily displayed on smartphones by leveraging

the WebRTC API available in modern browsers.

Fig. 1 presents the WebRTC session sequence dia-

gram used in GOTE for the rendering server. The sig-

naling and rendering modules are components of the

GOTE rendering server application. First, both the

player client and the rendering module communicate

with the signaling module in order to retrieve a com-

mon session identiﬁer. Then, the player client creates

an SDP offer based on the media type it is able to

play. In parallel, the rendering module creates an SDP

offer based on the media it can transmit. Next, the

ICE candidates are retrieved from the STUN servers

by both the player client and the rendering module.

These candidates carry the available methods and ad-

dresses the peers can use to communicate with each

other. Finally, the signaling module mediates the ex-

change of ICE candidates and SDP offers between the

player client and the rendering module. The rendering

module can start the video stream once this exchange

has ﬁnished.

2.2 STUN

The existence of different network topologies can in-

crease the complexity of the connection establishment

between peers in the WebRTC environment. For ex-

ample, peers can be in different private networks re-

lying on the Network Address Translation protocol

ICEIS 2023 - 25th International Conference on Enterprise Information Systems

722

(NAT) for being reachable over the internet through

public IP addresses. GOTE leverages the STUN pro-

tocol for enabling the connection between the render-

ing server and the player client on a wider range of

network topologies.

STUN is a protocol used in WebRTC’s (Rosen-

berg, 2010) signaling process for collecting public IP

address information from the peers (Loreto and Ro-

mano, 2014). The peers can request their public IP ad-

dresses from a centralized cloud STUN server, creat-

ing a NAT binding on each peer’s router. This binding

maps a public IP and port to an IP in the private net-

work, enabling peers to be reachable over the Internet.

STUN can also be used to maintain NAT bindings via

periodic connectivity checks (Rosenberg, 2008). Fi-

nally, the peers can exchange their public IP addresses

through ICE as part of WebRTC’s signaling process.

2.3 GStreamer

GStreamer is a framework for streaming media appli-

cations. It enables multimedia pipelines to be built

with a broad variety of input and output format and

sources. A GStreamer pipeline consists of elements

that are interconnected in order to take multimedia

data from a source to an output (GStreamer, 2021).

Video encoding must be executed by one element of

the media pipeline in accordance with the allowed

media formats informed by the destination peer at the

session agreement. In the GOTE case, since RTP

is supported by the client’s web browser, the video

stream acts as the output of the media pipeline.

2.4 Hardware Encoding

Software encoding involves video encoding using

CPU resources. It is capable to achieving high video

quality at a speed that varies on the type of architec-

ture and performance of the CPU. It has been widely

used on Internet media. However, it is not suitable

Figure 1: GOTE WebRTC session sequence diagram.

for real-time video streaming since software encoders

can take up to several hours to compress a short video

in high deﬁnition using recent codecs (Kufa and Kra-

tochvil, 2017).

In contrast, hardware encoding uses a dedicated

GPU for video encoding tasks with higher perfor-

mance. The encoding speed from hardware encoding

can be up to ten times higher when compared to con-

ventional software encoding (Kufa and Kratochvil,

2017), making it suitable for real-time video stream-

ing. NVENC is NVIDIA’s hardware accelerated en-

coder. It is independent of the graphics performance

of the GPU, and during encoding, the graphics engine

and CPU are free for other tasks (NVIDIA, 2021). In

GOTE’s architecture, NVENC accelerates encoding

for a H.264 video stream.

3 RELATED WORK

The Games@Large project (Nave et al., 2008) aimed

to research, develop and implement an architecture

for remote execution of games using code ofﬂoad-

ing to local servers. This architecture’s use cases in-

clude hotels, cruise ships and Internet cafes. Instead

of streaming the game scenes as a video back to the

player’s device, this approach requires the scenes to

be rendered locally using the mobile device’s hard-

ware resources (Eisert and Fechteler, 2007). This

was achieved by capturing the commands sent by the

game logic to the related graphics API and redirect-

ing it to the player’s mobile device for rendering.

The tests showed low frame-rates for mobile devices,

ranging from 7 FPS on a business strategy game to 18

FPS on a casual game with an average of ≈ 0.34 Mb/s

sent over the local network.

The EdgeGame project (Zhang et al., 2019) pro-

posed an EC based architecture for mobile gaming

and built a prototype that ofﬂoads the processing of

game logic and rendering to edge nodes using virtu-

alization. The game scenes are then sent back as a

video stream to the mobile user using the WebRTC

standard. A congestion control algorithm is used for

dynamically adjusting the rate at which data is trans-

ferred based on network conditions (Jansen et al.,

2018). This standard ﬁts the mobile gaming use

case since it provides adaptability on unstable net-

works and real-time communication of player input

and game video stream. In EdgeGame the player can

locate available edge nodes by sending requests to

a centralized data center, that is also responsible for

managing user accounts and providing login services.

The tested network delays experienced on the EC ap-

proach were signiﬁcantly lower (16.2ms) when com-

GOTE: An Edge Computing Architecture for Mobile Gaming

723

pared to a cloud based one (44.2ms). Also, the user’s

QoE on EdgeGame was 20% higher when compared

to a cloud based alternative.

The RenderLink project (Oros and B

acu, 2020)

also adopts the approach of game code ofﬂoading to

the edge while sending a video stream back to the

client using WebRTC. Instead of using a dedicated

edge node for rendering, RenderLink proposes a peer-

to-peer (P2P) strategy that leverages idle user devices

for this task. In a commercial implementation, users

that expose their hardware resources and cooperate

in the network may be rewarded with virtual cur-

rency. Still, the P2P approach limits the complexity of

the games rendered based on the hardware resources

available in the network. The tests performed with

RenderLink project showed an average frame rate of

55.65 frames per second (FPS) on 720p over a wired

connection with standard deviation of 13.48. It was

noted that most implementations of WebRTC begin

streaming with low quality and gradually ramp up to a

stable condition, that accommodates bandwidth con-

straints, after around 1 minute and 20 seconds. This

characteristic of the implementations may hinder the

Quality of Service (QoS) during that time period.

3.1 Considerations About the Related

Work

EdgeGame (Zhang et al., 2019) and RenderLink

(Oros and B

acu, 2020) took similar approaches to

service discovery, leveraging a centralized data cen-

ter for starting a game session. Even so, a mobile

client can discover a Local Area Network (LAN) ren-

dering server through Simple Service Discovery Pro-

tocol (SSDP) (Donoho et al., 2020) or Service Loca-

tion Protocol (SLP) (Day et al., 1999), which decou-

ples the client from any centralized server since they

do not need to know about each other before starting

the communication to create a game session. Also,

the usage of LAN protocols enables discovery to be

performed without Internet connection. In the case

of WebRTC, this is only possible if a local signaling

server exists.

In general, the usage of RTP showed promising

results on other projects that took the game scene

streaming approach. This protocol was used by

EdgeGame and RenderLink in conjunction with the

WebRTC APIs, while having the frame rate stabiliza-

tion time as a drawback. The frame rate and resolu-

tion results of the local rendering approach presented

by the Games@Large (Nave et al., 2008) project were

surpassed by the WebRTC video streaming initiatives

(Oros and B

acu, 2020). Also, the local rendering of

graphics is not ideal since it increases the amount of

data being transferred to the device, and limits the

complexity of games to the player’s device hardware

capabilities.

Still, for the video streaming approach it is noted

that the QoE is directly affected by the frame rate at

the rendering source (Oros and B

acu, 2020). So, the

rendering server and the client should have at least

equivalent hardware resources in order for the of-

ﬂoading to be worthwhile. Using a peer mobile de-

vice as a rendering server, as done by RenderLink,

limits the complexity of games that can be streamed

based on the hardware resources of the peers (Oros

and B

acu, 2020). On the other hand, the dedicated

edge server strategy of EdgeGame is more expensive,

but enables complex games to be streamed to mobile

clients such as smartphones (Zhang et al., 2019).

This work takes the WebRTC approach for

streaming game scenes from a rendering server to the

player’s mobile client by means of the RTP usage,

as suggested by EdgeGame and RenderLink, while

constructing a media pipeline that takes advantage of

hardware encoding for improving performance and

QoE gain. Also, virtualization is used as a platform

for instantiating and managing rendering servers us-

ing a developed orchestrator. None of the related

works provided enough implementation details for re-

producibility. Table 1 presents a comparison between

the GOTE proposal and analyzed literature that took

the remote gaming approach to the edge computing

context.

4 ARCHITECTURE OVERVIEW

GOTE architecture is based on direct communication

between a rendering server and a player smartphone

client. Game input from the player is sent to the server

through WebSockets, that renders the game scene and

streams video back to the client using RTP. Such

server is deployed on a virtual node (VN) of a desk-

top PC or some other edge device with graphics pro-

cessing capabilities that is able to start the requested

game and a media pipeline for the RTP stream. This

pipeline must be efﬁcient enough to stream at frame

rates and resolutions that maximize the player’s QoE.

The smartphone client has a mobile application with

a game module, responsible for displaying the game

stream and transmitting game controller input, and

a discovery module, that enables the discovery of a

compliant orchestrator reachable by the client. The

orchestrator component is responsible for instantiat-

ing and managing VNs while forwarding game inputs

from the game module to the corresponding rendering

server. Fig. 2 presents an overview of the architecture.

ICEIS 2023 - 25th International Conference on Enterprise Information Systems

724

Table 1: Comparison between GOTE and related work.

Games@Large RenderLink EdgeGame GOTE

Rendering Player device Edge server Edge server Edge server

Multiplayer X ✓ ✓ X

WebRTC X ✓ ✓ ✓

RTP ✓ ✓ ✓ ✓

Hardware encoding X Unknown Unknown ✓

Instrumentation of

game code

✓ ✓ Unknown X

Provided

implementation

X X X ✓

In this solution each VN is responsible for the game

session of one player client, and the RTP stream is dis-

played on the game module in a web browser, since it

is compliant with the WebRTC standard.

As described in Section 2.1, the usage of WebRTC

requires a signaling server that acts as a mediator for

establishing the connection between the peers. Al-

though the GOTE rendering server acts as a media-

tor during the signaling process, a cloud STUN server

STUN server

Mobile app

Discovery

Module

Game

Module

(Web browser)

Player smartphone

Game input

(WebSocket)

RTP stream

Hyper-V

VN1 VN2 VN3 VN4

Orchestrator

Figure 2: Overview of the GOTE architecture.

is still necessary for acquiring reachability informa-

tion from the peers. This architecture relies on cloud

STUN servers that are accessible by both the client

and the rendering server. The communication with

cloud servers only happen during signaling and is not

impactful to the gameplay QoE.

The VNs are provisioned by the orchestrator com-

ponent when a gaming session request is received

from the player client, as shown on Fig. 3. VNs are re-

sponsible for running the game processes and stream-

ing video data to the clients while sharing hardware

resources from the host. GPU resources are shared

via GPU passthrough, available from Windows Server

2016 onwards (Microsoft, 2022).

Thus, when a client requests a gaming session

Player client

Request gaming

session

Orchestrator

Validate VN

provisioning viability

for requested game

Available

resources for

requested

game?

Respond error status

"not available"

Provision new VN

with rendering server

Yes

Rendering

server

Start game process

Start streaming

pipeline

Start WebRTC

session

Figure 3: Flowchart of the VN and rendering server provi-

sioning process.

GOTE: An Edge Computing Architecture for Mobile Gaming

725

Memory CPU

NVIDIA

GPU

HyperV virtualization platform

Windows Virtual machine (VM)

RTP stream

GOTE rendering server

application

GStreamer

pipeline

Game

process

Reads

StartsStarts

Figure 4: VN architecture.

to the orchestrator, it veriﬁes if there are enough re-

sources to instantiate a new VN to serve that player.

In case it is possible, a rendering server application is

deployed on the VN. Such application is responsible

for starting the game and video streaming processes.

The media pipeline is implemented using GStreamer

while leveraging NVENC hardware encoding from

NVIDIA GPUs. Fig. 4 presents the proposed VN ar-

chitecture.

Fig. 5 presents the GStreamer pipeline assembled

for the system. The dxgiscreencapsrc element is re-

sponsible for capturing RGBA (red, green, blue and

alpha) data of the game screen at a rate of 60 FPS. The

following element, nvh264enc, uses the NVENC en-

coder API to encode the video stream with the H.264

compression. Then, the rtph264pay packages the

H.264 encoded video stream into the payload of the

RTP packets. Finally, GStreamer makes the stream

available for WebRTC connections on webrtcbin.

dxgiscreencapsrc nvh264enc rtph264pay

webrtcbin

(RTP stream)

Figure 5: GStreamer pipeline.

WebSockets are used for sending game input com-

mands to the VNs and for all communication related

to signaling due to its bidirectional capabilities.

A practical example of the GOTE architecture can

be divided into two phases. First, at the game and

stream provisioning phase, the player (user) initiates

its interaction with the architecture by means of a

mobile application. This application is responsible

for discovering and communicating with a local or-

chestrator in order to establish a game session. The

orchestrator discovery is performed by the discovery

module of the mobile application by means of a cen-

tralized cloud server, or a service discovery protocol

such as SSDP or SLP, for example. Then, the player

can request a game to the orchestrator based on a cat-

alog of games installed on the VN and available for

remote play. After receiving the game request, the or-

chestrator will perform the VN provisioning to a VN

host, as presented on Fig. 3. The orchestrator returns

an error if there are not enough hardware resources

for the requested game. If the provision is success-

ful, the game module of the mobile application opens

the web browser in order to begin the WebRTC ses-

sion establishment, as described on Fig. 1, making the

game video stream available and allowing the user to

start playing. This establishes the beginning of the

remote gameplay phase. At this point, every player

controller input, along with a session identiﬁer, will

be sent to the orchestrator for forwarding to the corre-

sponding VN responsible for hosting the current game

session. Fig. 6 presents the interactions between the

components on an example with successful VN pro-

visioning.

5 EXPERIMENTS AND RESULTS

Experiments were conducted to evaluate the GOTE

architecture on scenarios of increasing complexity.

The ”Local” scenario consists on running the player

client and rendering server locally on the same physi-

cal hardware in order to establish an architecture base

line performance. The ”Wireless LAN” one consists

on running the player client on a smartphone and the

rendering server on a desktop PC that share the same

5 GHz wireless LAN, emulating a high performance

edge computing environment. The last experiment,

labeled ”4G”, consists on running the architecture

while streaming game scenes from a desktop PC ren-

dering server to a smartphone player client over 4G,

emulating a more realistic edge computing scenario.

The performance metrics chosen to evaluate the

GOTE architecture comprising video streaming are

jitter, packet loss, bitrate, frames dropped and sent per

second to the player’s client device. Also, the game

input delay (GID) was measured as the time differ-

ence between player interaction and command arrival

on the rendering server on every second. Jitter, packet

loss and frames dropped are metrics directly related to

the stability of connection and data transmission be-

tween the rendering server and the player client. In

addition, the amount of video data being transmitted

over the network, and its variation over time, is rep-

resented by the bitrate data while GID data was used

to assess how different network scenarios impacted

game real-time response.

A Windows 10 PC with an i5-9400F 2.90GHz

ICEIS 2023 - 25th International Conference on Enterprise Information Systems

726

Mobile application

Orchestrator

discovery

Request game

catalog

Request session for

game

Open web browser

Start WebRTC

session

Send game controller

input

Wait for game

controller input

Orchestrator

Get game catalog Provision VN

Respond game

session is ready

Forward controller

input

VN Host

Instantiate VN

Start rendering server

and WebRTC session

Process game

controller input

Update game scene

Player

Open application Choose game

Interact with game

controller input

Game and stream provisioning Remote gameplay

Figure 6: GOTE Components interaction diagram.

processor, 16GB RAM and NVIDIA GeForce RTX

2060 video card hosted the rendering server on all ex-

periments, along with a Redmi M2101K7AI smart-

phone with 6GB RAM as the player client. The cho-

sen web browser was Google Chrome 96.0.4664.45

running on Android 11. A public cloud STUN server

was used during signaling to establish the connec-

tion between the peers. The NVENC accelerated

H.264 encoder component of the GStreamer pipeline

was set to a constant bitrate of 500 kBps with the

low latency preset, as recommended by NVIDIA for

game-streaming use cases (NVIDIA, 2022). The

screen capturing component used a source resolu-

tion of 1280x720 pixels (720p) at 60 frames per sec-

ond. Graphics test 1 and 2 from 3DMark’s Time

Spy benchmark were transmitted from the rendering

server on all experiments. The benchmarks lasted for

150 seconds in total, with a loading screen at the start

and in between tests.

Even though real remote gaming scenarios in-

volve video streaming and game command communi-

cation simultaneously, it is unlikely that the trafﬁc in-

volved in sending lightweight user input commands to

the server would deeply inﬂuence the stream results.

Therefore, GID data was collected on independent ex-

periments for all three network scenarios. These ex-

periments ran over a 90 seconds time window, which

is sufﬁcient for capturing the impact of the different

network scenarios in game input feedback and remote

gaming experience. Fig. 7 and Table 2 present the re-

sults from all experiments.

These metrics were collected for one rendering

server streaming to one player client and hence the or-

chestration component was not used during the exper-

imentation. Also, the time for the connection estab-

lishment related to VN provision and signaling were

not taken into account during the experiments.

6 DISCUSSION

Frame rate is an important feature of video motion

particularly important for the player experience. Al-

though 60 FPS is desired, it is known that some

variation between 30 and 60 FPS not impact gam-

ing QoE signiﬁcantly (Zadtootaghaj et al., 2018).

All experiments showed stable frame rates around 60

FPS. However, the Local and Wireless LAN scenar-

ios showed the highest deviation from the average

FPS mark of 5.68 and 7.61 frames per second, respec-

tively. A signiﬁcant part of these deviations was due

to frame rate drops at the start of the streaming and

close to the 90 seconds mark, at the transitions be-

tween the loading screens and the beginning of the

benchmarks. There are also peaks in jitter for all ex-

periments at the same time windows.

The abrupt transition from the loading screens

to the benchmarks may have created a delay in the

compression step of the pipeline due to motion com-

pensation, which is a technique that predicts future

frames based on camera motion and objects in neigh-

bor frames of the video (Chen et al., 2001). Since

H.264 uses motion compensation (ITU, 2021), it im-

GOTE: An Edge Computing Architecture for Mobile Gaming

727

(a) Frame rate (b) Jitter

Figure 7: Experimental results.

pacted frame rate both at the beginning and at the 90

seconds mark. On a real gaming scenario, this delay

in compression is more likely to occur in cinematic

oriented games, where camera cuts are frequent, and

less likely to occur in strategy games, for example,

where motion compensation can take advantage from

fewer image changes from one frame to the next. The

motion compensation impact can be seen on bitrate

data between the 60 and 90 seconds marks, during the

transmission of a loading screen, in which the only

motion region in video is the loading bar and the bi-

trate falls close to zero.

The 4G experiment had 4 packets lost during the

experiment. This is critical for the RTP stream since

UDP has no recovery mechanism for handling lost

packets. Also, this metric has signiﬁcant impact on

the perceptual quality of the video stream (Pande

et al., 2013). Still, the 4 packets lost on the 140 sec-

onds mark had no impact on the stability of the video,

as seen in the frame rate and bitrate data.

The Local experiment had 6 frames dropped dur-

ing the 150 seconds time window. Running the

rendering server and the browser client locally may

have impacted performance due to the competition for

hardware resources, consequently favouring packet

drops. Still, no scenario was signiﬁcantly impacted

by frame drops along the experiments.

Jitter values under 100ms do not damage player

Table 2: Average and standard deviation results for the per-

formance metrics.

Average

(standard deviation)

Unit Local

Wireless

LAN

Frame

FPS

56.17 54.47 54.43

rate (5.68) (7.61) (3.17)

Jitter ms

12.45 13.33 10.04

(6.41) (2.52) (2.53)

Bitrate kbps

318.54 344.56 343.29

(138.64) (146.57) (148.27)

GID ms

0.80 6.85 14.78

(2.23) (7.63) (8.50)

QoE even for multiplayer action shooting games

(Amin et al., 2013). This metric remained under 100

ms on all experiments, even at the transitions to and

from the loading screen, meaning that the buffer used

by WebRTC was efﬁcient at sequencing packets for

the video stream, and no component in the media

pipeline created signiﬁcant delays in frame delivery.

ICEIS 2023 - 25th International Conference on Enterprise Information Systems

728

The average bitrate for all experiments remained

between 310 and 350 kBps, meaning that on the aver-

age case all scenarios supported a similar rate of video

data streaming to the player client. This rate is lower

than the 500 kBps speciﬁed on the video encoder of

the GStreamer pipeline because of bandwidth condi-

tions and motion compensation. Also, the LAN and

4G experiments had higher bitrate variations when

compared to Local because of the remote nature of

the stream. Is is noticeable that the bitrate variabil-

ity increased as the complexity of the experiments in-

creased, from Local to 4G.

GID on experiments stayed below 20 ms and in-

creased with the complexity of the scenario. This

result favor QoE when compared to 60 ms, con-

sidered small even for action online games (Quax

et al., 2004). Also, GID metrics were collected ev-

ery second for a simple game input, which means that,

these results may vary for complex games that require

higher player interaction rate.

The GOTE architecture enables game streaming

on an edge rendering server without the need for any

instrumentation of game code, as done by literature

work (Oros and B

acu, 2020). This means that the ren-

dering server is able to stream any game that runs on

a Windows PC edge node regardless of the technol-

ogy it uses. Then, the scale of a commercial applica-

tion based on GOTE can be increased with a system

that rewards virtual currency in exchange for donated

hardware resources, enabling ﬂexible ways of mone-

tizing the usage of a service that leverages this archi-

tecture for providing remote gaming to customers.

RenderLink reported an average of 60 FPS on a

resolution of 1600x900 (Oros and B

acu, 2020). In

addition, Games@Large showed a peak of 26 FPS

and problems running games on edge devices with

no hardware acceleration features (Nave et al., 2008).

The proposed approach was able to achieve higher

FPS at 1280x720 (720p) when compared with Ren-

derLink and Games@Large. Also, the 4G tested GID

was lower when compared to EdgeGame’s reported

network delay of 16.2 ms (Zhang et al., 2019).

7 FINAL CONSIDERATIONS

This work proposed the GOTE architecture, that en-

ables complex games to be played on smartphone

devices, leveraging edge infrastructure with graph-

ics processing capabilities. Also, the architecture’s

core was implemented using WebRTC, GStreamer

and NVENC. All experiments were able to sustain

desirable frame rates, quality and stable streaming

across the tested time frame.

The implementation of the architecture relied on

WebRTC for displaying an RTP stream on a player

smartphone browser client. Commercial applica-

tions of this architecture should consider implement-

ing a WebRTC compliant API on other platforms, or

use another abstraction to deliver low latency video

streaming to mobile clients. Services like this make

use of a mobile app with service discovery capabili-

ties, via SSDP or others, to communicate with a local

compliant orchestrator without the need for Internet

connection. This approach can be useful for closed

events, cruise ships, trains and other transportation

means without stable Internet connection, for exam-

ple.

The implementation applied the H.264 codec to

the video being streamed via RTP to the player client.

Besides this codec, VP8 is also supported by We-

bRTC compliant browsers (Mozilla, 2021). Further

iterations of this architecture should consider the us-

age of VP8 and a comparison with H.264 and other

media pipeline optimizations. Also, all experiments

relied on hardware-accelerated video encoding using

NVENC. Future work should evaluate other encoding

techniques and hardware for this task.

This work implemented and experimented with

single player experiences for remote gaming. The

value proposition of a remote gaming service in-

creases if it supports multiplayer gaming. Therefore,

future work should repeat the proposed experiments

in a multiplayer scenario, and investigate its impact

in video streaming metrics and the resource consump-

tion on the rendering server. Further work should

also consider network stress scenarios for the video

streaming and the signaling process, since all experi-

ments ran under stable network conditions. Also, GID

metrics were collected periodically. Therefore, stress

scenarios should also test GID for games with high

player interaction rate.

GOTE architecture’s mobility can be enhanced by

improving the orchestrating algorithm so that it dis-

covers eligible edge hardware locally, and calculates

the most efﬁcient VN provisioning (code ofﬂoad-

ing) strategy according to the game being requested,

the number of players on a gaming session, the dis-

tance from the player client device, network condi-

tions and other parameters. Such strategy may even

conclude that running the requested game locally on

the player’s smartphone device is the most efﬁcient

decision, in case of insufﬁcient edge resources avail-

able. Also regarding mobility, there are challenges

in seamless game session handover from one VN to

another without instrumentation of game code while

maintaining QoE.

Edge remote gaming represents a potential succes-

GOTE: An Edge Computing Architecture for Mobile Gaming

729

sor of the traditional cloud based streaming model.

Therefore, future work should evaluate the advan-

tages and disadvantages, both for the player and

the service provider, between the GOTE architecture

and cloud gaming architectures such as PlayStation

Now®and GeForce Now®.

ACKNOWLEDGEMENTS

This work received ﬁnancial support from the Coordi-

nation for the Improvement of Higher Education Per-

sonnel - CAPES - Brazil - PROAP-AUXPE/PDPG-

CONSOLIDACAO-3-4 1367/2022.

REFERENCES

Amin, R., Jackson, F., Gilbert, J. E., Martin, J., and Shaw,

T. (2013). Assessing the impact of latency and jitter

on the perceived quality of call of duty modern war-

fare 2. In Kurosu, M., editor, Human-Computer In-

teraction. Users and Contexts of Use, pages 97–106,

Berlin, Heidelberg. Springer Berlin Heidelberg.

Cai, W., Shea, R., Huang, C.-Y., Chen, K.-T., Liu, J., Le-

ung, V. C. M., and Hsu, C.-H. (2016). A survey on

cloud gaming: Future of computer games. IEEE Ac-

cess, 4:7605–7620.

Chen, J., Koc, U.-V., and Liu, K. R. (2001). Design of dig-

ital video coding systems: a complete compressed do-

main approach. CRC Press.

Day, M. D., Perkins, C. E., Veizades, J., and Guttman, E.

(1999). Service Location Protocol, Version 2. RFC

2608.

Donoho, A., Roe, B., Bodlaender, M., Gildred, J., Messer,

A., Kim, Y., Fairman, B., and Tourzan, J. (2020).

UPnP Device Architecture 2.0.

Eisert, P. and Fechteler, P. (2007). Remote rendering of

computer games. SIGMAP, 7:438–443.

Google (2022). A message about Stadia and our long term

streaming strategy.

GStreamer (2021). Application development manual.

ITU, I. T. U. (2021). H.264: Advanced video coding

for generic audiovisual services. ITU-T H.264 (V14)

(08/2021).

Jansen, B., Goodwin, T., Gupta, V., Kuipers, F., and Zuss-

man, G. (2018). Performance evaluation of webrtc-

based video conferencing. SIGMETRICS Perform.

Eval. Rev., 45(3):56–68.

Kufa, J. and Kratochvil, T. (2017). Software and hard-

ware hevc encoding. In 2017 International Confer-

ence on Systems, Signals and Image Processing (IWS-

SIP), pages 1–5. IEEE.

Lin, L., Liao, X., Jin, H., and Li, P. (2019). Computation

ofﬂoading toward edge computing. Proceedings of the

IEEE, 107(8):1584–1607.

Liu, F., Tang, G., Li, Y., Cai, Z., Zhang, X., and Zhou,

T. (2019). A survey on edge computing systems and

tools. Proceedings of the IEEE, 107(8):1537–1562.

Loreto, S. and Romano, S. P. (2014). Real-time commu-

nication with WebRTC: peer-to-peer in the browser.

”O’Reilly Media, Inc.”.

Mach, P. and Becvar, Z. (2017). Mobile edge comput-

ing: A survey on architecture and computation of-

ﬂoading. IEEE Communications Surveys Tutorials,

19(3):1628–1656.

Messaoudi, F., Ksentini, A., Simon, G., and Bertin, P.

(2017). Performance analysis of game engines on mo-

bile and ﬁxed devices. ACM Trans. Multimedia Com-

put. Commun. Appl., 13(4).

Microsoft (2022). Deploy graphics devices using Discrete

Device Assignment.

Mozilla (2021). WebRTC supported video codecs.

Nave, I., David, H., Shani, A., Tzruya, Y., Laikari, A., Eis-

ert, P., and Fechteler, P. (2008). Games@large graph-

ics streaming architecture. In 2008 IEEE International

Symposium on Consumer Electronics, pages 1–4.

Newzoo (2021). Global games market report.

NVIDIA (2021). NVIDIA Video Codec SDK.

NVIDIA (2022). NVENC Video Encoder API Prog Guide:

Recommended NVENC Settings.

Oros, B.-I. and B

acu, V. I. (2020). Renderlink remote ren-

dering platform for computer games: A webrtc solu-

tion for streaming computer games. In 2020 IEEE

16th International Conference on Intelligent Com-

puter Communication and Processing (ICCP), pages

555–561. IEEE.

Pande, A., Ahuja, V., Sivaraj, R., Baik, E., and Mohapatra,

P. (2013). Video delivery challenges and opportunities

in 4g networks. IEEE MultiMedia, 20(3):88–94.

Quax, P., Monsieurs, P., Lamotte, W., De Vleeschauwer,

D., and Degrande, N. (2004). Objective and subjec-

tive evaluation of the inﬂuence of small amounts of

delay and jitter on a recent ﬁrst person shooter game.

In Proceedings of 3rd ACM SIGCOMM workshop on

Network and system support for games, pages 152–

156.

Rosenberg, J. (2010). Interactive Connectivity Establish-

ment (ICE): A Protocol for Network Address Transla-

tor (NAT) Traversal for Offer/Answer Protocols. RFC

5245.

Rosenberg, Mahy; Matthews, W. C. (2008). Session Traver-

sal Utilities for NAT (STUN). RFC 5389.

Zadtootaghaj, S., Schmidt, S., and M

oller, S. (2018). Mod-

eling gaming qoe: Towards the impact of frame rate

and bit rate on cloud gaming. In 2018 Tenth Interna-

tional Conference on Quality of Multimedia Experi-

ence (QoMEX), pages 1–6. IEEE.

Zhang, X., Chen, H., Zhao, Y., Ma, Z., Xu, Y., Huang, H.,

Yin, H., and Wu, D. O. (2019). Improving cloud gam-

ing experience through mobile edge computing. IEEE

Wireless Communications, 26(4):178–183.

ICEIS 2023 - 25th International Conference on Enterprise Information Systems

730