Everest: A Cloud Platform for Computational Web Services

Oleg Sukhoroslov and Alexander Afanasiev

Institute for Information Transmission Problems, Russian Academy of Sciences, Bolshoy Karetny per. 19, Moscow, Russia

Keywords: Service-Oriented Computing, Computational Web Services, Web Service API, REST, Service Framework,

Cloud Platform, Platform as a Service.

Abstract: The ability to effortlessly reuse and combine existing computational tools is an important factor influencing

research productivity in many scientific domains. While the service-oriented approach proved to be essen-

tial in order to enable wide-scale sharing of applications, we argue that its full potential in scientific compu-

ting is still not realized. In this paper, we present Everest, a cloud platform that supports publication, sharing

and reuse of scientific applications as web services. The underlying approach is based on a uniform repre-

sentation of computational web services and its implementation using REST architectural style. In compari-

son with existing work, Everest has a number of novel features such as the use of PaaS model, flexible bind-

ing of services with externally provisioned computing resources and remotely accessible API.

1 INTRODUCTION

Modern scientific research is often associated with

complex computations and use of high performance

computing resources. In their research scientists

actively use software applications that implement

computational algorithms, methods and models.

The ability to reuse existing computational tools

is one of important factors influencing research

productivity. However, such software often requires

specific expertise in order to install, configure and

run it that is beyond the expertise of an ordinary

researcher. This also applies to configuration and

use of high performance computing resources to run

the software. Finally, researchers increasingly need

to combine multiple tools in order to solve a com-

plex problem, which brings an important issue of

application composition.

The aforementioned problems can be addressed

by provision of scientific applications in the form of

remotely accessible, interoperable services. The use

of service-oriented approach can enable wide-scale

sharing, publication and reuse of applications, as

well as automation of scientific tasks and composi-

tion of applications into new services (Foster, 2005).

While the underlying principles of this approach are

well-known, it is still an open question how to im-

plement it in scientific computing in order to realize

its full potential.

So far, most efforts in this area were focused on

the provision of remote access to scientific tools via

convenient web user interfaces. Examples of such

approach include grid portals (Kacsuk, 2011), sci-

ence gateways (Miller et al., 2010) and scientific

hubs (McLennan and Kennell, 2010). While being

successful among unskilled users, such systems do

not actually expose applications as web services or

provide programming interfaces thus limiting oppor-

tunities for application reuse, composition and inte-

gration with external applications.

This approach is in stark contrast to Web 2.0 ap-

plications and cloud computing services that support

programmatic access via web service based APIs

(Programmable Web, 2013). The proliferation of

Web APIs has spawned development of mashups

(Yu et al., 2008) that combine data, presentation and

functionality from multiple services. Web service

composition tools, such as Yahoo! Pipes, provided

convenient interfaces for building mashups and

making them available to everyone as new services.

This aspect is largely ignored in existing web-

based scientific environments. As a rule, such sys-

tems do not provide tools for application composi-

tion or support workflows only on the level of com-

putational jobs. The notable exception is Galaxy

platform (Afgan et al., 2011) that supports tool com-

position and sharing of produced workflows. At the

same time, Galaxy also doesn’t expose tools and

workflows as services thus limiting their use outside

the platform.

411

Sukhoroslov O. and Afanasiev A..

Everest: A Cloud Platform for Computational Web Services.

DOI: 10.5220/0004941404110416

In Proceedings of the 4th International Conference on Cloud Computing and Services Science (CLOSER-2014), pages 411-416

ISBN: 978-989-758-019-2

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

Other efforts were focused on building tools for

transformation of scientific applications into web

services (Delaitre et al., 2005; Krishnan et al., 2009).

These tools don’t provide user interfaces beyond

basic web forms for service invocation and rely on

existing solutions for web service composition.

While more powerful and well-aligned with SOA

principles, this approach requires more effort in

order to build a convenient environment for scien-

tists. Service developers need an infrastructure to

host services. The environment should provide

mechanisms for service discovery, invocation and

composition taking into account security require-

ments. These mechanisms should be accessible via

convenient user interfaces facilitating the use of

services for problem solving.

Both approaches also require considerable efforts

to integrate an environment with high performance

computing resources and grid infrastructures needed

to run applications. As a rule, existing systems are

tied to a single computing infrastructure and doesn’t

allow users to attach external resources.

The paper presents Everest, a cloud platform for

computational web services that addresses discussed

issues. It combines both approaches discussed above

by exposing computational applications as web ser-

vices with a uniform interface and implementing a

web user interface for creating, sharing and access-

ing services. In contrast to previous work (Afana-

siev et al., 2013), all functionality of the platform is

provided remotely using the Platform as a Service

model.

The paper is structured as follows. Section 2 dis-

cusses the model, interface and implementation of

computational web services that underpin the pro-

posed approach. Section 3 describes architecture and

components of Everest platform in its current im-

plementation. Section 4 concludes and discusses

future work.

2 COMPUTATIONAL WEB

SERVICES

2.1 Service Model

Computational web services (CWS) are the main

entities managed by Everest. On the conceptual

level, CWS represent a special type of web services

targeted at processing computationally intensive

requests. Such services should support management

of long-running jobs and transfer of job data.

In contrast to generic web service interfaces to

computing infrastructures, such as grids, CWS are

specialized in running specific applications, i.e.,

solving specific classes of problems. Therefore a

request to CWS normally doesn’t contain an execut-

able, but instead represents a set of input parameters

describing a problem to be solved. We will refer to

such requests as service-level jobs or just jobs. The

job results can be represented as a set of output pa-

rameters in the same fashion.

It is responsibility of a CWS implementation to

translate service-level jobs to one or more compute

jobs submitted to underlying computing infrastruc-

ture in order to obtain desired results. Therefore

CWS implement more specialized and high level

interfaces than computing infrastructures. This

makes it possible to hide the complexity of running

compute jobs from service users and to enable trans-

parent use of resources from multiple infrastructures.

In contrast to stateful web services typically

found in enterprise systems, CWS process each in-

coming job in isolation. The only state managed by

CWS is the state of processed jobs, so all data need-

ed for a job should be provided in a request. While

such restriction leaves out interactive session-based

applications, it aligns well with the majority of com-

putational tools such as solvers. This restriction also

contributes to scalability properties of CWS.

The above description of CWS is rather general

and can be related to a large class of services found

not only in the scientific computing domain. Never-

theless, it explains the motivation and reasons be-

hind implementation of CWS in Everest.

2.2 Service Interface

In technical terms, CWS can be implemented using

any web service technology or style. Such freedom

and lack of standards for implementation of CWS

led to a multitude of approaches introduced by dif-

ferent systems. In order to facilitate reuse and com-

position of CWS implemented by different parties it

is crucial to unify service interfaces.

The described model of CWS makes it possible

to introduce a uniform service interface consisting of

four operations:

 Job submission (as a set of input parameters);

 Retrieval of job state and results (as a set of

output parameters);

 Job cancellation;

 Retrieval of service description (including de-

scription of input and output parameters).

All services implementing this interface support

the same set of operations but can accept and return

different sets of parameters. The last operation ena-

bles introspection of service parameters in order to

CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience

412

facilitate construction of job submission requests

and processing of job results.

The described uniform interface follows an ap-

proach used by the HTTP protocol (Fielding et al.,

1999) which defines a standard set of methods to

indicate the desired action to be performed on the

web resource identified by URI. This approach and

the underlying REST architectural style (Fielding,

2000) proved to be essential to make the Web suc-

cessful. In contrast, SOAP-based web services

(Curbera et al., 2002) encourage creation of special-

ized interfaces and operations which provides a

greater flexibility but complicates service reuse.

2.3 Interface Implementation

Using the REST architectural style the described

uniform interface can be implemented as follows

(Afanasiev et al., 2013). CWS represents a RESTful

web service (Richardson and Ruby, 2008) identified

by a Service URL. A job managed by the service is

identified by a Job URL.

The Service resource supports the following

HTTP methods:

 GET, which returns service description;

 POST, which performs job submission and

returns a Job URL.

The Job resource created during job submission

supports the following methods:

 GET, which returns the job state and results (if

any available);

 PUT, which enables changing of the job state

(e.g., job cancellation);

 DELETE, which destroys the job resource and

deletes its data.

An additional File resource can be introduced to

identify files passed to or returned by a service via

its parameters. In such case parameter value contains

a file URL. This enables passing large amount of

data, which is particularly important for scientific

computing, via appropriate data transfer mecha-

nisms, such as HTTP, FTP or GridFTP.

Consider resource representation formats and

means of describing service parameters.

The most widely used data representation for-

mats for web services are XML and JSON. Among

these JSON has been chosen for the following rea-

sons. First, JSON provides more compact and reada-

ble representation of data structures, while XML is

focused on representation of arbitrary documents.

Second, JSON supports native integration with Ja-

vaScript language facilitating creation of web user

interfaces for CWS.

The description and validation of service pa

rameters can be accomplished by means of JSON

Schema (JSON Schema, 2013), a de facto standard

for defining the structure of JSON data.

2.4 Service Implementation

Consider an implementation of computational web

service. Just like its interface, the inner workings of

CWS follow a common pattern. A service listens to

incoming job requests over HTTP and performs the

following steps for each request:

 Authenticate and authorize the client;

 Parse and validate input parameters from the

request;

 Translate input parameters to a compute job

specification (executable, arguments, input

and output files, etc.);

 Submit the compute job to configured compu-

ting resource;

 Monitor compute job state and provide this

information to the client;

 Retrieve compute job results upon job com-

pletion;

 Translate compute job results to output pa-

rameters;

 Pass output parameters to the client.

Most of these steps can be implemented in the

same fashion for any service disregarding its appli-

cation domain. The only application specific parts

are the ones that deal with processing of input and

generation of output parameters. This makes it pos-

sible to implement a software framework, which

provides a generic service skeleton that can be con-

figured with the application specific parts (Afana-

siev et al., 2013).

3 EVEREST PLATFORM

Everest is a cloud platform for computational web

services that is based on considerations presented in

the previous section. It implements a development

framework and a hosting environment for CWS that

adhere to the described uniform interface.

In contrast to traditional service development

tools, Everest follows the Platform as a Service

cloud delivery model by providing all its functionali-

ty via remote interfaces. A single instance of the

platform can be accessed by many users in order to

create, run and share services with each other with-

out the need to install additional software on users’

computers.

Another distinct feature of Everest is the ability

to connect services with external computing

Everest:ACloudPlatformforComputationalWebServices

413

Figure 1: Architecture of Everest platform.

resources. That means that service developer can

provide computing resource for running service jobs.

This feature is useful in situations when platform’s

computing infrastructure has limited capacity or

service developers need more control over an execu-

tion environment. A service user can also override

the default resource by providing another resource

for running her jobs.

The architecture of Everest is represented in Fig-

ure 1. Consider each of the platform’s components

in detail.

3.1 REST API

REST API is the platform’s application program-

ming interface implemented as a RESTful web ser-

vice. It serves as a single entry point for all clients,

including the web user interface.

The API includes operations for accessing and

manipulating entities managed by the platform such

as users, user groups, services, jobs and resources. In

particular, the API implements operations from the

uniform interface of CWS described in Section 2. It

also provides additional operations related to ser-

vices, such as service configuration and service dis-

covery.

For each incoming request the API performs au-

thentication of a client. The default authentication

mechanism is implemented by means of OAuth

bearer tokens (Jones and Hardt, 2012). A client can

obtain a token by providing user credentials, i.e.,

username and password.

Upon successful authentication, the API also per-

forms authorization of the requested action. Each

entity managed by the platform has its’ owner. The

default security policy allows access to the entity

only to its’ owner. An owner can modify this policy,

e.g., a service owner can specify a white list of users

or user groups that are allowed to use the service.

The API relies on the data storage component to

read and write information about platform entities. It

also communicates with the compute bridge by pass-

ing it incoming job requests.

3.2 Web User Interface

Web user interface (Web UI) provides a convenient

graphical interface for interaction with the platform.

It is implemented as a JavaScript application that

can run in any modern web browser without installa-

tion of additional software.

Web UI provides access to all functionality of

the platform. It is built directly on top of the REST

API, i.e., it uses the same interface as all other plat-

form clients. While technically more challenging

than traditional server-side web interface generation,

this approach allowed us to reduce the server com-

plexity and directly test the REST API.

The most important parts of Web UI are service

configuration and job submission interfaces.

Service configuration interface is used to create

new and edit existing services. It is implemented as

a set of web forms that enable a user to specify all

required information about a service including:

CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience

414

 Service metadata (name, description, etc.);

 Input parameters;

 Output parameters;

 Job template;

 Required files;

 Computing resource to run service jobs;

 Security configuration (white list, etc.).

Job template represents an application specific

part of service configuration that is used by the plat-

form to translate service requests to compute jobs. It

includes the following information:

 Job command template that supports input pa-

rameter substitution;

 Mapping of input parameters to job input files;

 Mapping of job output files to output parame-

ters.

Job submission interface is used to submit job

requests to services. This interface is dynamically

generated for each service according to the descrip-

tion of its input parameters. This information is also

used to validate the request before its submission to

the API. The implemented approach frees service

developer from manual implementation of job sub-

mission forms.

3.3 Compute Bridge

Compute bridge is the core component of Everest

that performs translation of service requests (ser-

vice-level jobs) to compute jobs. It acts as a media-

tor between REST API and Compute subsystem that

manages execution of compute jobs.

All job requests coming to REST API are asyn-

chronously forwarded to the bridge. For each request

the bridge performs translation of input parameters

to a compute job specification according to the ser-

vice configuration. The bridge also downloads input

files that are referenced in the request.

A compute job specification produced by the

bridge includes a command to be run, a list of job

input files, a list of output files and a resource to run

the job. The job specification is passed to the Com-

pute subsystem for execution. The bridge also sub-

scribes to notifications about the job state changes

and translates these changes to the data storage.

Upon the job completion the bridge performs

translation of job output files to output parameters

according to the service configuration and saves the

final result in the data storage.

3.4 Compute

Compute subsystem manages execution of compute

jobs received from the bridge on computing re-

sources attached to the platform. It performs all rou-

tine tasks related to staging of input files, submitting

a job, monitoring a job state and downloading job

results. All job state changes are translated to the

bridge. Compute subsystem also monitors the state

of all resources attached to the platform.

A computing resource can be attached to the

platform by any user. A resource owner can config-

ure a policy for accessing the resource. Any allowed

user can bind the resource to any service.

Currently two approaches for integration with

computing resources have been implemented. These

approaches represent different tradeoffs between

ease of integration and resource protection.

The first approach relies on existing remote ac-

cess mechanisms supported by resources, such as

SSH. In this case such mechanism is configured to

accept credentials provided by the platform, e.g.,

SSH keypair. This enables the platform to directly

execute any commands on the attached resource.

Such approach makes it easy to attach computing

servers or clusters without the need to install addi-

tional software on a resource. However, it also

brings some issues. For example, sometimes it can

be desirable to restrict commands that can be run by

the platform, or a user can’t provide full access to its

account due to resource usage policy. This approach

also doesn’t support integration with resources that

are not accessible remotely, such as desktop com-

puters or resources behind a firewall.

The second approach addresses the mentioned is-

sues by running a special agent on each attached

resource. The agent acts as a mediator between the

platform and the resource. This approach requires

deployment of additional software on resources, but

enables implementation of arbitrary security policies

on the agent level and integration with resources

behind a firewall. The communication between an

agent and the platform is implemented through the

WebSocket protocol (Fette and Melnikov, 2011).

Upon startup an agent initiates connection with the

platform to establish a bidirectional communication

channel. This channel is used only for control and

status messages. Job data transfer is performed by an

agent via the HTTP protocol.

Currently the Compute subsystem doesn’t per-

form resource selection during the submission of

compute job. It is assumed that each service has only

one resource linked with it. A user can override this

resource with another one in a job request. In any

case, the job specification passed to the Compute

contains a single resource reference.

Everest:ACloudPlatformforComputationalWebServices

415

3.5 Data Storage

Data storage component implements long-term stor-

age of information related to all entities managed by

the platform. It is based on MongoDB (MongoDB,

2013), a document-oriented database system. Native

support for JSON data structures with dynamic

schemas proved to be useful during the platform

development. The data storage also relies on GridFS

feature of MongoDB for storing job data and other

files.

4 CONCLUSIONS

The paper presented Everest, a cloud platform that

supports development and hosting of computational

web services. In comparison with existing work,

Everest has a number of novel features such as the

use of PaaS model, flexible binding of services with

externally provisioned computing resources and

remotely accessible API. While the platform doesn’t

provide its own infrastructure to run compute jobs as

classic PaaS examples, it can handle the problems of

resource allocation, job management, data transfer

and so on without the interference of users.

Everest is work in progress. The platform is cur-

rently undergoing experimental evaluation and pilot

deployment. The results of this work and application

case studies will be presented in future publications.

Future work will also address remaining gaps in

platform’s functionality and other challenges, such

as development of programming APIs, supporting

service composition, implementation of job schedul-

ing mechanism enabling binding of multiple re-

sources to a service, integration with grid infrastruc-

tures, and optimization of data transfer for services

handling large amounts of data.

ACKNOWLEDGEMENTS

The work is supported by the Russian Foundation

for Basic Research (grant No. 14-07-00309 А).

REFERENCES

Foster, I. (2005). Service-Oriented Science. Science,

308(5723), 814-817.

Kacsuk, P. (2011). P-GRADE portal family for grid infra-

structures. Concurrency and Computation: Practice

and Experience, 23(3), 235-245.

Miller, M. A., Pfeiffer, W., & Schwartz, T. (2010). Creat-

ing the CIPRES Science Gateway for inference of

large phylogenetic trees. In Gateway Computing Envi-

ronments Workshop (GCE), 2010 (pp. 1-8). IEEE.

McLennan, M., Kennell, R. (2010). HUBzero: A Platform

for Dissemination and Collaboration in Computational

Science and Engineering. Computing in Science and

Engineering, 12(2), pp. 48-52.

ProgrammableWeb (2013). ProgrammableWeb -

Mashups, APIs, and the Web as Platform.

http://www.programmableweb.com/.

Yu, J., Benatallah, B., Casati, F., & Daniel, F. (2008).

Understanding mashup development. Internet Compu-

ting, IEEE, 12(5), 44-52.

Afgan, E., Goecks, J., Baker, D., Coraor, N., Nekrutenko,

A., Taylor, J. (2011). Galaxy - a Gateway to Tools in

e-Science. In: K. Yang, Ed. (ed) Guide to e-Science:

Next Generation Scientific Research and Discovery,

Springer, pp. 145-177.

Delaitre, T., Kiss, T., Goyeneche, A., Terstyanszky, G.,

Winter, S., Kacsuk, P. (2005). GEMLCA: Running

Legacy Code Applications as Grid Services. Journal

of Grid Computing, Vol. 3. No. 1-2, pp. 75-90.

Krishnan, S., Clementi, L., Ren, J., Papadopoulos, P., Li,

W. (2009). Design and Evaluation of Opal2: A Toolkit

for Scientific Software as a Service. In 2009 IEEE

Congress on Services (SERVICES-1 2009), pp.709-

716.

Afanasiev, A., Sukhoroslov, O., Voloshinov, V. (2013).

MathCloud: Publication and Reuse of Scientific Ap-

plications as RESTful Web Services. In Parallel

Computing Technologies (PaCT 2013). Lecture Notes

in Computer Science, Vol. 7979, Springer, pp. 394-

408.

Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter,

L., Leach, P., & Berners-Lee, T. (1999). Hypertext

transfer protocol—HTTP/1.1. Internet RFC 2616.

Fielding, R. T. (2000). Architectural Styles and the Design

of Network-based Software Architectures. Ph.D. dis-

sertation, University of California, Irvine.

Curbera, F., Duftler, M., Khalaf, R., Nagy, W., Mukhi, N.,

& Weerawarana, S. (2002). Unraveling the Web ser-

vices web: an introduction to SOAP, WSDL, and

UDDI. Internet Computing, IEEE, 6(2), 86-93.

Richardson, L., & Ruby, S. (2008). RESTful Web Services,

O’Reilly.

JSON Schema (2013). JSON Schema and Hyper-Schema.

http://json-schema.org/.

Jones, M., & Hardt, D. (2012). The OAuth 2.0 Authoriza-

tion Framework: Bearer Token Usage. RFC 6750.

Fette, I., & Melnikov, A. (2011) The WebSocket Protocol.

RFC 6455, Internet Engineering Task Force.

MongoDB (2013). MongoDB. http://www.mongodb.org/

CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience

416