A Novel Method for the Inverse QSAR/QSPR based on Artiﬁcial Neural

Networks and Mixed Integer Linear Programming with Guaranteed

Admissibility

Naveed Ahmed Azam

1 a

, Rachaya Chiewvanichakorn

, Fan Zhang

, Aleksandar Shurbevski

1 b

Hiroshi Nagamochi

and Tatsuya Akutsu

2 c

Department of Applied Mathematics and Physics, Kyoto University, Kyoto, Japan

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji-city, Japan

Keywords:

QSAR/QSPR, Artiﬁcial Neural Networks, Mixed Integer Programming, Feature Vectors, Chemical Graphs.

Abstract:

Inverse QSAR/QSPR is a well-known approach for computer-aided drug design. In this study, we propose

a novel method for inverse QSAR/QSPR using artiﬁcial neural network (ANNs) and mixed integer linear

programming. In this method, we introduce a feature function f that converts each chemical compound G

into a vector f (G) of several descriptors of G. Next, given a set of chemical compounds along with their

chemical properties, we construct a prediction function ψ with an ANN so that ψ( f (G)) takes a value nearly

equal to a given chemical property for many chemical compounds G in the set. Then, given a target value y

∗

of the chemical property, we conversely infer a chemical structure G

∗

having the desired property y

∗

in the

following way. We formulate the problem of ﬁnding a vector x

∗

such that (i) ψ(x

∗

) = y

∗

and (ii) there exists a

chemical compound G

∗

such that f (G

∗

) = x

∗

(if one exists over all vectors x

∗

in (i)) as a mixed integer linear

programming problem (MILP). In an existing method for the inverse QSAR/QSPR, the second condition (ii)

was not guaranteed. For acyclic chemical compounds and some chemical properties such as heat of formation,

boiling point, and retention time, we conducted computational experiments.

1 INTRODUCTION

Drug design is one of the major goals of bioin-

formatics and chemo-informatics. Many computa-

tional methods have been proposed for computer-

aided drug design. Forward QSAR/QSPR (quanti-

tative structure-activity and structure-property rela-

tionships) is the problem of computing a numeri-

cal relationship between a chemical structure and a

biological activity from a given set of known data

and using this relationship to predict the activity of

unknown structures, whereas inverse QSAR/QSPR,

given such a relationship, attempts to ﬁnd a chem-

ical compound that has a desired activity. There-

fore, inverse QSAR/QSPR (quantitative structure-

activity and structure-property relationships) is one of

the major approaches for the purpose (Miyao et al.,

2016; Skvortsova et al., 1993). In this approach, de-

sired chemical activities and/or properties are speci-

https://orcid.org/0000-0002-7941-3419

https://orcid.org/0000-0001-9224-6929

https://orcid.org/0000-0001-9763-797X

ﬁed along with some constraints and then chemical

compounds satisfying these conditions are inferred,

where chemical compounds are usually represented

as chemical graphs. In many cases, it is also for-

mulated as an optimization problem to ﬁnd a chem-

ical graph maximizing (or minimizing) an objective

function under various constraints, where objective

functions reﬂect certain chemical activities or prop-

erties. Objective functions are often derived from a

set of training data consisting of known molecules

and their activities/properties using statistical learn-

ing methods.

Recently, novel approaches have been proposed

for inverse QSAR/QSPR using Artiﬁcial Neural Net-

work (ANN) and deep learning technologies. For ex-

ample, recurrent neural networks (Segler et al., 2017;

Yang et al., 2017), variational autoencoders (G

omez-

Bombarelli et al., 2018), and grammar variational au-

toencoders (Kusner et al., 2017) have been applied. In

these approaches, neural networks are trained using

known compound/activity data and then novel chem-

ical graphs are obtained by solving a kind of inverse

problems on neural networks, that is, by looking for

Azam, N., Chiewvanichakorn, R., Zhang, F., Shurbevski, A., Nagamochi, H. and Akutsu, T.

A Novel Method for the Inverse QSAR/QSPR based on Artiﬁcial Neural Networks and Mixed Integer Linear Programming with Guaranteed Admissibility.

DOI: 10.5220/0008876801010108

In Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2020) - Volume 3: BIOINFORMATICS, pages 101-108

ISBN: 978-989-758-398-8; ISSN: 2184-4305

101

input values for a trained neural network that will pro-

duce a desired output value. Usually, these inverse

problems are solved using various statistical meth-

ods. However, the solutions found by statistical meth-

ods are generally not optimal. In order to cope with

this issue, novel mixed integer linear programming

(MILP)-based methods have been proposed (Akutsu

and Nagamochi, 2019) for ANNs with rectiﬁed lin-

ear unit (ReLU) functions and sigmoid functions. In

their methods, activation functions on neurons are ef-

ﬁciently encoded as piece-wise linear functions so

as to represent ReLU functions exactly and sigmoid

functions approximately.

Chiewvanichakorn et al. (2020) recently com-

bined these previous approaches; efﬁcient enumera-

tion of tree-like graphs (Fujiwara et al., 2008), and

MILP-based formulation of the inverse problem on

ANNs (Akutsu and Nagamochi, 2019). This com-

bined framework for QSAR/QSPR mainly consists

of two phases. The ﬁrst phase solves (I) PREDIC-

TION PROBLEM, where each chemical compound G

is transformed into a feature vector f (G) and an ANN

N is trained from existing chemical compounds and

their values a(G) on a chemical property π to obtain

a prediction function ψ

so that a(G) is predicted as

( f (G)). The second phase solves (II) INVERSE

PROBLEM, where (II-a) given a target value y

∗

of the

chemical property π, a feature vector x

∗

is inferred

from the trained ANN N so that ψ

∗

) is close to

∗

and (II-b) then a set of chemical structures G

∗

such

that f (G

∗

) = x

∗

is enumerated.

We call an inferred feature vector admissible if

it admits a chemical structure whose feature vec-

tor is equal to the inferred feature vector. How-

ever the above-mentioned previous method (Chiew-

vanichakorn et al., 2020) has no guarantee that ev-

ery inferred vector in (II-a) is admissible. In their

experiment, the probability that an inferred vector is

admissible was ranging from 20% to 50% on aver-

age. In this paper, we improve the method so that (i)

every feature vector x

∗

inferred from a trained ANN

N in (II-a) becomes admissible; and (ii) no chemi-

cal structure exists for a given target value when no

feature vector is inferred from the ANN N . Since

the new idea can be applied to any inference prob-

lem of a trained ANN, we describe the method in a

more general setting (see Section 2.1). We conducted

some preliminary computational experiments to in-

fer acyclic chemical compounds on several chemical

properties.

2 PRELIMINARY

Let R and Z denote the sets of reals and non-negative

integers, respectively. For two integers a and b, let

[a,b] denote the set of integers i with a ≤ i ≤ b.

2.1 Inference of Structures based on

ANNs and MILPs

This section reviews the previous method for the in-

verse QSAR/QSPR (Chiewvanichakorn et al., 2020),

and then shows how to improve the theoretical frame-

work of the method so as to increase the chance of

ﬁnding a larger number of solutions to the inverse

QSAR/QSPR. We here describe their and our meth-

ods in a general setting where the object to be inferred

with an ANN is not necessarily chemical compounds.

Fig. 1 illustrates the general setting.

x:=f(S)

a(S)

(x)

ANN N=(A,B,w)

M(x,y,z;C

)

M(x,y,z,s;C

)

MILP

conversion

to feature vector

f : feature

function

: prediction

function

: a set of

structures

a: propety

function

admissible

inadmissible

S S*

necessary condition only

necessary & sufficent condition

y*: target

vector

f(S*)

input

output

Figure 1: An illustration of a property function a, a fea-

ture function f , a prediction function ψ

and MILPs

M (x,y,z, s;C

) and M (x,y,z, s;C

We start to formulate a prediction problem and its

inverse problem. Let S be a set of structures, and

a : S → R

be a property function for an integer ` ≥ 1

We assume that the function a is not explicitly given

and we are given the value a(S) only for structures S

in some subset S

of S.

(I) PREDICTION PROBLEM: Given a structure S ∈ S

(especially when S 6∈ S

), predict the value a(S) based

on the information {(S,a(S)) | S ∈ S

};

BIOINFORMATICS 2020 - 11th International Conference on Bioinformatics Models, Methods and Algorithms

102

(II) INVERSE PROBLEM: Given a vector y ∈ R

, gen-

erate a structure S ∈ S such that a(S) = y.

For Problem (I), we use artiﬁcial neural networks

(ANN) in a standard way. We ﬁrst deﬁne a feature

space R

for an integer k ≥ 1 and a feature function

f : S → R

that converts each structure S ∈ S into a

vector f (S) ∈ R

, called the feature vector of S We

next construct an ANN N = (A,B,w), which consists

of an architecture A, a set B of activation functions and

a vector w ∈ R

that consists of arc weights and node

bias for some integer h ≥ 1. An ANN N = (A,B,w)

deﬁnes a prediction function ψ

: R

→ R

. For

some subset S

of S

, we try to choose an ANN

N = (A,B,w) so that the prediction ψ

( f (S)) of

each structure S ∈ S

is close to the value a(S) There

have been developed many learning algorithms that

choose such an ANN.

For Problem (II), Akutsu and Nagamochi (Akutsu

and Nagamochi, 2019) proposed a formulation based

on MILP assuming that an ANN N and its prediction

function ψ

are given. Let N = (A,B, w) be an ANN

such that all activation functions in B are continuous

piece-wise linear functions, n

denote the number of

nodes in the architecture A in N , and n

denote the

total number of break-points over all activation func-

tions in B. (We approximate activation functions that

are not piece-wise linear with piece-wise linear func-

tions, where the approximation may cause some nu-

merical errors in computing the predicted values with

ψ. See (Akutsu and Nagamochi, 2019) for details.)

Then it is known (Akutsu and Nagamochi, 2019)

that there is an MILP M (x, y, z; C

) without an objec-

tive function that consists of variable vectors x ∈ R

y ∈ R

and z ∈ R

and a set C

of integer restrictions

and linear equalities on these variables such that

- p = O(n

+ n

) and |C

| = O(n

+ n

);

- ψ

(x) = y if and only if there is a vector z ∈

such that (x, y, z) is feasible to the system

M (x,y,z;C

When a target value y

∗

∈ R

is speciﬁed for the

variable vector y ∈ R

, we can ﬁnd a vector x

∗

∈ R

such that ψ

∗

) = y

∗

by solving MILP M (x, y, z; C

)

with y = y

∗

. However, in general, there may be no

structure S

∗

∈ S such that f (S

∗

) = x

∗

, since no in-

formation on the structure of S is included in the

formulation of MILP M (x,y,z;C

). We call a vec-

tor x ∈ R

admissible if it admits a structure S

∗

∈ S

such that f (S

∗

) = x

∗

. To increase the chance to

have an admissible vector x ∈ R

, we impose a set

of linear constraints on x ∈ R

that are “neces-

sary conditions” for a vector x ∈ R

to be admissible.

Let M (x,y,z;C

) denote the MILP obtained from

M (x,y,z;C

) by adding the set C

of linear constraints

on x ∈ R

. The previous method (Chiewvanichakorn

et al., 2020) for the inverse QSAR/QSPR followed

this idea. However there was no guarantee that every

inferred vector x

∗

by this method is admissible

In this paper, we impose to MILP M (x, y, z; C

)

a new set C

of constraints so that, for a given tar-

get value y

∗

, (i) every feature vector x

∗

inferred from

the new system becomes admissible; and (ii) no struc-

ture exists in S when the new system has no feasible

solution. For this, we introduce a new variable vec-

tor s ∈ R

for some integer h, and prepare a set C

of linear constraints on x ∈ R

and s ∈ R

to obtain

an MILP M (x, s; C

) that consists of variable vectors

x ∈ R

and s ∈ R

and a set C

of integer restrictions

and linear inequalities such that

- x ∈ R

is admissible if and only if there is a vec-

tor s ∈ R

such that (x,s) is feasible to MILP

M (x,s;C

); and

- s ∈ R

forms a structure S ∈ S if and only if there

is a vector x ∈ R

such that (x, s) is feasible to

MILP M (x,s;C

That is, the constraint set C

is “necessary and sufﬁ-

cient condition” for a vector x ∈ R

to be admissible.

Now we combine the two systems M (x, y, z; C

) and

M (x,s;C

) to obtain an MILP M (x, y, z,s;C

) that

consists of variable vectors x ∈ R

, y ∈ R

, z ∈ R

and

s ∈ R

and a set C

of constraints on x, y and z and a

set C

of constraints on x and s. Hence given a target

vector y

∗

∈ R

, we solve MILP M (x, y, z,s;C

) to

ﬁnd a feasible vector (x

∗

), where the vector

∗

∈ R

satisﬁes ψ

∗

) = y

∗

, and the vector s

∗

∈ R

forms a structure S

∗

∈ S such that f (S

∗

) = x

∗

. Based

on the new idea, we design a method for the inverse

QSAR/QSPR in Section 3. Our M (x, s; C

) in the

method is presented in Section 4.

2.2 Modeling of Chemical Compounds

Graphs. Let H = (V,E) be a graph with a set V of

vertices and a set E of edges. For a vertex v ∈ V , the

set of neighbors of v in H is denoted by N

(v), and

the degree deg

(v) of v is deﬁned to be |N

(v)|. The

length of a path is deﬁned to be the number of edges

in the path. The distance dist

(u,v) between two ver-

tices u,v ∈ V is deﬁned to be the minimum length of

a path connecting u and v in H. The diameter dia(H)

of H is deﬁned to be the maximum distance between

two vertices in H. The sum-distance smdt(H) of H

is deﬁned to be the sum of distances over all vertex

pairs.

Chemical Graphs. We represent the graph struc-

ture of a chemical compound as a graph with labels

A Novel Method for the Inverse QSAR/QSPR based on Artiﬁcial Neural Networks and Mixed Integer Linear Programming with Guaranteed

Admissibility

103

on vertices and multiplicity on edges in a hydrogen-

suppressed model. Let Λ be a set of labels each of

which represents a chemical element such as C (car-

bon), O (oxygen), N (nitrogen), S (sulfur) and so on,

where we assume that Λ does not contain H (hydro-

gen). Let mass(a) and val(a) denote the mass and

valence of a chemical element a ∈ Λ, respectively.

A chemical graph in a hydrogen-suppressed

model is deﬁned to be a tuple G = (H,α,β) of a graph

H = (V,E), a function α : V → Λ (called an atom-

function) and a function β : E → {1,2,3} (called a

bond-function) such that

(i) H is connected; and

(ii)

∑

uv∈E

β(uv) ≤ val(α(u)) for each vertex u ∈ V.

Figure 2 illustrates an example of a chemical graph

G = (H,α,β), where the structure of the vector f (G)

is explained in Section 2.3.

We introduce a total order < over the elements

in Λ according to their mass values; i.e., we write

a < b for chemical elements a,b ∈ Λ with mass(a) <

mass(b). Choose a set Γ

of tuples γ = (a, b,k) ∈ Λ×

Λ × [1,3] such that a < b. For a tuple γ = (a,b, k) ∈

Λ × Λ × [1,3], let γ denote the tuple (b, a,k). Set

= {γ | γ ∈ Γ

}, Γ

= {(a,a,k) | a ∈ Λ, k ∈ [1,3]}

and Γ = Γ

∪ Γ

. A pair of two atoms a and b joined

with a bond of multiplicity k is denoted by a tuple

γ = (a, b,k) ∈ Γ.

f(G)=(n(H)=6, n

(H)=3, n

(H)=3,

(H)=1, n

(H)=0, n

(H)=0,

dia(H)=0.667, smdt(H)=0.1435,

(G)=4, n

(G)=1, n

(G)=1, ms(G)=13,

(G)=1, b

(G)=0, n

(C,C,1)

(G)=2,

(C,C,2)

(G)=1, n

(C,O,1)

(G)=1, n

(C,N,1)

(G)=1 )

Figure 2: An example of a chemical graph G = (H =

(V, E),α,β) and its feature vector f (G).

2.3 Descriptors in Feature Vectors

In our method, we use only graph-theoretical descrip-

tors for deﬁning a feature vector, which facilitates

our designing an algorithm for constructing graphs.

Given a chemical graph G = (H = (V,E),α,β), we

deﬁne a feature vector f (G) that consists of the fol-

lowing eight kinds of descriptors: n(H): the number

of vertices; n

(H) (d ∈ [1,6]): the number of vertices

of degree d in H; dia(H): the diameter of H divided

by |V |; smdt(H): the sum of distances of H divided

by |V |

; n

(G) (a ∈ Λ): the number of vertices with

label a ∈ Λ; ms(G): the average mass of atoms in

G; b

(G) (i = 2, 3): the number of double and triple

bonds; n

(G) (γ = (a,b, k) ∈ Γ): the number of label

pairs {a,b} with multiplicity k; Figure 2 illustrates an

example of a feature vector f (G).

3 A METHOD FOR INFERRING

CHEMICAL GRAPHS

Analogously with the previous method (Chiew-

vanichakorn et al., 2020), our new method for the

inverse QSAR/QSPR also solves two problems: (I)

PREDICTION PROBLEM and (II) INVERSE PROBLEM

introduced in Section 2.1. For a speciﬁed chemical

property such as boiling point, we denote by a(G) the

value of the speciﬁed chemical property for a given

chemical compound G, which is represented by a

chemical graph G = (H,α,β). We solve Problem (I)

for the inverse QSAR/QSPR as follows.

Phase 1.

1. Prepare a data set D = {(G

,a(G

)) | i =

1,2,.. ., p} for a speciﬁed chemical property,

where G

is a chemical graph. Set reals a, a ∈ R

such that a ≤ min{a(G

) | i = 1,2,.. ., p} and a ≥

max{a(G

) | i = 1,2,... , p}.

2. Set a graph class G to be a set of chemical graphs

such that G ⊇ {G

| i = 1,2, ..., p}. Introduce a

function f : G → R

for a positive integer K, as

we have observed in Section 2.3. We call f (G)

the feature vector of G ∈ G, and call each entry of

a vector f (G) a descriptor of G.

3. Select an architecture A and a set B of activation

functions to construct an ANN N that, given a

vector in R

, returns a real in the range [a,a].

Using the data set D as training data, choose a

vector w of arc weights and node biases of N

to construct a prediction function ψ

with N =

(A,B,w) so that ψ

( f (G)) takes a value nearly

equal to a(G) for many chemical graphs in D.

In this paper, we solve Problem (II) for the in-

verse QSAR/QSPR by treating the following infer-

ence problems.

(II-a) Inference of Vectors

Input: A real y

∗

∈ [a, a].

Output: Vectors x

∗

∈ R

and s

∗

∈ R

such that

∗

) = y

∗

and s

∗

forms a chemical graph G

∗

∈ G

with f (G

∗

) = x

∗

(II-b) Inference of Graphs

Input: A vector x

∗

∈ R

Output: All graphs G

∗

∈ G such that f (G

∗

) = x

∗

In the previous method (Chiewvanichakorn et al.,

2020), they tried to ﬁnd a vector x

∗

∈ R

such that

∗

) = y

∗

in (II-a), which may not be admissible.

In this paper, we include into (II-a) another vector s

that represents a chemical graph so that an inferred

vector x

∗

becomes admissible. To treat Problem (II-

a), we assume that an ANN N with a piecewise-linear

BIOINFORMATICS 2020 - 11th International Conference on Bioinformatics Models, Methods and Algorithms

104

activation function (such as a ReLU) is given. As ob-

served in Section 2.1, there is an MILP M (x, y, z; C

)

that consists of variable vectors x ∈ R

, y ∈ R and

z ∈ R

for some integer p and a set C

of constraints

on these variables such that ψ

∗

) = y

∗

if and only if

there is a vector (x

∗

) feasible to M (x,y,z;C

We here show how to construct an MILP M (x, s; C

)

that consists of variable vectors x ∈ R

and s ∈ R

and

a constraint set C

such that x

∗

∈ R

is admissible and

∗

∈ R

forms a chemical graph G

∗

∈ G if and only if

∗

) is feasible to MILP M (x,s; C

). A more pre-

cise description of MILP M (x, s; C

) will be given in

Section 4. Given a value y

∗

∈ R, we can ﬁnd an ad-

missible vector x

∗

∈ R

such that ψ

∗

) = y

∗

(if one

exists) by solving the MILP M (x,y, z,s; C

We design an algorithm to Problem (II-b) based

on the branch-and-bound method (see (Fujiwara et al.,

2008) for enumerating acyclic chemical compounds).

Phase 2.

4. Formulate Problem (II-a) as an MILP

M (x,y,z,s;C

) based on N . Find a set

∗

of admissible vectors x

∗

∈ R

such that

(1 − ε)y

∗

≤ ψ

∗

) ≤ (1 + ε)y

∗

for a tolerance ε

set to be a small positive real. A vector s

∗

∈ R

obtained together with x

∗

forms a chemical graph

∗

∈ G such that f (G

∗

) = x

∗

5. To solve Problem (II-b), enumerate all graphs

∗

∈ G such that f (G

∗

) = x

∗

for each admissible

vector x

∗

∈ F

∗

In our experiment in Section 5, we further im-

pose another set C

of constraints on the descriptors

in vector x not to change the feasibility of MILP

M (x,y,z,s;C

) but to try to reduce the search

space required when we solve the MILP with some

solver. Also we ﬁx the numbers of vertices and ver-

tices of degree 1 and the diameter to speciﬁed integers

∗

, n

∗

and dia

∗

, respectively in Section 5.

In Step 5 of the previous method (Chiew-

vanichakorn et al., 2020), some vector x

∗

∈ F

∗

may

not be admissible; i.e., there may not exist a graph

∗

∈ G such that f (G

∗

) = x

∗

. As discussed in Sec-

tion 2.1, we formulate an MILP in Phase 2 so that

every vector x

∗

∈ F

∗

obtained in Step 4 is admissible.

4 MILPs FOR REPRESENTING

ACYCLIC CHEMICAL GRAPHS

In this section, we propose a formulation of the MILP

M (x,s;C

) introduced in Section 2.1. For space lim-

itation, we show a sketch of the MILP. In our MILP,

we construct a tree H with n vertices and diameter

dia

∗

by selecting n − 1 pairs of vertices in an n × n

adjacency matrix (or a complete graph K

with n ver-

tices). For this, we introduce the following variables

and constraints:

- e(i, j) ∈ {0,1}, 1 ≤ i, j ≤ n (e(i, j) = 1 implies that

is the parent of v

in a tree H rooted at v

);

e( j, i) = 0, 1 ≤ i ≤ j ≤ n,

e(i,i + 1) = 1, i ∈ [1, dia

∗

e(dia

∗

+ 1, i) = 0, i ∈ [dia

∗

+ 2, n],

∑

1≤i< j

e(i, j) = 1, j ∈ [2, n].

For a tree H, we select functions α and β to deﬁne a

chemical graph G = (H,α,β) with the following vari-

ables and constraints:

α(i) ∈ {[a] | a ∈ Λ}, i ∈ [1,n] (

α(i) = [a] implies

α(v

) = a in G);

- δ

(i,a) ∈ {0,1}, i ∈ [1, n], a ∈ Λ (δ

(i,a) = 1

implies α(v

) = a in G);

β(i, j) ∈ [0,3], 1 ≤ i < j ≤ n (

β(i, j) = k implies

β(v

) = k for k ≥ 1 and v

6∈ E for k = 0);

∑

a∈Λ

(i,a) = 1, i ∈ [1, n],

∑

a∈Λ

[a] · δ

(i,a) =

α(i), i ∈ [1, n],

e(i, j) ≤

β(i, j) ≤ 3e(i, j), 1 ≤ i < j ≤ n,

∑

h<i

β(h,i) +

∑

i< j

β(i, j) ≤

∑

a∈Λ

val(a) · δ

(i,a),

i ∈ [1, n].

Among the descriptors in our feature vector, we

show how the number n

(G) of tuple γ ∈ Γ is rep-

resented in our MILP. We omit to show the linear

constraints for the other descriptors. Let Γ

fbn

de-

note the set of forbidden tuples; i.e., Γ

fbn

= (Λ × Λ ×

{1,2,3})\ (Γ

∪ Γ

). We introduce the follow-

ing variables and constraints for descriptor n

(G):

- δ

(i,k) ∈ {0,1}, i = 2,3,.. .,n, k ∈ [1,3] (for a

function β

: [2, n] → [1,3], δ

(i,k) = 1 implies that

is the parent of v

and β(v

) = k in G);

- δ

(i,γ) ∈ {0,1}, i ∈ [2, n],γ ∈ Γ

∪ Γ

(for

a function τ : {2,3,. ..,n} → {γ | γ ∈ Γ

∪ Γ

∪Γ

}, δ

(i,γ) = 1 implies that v

is the parent of

and (α(v

),α(v

),β(v

)) = γ ∈ Γ

∪ Γ

in G);

- n(γ) ∈ [0,n − 1], γ ∈ Γ

∪ Γ

(n(γ) represents

(G));

∑

k∈[1,3]

(i,k) ≤ 1, i ∈ [2, n],

∑

k∈[1,3]

kδ

(i,k) =

∑

h<i

β(h,i), i ∈ [2, n],

A Novel Method for the Inverse QSAR/QSPR based on Artiﬁcial Neural Networks and Mixed Integer Linear Programming with Guaranteed

Admissibility

105

( j, (a,b,k)) ≥ δ

(i,a) + δ

( j, b)

+ δ

( j, k)+ e(i, j ) − 3,

1 ≤ i < j ≤ n,(a,b,k) ∈ Γ

∪ Γ

(i,a) + δ

( j, b) + δ

( j, k)+ e(i, j ) ≤ 3,

1 ≤ i < j ≤ n,(a,b,k) ∈ Γ

fbn

∑

γ∈Γ

∪Γ

(i,γ) = 1, i ∈ [2, n],

∑

i∈[2,n]

(δ

(i,γ) + δ

(i,γ)) = n(γ), γ ∈ Γ

∑

i∈[2,n]

(i,γ) = n(γ), γ ∈ Γ

To reduce the number of graph-isomorphic so-

lutions in this MILP, we require the indexing

,... ,v

of vertices in a tree H to be a depth-ﬁrst

search ordering from v

. This can be attained with the

following variables and constraints:

- dist(i, j) ∈ [0,n], 1 ≤ i < j ≤ n (dist(i, j) represents

dist

));

- dist

lca

(i, j), 1 ≤ i < j ≤ n (dist

lca

(i, j) represents

dist

(w, v

) for the least common ancestor w

of v

and v

);

dist

lca

( j, i) ≥ dist(1, j) − n(1 − e ( j,h)),

1 ≤ j < i < h ≤ n.

5 EXPERIMENTAL RESULTS

We implemented our method for inferring acyclic

chemical graphs and executed on a PC with Intel Core

i5 1.6 GHz CPU and 8GB of RAM running under

the Mac OS operating system version 10.14.4. We

select ﬁve chemical properties: heat of atomization

(HA), heat of formation (HF), boiling point (BP), oc-

tanol/water partition coefﬁcient (KOW) and retention

time (RT).

Results on Phase 1. In Step 1, we used a data set D

of acyclic chemical graphs in (Roy and Saha, 2003)

(resp., (Jalali-Heravi and Fatemi, 2001)) for HA, HF,

BP and KOW (resp., RT). Table 1 shows the size and

range of data sets that we prepared for each chemical

property, where we denote the following:

π: one of the chemical properties HA, HF, BP, KOW

and RT;

|D|: the size of data set D for a chemical property π;

[n,n]: the minimum and maximum number of

vertices in H over data set D.

The set Γ of all tuples γ and the number

K = |Λ| + |Γ| + 12 of descriptors in f (G) in

D are Γ = {(C, O,1),(C,S, 1),(C,C,1), (C,C,2)} and

K = 19 for HA, HF, BP and KOW and Γ =

{(C,O,1), (C,O,2),(C,C,1),(C,C,2)} and K = 18 for

RT. Some of the above parameters may be denoted

with a sufﬁx π such as a

to indicate the underlying

property π ∈ {HA, HF, BP, KOW, RT}.

Table 1: The results on Steps 1 to 3 in Phase 1.

π |D| Λ [n,n] L-time R

HA 128 C,O,S [2,11] 11.240 0.999

HF 88 C,O,S [2,16] 1.437 0.985

BP 131 C,O,S [2,16] 7.335 0.974

KOW 62 C,O,S [2,16] 0.534 0.969

RT 39 C,O [11,16] 9.529 0.890

In Step 2, we set a graph class G to be the set of

all acyclic chemical graphs over the sets Λ and Γ in

Table 1.

In Step 3, we used scikit-learn version 0.21.3

with Python 3.7.4 to construct ANNs N where the

tool and activation function are set to be MLPRe-

gressor and ReLU, respectively. We tested several

different architectures of ANNs for each chemical

property. To evaluate the performance of the resulting

prediction function ψ

with cross-validation, we

partition a given data set D into ﬁve subsets D

i = 1, 2,3,4,5 randomly, where D \ D

is used for a

training set and D

is used for a test set in ﬁve trials

i = 1,2,3, 4,5. Table 1 shows the results on Step 3,

where

L-time: the average time (sec) to construct ANNs

for each trial;

: the average of coefﬁcient of determination

over the ﬁve test sets.

For each chemical property π, we selected the

ANN N that attained the best test R

score among

the ﬁve ANNs to formulate an MILP M (x,y,z;C

) in

the second phase.

Results on Phase 2. We implemented Steps 4 and 5

in Phase 2 as follows.

Step 4. In this step, we solve the MILP

M (x,y,z,s;C

) formulated based on the ANN N

obtained in Phase 1. In our computational experi-

ment, we further impose a set C

of several other

necessary conditions on the descriptors of our fea-

ture vector x ∈ R

so as to reduce the search space

for solving this MILP with a solver. Thus we solve

an MILP M (x,y,z,s; C

). We here omit de-

scribing the additional constraints in C

. To solve an

MILP in Step 4, we use CPLEX (ILOG CPLEX ver-

sion 12.8).

In Step 4, we conducted the following two options

in our experiments.

BIOINFORMATICS 2020 - 11th International Conference on Bioinformatics Models, Methods and Algorithms

106

(a) Choose some target y

∗

∈ [a,a] for each chemical

property in {HA, HF, BP, KOW, RT}; and

(b) For two combinations (π

,π

) ∈ {(HA, KOW), (BP,

HF)}, choose a pair of target values y

∗

∈ [a

i = 1,2. When the same feature function f is used for

two chemical properties π

,π

∈ {HA, HF, BP, KOW,

RT}, we can combine the corresponding MILP

and

MILP

to ﬁnd a vector x

∗

for a pair of target values

∗

∈ [a

], i = 1,2. In our experiment, we choose

a target value y

∗

and ﬁx or bound some descriptors in

our feature vector as follows:

- Choose two target values y

∗

∈ [a,a] for each

chemical property π and one pair of target val-

ues y

∗

∈ [a

], i = 1,2 for each pair (π

,π

) ∈

{(HA, KOW), (BP, HF)};

- Choose some two values n

∗

∈ [n,n] to ﬁx the num-

ber n(H) of vertices;

- Fix the number n

(H) of vertices of degree 1

to each n

∗

∈ {4,5,6} if n

∗

= 10,11,12, 13; n

∗

∈

{5,6,7} if n

∗

= 14, 15; and n

∗

∈ {6, 7,8} if n

∗

16;

- Choose some three values dia

∗

to ﬁx the diameter

dia(H) so that dia

∗

∈ {4,5, 6} if n

∗

= 10; dia

∗

∈

{4,6,7} if n

∗

= 11; dia

∗

∈ {5,7, 8} if n

∗

= 12; and

dia

∗

∈ {6, 7,8} if n

∗

= 13, 14,15,16; and

- Bound the number n

(C,C,2)

(G) of double bonds

between two carbon atoms by choosing each of

the two ranges r ∈ {[0,1/2],(1/2, 1]} so that

(C,C,2)

(G)/(|V | − 1) ∈ r.

This scheme results in 3 × 3 × 2 = 18 MILPs

with different combinations of speciﬁcations for each

pair (y

∗

) (resp., each tuple (y

∗

)) of spec-

iﬁed y

∗

, π ∈ {HA, HF, BP, KOW, RT} and n (resp.,

∗

), (π

,π

) ∈ {(HA, KOW), (BP, HF)} and n).

Each of these MILPs is either feasible or infeasible

and we ﬁnd one feasible vector x

∗

to each feasible

MILP. We set ε = 0.02 in Step 4.

Tables 2 to 4 show the results on Step 4, where we

denote the following:

∗

: a target value in [a, a] for a property π;

∗

: a speciﬁed number of vertices in [n,n];

∗

|: the size of set F

∗

of vectors x

∗

generated from

feasible instances of these 18 MILPs in Step 4;

IP-time: the time (sec.) to solve the 18 MILP

instances to ﬁnd a set F

∗

of vectors x

∗

We observe that MILPs with some restrictions

were infeasible. For example, |F

∗

| = 5 for (y

∗

2700,n

∗

= 10) in Table 2 means that six instances out

of the 18 MILPs were infeasible.

Step 5. In this step, we modiﬁed the algorithm

proposed in (Fujiwara et al., 2008) to enumerate all

acyclic graphs. For each x

∗

∈ F

∗

, we enumerate all

graphs G

∗

∈ G such that f (G

∗

) = x

∗

Tables 2 to 4 show the results on Step 5, where we

denote the following:

∗

(): the number of acyclic chemical graphs G

∗

such that f (G

∗

) = x

∗

, where the number of

chemical graphs already registered in the

database PubChem among the generated

chemical graphs is indicated in parentheses;

G-time: the time (sec.) to generate all chemical

graphs G

∗

such that f (G

∗

) ∈ F

∗

Table 2: Results on Steps 4 and 5 with ε = 0.02.

π y

∗

| IP-time #G

∗

G-time

HA 2700 10 7 0.503 50 (10) 0.0045

HA 2700 11 6 0.799 293 (4) 0.0067

HA 2900 10 7 0.362 8 (8) 0.0042

HA 2900 11 6 0.614 208 (29) 0.0043

HF 70 11 5 0.503 57 (45) 0.0033

HF 70 12 2 0.501 4 (4) 0.0016

HF 90 12 7 1.293 374 (0) 0.0147

HF 90 13 9 1.679 663 (1) 0.0126

BP 190 11 10 2.021 339 (6) 0.0334

BP 190 12 13 3.807 496 (0) 0.1423

BP 220 12 13 4.261 245 (0) 0.1240

BP 220 13 17 5.884 1675 (0) 0.3110

KOW 5 11 6 1.883 94 (4) 0.0047

KOW 5 12 7 4.196 356 (2) 0.0095

KOW 7 14 2 7.058 11 (0) 0.0041

KOW 7 15 9 16.174 975 (0) 0.0216

RT 1600 14 17 15.815 880 (0) 0.5046

RT 1600 15 17 13.933 6152 (0) 1.3607

RT 1900 15 16 14.829 6616 (2) 0.4782

RT 1900 16 18 24.100 19299 (0) 1.6018

Table 3: Results on Steps 4 and 5 for (HA, KOW) with y

∗

2700 and y

∗

Kow

= 5.

∗

| IP-time #G

∗

G-time

10 5 0.817 19 (6) 0.0027

11 3 1.298 25 (0) 0.0020

Table 4: Results on Step 4 and 5 for (HF, BP) with y

∗

= 90

and y

∗

= 220.

∗

| IP-time #G

∗

G-time

12 7 2.194 1056 (0) 0.0130

13 3 3.345 18 (1) 0.0090

From these tables, we observe that, for all chem-

ical properties we tested, each set of the 18 instances

for ﬁnding an acyclic chemical graph with speciﬁed

sizes n

∗

≤ 16, n

∗

(< n

∗

) and dia

∗

(< n

∗

) from a target

∗

was solved within 2 seconds based on our novel

MILP formulation; and that, for each inferred feature

A Novel Method for the Inverse QSAR/QSPR based on Artiﬁcial Neural Networks and Mixed Integer Linear Programming with Guaranteed

Admissibility

107

vector x

∗

, enumeration of all acyclic chemical graphs

∗

inferred from x

∗

was completed in 2 milli sec-

onds to 2 seconds. Note that our MILP outputs a fea-

ture vector for which we enumerated all structurally

unique chemical graphs as deﬁned in Section 2.2, and

hence we were able to generate some chemical graphs

which are not registered in the PubChem database. In

these results, each inferred vector is admissible (i.e.,

has one or more corresponding chemical structures),

whereas only around 20% to 50% of the feature vec-

tors were admissible under a tolerance ε = 0.02 in

the previous method (Chiewvanichakorn et al., 2020).

Note that, for any small tolerance ε > 0, our new

method will infer an admissible vector, if one exists.

6 CONCLUDING REMARKS

In this paper, we proposed an improved method over

the previous method (Chiewvanichakorn et al., 2020)

for the inverse QSAR/QSPR by extending the MILP

so that, for a given target y

∗

, either (i) any feasible

solution to the MILP provides an admissible vector

∗

∈ R

and a chemical graph G = (H,α,β) such

that ψ

∗

) = y

∗

and n

∗

= n(H), and n

∗

= n

(H)

and dia

∗

= dia(H) or (ii) the infeasibility of the MILP

tells us that there exists no such chemical graph G.

Although this paper presented such an MILP for the

class G of acyclic chemical graphs, it is not difﬁcult

to modify the MILP to treat any class of chemical

graphs. Note that our MILP formulation provides a

vector s that forms a chemical graph, which means

that we can ﬁnd an inferred chemical graph in Step 4

without designing an algorithm for Step 5 separately.

REFERENCES

Akutsu, T. and Nagamochi, H. (2019). A mixed integer lin-

ear programming formulation to artiﬁcial neural net-

works. In Proceedings of the 2nd International Con-

ference on Information Science and Systems, pages

215–220.

Chiewvanichakorn, R., Wang, C., Zhang, Z., Shurbevski,

A., Nagamochi, H., and Akutsu, T. (2020). A method

for the inverse QSAR/QSPR based on artiﬁcial neu-

ral networks and mixed integer linear programming.

ICBBB2020 (to appear).

Fujiwara, H., Wang, J., Zhao, L., Nagamochi, H., and

Akutsu, T. (2008). Enumerating treelike chemical

graphs with given path frequency. Journal of Chemi-

cal Information and Modeling, 48(7):1345–1357.

omez-Bombarelli, R., Wei, J. N., Duvenaud, D.,

Hern

andez-Lobato, J. M., S

anchez-Lengeling, B.,

Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D.,

Adams, R. P., and Aspuru-Guzik, A. (2018). Auto-

matic chemical design using a data-driven continuous

representation of molecules. ACS Central Science,

4(2):268–276.

IBM ILOG CPLEX Optimization Studio 12.8. https://www.

ibm.com/support/knowledgecenter/SSSA5P

12.8.0/

ilog.odms.studio.help/pdf/usrcplex.pdf.

Jalali-Heravi, M. and Fatemi, M. H. (2001). Artiﬁcial neu-

ral network modeling of Kovats retention indices for

noncyclic and monocyclic terpenes. Journal of Chro-

matography A, 915(1-2):177–183.

Kusner, M. J., Paige, B., and Hern

andez-Lobato, J. M.

(2017). Grammar variational autoencoder. In Pro-

ceedings of the 34th International Conference on Ma-

chine Learning-Volume 70, pages 1945–1954.

Miyao, T., Kaneko, H., and Funatsu, K. (2016). Inverse

QSPR/QSAR analysis for chemical structure genera-

tion (from y to x). Journal of Chemical Information

and Modeling, 56(2):286–299.

Roy, K. and Saha, A. (2003). Comparative QSPR stud-

ies with molecular connectivity, molecular negentropy

and TAU indices. Journal of Molecular Modeling,

9(4):259–270.

Segler, M. H., Kogej, T., Tyrchan, C., and Waller, M. P.

(2017). Generating focused molecule libraries for

drug discovery with recurrent neural networks. ACS

Central Science, 4(1):120–131.

Skvortsova, M. I., Baskin, I. I., Slovokhotova, O. L., Pa-

lyulin, V. A., and Zeﬁrov, N. S. (1993). Inverse prob-

lem in QSPR/QSAR studies for the case of topological

indices characterizing molecular shape (Kier indices).

Journal of Chemical Information and Computer Sci-

ences, 33(4):630–634.

Yang, X., Zhang, J., Yoshizoe, K., Terayama, K., and

Tsuda, K. (2017). ChemTS: an efﬁcient python li-

brary for de novo molecular generation. Science and

Technology of Advanced Materials, 18(1):972–976.

BIOINFORMATICS 2020 - 11th International Conference on Bioinformatics Models, Methods and Algorithms

108