vector x
∗
, enumeration of all acyclic chemical graphs
G
∗
inferred from x
∗
was completed in 2 milli sec-
onds to 2 seconds. Note that our MILP outputs a fea-
ture vector for which we enumerated all structurally
unique chemical graphs as defined in Section 2.2, and
hence we were able to generate some chemical graphs
which are not registered in the PubChem database. In
these results, each inferred vector is admissible (i.e.,
has one or more corresponding chemical structures),
whereas only around 20% to 50% of the feature vec-
tors were admissible under a tolerance ε = 0.02 in
the previous method (Chiewvanichakorn et al., 2020).
Note that, for any small tolerance ε > 0, our new
method will infer an admissible vector, if one exists.
6 CONCLUDING REMARKS
In this paper, we proposed an improved method over
the previous method (Chiewvanichakorn et al., 2020)
for the inverse QSAR/QSPR by extending the MILP
so that, for a given target y
∗
, either (i) any feasible
solution to the MILP provides an admissible vector
x
∗
∈ R
K
and a chemical graph G = (H,α,β) such
that ψ
N
(x
∗
) = y
∗
and n
∗
= n(H), and n
∗
1
= n
1
(H)
and dia
∗
= dia(H) or (ii) the infeasibility of the MILP
tells us that there exists no such chemical graph G.
Although this paper presented such an MILP for the
class G of acyclic chemical graphs, it is not difficult
to modify the MILP to treat any class of chemical
graphs. Note that our MILP formulation provides a
vector s that forms a chemical graph, which means
that we can find an inferred chemical graph in Step 4
without designing an algorithm for Step 5 separately.
REFERENCES
Akutsu, T. and Nagamochi, H. (2019). A mixed integer lin-
ear programming formulation to artificial neural net-
works. In Proceedings of the 2nd International Con-
ference on Information Science and Systems, pages
215–220.
Chiewvanichakorn, R., Wang, C., Zhang, Z., Shurbevski,
A., Nagamochi, H., and Akutsu, T. (2020). A method
for the inverse QSAR/QSPR based on artificial neu-
ral networks and mixed integer linear programming.
ICBBB2020 (to appear).
Fujiwara, H., Wang, J., Zhao, L., Nagamochi, H., and
Akutsu, T. (2008). Enumerating treelike chemical
graphs with given path frequency. Journal of Chemi-
cal Information and Modeling, 48(7):1345–1357.
G
´
omez-Bombarelli, R., Wei, J. N., Duvenaud, D.,
Hern
´
andez-Lobato, J. M., S
´
anchez-Lengeling, B.,
Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D.,
Adams, R. P., and Aspuru-Guzik, A. (2018). Auto-
matic chemical design using a data-driven continuous
representation of molecules. ACS Central Science,
4(2):268–276.
IBM ILOG CPLEX Optimization Studio 12.8. https://www.
ibm.com/support/knowledgecenter/SSSA5P
12.8.0/
ilog.odms.studio.help/pdf/usrcplex.pdf.
Jalali-Heravi, M. and Fatemi, M. H. (2001). Artificial neu-
ral network modeling of Kovats retention indices for
noncyclic and monocyclic terpenes. Journal of Chro-
matography A, 915(1-2):177–183.
Kusner, M. J., Paige, B., and Hern
´
andez-Lobato, J. M.
(2017). Grammar variational autoencoder. In Pro-
ceedings of the 34th International Conference on Ma-
chine Learning-Volume 70, pages 1945–1954.
Miyao, T., Kaneko, H., and Funatsu, K. (2016). Inverse
QSPR/QSAR analysis for chemical structure genera-
tion (from y to x). Journal of Chemical Information
and Modeling, 56(2):286–299.
Roy, K. and Saha, A. (2003). Comparative QSPR stud-
ies with molecular connectivity, molecular negentropy
and TAU indices. Journal of Molecular Modeling,
9(4):259–270.
Segler, M. H., Kogej, T., Tyrchan, C., and Waller, M. P.
(2017). Generating focused molecule libraries for
drug discovery with recurrent neural networks. ACS
Central Science, 4(1):120–131.
Skvortsova, M. I., Baskin, I. I., Slovokhotova, O. L., Pa-
lyulin, V. A., and Zefirov, N. S. (1993). Inverse prob-
lem in QSPR/QSAR studies for the case of topological
indices characterizing molecular shape (Kier indices).
Journal of Chemical Information and Computer Sci-
ences, 33(4):630–634.
Yang, X., Zhang, J., Yoshizoe, K., Terayama, K., and
Tsuda, K. (2017). ChemTS: an efficient python li-
brary for de novo molecular generation. Science and
Technology of Advanced Materials, 18(1):972–976.
BIOINFORMATICS 2020 - 11th International Conference on Bioinformatics Models, Methods and Algorithms
108