Preprocessing Graphs for Network Inference Applications
H. R. Sachin Prabhu and Hua-Liang Wei
Department of Automatic Control & Systems Engineering, The University of Sheffield,
Mappin Street, S1 3JD, Sheffield, U.K.
Keywords:
Bipartite Graphs, Reduced Graphs, Ordered Matching, Rank, Subnetworks.
Abstract:
The problem of network inference can be solved as a constrained matrix factorization problem where some
sparsity constraints are imposed on one of the matrix factors. The solution is unique up to a scaling factor
when certain rank conditions are imposed on both the matrix factors. Two key issues in factorising a matrix of
data from some netwrok are that of establishing simple identifiability conditions and decomposing a network
into identifiable subnetworks. This paper solves both the problems by introducing the notion of an ordered
matching in a bipartite graphs. Novel and simple graph theoretical conditions are developed which can replace
the aforementioned computationally intensive rank conditions. A simple algorithm to reduce a bipartite graph
and a graph preprocessing algorithm to decompose a network into a set of identifiable subsystems is proposed.
1 INTRODUCTION
The problem of network inference arises when regu-
latory pattern of a network is known and its outputs
are measured whereas the inputs that drive the net-
work and regulatory strengths are unknown. A reg-
ulatory pattern indicates causal relationships between
inputs and outputs of a network. In terms of steady-
state analysis of systems, input-output relationships
can be represented as a system of linear equations
where the coefficients of which represent steady-state
gains. The challenge is to simultaneously estimate
the regulatory strengths and input activities. In other
words, a data matrix is to be factorised into a prod-
uct of two matrices – a regulatory matrix and an in-
put matrix - such that the error in data reconstruction
is minimised. Such problems are common in studies
on social networks and biological regulatory networks
(Newman, 2003; Brugere et al., 2016).
It is hard to simultaneously estimate the matrix
factors using conventional techniques such as Princi-
pal Component Analysis or Singular Value Decom-
position as the structure of the regulatory matrix is
constrained (Liao et al., 2003). Network Compo-
nent Analysis (NCA) (Liao et al., 2003) solves the
network inference problem as a bilevel optimisation
problem while taking these constraints into consider-
ation. NCA imposes several rank conditions on the
regulatory matrix and input matrix in order to ensure
uniqueness to a certain degree of the estimates ob-
tained via optimisation. A regulatory network must
satisfy all relevant NCA rank conditions imposed on
it whereas the input matrix can only be assumed to
satisfy the conditions imposed on it.
The patterns found in complex real world net-
works are not random (Newman, 2003). Therefore,
there is a need to characterise such networks when-
ever possible. Regulatory networks can be formally
described using graphs and the benefits of doing so
are multifold. Parameter estimation is relatively eas-
ier as the solution space is well defined. Graph the-
oretical descriptions are more comprehensible to a
layman than matrix rank conditions. In addition to
that, subnetworks can be identified in cases where
parameter estimation for the original network is un-
solvable. These advantages motivated development
of graph theoretical interpretations of regulatory net-
works in (Boscolo et al., 2005) and (Fritzilas et al.,
2013). Identifying subnetworks refers to searching
for a part of regulatory pattern that allows application
of NCA in the context of this paper. It should not be
mistaken for parameter estimation.
Graph theoretical conditions that are based on
analysing the structure of regulatory matrix is de-
veloped in (Boscolo et al., 2005). These conditions
are comprehensible and offer a simple way to test
NCA compatibility of relatively smaller networks by
inspection. However, a more formal description is
possible. More importantly, proposed limit on num-
ber of outputs that an input can regulate is inaccu-
rate. Maximal matching property of a graph is used in
(Fritzilas et al., 2013) to obtain a formal and compu-
tationally simpler conditions to test a network for its
NCA compatibility. Though the matching condition
406
Prabhu, H. and Wei, H-L.
Preprocessing Graphs for Network Inference Applications.
DOI: 10.5220/0006401104060413
In Proceedings of the 14th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2017) - Volume 1, pages 406-413
ISBN: 978-989-758-263-9
Copyright © 2017 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved