Preprocessing Graphs for Network Inference Applications

H. R. Sachin Prabhu and Hua-Liang Wei

Department of Automatic Control & Systems Engineering, The University of Shefﬁeld,

Mappin Street, S1 3JD, Shefﬁeld, U.K.

Keywords:

Bipartite Graphs, Reduced Graphs, Ordered Matching, Rank, Subnetworks.

Abstract:

The problem of network inference can be solved as a constrained matrix factorization problem where some

sparsity constraints are imposed on one of the matrix factors. The solution is unique up to a scaling factor

when certain rank conditions are imposed on both the matrix factors. Two key issues in factorising a matrix of

data from some netwrok are that of establishing simple identiﬁability conditions and decomposing a network

into identiﬁable subnetworks. This paper solves both the problems by introducing the notion of an ordered

matching in a bipartite graphs. Novel and simple graph theoretical conditions are developed which can replace

the aforementioned computationally intensive rank conditions. A simple algorithm to reduce a bipartite graph

and a graph preprocessing algorithm to decompose a network into a set of identiﬁable subsystems is proposed.

1 INTRODUCTION

The problem of network inference arises when regu-

latory pattern of a network is known and its outputs

are measured whereas the inputs that drive the net-

work and regulatory strengths are unknown. A reg-

ulatory pattern indicates causal relationships between

inputs and outputs of a network. In terms of steady-

state analysis of systems, input-output relationships

can be represented as a system of linear equations

where the coefﬁcients of which represent steady-state

gains. The challenge is to simultaneously estimate

the regulatory strengths and input activities. In other

words, a data matrix is to be factorised into a prod-

uct of two matrices – a regulatory matrix and an in-

put matrix - such that the error in data reconstruction

is minimised. Such problems are common in studies

on social networks and biological regulatory networks

(Newman, 2003; Brugere et al., 2016).

It is hard to simultaneously estimate the matrix

factors using conventional techniques such as Princi-

pal Component Analysis or Singular Value Decom-

position as the structure of the regulatory matrix is

constrained (Liao et al., 2003). Network Compo-

nent Analysis (NCA) (Liao et al., 2003) solves the

network inference problem as a bilevel optimisation

problem while taking these constraints into consider-

ation. NCA imposes several rank conditions on the

regulatory matrix and input matrix in order to ensure

uniqueness to a certain degree of the estimates ob-

tained via optimisation. A regulatory network must

satisfy all relevant NCA rank conditions imposed on

it whereas the input matrix can only be assumed to

satisfy the conditions imposed on it.

The patterns found in complex real world net-

works are not random (Newman, 2003). Therefore,

there is a need to characterise such networks when-

ever possible. Regulatory networks can be formally

described using graphs and the beneﬁts of doing so

are multifold. Parameter estimation is relatively eas-

ier as the solution space is well deﬁned. Graph the-

oretical descriptions are more comprehensible to a

layman than matrix rank conditions. In addition to

that, subnetworks can be identiﬁed in cases where

parameter estimation for the original network is un-

solvable. These advantages motivated development

of graph theoretical interpretations of regulatory net-

works in (Boscolo et al., 2005) and (Fritzilas et al.,

2013). Identifying subnetworks refers to searching

for a part of regulatory pattern that allows application

of NCA in the context of this paper. It should not be

mistaken for parameter estimation.

Graph theoretical conditions that are based on

analysing the structure of regulatory matrix is de-

veloped in (Boscolo et al., 2005). These conditions

are comprehensible and offer a simple way to test

NCA compatibility of relatively smaller networks by

inspection. However, a more formal description is

possible. More importantly, proposed limit on num-

ber of outputs that an input can regulate is inaccu-

rate. Maximal matching property of a graph is used in

(Fritzilas et al., 2013) to obtain a formal and compu-

tationally simpler conditions to test a network for its

NCA compatibility. Though the matching condition

406

Prabhu, H. and Wei, H-L.

Preprocessing Graphs for Network Inference Applications.

DOI: 10.5220/0006401104060413

In Proceedings of the 14th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2017) - Volume 1, pages 406-413

ISBN: 978-989-758-263-9

can greatly reduce the computational burden, there

are cases where some networks can be erroneously

classiﬁed as NCA compatible. Furthermore, both ap-

proaches present algorithms to identify largest NCA

compatible subnetwork, but they ignore rest of the

original network.

Two key problems addressed in this paper are

that of deriving an accurate comprehensible formal

description of NCA compatible regulatory networks

and decomposing NCA incompatible networks into a

set of NCA compatible subnetworks without ignor-

ing any part of the original network. In this paper, a

new graph property called ordered matching is intro-

duced. A formal description of NCA compatibile net-

works based on ﬁnding a maximal ordered matching

in a reduced graph is developed. A new algorithm to

reduce a graph is proposed and a method to decom-

pose a given network into a set of NCA compatible

subnetworks is proposed.

The paper is organised as follows - section 2

presents a formal description of the regulatory net-

work inference problem followed by graph theoretical

interpretations of NCA found in literature. Shortcom-

ings of those interpretations of NCA are demonstrated

in section 3 with the help of counter examples. Novel

NCA conditions are developed thereafter. Algorithms

to reduce a graph, test a network for NCA compatibil-

ity, and identify all NCA compatible subnetworks in

a given network are presented in the latter part of sec-

tion 3. The results of application of these algorithms

to an example network is presented in section 4.

2 PRELIMINARIES

2.1 Regulatory Network Inference

Consider a regulatory network with M outputs and L

inputs, L ≤ M. Let D ∈ R

M×N

be the output or data

matrix and S ∈ R

L×N

be the input or source matrix.

and S

, 1 ≤ k ≤ N respectively denote kth sample

of ith output and jth input. At steady-state, the system

is represented by a set of linear equations given by

D = W S (1)

Let Y = {v

, 1 ≤ i ≤ M} be the set of vertices cor-

responding to the outputs, X = {u

, 1 ≤ j ≤ L} be the

set of vertices corresponding to the inputs, and E =



, v

), u

∈ X regulates v

∈ Y



be a set of edges.

Let W ∈ R

M×L

encode all the edge strengths. The

system in (1) is a bipartite graph G = {X ∪ Y, E,W }

with X and Y as its bipartitions. The regulatory pat-

tern of the system is given by the adjacency matrix of

G B(G) ∈ [0, 1]

M×L

deﬁned as

B(G)

i j

(

1, if (u

, v

) ∈ E

0, otherwise

(2)

A mapping φ : R

M×L

7→ [0, 1]

M×L

can be deﬁned such

that

φ(W ) = B(G)

The problem of regulatory network inference is such

that B(G) and D are known, whereas W and S are to

be estimated simultaneously. Regulatory network in-

ference problem is posed as an optimisation problem

below:

min

W,S

J (W, S) = kD −WSk

subject to φ(W ) = B(G)

(3)

where, k . k

denotes the Frobenius norm of a matrix.

Note that the objective is to be minimised si-

multaneously over W and S. Thus, this cannot be

solved as a standard quadratic program in one op-

timisation variable. Matrix factorization approaches

such as Principal Component Analysis and Singular

Value Decomposition are not applicable here because

of the constraint in (3). Network Component Analy-

sis (NCA) (Liao et al., 2003) proposes a bi-level opti-

mization based approach to factorize matrix D into a

product of two matrices

W and

The following set of conditions is established in

(Liao et al., 2003) that W and S must satisfy for NCA

to be applicable:

1. W has full column rank, i.e., rank(W ) = L

2. Reduced submatrices W

, ∀u

∈ X have full col-

umn rank, i.e., rank(W

) = L − 1

3. S has full row rank, i.e., rank(S) = L

where, the reduced submatrix W

of W is ob-

tained by deleting from W the jth column and

every ith row such that φ(W )

i j

= 1. As W is

unknown apriori, it is to be replaced by B(G)

and W

by B(G)

= B(W

). Consider a sys-

tem with M = 10, L = 5, and W such that

Preprocessing Graphs for Network Inference Applications

407

φ(W ) = B(G), where B(G) is given by

B(G) =

1 1 0 1 0

0 0 1 1 0

1 1 0 1 0

0 0 1 1 1

0 1 0 1 1

1 0 1 0 1

1 1 1 0 1

1 0 0 0 1

0 1 1 0 1

(4)

B(G)

is obtained by deleting rows and columns as

shown in Fig. 1.

1 1 0 1 0

0 0 1 1 0

1 1 0 1 0

0 0 1 1 1

0 1 0 1 1

1 0 1 0 1

1 1 1 0 1

1 0 0 0 1

0 1 1 0 1

Figure 1: Deleting rows and columns to obtain B(H

(G)).

It can be easily veriﬁed that rank(B(G)) = L.

However, any regulatory network with φ(W ) = B(G)

as described in (4) will not be NCA compatible as the

submatrices B(G)

s are such that

rank(B(G)

)

(

= L − 1, j = 1, 2, 4

< L − 1, j = 3, 5

Testing for NCA compatibility is computationally

intensive as it involves checking ranks of L + 1 ma-

trices for a system with L inputs. Graph theoretical

interpretation of these conditions can reduce the in-

volved complexity to a great extent as explained in

the next section.

2.2 Graph Theoretical Interpretations

of NCA

Let N (u

) = {v

∈ Y |(u

, v

) ∈ E} be the set of neigh-

bours of some vertex u

∈ X . Trivially, N (u

) =

i j

= 1}. Given a graph G, G −V denotes an in-

duced subgraph obtained by deleting the set of ver-

tices V and all associated edges from G. A graph the-

oretical interpretation of submatrix W

of W , referred

to as reduced matrix in NCA literature, is deﬁnes as

Deﬁnition 1. An induced subgraph in the sense

of NCA H

(G) of G is deﬁned as H

(G) = G −



, N (u

)



Deﬁnition 2. Let W ∈ R

M×L

be such that φ(W ) =

B(G). An induced submatrix W

is obtained by

deleting rows and columns of W such that φ(W

) =

B(H

(G))

B(H

(G)) is nothing but the submatrix B(G)

described in section 2.1.

Various constraints are imposed on NCA compat-

ible networks in (Boscolo et al., 2005) that are more

comprehensible compared to the original NCA rank

conditions. However, no computationally simple al-

gorithm is given to implement those. Moreover, the

condition imposed on the degree of input vertices

d(u

) = | N (u

)| ≤ L − 1 is inaccurate as demon-

strated in the next section. The interpretation in def-

inition 2 is implicitly introduced in (Fritzilas et al.,

2013) where it is argued that NCA conditions are

equivalent to ﬁnding a maximal matching of size L−1

in H

(G).

Deﬁnition 3. A matching in a bipartite graph G =

{X ∪Y, E,W } is a pair (X

), X

⊆ X,Y

⊆ Y such

that (u

, v

) ∈ E, |N (u

)∩Y

| = 1, and |N (v

)∩X

| =

1, ∀u

∈ X

, v

∈ Y

In other words, every vertex in a matching has a

unique neighbour and |X

| = |Y

| is the size of the

matching (X

). A maximal matching (X

∗

) is

such that |X

∗

| ≥ |X

|, where (X

) is any other

matching in G. Let G

∗

denote a subgraph deﬁned by

a maximal matching (X

∗

). Its adjacency matrix

B(G

∗

) will have standard basis vectors e

, i.e., vectors

with 1 in ith position and rest of the entries being zero.

Thus

rank(B(G

∗

)) = |X

∗

This fact is forms the basis for the argument that

matching conditions are equivalent to the original

NCA conditions.

This interpretation of NCA reduces the problem of

computing rank of an induced submatrix to a compu-

tationally less intensive problem of ﬁnding a maximal

ICINCO 2017 - 14th International Conference on Informatics in Control, Automation and Robotics

408

matching in bipartite graph. However, the case of du-

plicate vertices is not considered which may lead to

inaccurate descriptions of NCA compatible networks.

Merely ﬁnding a matching in G will not sufﬁce as

demonstrated in the next section.

3 GRAPH PREPROCESSING FOR

NCA

The problem of identifying NCA compatible subnet-

works arises when a given regulatory network does

not satisfy the original NCA conditions. This prob-

lem is addressed in (Boscolo et al., 2005) and (Fritzi-

las et al., 2013) where they ﬁrst develop graph theo-

retical description of compatible regulatory networks

and then propose methods to identify the largest NCA

compatible subnetwork. However, the set of condi-

tions claimed to be equivalent to the original NCA

conditions are rather inaccurate as shown with the

help of counter-examples in this section

Consider testing B(H

(G)) given by

B(H

(G)) =

0 1 1 0

0 1 1 1

1 0 1 1

1 1 0 1

(5)

against the conditions in (Fritzilas et al., 2013).

∗

), X

∗

⊂ X = {u

, u

},Y

∗

⊂ Y =

, v

} deﬁnes a maximal matching of size

L − 1. B(H

∗

) deﬁned by this matching as shown

below

B(H

∗

) =

0 1 1 0

0 1 1 1

1 0 1 1

(6)

Note that rankB(H

∗

) = 3 < L − 1 as rows corre-

sponding to v

and v

are identical. Therefore, it is

possible to ﬁnd a case where an induced submatrix

B(H

(G)) for some u

has rank smaller than L − 1,

but contains a maximal matching of size L − 1. This

observation is summarised as follows

Remark 1. Duplicate vertices can lead to wrong

classiﬁcation of a network as NCA compatible

Consider the following submatrix of B(H

(G))

0 1 1 0

0 1 1 1

1 0 1 1

({v

, v

}, {u

, u

}) deﬁne a matching of size

3. Though rank(B

) = 3, this matching does not rep-

resent a set of linearly independent columns - col-

umn u

is linearly dependent on column u

and u

= u

+ u

Remark 2. A maximal matching may contain linearly

dependent vertices

Both remarks 1 and 2 show that a maximal match-

ing does not necessarily correlate to the rank of a

graph as argued in (Fritzilas et al., 2013).

It is evident that for B(G) in (4), d(u

) > L −

1, ∀u

∈ X. It can be veriﬁed that a subnetwork

formed by all rows of B(G) and columns u

, u

, and

is NCA compatible. This contradicts condition 2 of

Lemma 1 in (Boscolo et al., 2005) (page 292) which

limits d

to L − 1 for every u

∈ X.

Remark 3. d(u

), u

∈ X can be greater than L − 1

Novel conditions that resolve the issues pointed

out in remarks 1, 2 and 3 are presented in the next

section.

3.1 Novel NCA Compatibility

Conditions

Identifying a maximal matching (X

∗

) is not nec-

essarily equivalent to calculating rank of the original

graph G

rank(B(G)) 6= |X

∗

as demonstrated in the previous section. The issue in

remark 1 arises as the vertex v

is a duplicate of v

in the sense that N (v

) = N (v

). Another maximal

matching (X

∗

) of B(H

(G)) can be found where

∗

= {u

, u

} and Y

∗

= {v

, v

B(H

∗

(G)) deﬁned by X

∗

and Y

∗

will contain all

rows in (5) except the second row. It can be veri-

ﬁed that the rank of this matrix is L − 1 = 4, which is

equal to |X

∗

|. This demonstrates the need to reduce a

graph before ﬁnding a maximal matching. A reduced

bipartite graph is deﬁned as follows

Deﬁnition 4. A reduced bipartite graph

G = {

X ∪

Y ,

W } is a subgraph of G such that

X ⊆ X and

Y ⊆ Y

2. there are no isolated vertices, d(v) > 0, ∀v ∈

X ∪

Preprocessing Graphs for Network Inference Applications

409

3. there are no duplicate vertices, N (v

) 6=

N (v

), ∀v

, v

∈

X ∪

and,

E ⊆ E and

W ⊆ W are submatrices whose rows

and columns are indexed by

X and

Y .

A matching ({u

, u

}, {v

, v

}) in (6)

represents linearly independent columns of B(H

∗

This demonstrates the need to rearrange the rows and

columns of the adjacency matrix of a bipartite graph

before looking for a maximal matching. It will be

shown in this section that imposing a partial order

based on degrees of vertices is helpful in simplifying

NCA compatibility conditions.

Deﬁnition 5. An ordered matching is a matching

(X , Y ) obtained from partially ordered sets D(X)

and D(Y ) where D(.) sorts a set of vertices in in-

creasing order of degrees of its elements.

The act of imposing such an order is referred to

as ordering a graph in the context of this paper. Let

D(G) denote an ordered graph. Ordering and re-

duction operations are independent of each other and

hence, can be executed in any order. The relationship

between graph reduction and ordering, and rank of a

graph can be derived as shown below.

Lemma 1. The rank of a bipartite graph G = {X ∪

Y, E,W } such that |X| ≤ |Y | can be determined as

rank(G) = |

∗

(G)|

where, (

∗

(G),

∗

(G)) deﬁne a maximal matching in

ordered reduced bipartite graph D(

G).

Proof Rank of a graph is equal to the rank of re-

duced graph

G (Li et al., 2012). Deﬁning a partial

order D(.) on

G does not affect its rank. It is sufﬁ-

cient to show that (

∗

) a maximal matching in

ordered reduced graph D(

G) corresponds to a set of

linearly independent columns in D(

G), and hence, in

G and G.

The ordered reduced graph D(

G) has no duplicate

or isolated vertices. Assume that there exists a vertex

∈ D(

X) the column corresponding to which can be

expressed as a linear combination of other columns in

the adjacency matrix B(D) = B(D(

G)) as

B(D)

= Σ

j6=i

B(D)

: j

Thus, for every v ∈ N (u

), there exists some u

∈

X), j 6= i such that v ∈ N (u

). Since every ver-

tex in a matching has exactly one neighbour, u

will

not be a part of any maximal ordered matching as the

columns corresponding to u

s will precede the col-

umn corresponding to u

in B(D). Thus, all columns

of B(

G) corresponding to (

∗

) are linearly inde-

pendent. A similar argument with respect to the rows

of B(

G) by replacing u with v and X by Y completes

the proof. 

A novel and simple set of conditions that is equiv-

alent to the original NCA conditions can now be de-

veloped.

Theorem 1. A regulatory network as described in (1)

is NCA-compatible if and only if every input vertex

∈ X is such that

∗

(G))| = L − 1

where G = (X ∪ Y, E,W ) is a bipartite graph repre-

senting the network, H

(G)s are induced subgraphs

of G, and

∗

(.) represents a maximal ordered match-

ing in

Proof If every vertex u

∈ X satisﬁes the condition

in theorem 1, rank(W

) = L − 1 from Lemma 1. The

graph H

(G) + v for any u

∈ X and v ∈ N (u

) is

reduced and it will have a maximal ordered matching

of size L and hence rank(W ) = L.

Conversely, by deﬁnition, maximal ordered

matchings of size L − 1 can be found for all H

s of

NCA compatible networks.

We use the argument on rank of S provided in

(Liao et al., 2003). Thus, all original NCA conditions

are satisﬁed. 

3.2 Identifying NCA Compatible

Subnetworks

As pointed out towards the end of section 2.2 and

demonstrated in section 3, the conditions developed

in (Boscolo et al., 2005) and (Fritzilas et al., 2013) are

inaccurate graph theoretical interpretations of original

NCA conditions. Both present their own approaches

to identify NCA compatible subnetworks of NCA in-

compatible networks. Algorithm proposed in the for-

mer one starts with an initial guess on the largest NCA

compatible subnetwork and continues to add more in-

put vertices such that no NCA conditions are violated.

This requires an initial guess on largest NCA compat-

ible subnetwork. The simplest starting point could

be one where only one input vertex and its neigh-

bours are considered, but it is trivial to see that ran-

domly adding more vertices thereafter is a tedious

task. The latter proposes a more structured approach

where nicely separable subsets of X and Y are to be

identiﬁed. Nicely separable NCA compliant subnet-

works are those that have no vertices in common and

are NCA compatible (Fritzilas et al., 2013). The moti-

vation for our work comes from the fact that both for-

mer approaches look for the single largest NCA com-

patible subnetwork and ignore the rest of the vertices.

In addition to that, neither of the two are computation-

ally simple.

ICINCO 2017 - 14th International Conference on Informatics in Control, Automation and Robotics

410

Several matching algorithms are available in liter-

ature. In depth discussions on such algorithms can be

found in (Burkard et al., 2009). It is necessary to de-

ﬁne a clear objective in order to ﬁnd a suitable match-

ing - for example, an algorithm designed to ﬁnd on-

line a maximum cardinality matching with minimum

cost is proposed in (Azad et al., 2015). Our objective

is to ﬁnd matchings such that conditions of theorem

1 are satisﬁed. All the graphs that we consider for

preprocessing phase are unlabelled and undirected bi-

partite graphs. Thus, a set of relatively simple algo-

rithms proposed in this section are sufﬁcient to meet

our objectives.

In this section, we propose simple approaches

to reduce a bipartite graph, identify a maximal or-

dered matching and decompose the whole network

into NCA compatible subnetworks. If a system at

steady state can be clearly represented as a regulatory

network as described in (1), the algorithms presented

in this section will not only avoid unnecessary compu-

tational overhead, but also provide a decomposition of

the network without ignoring any of the subnetworks.

A method to remove duplicate and zero rows and

columns is presentedbelow. This algorithm identi-

Algorithm 1: Conjugate reduction.

input: B ∈ R

M×L

STEP 1:

Set B

= B(¬B)

+ (¬B)B

and M = number of

rows in B

STEP 2:

for i = 1 to M do

for j = i + 1 to M do

if B

(i, j) == 0 then

remove row j from B

end if

end for

STEP 3:

remove any zero rows from B

Repeat all steps with B = B

output: B

ﬁes duplicates by looking for zero entries in the inner

product over binary ﬁeld (Gudder and Latr

emoli

ere,

2009) of B(G) and its conjugate ¬B(G).

Reduced submatrix B(

G) can be obtained by ap-

plying Algorithm 1 on B(G). In order to test a net-

work against the conditions in theorem 1, a maxi-

mal ordered matching should be found in B(

G). In

order to do that, the rows and columns of B(

G) are

rearranged in increasing order of number of ones in

them. Several matching algorithms available in liter-

ature can be used to ﬁnd a maximal matching. How-

ever, a simple linear-search-and-eliminate based ap-

proach is sufﬁcient as there is no need to minimise

any associated cost. It can be shown that such an al-

gorithm runs in O(L) time for B(

G) with |

X| = L. The

algorithm has not been described here for brevity.

We can now establish a simple approach to test

NCA-compatibility of a given network as described

in Algorithm 2.

Algorithm 2: NCA compatibility test.

input: B(G) ∈ R

M×L

if Σ

i j

(G) > M − L + 1 for any u

∈ X then

NCA-incompatible

else

Obtain B(

G) using algorithm 1

if for any u

∈

X, |

∗

(G))| < L − 1 then

NCA-incompatible

else

NCA-compatible

end if

NCA compatible subnetworks can be identiﬁed if

a network is found to be incompatible after applying

Algorithm 2. This can be done by grouping together

vertices u

, u

∈ X such that u

∪

∗

(G)) is iden-

tical to u

∪

∗

(G)) as outlined below. All ver-

Algorithm 3: Graph preprocessing for NCA.

input: B(G) ∈ R

M×L

STEP 1: for every u

∈ X set

∗

(G)) ∪ u

and n

= |X

∗

STEP 2: identify sets C

such that

, u

∈ C

⇒ X

∗

≡ X

∗

, ∀u

, u

∈ X

outputs: set

= ∪

∈C

∪Y ), E

:,C

)

tices in a set C

form maximal ordered matching of

size n with other vertices in the same set. In Algo-

rithm 3, E

:,C

and W

:,C

respectively represent subma-

trices of E and W with all rows and columns corre-

sponding to the vertices in set C

Algorithm 3 is not only capable of identifying

NCA compatible subnetworks, but also of implicitly

testing a network for NCA compatibility. If a network

is NCA compatible, Algorithm 3 returns one subnet-

work G

L−1

which is the original network G. Thus,

Algorithm 3 alone can be used to preprocess a graph

to identify NCA compatible subnetworks. The sub-

networks can then be inferred individually and recom-

bined to infer the whole network. Such a divide and

conquer approach is beyond the scope of this paper.

Preprocessing Graphs for Network Inference Applications

411

Part of the results of applying the algorithms in this

section to the regulatory network in (4) is presented

in the next section.

4 RESULTS

In this section, we apply the algorithms presented in

section 3.2 to the example network in (4). We demon-

strate each step involved in obtaining ordered match-

ings

∗

(G)) and

∗

(G)). Similar steps are

to be applied for vertices u

, u

and u

(not shown

here). The objective of the algorithms in (Boscolo

et al., 2005) and (Fritzilas et al., 2013) is to identify

largest possible NCA subnetwork while ignoring re-

mainder of the network whereas our goal is to divide

a given network into several NCA compatible subnet-

works. Thus, a comparison between results of apply-

ing those algorithms with the ones presented in this

section is unwarranted.

The ﬁrst step in identifying NCA compatible sub-

networks is identifying maximal ordered matchings

for every vertex u

∈ X. Consider u

and u

. The

induced submatrix corresponding to u

B(H

(G)) is

given in (5) and that corresponding to u

B(H

(G))

is given below.

B(H

(G)) =

0 1 1 0

0 1 1 1

1 1 0 1

1 0 0 1

(7)

We reduce B(H

(G)) and B(H

(G)), impose the

order D(.) on the reduced versions B(

(G)) and

then look for maximal matchings as demonstrated

next. The steps in reducing matrices are illustrated by

striking out duplicate rows and columns.

Reduced induced submatrices:

(G)) =

0 1 1 0

0 1 1 1

1 0 1 1

1 1 0 1

(G)) =

0 1 1 0

0 1 1 1

1 1 0 1

1 0 0 1

It can be seen that B(D(

(G))) = B(

(G))

as the columns and rows in B(

(G)) are already

in increasing order of degrees of vertices. However,

the rows and columns of B(

(G)) illustrated by

encircled vertices must be reordered as indicated by

the arrows here

Partial ordering:

B(D(

(G))) =

←

→

0 1 1 0

↑ v

1 0 0 1

↓ v

0 1 1 1

↓ v

1 0 1 1

A linear search is conducted to look for the ﬁrst 1 en-

try in every row, the row and column corresponding

to the found entry are eliminated. These steps are re-

peated for all columns until no more 1 entry is found

or all columns are exhausted. A set of vertices that

correspond to all the columns in which a 1 entry is

found deﬁnes a maximal matching. Finding ﬁrst 1 en-

try in the ﬁrst row of B(D(

(G))) is illustrated by

encircling the entry and eliminating the correspond-

ing row and column by striking them out.

0 1 1 0

0 1 1 1

1 0 1 1

1 1 0 1

A maximal ordered matching in D(B(

(G))) is

obtained as

(

∗

(G)),

∗

(G)))

= ({u

, u

}, {v

, v

})

The encircled entries represent this maximal ordered

matching.

ICINCO 2017 - 14th International Conference on Informatics in Control, Automation and Robotics

412

0 1 1 0

0 1 1 1

1 0 1 1

1 1 0 1

Executing all the illustrated steps on all induced sub-

matrices results in the following sets as indicated in

Step 1 of Algorithm 3:

∗

= {u

, u

} ∪ {u

} n

= 5

∗

= {u

, u

} ∪ {u

} n

= 5

∗

= {u

, u

} ∪ {u

} n

= 4

∗

= {u

, u

} ∪ {u

} n

= 5

∗

= {u

, u

} ∪ {u

} n

= 3

The NCA compatible sets obtained as described

in Step 2 of Algorithm 3 are C

= {u

, u

= {u

}, and C

= {u

}. Thus, the given NCA in-

compatible network can be decomposed into a collec-

tion of 3 NCA compatible subnetworks as illustrated

in Fig. 2

. u

1 1 1 . 0 . 0

0 0 1 . 1 . 0

1 1 1 . 0 . 0

0 0 1 . 1 . 1

0 1 1 . 0 . 1

1 0 0 . 1 . 1

1 1 0 . 1 . 1

1 0 0 . 0 . 1

0 1 0 . 1 . 1

Figure 2: Decomposition of original network into subnet-

works.

5 CONCLUSIONS

In this paper, the need to consider graph theoreti-

cal interpretations of NCA was emphasized. It was

demonstrated that merely ﬁnding a maximal match-

ing in bipartite graphs may lead to wrong classiﬁca-

tion of NCA incompatible networks as NCA compat-

ible. In order to overcome this issue, a new property

called ordered matching in a bipartite graph was in-

troduced. The rank of a bipartite graph was proven to

be equal to the size of an ordered matching in its re-

duced form. This result was used to develop new con-

ditions for NCA compatibility. A simple algorithm

to reduce a bipartite graph was proposed. An algo-

rithm was proposed to identify NCA compatible sub-

networks in a given network. The results presented

in this paper solve two important preprocessing prob-

lems - simplifying NCA compatibility conditions and

decomposing a network into identiﬁable parts.

REFERENCES

Azad, A., Buluc¸, A., and Pothen, A. (2015). A parallel tree

grafting algorithm for maximum cardinality match-

ing in bipartite graphs. In Proceedings of the 2015

IEEE International Parallel and Distributed Process-

ing Symposium, pages 1075–1084. IEEE Computer

Society.

Boscolo, R., Sabatti, C., Liao, J. C., and Roychowd-

hury, V. P. (2005). A generalized framework for

network component analysis. IEEE/ACM Transac-

tions on Computational Biology and Bioinformatics,

2(4):289–301.

Brugere, I., Gallagher, B., and Berger-Wolf, T. Y. (2016).

Network structure inference, A survey: Motivations,

methods, and applications. CoRR, abs/1610.00782.

Burkard, R., Dell’Amico, M., and Martello, S. (2009). As-

signment Problems, Revised Reprint. Other titles in

applied mathematics. Society for Industrial and Ap-

plied Mathematics (SIAM, 3600 Market Street, Floor

6, Philadelphia, PA 19104).

Fritzilas, E., Milani

A, M., me Monnot, J., and Rios-Solis,

Y. A. (2013). Resilience and optimization of identiﬁ-

able bipartite graphs. Discrete Applied Mathematics,

161(4-5):593–603.

Gudder, S. and Latr

emoli

ere, F. (2009). Boolean inner-

product spaces and boolean matrices. Linear Algebra

and its Applications, 431:274–296.

Li, H., Su, L., and Sun, H. (2012). On bipartite graphs

which attain minimum rank among bipartite graphs

with given diameter. Electronic Journal of Linear Al-

gebra, 23:1–14.

Liao, J. C., Boscolo, R., Yang, Y., Tran, L. M., Sabatti, C.,

and Roychowdhury, V. P. (2003). Network component

analysis: Reconstruction of regulatory signals in bio-

logical systems. Proceedings of the National Academy

of Sciences, 100(26):15522–15527.

Newman, M. E. J. (2003). The structure and function of

complex networks. SIAM Review, 45(2):167–256.

Preprocessing Graphs for Network Inference Applications

413