forms that are used by various types of networking
technologies, is also a contributing factor toward the
error term, ε.
In this paper, we have applied SCS technique to
denoise the noisy link delay data. A key idea that
constitutes the rationale behind sparse code shrinkage
(SCS) is to use a basis that is more suitable for data at
hand. For denoising, it is required to transform data
to a sparse code, apply maximum likelihood (ML)
estimation procedure component-wise, and transform
back to the original variables. The simulation results
show that the proposed technique needs less input and
assumptions to denoise and recover almost noise free
(original) data.
The rest of the paper is organized as follows. Sec-
tion 2 briefly describes network tomography and var-
ious factors that introduce errors in tomography data.
Section 3 reviews related work. Section 4 discusses
SCS and the rationale for using SCS. Section 5 ex-
plains application of NNMF in the context of network
tomography and sparsity. Section 6 presents and dis-
cusses results to show that SCS successfully denoises
the noisy link delay data with out a priori knowledge
of routing matrix. Section 7 concludes the paper.
2 FACTORS INTRODUCING
ERRORS IN NETWORK
TOMOGRAPHY
Vardi (Vardi, 1996) was the first one to introduce the
term of network tomography for an indirect infer-
ence of desired statistics. Three categories of network
tomography problems (active, passive, and topology
identification) have been addressed in the literature.
In passive network tomography (Vardi, 1996), link
level statistics such as bit rate are passively measured
as matrix Y and origin destination (OD) flows are es-
timated as X.
In active network tomography (Castro et al.,
2004), (Coates and Nowak, 2001), unicast or mul-
ticast probes are sent from a single or multiple
source(s) to destination(s) and parameters such as
packet loss rate (PLR), delay and bandwidth are de-
termined from source destination measurements.
The key idea in most of the existing topology iden-
tification methods is to collect measurements at pairs
of receivers (Castro et al., 2004).
Simple Network Management Protocol (SNMP)
and NetFlow are the main contributors towards the
error term (ε) along with the heterogeneity of the
network components in terms of vendors and hard-
ware/software platforms that are used by various
types of networking technologies.
SNMP is applied for collecting data that is used
for management purposes including network delay to-
mography. SNMP (Zhao et al., 2006) periodically
polls statistics such as byte count of each link in an IP
network. In SNMP, the commonly adopted sampling
interval is 5 minutes. The management station cannot
start the management information base (MIB) polling
for hundreds of the router interfaces in a network at
the same time (at the beginning of the 5-minutes sam-
pling intervals). Therefore, the actual polling inter-
val is shifted and could be different than 5 minutes.
This polling discrepancy becomes a source of error in
SNMP measurements.
The traffic flow statistics are measured at each
ingress node via NetFlow (Clemm, 2006), (Systems,
2010). A flow is an unidirectional sequence of pack-
ets between a particular source and destination IP ad-
dress pair. The high cost of deployment limits the
NetFlow capable routers. Also, products from ven-
dors other than Cisco have limited or no support at all
for NetFlow (Clemm, 2006), (Systems, 2010). There-
fore, sampling is a common technique to reduce the
overhead of detailed flow level measurement. The
flow statistics are computed after applying sampling
at both levels; packet level and flow level. Since the
sampling rates are often low, inference from the Net-
Flow data may be noisy.
Both SNMP and NetFlow use the user datagram
protocol (UDP) as the transport protocol. The oper-
ating nature of UDP may add to the error term of the
model due to hardware or software problem resulting
in data loss in transit (Zhao et al., 2006), (Clemm,
2006), (Systems, 2010).
Having different vendors for network components
along with hardware/software platforms that are used
by various types of networking technologies and the
inherited shortcomings of the distributed computing
also introduce errors. The risk of errors increases if
there are more components in a system. The physical
and time separation and consistency is also a problem
and a source of error (Zhao et al., 2006).
The next section describes related work and dis-
tinguishes our contribution from the related work.
3 REVIEW OF RELATED WORK
The authors of (Zhao et al., 2006), on their way to es-
timate traffic matrix with imperfect information, have
mentioned the presence of errors in network measure-
ments. But, they did not present any solution in par-
ticular to the errors in link measurements. Though
they have considered these errors when they have
DCNET 2010 - International Conference on Data Communication Networking
68