based on RPSL for its wider usage, and it can be
applied to SWIP with tiny modifications.
RPSL is designed to specify routing policy at
various levels, ranging from router to AS. In an ideal
case, low-level router configurations can be directly
generated from the routing policies described at AS
level. Like typical object-orientated language, RPSL
comprises several classes, each of which uses a set
of attributes to describe its object instances. In our
model, RPSL classes are classified into three
categories: PoC(Point of Contact), NR(Number
Resource) and RP(Routing Policy), according to the
content being described.
(1) PoC classes.
PoC classes describe contact information. For
details, PoC classes include mntner, person and role
class. The mntner class specifies authentication
information required to add, delete or modify other
objects. The person class describes the information
necessary to contact a person. The role class is very
similar to person class except for that instead of
describing a human being, a role object describes a
role performed by one or more human beings. In this
way, role does not have to change when a person
performing this role changes.
(2) NR classes.
NR classes describe Internet number resources,
such as inetnum, inet6num and domain class in
RPSL.
(3) RP classes.
RP classes are used to describe routing policy.
For example, the inet-rtr class defines a router via
this router’s DNS name, the IP address of each
interface, the AS number of the AS which owns or
operates this router and such information.
The NR and RP objects can establish direct
connections with network operators by referring to
the class key of PoC objects through their admin-c,
tech-c and mnt-* attributes(including mnt-by, mnt-
lower, mnt-routes and so on), as shown in Figure 3.
Figure 3: PoC objects are referred as contact points.
While admin-c attribute usually refers to
someone who is physically located at the site of the
network, the tech-c attribute indicates a person
responsible for the day-to-day operation of the
network, but does not need to be physically located
at the site of the network.
3 METHODOLOGY
Our methodology is as follows. Firstly, we build a
MDN (multiple dimension network) to characterize
the interrelationship of various elements in registry
data, which is the outcome of network operators’
registry activities. Secondly, we quantify how close
are two network operators with the tie strength in
between, which is calculated based on the paths
between these two operators in the built MDN. At
last, network operators are grouped into clusters
according to the tie strength among them, and each
cluster is considered as an organization.
3.1 Building Multi-Dimension Network
Let symbol D={as-block,as-set,aut-num,inet6num,
inetnum ,mntner…, mail, phone/fax number} denote
the dimension vector of the MDN, each element in
D represents either a RPSL or a user-defined class.
3.1.1 Vertexes of MDN
Let V
(i)
denote the set of vertexes from dimension i
∈ D, then V
(i)
should be the union set of all the
object instances’ class keys of class i. Similarly, the
vertex sets of email and phone/fax number
dimension are all the email addresses and phone/fax
numbers that appeared in the dataset.
3.1.2 Discovering Links from RPSL Objects
MDN links are primarily generated from RPSL
objects, each of which is essentially a collection of
attributes. For each RPSL object r, it has a key
attribute (r.k) and a set of non-key attributes (r.NK).
For each attribute x∈NK, the non-key attribute x can
be:
(1) Key of other RPSL objects. The definition of
r leverages the information that has already been
defined by some other RPSL objects. In this case, a
link k→x is added to the link set.
(2) Plain text. Since natural language processing
is not so accurate, no links are generated in this case
to avoid importing uncertainty.