computers can be done from any IP address not
affiliated to the institution), less than honourable
intentions can be surmised, e.g. discovering which
systems exist and gathering information on them.
3.2 DNS Scanning
Domains asked for but not existing are a significant
portion of the queries: on average 6,577 domain
names are asked for each hour, which do not exist.
This translates to 10% of all requests. As it is unlike-
ly that humans enter that many incorrect domain
names and even notoriously non-specification-
conforming HTML usually gets the host part of links
right, a different explanation is required. After man-
ual investigation of these errors, we could identify
the following subgroups:
These seems to be a lot of checking for existing
(or not) domains going on with very good lists or
sensible automation. Few nonsensical names are
tried, as almost all do make (some) sense. For exam-
ple, these contained (beside numerous similar oth-
ers) the following series of queries: worldwidere-
veal.com, worldwiderevenue.net, world-
widescort.com, worldwidescubatravel.com, world-
wideshoponline.com, worldwideshopspot.com,
worldwidetomatosociety.com, worldwide-
towers.com, worldwidetowinginc.com, world-
widetravelmembership.com, worldwideunderstand-
ing.com, world-wide-web-host.com, worldwideweb-
stersonline.com, worldwidewebtec.com, world-
wideweed.xyz. However there seems to be no obvi-
ous generator being used, as definitely many more
“worldwide*” names exist, multiple TLDs are used
(but with different second-level names), and e.g.
typos are perhaps also part of it (worldwidescort
should probably be worldwideescort). Also, if being
merely dictionary-based, many more combinations
than the ones above would be tried. A possible ex-
planation for this is that multiple exit nodes might be
used, so we only saw a portion of all queries. Note
that unlike the examples below, all these domain
names were only queried for a single time over the
whole observation period. This therefore seems
unlikely to be a prelude to attacks, but more search-
ing for opportunities to buy domain names, or creat-
ing respectively maintaining a list of existing top-
/second-level domains.
Numerous non-existing domain names are que-
ried for multiple times: for example, the top one is
“geo.mozilla.org” with 37,395
queries in total over
all five months, a domain name that however did
exist in the past. The next most common one (15,929
times) is cdn.api.twitter.com, which seems to have
been a working (but non-official) server which has
since been shut down. A small amount of queries are
mistakes of websites, at least partly because of
changing/removed server names not followed by
changes in the websites.
Some domain names are obviously simply erro-
neous, like “index.php” (3,778 queries) or “wp-
login.php” (1,320 times), which are probably meant
as a path and not as a host. Or “web.archive.orghttp”
(2,433 queries), “web.archive.org.https” (occurred
19 times) or “web.archive.org.localhost” (4 queries),
which are typos or signs of misconfigurations or
mistakes. Even aggregated these do not constitute a
significant number of queries in total.
Not directly explainable are the huge amount of
queries for domains of the form “forum.*”. 714,174
such non-existing domain names were queried for.
And as each of the top names (“fo-
rum.eurostimul.com”, “forum.zawya.com”, “fo-
rum.roots-archives.com” etc) occur more than 2,900
times this cannot be merely a scan. According to
Google searches, these domain names do not exist or
existed, although there might have been forums on
these sites (e.g. “eurostimul.com/forum/
memberlist.php” is in the result list). As it is unlikely
that several thousand scans with the same lists occur,
this is looking more like an error while performing
scans.
Apart from non-existing domain names, also
many queries receive a “no-data” reply. The tech-
nical reason is when a specific type of DNS record is
queried for and the domain name does exist, but not
this kind of record. Because of the limitation of Tor
in DNS queries (only A=IPv4, AAAA=IPv6, and
PTR=reverse lookup; are possible), the explanation
is simple: these are queries for IPv6 addresses,
where only IPv4 data exists (or potentially the re-
verse). This can be exemplified by the most common
name in this category: e13829.x.akamaiedge.net was
queried for 1,111,778 times! This domain name does
exist, but only serves IPv4, but was often queried for
its non-existing IPv6 record. The same applies to the
second largest count in this category: shops.myshop-
ify.com (363,117 queries; IPv4 data only). These
requests are therefore legitimate and not signs of a
scan, but of the increasing share of IPv6 being used.
DNS scans can also be used as attacks: little out-
going traffic causes large return traffic. Together
with falsifying the source address a DoS attack be-
comes possible. As the exit node determines the
source address of query packets,
this is not relevant
here. But the fact remains, that a DNS server must
produce a large answer (and expend computing time
for producing it), thereby, although not allowing
ICISSP 2019 - 5th International Conference on Information Systems Security and Privacy