more carefully in our portal. The Not Verified repre-
sents those rankings which are different in two data
sources. Let us look at the concrete case. Listing 6
is used to generate Table 3 which presents Not Ver-
ified case. Brasilia is ranked 4th by WolframAlpha
whereas Fortaleza is ranked 4th in GeoNames. Ac-
cording to Wikipedia
16
, Belo Horizonte is the 6th
largest city in Brazil, but it is ranked 5th in GeoN-
ames (Surprising: One city, three different rankings
in three different data sources). Similarly if we look
at city rankings of Bosnia in the list below, we find
that Banja Luka is ranked 2nd in GeoNames whereas
Zenika is ranked 2nd in Wolfram. According to
Wikipedia
17
Zenica is 4th largest city in Bosnia and
Herzegovina. There are few cases in the Not Verified
category. They are due to lack of completeness.
Wolfram: Sarajevo; Zenica; Banja Luka; Tuzla; Mostar
GeoNames: Sarajevo; Banja Luka; Zenica; Tuzla; Mostar
For instance, Solomon Islands (a country associated
with Australia) has no city list in WolframAlpha. All
this also depends on whether town, village, atoll, cap-
ital city is excluded or included in largest city list e.g.
Vaiaku is a village that is included in city list of Tu-
valu (a country associated with Australia) in Wolfram.
Now we turn to mountain rankings. Following is the
list showing highest mountains of Austria.
Wolfram: Grossglockner; Wildspitze; Weisskugel;
Grossvenediger; Similaun
GeoNames: Grossglockner; Wildspitze; Palla Bianca;
Grossvenediger; Ramolkogel
The mountains ranked 1 and 2 are correct. The
mountains ranked 3rd and 4th are also correct as
Italian name of Weisskugel is Palla Bianca (differ-
ent name case again!). Rank 5 seems incorrect, as
Hinterer Brochkogel (3628 m) is higher than Simi-
laun, see (Kulathuramaiyer et al., 2014). According
to Wikipedia
18
the Similaun is the 6th highest sum-
mit, thus representing a Not Verified case. Some-
times data sources display different peaks of the same
mountain which affects the ranking. Table 4 is show-
ing the highest mountains of some European coun-
tries. Monte Rosa is the highest mountain in Switzer-
land according to Wolfram whereas Dom comes on
top in Geonames, therefore it represents a Not Ver-
ified case. In the last row of Table 4, Hoverla and
Goverla are different names for the same mountain in
Ukraine, thus comes under Verified category.
16
http://en.wikipedia.org/wiki/Belo Horizonte
17
http://en.wikipedia.org/wiki/Zenica
18
http://en.wikipedia.org/wiki/Similaun
Against our expectations we found discrepancies in
area (size of country) figures also. Norway, France,
Sweden, etc. are some of the examples as shown in
Table 5; whereas countries like Philippines, Sri Lanka
etc. have accurate area figures. A map (coded using
shades of grey) of European countries displaying area
difference is shown in Figure 6. The transition from
white to dark grey shade represents the increase of the
area difference. Countries like Austria, Poland have
almost no area difference in multiple data sources
where as France, Norway have much larger area dif-
ference, see Table 5. The continent-wise area verifica-
tion results are shown in Table 6. The methodology of
identifying Verified and Not Verified cases regarding
area using multiple data sources is discussed in (Ku-
lathuramaiyer et al., 2014). We found 79 out of 193
UN countries where area facts mismatch (Not veri-
fied). In case of 114 out of 193 UN countries area
facts match (Verified).
While verifying information, we have learnt that even
simple quantitative questions like “what is the area of
a country in square kilometers” cannot be answered
easily. We hope that with the help of community and
domain experts, we will be able to clarify the rea-
sons for discrepancies in as many cases as is possible.
Another important consideration is that some data
sources simply copy some facts from others, It is hard
to determine the exact source for facts, due to mul-
tiple entries in reference section. But after exploring
the reference section of mentioned data sources, we
have noticed that WolframAlpha takes data from CIA
World Factbook. The reference section of Wikipedia
page of several countries points to CIA Factbook, be-
sides other sources e.g. UN data
19
. The geographic
web portals like Infoplease and World Atlas also take
data from CIA World Factbook and U.S. Census Bu-
reau
20
.
A Reliable Geographic Web Portal
We therefore seem to have at least partially succeeded
in building a large geographic portal which is reliable
and contains a large number of facts. The portal is on-
line and it is open to the general public, see Austria-
Forum
21
. The main interface showing each country is
shown in Figure 7. This geographic portal provides
factual information about the 193 countries which are
recognized by United Nations, along with a number
of territories, oceans and islands.
Smart Display of Facts for further Verification
In order to get help from domain experts, when we
find discrepancies in area figures we list the various
numbers with their sources. In some cases it is pos-
19
https://data.un.org/
20
http://www.census.gov/
21
http://austria-forum.org/af/Geography/
DATA2015-4thInternationalConferenceonDataManagementTechnologiesandApplications
148