because of the economy of scale, sharing resources
with other Virtual Organizations (VOs). In our par-
ticular case study, for simplicity, minor fixed costs
differences, like building amortization, are not con-
sidered. There are also vendor discounts for capital
purchases and maintenance, which are difficult to es-
timate, because it is not unusual that the negotiated
pricing is confidential. Such discounts can be impor-
tant in the Grid option compared with the dedicated
datacentre, but they are not taken in consideration for
Grid costs estimation.
In other capital costs of the Grid, the file sys-
tem servers and the storage area network have costs
reduction because they are shared with other VOs.
We have estimate a reduction of 2/3, which is
easy to reach in the Grid context of EGI. Regard-
ing the maintenance and licensing costs, we take
the example of our institute complex Grid infras-
tructure. We have a site belonging to the federated
Tier-2 for the ATLAS experiment (http://atlas.ch/),
and a site belonging to the Grid-CSIC infrastructure
(http://www.grid.csic.es), with a total storage near of
1 PB. These infrastructures are integrated in the Na-
tional Grid Initiative NGI/EGI, and there is an oper-
ational collaboration with the rest of Grid site teams,
for coordination and self-support. The operation of
this complex Grid site includes not only the data-
centre, but also computing resources administration.
This complex infrastructure is supported by a team
of 3 FTE persons working without third party main-
tenance support. Additional support is obtained from
the collaboration with the NGI-EGI operation groups.
This is possible in the collaborative environment of
the Grid communities, with a know-how sharing con-
text. About licensing, the gLite middleware is pro-
vided by the European Middleware Initiative (EMI)
(http://www.eu-emi.eu/) with opensource license cut
off cost. The mass storage systems supported by gLite
middleware are dCache and CASTOR (Burke et al.,
2009). In our Grid complex infrastructure we also
use Lustre (SunMycrosytems, 2009), which neither
have licensing costs. Other third party mass stor-
age systems can have licensing costs. If we consider
these premises, we can take a similar scheme for the
AGATA storage Grid option and cut off the mainte-
nance and licensing costs. Finally in the costs break-
down analysis, Table 1 assigns to the Grid storage
of AGATA a 2 FPE, 1 FPE less than the dedicated
datacentre, because the economy of scale in the oper-
ational tasks which can be done for many VOs.
Since most of the Cloud providers offer a wide
range of storage service, the AGATA storage require-
ments are in the standard range of the Cloud storage.
2.2 Data Communication Costs
In what concerns AGATA, the data source is the Data
Acquisition system (DAQ). The DAQ includes a pre-
mium range storage system, which is able to deal with
the raw-data throughput of the AGATA detector. For
cost reasons, the DAQ storage size is reduced to the
space required for processing the data of the active
experiment. Therefore, the DAQ needs to transfer the
produced raw data in quasi-real time to the mass stor-
age system. In the following we analyse the transfer
requirements.
In the Introduction Section, we have shown
that the experimental data produced by the AGATA
demonstrator with 4 triple detectors is 10TB on aver-
age. For network requirements analysis we take into
account not the average but the peak throughput of
the experiments, which is about 20TB for the AGATA
demonstrator. The peak experiments produce 5TB for
a tripledetector throughput. If we considerer the men-
tioned filtering factors, this throughput in the com-
plete AGATA ball can reach 10TB, which gives a to-
tal of 600TB for the peak experiments.
In our transfer tests we get 170MB/s of effec-
tive transfer rate. Therefore, for a peak experiment
600TB of raw data is transferred in 42 days. This
is clearly unscalable since there are about 30 exper-
iments planned for each year, and the AGATA DAQ
has storage space for only one experiment. For this
peak size experiments the AGATA project can book
some extra off-time before the next experiment. The
transfers can start during the data taking, so the peak
transfers can take two weeks. For our peak network
requirement, 600TB in 14 days, it is necessary an ef-
fective transfer rate of 520MB/s, equivalent to 4,160
Mbps, which requires a dedicated network of 5Gbps
rate.
This analysis illustrates a premium network re-
quirements, not only at physical layer but also in the
transfer software, to scale to the 60 triple detector data
transfer. Private leased lines are dedicated circuits,
with price depending basically on speed and distance.
A good option for our purpose can be two circuits of
OC-48 (Optical Circuit at 2,448 Mbps) to reach the
required 5Gbps of full time dedicated connection. A
estimation in (NortelNetworks, 2009) says that a 18
months leasing is 2,500 £per fibre mile, including in-
stallation and the rest of the costs. This is equivalent
to 1,281
C/km for a year channel leasing. Vendors
discount are usual for multiple channels after the first
one, but it is difficult to estimate, for our purpose we
take the costs of two complete channels. For the dis-
tance estimations, we take the distance between the
AGATA demonstrator DAQ and the data storage, ab-
ON THE ECONOMICS OF HUGE REQUIREMENTS OF THE MASS STORAGE - A Case Study of the AGATA Project
509