Some Estimates of Labor Contribution for Creating Digital Libraries
N. Kalenov
a
, G. Savin
b
, I. Sobolevskaya
c
and A. Sotnikov
d
Joint Supercomputer Center of the Russian Academy of Sciences, Branch of Federal State Institution
โ€œScientific Research Institute for System Analysis of the Russian Academy of Sciencesโ€ (JSCC RAS - Branch of SRISA),
119334, Moscow, Leninsky av., 32 a, Russia
Keywords: Scientific Heritage, Digital Library, Russian Scientists, Information System, Network Technologies, Virtual
Exhibitions, Museum Objects, Digitization, Scientific Digital Library, Digitalization, Digital Books,
3D-Models, Technology, Labor Contribution, Span Time.
Abstract: One of the main directions of modern technological development is the digitalization of various areas of
activity. In the science, this forms integrated digital libraries include various digital objects, including digital
copies of printed publications, 3D models of museum items, digitized images, audio and film materials.
Scientific digital libraries are characterized by high requirements for the quality of digital copies of printed
scientific sources, since any ambiguity or contamination within chemical formulas or mathematical
expressions can lead to erroneous perception or misunderstanding of the meaning. Special requirements for
digital copies are also imposed when digitizing rare editions and archival documents that are of scientific and
historical value not only in their content, and in the notes of scientists in the margins of a book or archival
document. Requirements for the quality of digitized materials determine the significant labor intensity of
preparation; it is necessary to evaluate it when planning work on filling scientific libraries. This article
contains a calculation methodology of span time for creating integrated digital scientific libraries using the
example of the technology of forming a digital library "Scientific Heritage of Russia" (DL SHR). DL SHR
contains detailed information about scientists, their most important publications (digital copies of full texts),
related archival documents, as well as 3D models of museum items related to their activities. The developed
methodology includes the decomposition of the entire technological process into a number of operations
performed by specialists of a certain profile (librarians, editors, scan-operators, etc.). Each stage is divided
into several operations, for each of which the time spent on the execution of works assigned to the unit within
this operation is estimated. Such units can be a page of a book, an entire book, a biography of a scientist, etc.
Span time estimation is carried out either from the published standards, or, in their absence, from the analysis
of the experience of performing this operation. The article provides data on the calculation of time costs for
individual operations, the formation of digital objects and their collections in relation to DL SHR based on
Russian standards and 10 years of experience.
1 INTRODUCTION
The creation of digital libraries is one of the most
rapidly developing field in computer science. In the
world today there is a huge collections of scientific
information a huge number of collections of scientific
information resources called digital libraries. These
digital libraries contain different thematic and types
a
https://orcid.org/0000-0001-5269-0988
b
https://orcid.org/0000-0003-4189-1244
c
https://orcid.org/0000-0002-9461-3750
d
https://orcid.org/0000-0002-0137-1255
of management, use different approaches to the
formation of their funds (content). It should be noted
that although the term "digital library" is used in a
great number of publications, a single clear definition
of it has not yet been given. Not referring to the term
that was discussed 20 or more years ago, we will give
several definitions related to modern times.
Kalenov, N., Savin, G., Sobolevskaya, I. and Sotnikov, A.
Some Estimates of Labor Contribution for Creating Digital Libraries.
DOI: 10.5220/0010512500590066
In Proceedings of the 18th International Conference on e-Business (ICE-B 2021), pages 59-66
ISBN: 978-989-758-527-2
Copyright
c
๎€ 2021 by SCITEPRESS โ€“ Science and Technology Publications, Lda. All rights reserved
59
In some publications (Jupp, 1997; Schwartz,
2000) the term "digital library" is defined as an
information system that allows you to keep secure
and give full play to various types of digital
documents (text, visual, audio, video, etc.), localized
in the system itself, as well as available to it through
telecommunication networks.
Presently scientific publications (Bogdanova,
2017; Parn, 2017) indicate that "Digital libraries"
(DL) are forms of complex distributed information
systems, providing new opportunities for working
with heterogeneous information. DL are considered
as the basis for creating a global distributed repository
of knowledge. Actually the authors of these articles
point out that
does not exist a single generally
accepted definition of the digital libraries.
Here are some definitions
(https://dic.academic.ru/dic.nsf/fin_enc/31885) of the
term "digital library":
Digital libraries are organized collections of
information resources and associated tools for
creating, archiving, sharing, searching, and using
information that can be accessed electronically
(https://www.encyclopedia.com/literature-and-arts/
journalism-and-publishing/libraries-books-and-
printing/digital-libraries).
A digital library, digital repository, or digital
collection, is an online database of digital objects that
can include text, still images, audio, video, digital
documents, or other digital media formats.
(https://en.wikipedia.org/wiki/Digital_library).
Without going into the discussion of the above
definitions, we note that, no matter how the term
"digital libraries" is interpreted, a special technology
is needed to implement in practice an "access
network", "information system" or "ordered
collection". Provision of information resources to
users is what unites the traditional and digital
libraries. Both libraries should create a reference
apparatus that allows a user to find what interests him
among the resources that the libraries provide. For
traditional libraries, these are various kinds of
catalogs (including electronic ones) with search
elements that have developed in many years of library
practice. For digital libraries (in the broad sense of
this term) - metadata bases with a search interface of
varying complexity (Chen, Lu, 2015). The formation
of electronic libraries is often associated with the
need for purposeful digitization of certain
publications. First of all, this is typical for scientific
DL, formed according to narrow thematic or other
fixed principles.
Creation and maintenance of digital libraries is
labor intensive. Some labor relate to the DL ontology
(the choice of the database structure, the definition of
classes of objects included in the DL and their
relationships, metadata profiles and information
presentation formats) or the creation or adaptation of
the software shell (Kozlova, 2019).
DL maintenance is constantly required and the
amount of work involved in its maintenance is not so
much related to technical support, but rather to the
creation of content. But with the formation of content,
including both digital objects themselves and their
metadata, which ensure the quality of search.
The subject of this article is the assessment of
permanent labor contributions to support digital
libraries using the example of the Digital Library
"Scientific Heritage of Russia" (Kalenov, Savin,
Serebryakov, Sotnikov, 2012; Kalenov, 2014;
Sotnikov, 2015; Zabrovskaya, 2017).
The digital library "Scientific Heritage of Russia"
(DL SHR) (http://e-heritage.1gb.ru/Catalog/IndexL)
has been operating in Ethernet mode since 2010. The
main goal of the DL SHR is to create, preserve, and
provide access to accurate and reliable information
about outstanding scientists who have contributed to
the development of Russian science and scientific
achievements. The DL SHR contains biographical
information about scientists, the major publications
(bibliography and scanned full texts), archival
information and museum objects related to them. The
library includes text information, digitized prints,
archival documents, photographs and films, 3D
models of museum items.
To date, the DL SHR provides information on
more than 6100 scientists who worked in Russia from
the 18th to the first quarter of the 20th centuries; about
25,000 books published during this period have been
digitized and available to users.
The DL SHR is based on the principle of
distributed data with centralized editorial processing,
content downloading and technology support. More
than 20 libraries, institutes and museums prepare
information for DL SHR according to uniform rules.
The task of content providers the selection of
materials in accordance with the principles, the
formation of metadata about the inclusion of objects
in the DL SHR (personalities, publications, archival
documents, museum items, photographs, multimedia
materials), the digitization of publications and
information processing in accordance with the rules
of the system (for DL SHR adopted, according to
which the scanned text is not recognized, with the
exception of the table of contents, the transfer of
processed materials to the editorial group.
The editorial team performs the following
functions:
ICE-B 2021 - 18th International Conference on e-Business
60
- makes the final decision on the inclusion of
publications proposed by providers in the digital
library;
- metadata validation;
- prepares the editions that have passed the control
for loading into the software shell of the demo part
of the digital library.
The object-oriented approach chosen in the design of
the DL SHR, the use of distributed data preparation
technology, the reflection in the DL SHR of various
digital objects of scientific purpose and positive
experience of many years of its operation allows to
consider the DL SHR as a prototype of the Common
Digital Space of Scientific Knowledge (CDSSK)
(Antopol'skij, Kalenov, Serebryakov, Sotnikov,
2019.). The creation of the CDSSK is the most
important challenge for the development of a system
(that will provide scientific, educational and cultural
informational support and preserve scientific data and
knowledge) for scientific educational and cultural
(informational) support and preservation of scientific
data and knowledge.
In this regard it is important to build a labor input
model for the CDSK on the basis of the DLSHR
operating experience. The model is built on examples
of the most labor-intensive processes, which are the
digitization of printed publications (the formation of
electronic books) and the creation of 3D models of
museum objects.
2 THE DIGITAL COPIES OF
BOOKS PREPARING
The DL SHR content is determined according to the
principle "from person to publication". Therefore, to
include a book in the DL SHR content, it is necessary
to enter information about its author first (first stage).
To determine the scientific interests of the
selected person, a universal hierarchical classification
(๐พ) of knowledge areas is used. This classification is
adopted to systematize the entire flow of scientific
and technical information.
In accordance with the DL SHR metadata
standards bibliographical data related to scientists,
their scientific interests in terms of classification, and
a bibliography of their main works are entered into
the library.
Librarians perform this work. It includes four
stages:
- the search for sources of scientist biographical
data;
- the compilation of a detailed biography;
- the selection of bibliography;
- the input of data into the DL SHR technological
block.
Lets denote the average time spent on the
implementation of each stage, respectively, through
๐‘ก
๎ฏฃ
๎ฌต
, ๐‘ก
๎ฏฃ
๎ฌถ
, ๐‘ก
๎ฏฃ
๎ฌท
.
Suppose he information is entered into the system,
then the scientific technological processes carried out
in the preparation of the publication for inclusion in
the DL SHR are presented Table 1.
Table 1: Technological processes carried out in the
preparation of the publication for inclusion in the DL SHR.
Stage
numbe
r
Project scope By whom
Accounting
uni
t
Time
1
Selection of the
publication;
Input into the
technological
block of primary
metadata of the
publication
including
connections with
persons
librarian
(organization -
participant of
the project)
Book
๐‘ก
๎ฏž
๎ฌต
2
Application
consideration
Editorial team
membe
r
Book
๐‘ก
๎ฏž
๎ฌถ
3
Getting the book
from the library
stock;
the introduction of
extended metadata
of publications
(including the
serialization of the
book to
classification ๐พ)
librarian
(organization -
participant of
the project)
Book
๐‘ก
๎ฏž
๎ฌท
4
Sending for
scanning,
preparing a book
for scanning
librarian
(organization -
participant of
the project)
Book
๐‘ก
๎ฏž
๎ฌธ
5 Page Scanning
Scanner-
Operato
r
Page
๐‘ก
๎ฏฆ
๎ฌต
6 Image processing
Technical
Specialis
t
Page
๐‘ก
๎ฏฆ
๎ฌถ
7
Table of contents
processing;
digital book buil
d
Technical
Specialist
Book
๐‘ก
๎ฏž
๎ฌน
8
Book metadata
quality control
Editor Book
๐‘ก
๎ฏž
๎ฌบ
9
Page metadata and
navigation system
quality control
Editor Page
๐‘ก
๎ฏฆ
๎ฌท
10
Downloading the
digital book into
the DL SHR (link
start-up between
the book layout
and metadata)
Technical
Specialist
Book
๐‘ก
๎ฏž
๎ฌป
Thus, if a book of ๐‘ pages is entered into the DL
SHR the total span time ๐‘‡
๎ฎป
for its inclusion in the
Library will be:
Some Estimates of Labor Contribution for Creating Digital Libraries
61
๐‘‡
๎ฎป
๎ตŒ๎ท๐‘ก
๎ฏž
๎ฏœ
๎ฌป
๎ฏœ๎ญ€๎ฌต
๎ต…๐‘โˆ™๎ท๐‘ก
๎ฏฆ
๎ฏœ
๎ฌท
๎ฏœ๎ญ€๎ฌต
(1)
Generating information on the scientist that is
reflected in the DL SHR includes three times
intervals.
To implement the first stage (time interval ๐‘ก
๎ฏฃ
๎ฌต
), it
is necessary to perform the following operations:
- selection of authoritative publications (primary
documents);
- obtaining information about the scientist, ordering
Items from a library;
- delivery from storage;
- issuing to the user (in this case, to the employee
compiling the biography of the scientist);
- compiling a biography of the scientist based on
information from the publications received;
- returning the publications to the funds.
The time spent on the selection of publications can be
estimated using the norm โ€œImplementation of
thematic information; search and selection of
documents ". Analysis of the data of the DL SHR
shows that on average, when compiling a biography
of a scientist, from 2 to 3 sources are used) is 15
minutes.
Technological operations of library related to the
issuance and acceptance of editions of the funds are
normalized per edition and total 13 minutes. Let us
estimate that operations last about 30 minutes
(considering that 2 items are to be loaned).
To estimate the time spent on compiling a
biography of a scientist, we will use the rule โ€œwriting
an abstract: studying and analyzing the document for
which the abstract is being prepared; writing a text ",
equating conditionally compiling a biography to
compiling an abstract of selected publications). This
rate per one author's sheet (40,000 characters) is 5920
minutes. An analysis of the data reflected in the DL
SHR shows that the volume of the text of a scientist's
biography ranges from 1000 to 31000 characters and
is, on average, about 6000 characters, or 15% of the
printed sheet. Thus, the standard time for compiling a
biography of a scientist and entering it into the system
is 888 minutes, the total time for completing the first
stage of forming data about a scientist is ๐‘ก
๎ฏฃ
๎ฌต
๎ตŒ15๎ต…
30 ๎ต… 888 ๎ตŒ 933 minutes.
The span time on the implementation of the
second stage (the formation of a bibliographic list of
the scientist's publications) can be estimated on the
basis of the norm for compiling a bibliographic index,
which is 13500 minutes per author's sheet. The
volume of the bibliography of the first year of his
activity - the average number of publications by one
scientist has increased several times over the last
century in his scientific activity. Analysis of the data
entered in the DL SHR shows that the bibliographic
list of one scientist, on average, is 2200 characters, or
5.5% of the author's sheet. According to the norms, it
takes 742 minutes to compose it.
Entering structured data about a scientist can be
interpreted as the operation โ€œtyping on the keyboard
information about the reader: last name, first name,
patronymic, characterizing his characteristics
(education, specialty, other information.โ€ According
to the norms, 6 minutes are given for this operation.
Thus, the total time spent on creating digital
library information about one scientist ( ๐‘‡
๎ฏ‰
๎ตŒ๐‘ก
๎ฏฃ
๎ฌต
๎ต…
๐‘ก
๎ฏฃ
๎ฌถ
๎ต…๐‘ก
๎ฏฃ
๎ฌท
) is 1681 minutes or (rounded up) 28 hours of
work of a librarian.
When assessing the labor costs of librarians ๐‘ก
๎ฏž
๎ฌต
, ๐‘ก
๎ฏž
๎ฌท
and ๐‘ก
๎ฏž
๎ฌธ
, we will use, together with the already
considered norms for the selection of literature, the
norms for "forming a bibliographic record for
documents in a language (descriptive cataloging)" (18
minutes per document), "indexing (meaningful
cataloging)โ€ (18) and โ€œentering computer basic
information about the document (author, title) in a
specialized programโ€ (5 minutes), โ€œpreparing
documents for microfilming and scanning
documentsโ€ (5 minutes), โ€œtransferring documents for
microfilming and scanningโ€ (16 min.). The results are
as follows: ๐‘ก
๎ฏž
๎ฌต
๎ต…๐‘ก
๎ฏž
๎ฌท
๎ต…๐‘ก
๎ฏž
๎ฌธ
๎ตŒ75 min.
Consider the processes (indicated as stages in Table
1) performed by the staff of the editorial team,
scanners and technicians. As a basis, we will take the
experience of the DL SHR database provisioning and
the norms for scanning documents in a non-contact
method (this is the technology used in the DL SHR),
presented in (YUmasheva YU.YU., 2012).
The results are as follows (stage 2 in Table 1), an
employee of the editorial group decides to enter the
input into the digital library of books proposed by the
project participants. To do so an item is checked
according to the following parameters:
- compliance with the system journal of the library;
- compliance with copyright rules;
- detection of duplication of bibliographic record
rules. Then a unique number is then assigned to
the e-book (ID). If the item (book) cannot be
registered, a note is made a mark is made
indicating the reason for such a decision. The
rate for one employee is 30 books per shift.
Based on this, we get ๐‘ก
๎ฏž
๎ฌถ
๎ตŒ16 min.
The rate per operator for page scanning (step 5) is 800
pages per shift. It means that ๐‘ก
๎ฏฆ
๎ฌต
๎ตŒ0.6 min.
ICE-B 2021 - 18th International Conference on e-Business
62
The main task of the 6th stage (image processing) is
to check and edit the graphic images of the digital
pages. It includes three technological processes:
- automatic scan processing by a special program -
automatic processing;
- manual correction.
As a result of the first process, typical scanning
defects are corrected to an acceptable level.
The second process involves:
- checking the sequential display of pages (page
numbering should be sequential, search for pages
missed during scanning);
- checking the quality of scanning (the degree of
readability of the text, at least 98% of the
information presented on the page must be
readable);
- Checking the quality of automatic processing of
scanned pages (correct page cropping, geometric
correction of text, text bends and distortions);
- the simplest editing of scanned pages (cropping,
removing extraneous elements).
As part of the third process, the pages are manually
processed in one of the graphic editors. This stage is
provided for the most complex editions, many
formulas, tables, illustrations, etc. It includes:
- geometric correction of text, text bends and
distortions;
- removal of extraneous elements on the pages of
electronic books (operator's fingers, stripes,
shadows, and other extraneous elements);
- color correction.
The rate per operator during this stage is 800 pages
per shift. Thus ๐‘ก
๎ฏฆ
๎ฌถ
๎ตŒ0.6 min.
The main tasks of the 7th stage are:
- formation of the table of contents of the book
(recognition and editing of text or its manual
input);
- layout of an e-book in a special program based on
prepared high-quality graphic formed pages and a
generated table of contents;
- creation of the most accurate navigation system of
the digital book.
In the process of creating a navigation system, the
technician must ensure:
- the correctness of typing, titles, notes and other
parts of the navigation system;
- the correctness of the electronic links and the
navigation system;
- completeness of the e-book: sequential number of
pages, order of sections.
The day's work for one specialist is 5 e-books per
shift: ๐‘ก
๎ฏž
๎ฌน
๎ตŒ96 min
Stage 8 (book metadata quality control) includes:
- checking the correspondence of the author name,
the title, the output data to those on the cove page;
- checking the formatting of records - spelling,
punctuation, accepted word abbreviations in
bibliographic data;
- checking the compliance of the information
entered in the fields "type of publication",
"language", "pages", the original. The "pages"
field is verified strictly according to the electronic
version of the book and includes the total number
of files in the digital version, prepared for
uploading to the site, checking for the presence of
appropriate indexes;
- checking the formatting of the bibliographic
description (according to standards).
The day's work for one specialist is 10 e-books per
shift, from which follows: ๐‘ก
๎ฏž
๎ฌบ
๎ตŒ 48 min.
At stage 9 (page metadata and navigation system
quality control), the issuing editor checks the layout
of the e-book on the production server. The work of
the editor includes the analysis of graphic images of
the pages and checking the navigation system. It
includes:
- checking the sequential display of pages;
- checking the quality of scanning (the degree of
readability of the text, at least 99% of the
information presented on the page must be
readable);
- checking the quality of processing of scanned
pages (correct page cropping, geometric text
correction, absence of text bends and other
distortions, absence of "extraneous elements" -
stripes, shadows, operator fingerprints, etc.);
- checking links for their opening;
- checking links for compliance with the chapters
and contents of the book.
When certain shortcomings are identified, the
corresponding information is transmitted to the
operator of the 6th stage. The norm for these works is
1200 pages per shift, based on this, we get ๐‘ก
๎ฏฆ
๎ฌท
๎ตŒ0.4
min.
At the final stage, the issuing editor publishes the
book and metadata on the e-library portal and checks
the availability of the downloaded information. The
production rate for one specialist is 50 e-books per
shift, ๐‘ก
๎ฏž
๎ตŒ9.6 min.
Substituting the obtained values into formula (1), we
find that the average time spent on digitizing and
including one book of N pages in the digital library
will be (in minutes)
๐‘‡
๎ฎป
๎ตŒ 244.6 ๎ต… 1.6 โˆ™ ๐‘
(2)
Library workers from this time spend ๐‘‡
๎ฏ…
๎ตŒ75 min.
Some Estimates of Labor Contribution for Creating Digital Libraries
63
Editors ๐‘‡
๎ฏ‹
๎ตŒ64๎ต…0.4โˆ™๐‘. Technical specialists
๐‘‡
๎ฏ–
๎ตŒ 105.6 ๎ต… 0.6 โˆ™ ๐‘.
Scanning operators ๐‘‡
๎ฌด
๎ตŒ0.6โˆ™๐‘.
To prepare and enter into DL the first book of a
scientist that was not previously presented in the DL,
200 pages in volume will take about 38 hours,
including ~ 29.5 hours of work of library specialists,
~ 2.5 hours of work of an editor, ~ 2 hours of work of
an operator- scanner, ~ 4 hours of work of a technical
specialist. By introducing another book by the same
author the processing time will be reduced the work
needs of librarians will be reduced to one and a half
hours, and the total preparation time for a book will
be about 10 hours.
3 LAYOUT PREPARATION OF 3D
DIGITAL MODELS OF
MUSEUM OBJECTS
Along with digital publications, DL SHR contains
multimedia content and, in particular, 3D-models of
museum objects. These objects can be associated with
a specific person (or several persons) or they can be
combined into an independent collection dedicated,
among other things, to a certain research area or
event. The estimated staff time required for creation
of 3D-model (one object) and of digitals collections
that include several objects will be discussed below.
Various methods are used to visualize a three-
dimensional object (Kalenov, 2020; Scopigno, 2017;
Garstki, 2017; Guidi, 2020; Medina, 2020; Hipsle,
2020).
The photogrammetry method (Lobanov, 1984;
Carstensen, 1991) allows you to build a high-quality
3D-model with the transfer of the texture and color of
the object. However, computationally, building 3D-
models from a set of images using the
photogrammetry method is a rather laborious task.
For example, the processing of 124 photographs on
one of the nodes of the MVS-10P cluster
(http://new.jscc.ru/resources/hpc/#item1587)
installed at the MSC RAS took 41 hours of
calculations (Sobolevskaya, 2019).
For the formation of digital 3D-models in the DL
SHR there was a model of interactive animation
technology (Sobolevskaya, 2019). This technology
does imply the construction of a 3D-model based on
a programmatic change (scrolling) of a fixed view of
an object (frames) using standard interactive display
programs that simulate a change in the point of view
of the original object. To create such an interactive
cartoon, you need a set of pre-prepared scenes that
will be separate exposition frames.
Before proceeding with the formation of digital
3D-models of museum objects in order to include
them in the digital library, it is necessary to carry out
certain preparatory work performed by the staff of the
museum, which owns the modeled object.
This work includes:
- Selection of an object for digitization with the
documentation preparation;
- Inspection of the object for preservation with the
preparation of a protocol of inspection or entry into
the register of museum items;
- Issuance of an object for digitization.
The standard time
๐‘‡
๎ฌด
, desired for this type of work
is, on average, 130 minutes per object.
After these preparatory work is completed, the
main cycle of work begins on the creation of a digital
3D-model of the museum object.
This cycle of work includes the following main
stages:
1. Preparing for digitizing. It means setting up an
object at the shooting location, adjusting lighting,
etc.
2. Digitization of the object. To carry out this work,
a special complex based on a specialized rotary
platform and a digital camera. The end result of
this stage is an array of data, files with
photographs of the object taken from 120 angles;
3. Processing of the data set obtained at the first
stage. At this stage, the background on which the
image was taken is removed from each photo.
This is done using a software module specially
designed for this stage;
4. Layout and quality control of the digital resource
image. The result of this phase is digital 3D-
images of museum items.
5. Description of the museum item, the digital 3D-
model of which is included in the digital
library.
The museum staff did this work.
6. Loading the generated model into the DL SHR.
Lets ๐‘‡
๎ฌต
,๐‘‡
๎ฌถ
,๐‘‡
๎ฌท
,๐‘‡
๎ฌธ
,๐‘‡
๎ฌน
,๐‘‡
๎ฌบ
- time intervals required for
processing one museum object at stages 1-6,
respectively.
Table 2 shows the technological processes carried
out in the creation of museum 3D-objects for
inclusion in the DL SHR.
ICE-B 2021 - 18th International Conference on e-Business
64
Table 2: The technological processes carried out in the
creation of museum 3D-objects for inclusion in the DL
SHR.
Stage
number
Project scope By whom
Accounting
unit
Time
1
Preparing for
digitizing
Museum
employee
Museum
object
๐‘‡
๎ฌต
2
Digitization of
the object
Technical
Specialist
Folder
containing
120 jpg files
for each
object
photographed
๐‘‡
๎ฌถ
3
Processing of
the data set
obtained at the
first stage
Technical
Specialist
obtained files
๐‘‡
๎ฌท
4
Layout and
quality control
of the digital
resource image
Technical
Specialist
Digital 3D-
object
๐‘‡
๎ฌธ
5
Description of
the museum
item, the digital
3D-model of
which is
included in the
digital library
Museum
employee
Digital 3D-
object
๐‘‡
๎ฌน
6
Loading the
generated model
into the DL
SHR
Technical
Specialist
Digital 3D-
object
๐‘‡
๎ฌบ
Thus, if there are ๐‘€ digital museum 3D-objects
are introduced into the DL SHR then the average time
๐‘‡
๎ฏ”๎ฏฉ
for the inclusion of this volume of digital
resources in the DL SHR is:
๐‘‡
๎ฏ”๎ฏฉ
๎ตŒ๐‘€โˆ™๎ท๐‘‡
๎ฏœ
๎ฌบ
๎ฏœ๎ญ€๎ฌด
After several objects have been digitized, they can be
combined into one or more collections. Let ๐‘‡
๎ฏž
be the
average time required to form and describe a
collection. Then the total time ๐‘‡ is total for the
formation of a digital collection of museum 3D
objects is: ๐‘‡๎ตŒ๐‘‡
๎ฏ”๎ฏฉ
๎ต…๐‘‡
๎ฏž
The following are the numerical values of the average
time spent on the formation of digital 3D-models of
museum items based on the experience of creating
content in the DL SHR. In the process of replenishing
the digital library content, more than 100 3D-models
of museum items were prepared, combined into
several collections. Among them is a digital 3D-
collection of models of fruits by I.V. Michurin, stored
in the State Biological Museum named after K. A.
Timiryazev (GBMT), digital 3D-collection of
anthropological reconstructions by M.M. Gerasimov,
stored in the GBMT and the State Darwin Museums.
The average time values ๐‘‡
๎ฌต
,๐‘‡
๎ฌถ
,๐‘‡
๎ฌท
,๐‘‡
๎ฌธ
,๐‘‡
๎ฌน
,๐‘‡
๎ฌบ
are
given below, based on the experience of formation,
including these collections.
To implement the first stage (preparation of the
object for digitization, interval ๐‘‡
๎ฌต
), an average of 45
minutes is required.
To implement the second stage (digitization of the
selected content, interval ๐‘‡
๎ฌถ
), on average, 20 minutes
per object.
To implement the third stage (processing the files
obtained as a result of digitization, time interval ๐‘‡
๎ฌท
),
an average of 290 minutes per object is required.
To implement the fourth stage (layout and quality
control of the image of a digital resource, time
interval ๐‘‡
๎ฌธ
), on average, 25 minutes per object is
required.
To implement the fifth stage (description of a
digital 3D-object, time interval ๐‘‡
๎ฌน
), an average of 15
minutes is required per object.
To implement the sixth stage (loading a 3D-object
into the DL SHR, time interval ๐‘‡
๎ฌบ
), on average, 35
minutes are required per object.
Thus, the total time spent on presenting one digital
3D-model of a museum object in the DL SHR is:
๐‘‡ ๎ตŒ 45 ๎ต… 20 ๎ต… 290 ๎ต… 25 ๎ต… 15 ๎ต… 35 ๎ต… 130 ๎ตŒ ๎ตŒ
560 min.
To generate at least 40 digital 3D-models of museum
objects (time ๐‘‡
๎ฏž
), an average of 180 minutes is
required.
When forming a digital 3D-collection of
anthropological reconstructions, M.M. Gerasimov
was created and uploaded to the site http://acadlib.ru/,
integrated with the DL SHR, 50 works by M.M.
Gerasimov. The total time taken to create this
collection was: ๐‘‡
๎ฏ€๎ฏ˜๎ฏฅ
๎ตŒ 415 โˆ™ 50 ๎ต… 180 ๎ตŒ 28 180
min. That's about 470 hours
.
4 CONCLUSIONS
Using the results obtained, it is possible to solve the
problem of optimizing the time spent on creating
digital copies of printed materials and museum
methods by paralleling "technological processes
performed by library or museum specialists
(preparation of object metadata) and technical
specialists (digitization of materials and quality
control).
The estimates obtained can be further extended to
create copies of other types of institutions for
Some Estimates of Labor Contribution for Creating Digital Libraries
65
planning work on the formation of a single digital
space of scientific knowledge.
ACKOWLEDGEMENTS
The research is carried out by Joint SuperComputer
Center of the Russian Academy of Sciences โ€“ Branch
of Federal State Institution โ€œScientific Research
Institute for System Analysis of the Russian Academy
of Sciencesโ€ within the framework of a state
assignment 0580-2021-0014.
REFERENCES
Antopol'skij A.B., Kalenov N.E., Serebryakov V.A.,
Sotnikov A.N. 2019. O edinom cifrovom prostranstve
nauchnyh znanij. In Vestnik Rossijskoj akademii, Vol.
89 (7). pages 728-735.
Bogdanova I.F. Bogdanova N.F. 2017. Elektronnye
biblioteki: istoriya i sovremennost'. In Informacionnoe
obshchestvo: obrazovanie, nauka, kul'tura i tekhnologii
budushchego. Vypusk 1. pages 133-153.
Carstensen, LW. 1991. Desk-top scanning for cartographic
digitization and spatial-analysis. In Photogrammetric
engineering and remote sensing. Vol. 57 (11). pages
1437-1446.
Chen J., Lu Q. 2015. A method for automatic analysis Table
of Contents in Chinese books. In Library hi tech. Vol.
33 (3). pages 424-438.
Garstki K. 2017. Virtual representation: the production of
3D digital artifacts. In Archaeol. Method Theory. Vol.
24. pages 726โ€“750.
Guidi G., Malik, US., Micoli, LL. 2020. Optimal Lateral
Displacement in Automatic Close-Range
Photogrammetry. In Sensors. Vol. 20 (21). โ„– 6280.
Hipsley CA., Aguilar R., Black, JR., Hocknull SA. 2020.
High-throughput microCT scanning of small
specimens: preparation, packing, parameters and post-
processing. In SCIENTIFIC REPORTS. Vol. 10 (1). โ„–
13863.
https://dic.academic.ru/dic.nsf/fin_enc/31885 (last access
26.01.2021).
http://documentation.sorbonne-universites.fr/en/resources/
digital-libraries/jubilotheque-upmcs-scientific-digital-
library.html (last access 24.01.2021).
http://e-heritage.1gb.ru/Catalog/IndexL (last access
24.01.2021).
http://new.jscc.ru/resources/hpc/#item1587 (last access
24.01.2021 (24.01.2021).
https://en.wikipedia.org/wiki/Digital_library (last access
24.01.2021).
https://www.encyclopedia.com/literature-and-arts/journal
ism-and-publishing/libraries-books-and-printing/digi
tal-libraries (last access 24.01.2021).
Jupp B. 1997. The Internet library of early journals. In Aslib
proceedings. Vol. 49 (6). pages 153-158.
Kalenov N.E. 2014. Upravlenie tekhnologiej napolneniya
elektronnoj biblioteki "Nauchnoe nasledie Rossii". In
Elektronnye biblioteki: perspektivnye metody i
tekhnologii, elektronnye kollekcii: trudy XVI
Vserossijskaya nauchnaya konferenciya RCDL. pages
357-361.
Kalenov N.E., Kirillov S.A., Sobolevskaya I.N., Sotnikov
A.N. 2020. Vizualizaciya cifrovyh 3d- ob"ektov pri
formirovanii virtual'nyh vystavok. In Elektronnye
biblioteki. Vol. 23 (4). pages 418-432.
Kalenov N.E., Savin G.I., Serebryakov V.A., Sotnikov
A.N. 2012. Principy postroeniya i formirovaniya
elektronnoj biblioteki "Nauchnoe nasledie Rossii". In
Programmnye produkty i sistemy, 2012. Vol. 4 (100).
pages 30-40.
Kozlova T., Zambrzhitskaia E., Simakov D., Balbarin Y.
2019. Algorithms for calculating the cost in the
conditions of digitalization of industrial production. In
International scientific conference digital
transformation on manufacturing, infrastructure and
service. Vol. 497. โ„– 012078.
Lobanov, A.N. 1984. Fotogrammetriya. In ยซNedraยป, pages
552.
Medina J.J, Maley J.M., Sannapareddy S.S., Medina N.N.,
Gilman C.M., McCormack J.E., 2020. A rapid and cost-
effective pipeline for digitization of museum specimens
with 3D photogrammetry. In Plos one. Vol. 15 (8). โ„–
e0236417
Parn EA., Edwards D. 2017. Vision and advocacy of
optoelectronic technology developments in the AECO
sector. In Built environment project and asset
management. SI. Vol. 7 (3). pages 330-348.
Scopigno R. 2017. Digital fabrication techniques for
cultural heritage: a survey. In Comput. Graph. Forum.
Vol. 36. pages 6โ€“21.
Schwartz C. 2000. Digital libraries: An overview. In
Journal of academic librarianship. Vol. 26 (6). pages
385-393.
Sobolevskaya I. N., Sotnikov A. N. 2019. Principles of 3D
Web-collections Visualization. In Proceedings of the
3rd International Conference on Computer-Human
Interaction Research and Applications. pages 145-151.
Sotnikov A.N., Kirillov S.A. 2015. Tekhnologiya
podgotovki elektronnyh izdanij dlya elektronnoj
biblioteki "Nauchnoe nasledie Rossii". In
Informacionnoe obespechenie nauki: novye
tekhnologii. pages 178-190.
YUmasheva YU.YU. 2012. Metodicheskie rekomendacii
po elektronnomu kopirovaniyu arhivnyh dokumentov i
upravleniyu poluchennym informacionnym massivom.
VNIIDAD. pages 125.
Zabrovskaya I.E., Kirillov S.A., Kondrat'eva E.A. Pruglo
O.A., Sotnikov A.N. 2017. Voprosy formirovaniya
fondov elektronnoj biblioteki "Nauchnoe nasledie
Rossii". In Informacionnoe obespechenie nauki: novye
tekhnologii. pages 184-191.
ICE-B 2021 - 18th International Conference on e-Business
66