3.1.3 Scene Space
Scene space refers to the place where the scene
exists. Each Scene has a scene space. Scene space
can be divided into natural or artificial, indoor or
outdoor, etc. The scene space can be a country, a
city, or a particular location. For instance, we regard
the Great Wall as a scene, and then the scene space
is China, more specifically Beijing.
3.1.4 Scene Objects
Scene objects are things that can be visualized in the
scene. The scene objects usually refer to entities that
exist in the scene, and entity refers to things that can
be directly visualized in the scene. Scene objects are
crucial in the scene, and it is one of the most basic
factors that make up the scene. A scene without a
scene object is like a blank drawing paper. From the
perspective of part of speech, scene objects belong
to the category of nouns, but not all nouns represent
a scene object. The apple, table, wind, China, etc.
are called scene objects, scene objects are things that
exist in the world or can be represented by existing
things. The apple and table represent a kind of
existence and concrete things in the real world, so
they also are called entities. For entities, they have
their own visual attributes. The visual attributes of
the entity include color, size, shape, texture, etc. In
addition, there are some nouns that reflect the
process or result of humans’ cognitive in the
objective world. They are not referring to the
existence and specific things, but the invisible
concepts. They are not suitable for visualization. For
example, spiritual, material, and friendship are
typical of these words.
3.1.5 Scene Relationships
Scene relationship includes the spatial and non-
spatial relationship between objects in the scene.
The spatial relations are a crucial component in
descriptions of scenes. Spatial relations define the
basic layout information of scenes, which includes
the orientation relationship, the distance relationship,
and the topological relationship. For example, “the
computer is on the table”, “John is 5 feet in front of
the tree”, and “a picture is hanging on the wall”.
Spatial relations are often denoted by prepositions
such as on, under, beyond, in front of, etc. In the
description of the spatial relationship, the specific
location information of the object is also related to
the size and shape of the object. Non-spatial
relationships are not obviously reflected in the scene,
but they are also important in the scene. The non-
spatial relationships between the objects have the
part-of relation, the container-of relation, etc. For
example, “arm of the chair” is the part-of relation.
“Bowl of cherries” is the container-of relation.
4 SCENE DIVISION
An article consists of a series of words, and the most
important words in multi-scene text are the entity
noun. Therefore, in order to divide the multi-scene
text, you must first annotate the entity noun with
topic IDs. It is not based on words, but on topic IDs
assigned by LDA model. This increases sparsity
because the word space is reduced to a lower
dimensional topic space. The topic model must be
trained on a document similar to the content of the
test document to make the method effective. A
sentence is assumed to be the smallest basic unit.
We introduced a window parameter that defines the
number of sentences contained in a window, and
then the similarity of two adjacent windows is
calculated. We cannot declare the value of window
parameter in advance, and this is conditioned on the
multi-scene text that is divided. To calculate the
similarity, we exclusively use the topic IDs assigned
to the entity noun. Assuming an LDA model with T
topics, the frequency of each topic in a window
block is counted to compose into a T-dimensional
vector. The cosine similarity between vectors is
calculated. If the value is close to zero, it denotes
that the two window block edges are not associated.
They belong to two different scenes. The value is
close to one, it indicates that the two window block
margins are associated. They belong to a scene. In
this experiment, we identify the scene boundaries by
setting a threshold.
4.1 LDA Model
LDA is a generative model for text and other
collections of discrete data introduced by Blei et al
(Blei, 2003), which is a classic probabilistic topic
model. LDA model contains three layers of words,
topics, and documents. Blei added the Bayesian
prior probability to the PLSA algorithm, which gave
birth to the LDA algorithm. LDA is an unsupervised
machine learning model that can be used to identify
potential topic information in large-scale document
sets or corpora. LDA has wide application in the
fields of natural language processing and document
classification. This method assumes that each word
in a document is extracted from the topic. Model
training evaluates two distributions: document-topic
distribution, topic-word distribution. Since LDA is a