ver, in our approach, we use a MLP neural network,
which was trained based on a metamodels elements
set, where all elements are well known and the en-
coding schem is simple. The work in (Zhang and
Chen, 2018) deal with the link prediction problem
in network-structured data, it presents link prediction
based on graph neural network, where it proposes a
new method to learn heuristics from local subgraphs
using a graph neural network (GNN). A document or
a model could be encoded as a graph, but there is no
specific treatment for the metamodel elements. An in-
tegration of these approaches with our solution could
improve the capabilities of the classifier.
We presented an approach for classifying JSON docu-
ments into existing metamodels. The solution enables
discovering the domain of the JSON documents and
to serve as an initial typing scheme. We present the
automated steps of the approach, consisting on meta-
model extraction into an MLP using a one-hot encod-
ing (OHE) of the elements, network training, transla-
tion and classification of the input JSON documents.
The extraction algorithm relies on the presence (or
not) of the elements in a given input document, since
it translated the elements into a binary classification
problem. The results have showed that the approach
is effective from classifying JSON documents, with
precision varying from 46 to 97 percent, depending
on the kinds of the elements. We achieved our main
goal to show that a domain-specific and simple ex-
traction algorithm can be useful for classifying docu-
ments, instead of trying to adapt more complex struc-
tured based classification approaches. The results are
publicly available for download, as well as the algo-
rithms implemented.
There are several open issues subject for future
work, such as testing the extraction algorithm output
with other classification algorithms. We also plan to
extend the algorithm to cover more complex relation-
ships between model elements and to test if the results
can be improved.
