Table 4: Average Precision (P), Recall (R), and F-measure (F1) with the standard deviation for the compared methods.
Dataset
wGCS wGCS positive ADIOS
P R F1 P R F1 P R F1
ab 0.87 ± 0.00 1.00 ±0.00 0.93 ±0.00 0.84 ±0.00 1.00 ± 0.00 0.92 ± 0.00 0.65 ± 0.10 1.00 ± 0.00 0.78 ±0.07
bra1 1.00 ± 0.00 1.00 ±0.00 1.00 ±0.00 0.90 ±0.00 1.00 ± 0.00 0.95 ± 0.00 0.70 ± 0.10 1.00 ± 0.00 0.82 ±0.07
pal2 0.89 ±0.02 0.99 ±0.01 0.94 ± 0.01 0.73 ± 0.00 1.00 ± 0.00 0.85 ± 0.02 0.61 ±0.09 1.00 ±0.00 0.75 ± 0.07
Table 5: Obtained p values for F1 from Welch’s t test.
Datasets wGCS vs. ADIOS wGCS positive vs. ADIOS wGCS vs. wGCS positive
ab 9.35e-05 1.56e-04 6.63e-127
bra1 2.48e-05 2.82e-04 3.39e-133
pal2 1.39e-05 2.39e-03 1.23e-10
ACKNOWLEDGEMENTS
The research was supported by the National Sci-
ence Centre Poland (NCN), project registration no.
2016/21/B/ST6/02158.
REFERENCES
Adriaans, P. and Vervoort, M. (2002). The EMILE 4.1
grammar induction toolbox. In International Col-
loquium on Grammatical Inference, pages 293–295.
Springer.
Adriaans, P. W. (1992). Language learning from a catego-
rial perspective. PhD thesis, Universiteit van Amster-
dam.
Baker, J. K. (1979). Trainable grammars for speech recog-
nition. The Journal of the Acoustical Society of Amer-
ica, 65(S1):S132–S132.
Clark, A. and Lappin, S. (2010). Unsupervised learning and
grammar induction. The Handbook of Computational
Linguistics and Natural Language Processing, 57.
de la Higuera, C. (2010). Grammatical Inference: Learn-
ing Automata and Grammars. Cambridge University
Press.
D’Ulizia, A., Ferri, F., and Grifoni, P. (2011). A survey of
grammatical inference methods for natural language
learning. Artificial Intelligence Review, 36(1):1–27.
Gold, E. M. (1967). Language identification in the limit.
Information and control, 10(5):447–474.
Heinz, J., De la Higuera, C., and Van Zaanen, M. (2015).
Grammatical inference for computational linguistics.
Synthesis Lectures on Human Language Technologies,
8(4):1–139.
Hogenhout, W. R. and Matsumoto, Y. (1998). A fast method
for statistical grammar induction. Natural Language
Engineering, 4(3):191–209.
Hopcroft, J. E., Motwani, R., and Ullman, J. D. (2001). In-
troduction to automata theory, languages, and compu-
tation. Acm Sigact News, 32(1):60–65.
Horning, J. J. (1969). A study of grammatical inference.
Technical report, Stanford Univ Calif Dept of Com-
puter Science.
Johnson, M., Griffiths, T., and Goldwater, S. (2007).
Bayesian inference for pcfgs via markov chain monte
carlo. In Human Language Technologies 2007: The
Conference of the North American Chapter of the As-
sociation for Computational Linguistics; Proceedings
of the Main Conference, pages 139–146.
Keller, B. and Lutz, R. (1997). Evolving stochastic context-
free grammars from examples using a minimum de-
scription length principle. In Workshop on Automatic
Induction, Grammatical Inference and Language Ac-
quisition.
Lari, K. and Young, S. J. (1990). The estimation of stochas-
tic context-free grammars using the inside-outside al-
gorithm. Computer speech & language, 4(1):35–56.
Nakamura, K. (2003). Incremental learning of context
free grammars by extended inductive CYK algorithm.
In Proceedings of the 2003rd European Conference
on Learning Context-Free Grammars, pages 53–64.
Ruder Boskovic Institute.
Petasis, G., Paliouras, G., Karkaletsis, V., Halatsis, C., and
Spyropoulos, C. D. (2004). e-GRIDS: Computation-
ally efficient gramatical inference from positive exam-
ples. Grammars, 7:69–110.
Salkind, N. J. (2010). Encyclopedia of Research Design.
SAGE Publications, Inc.
Smith, N. A. and Eisner, J. (2005a). Contrastive estimation:
Training log-linear models on unlabeled data. In Pro-
ceedings of the 43rd Annual Meeting on Association
for Computational Linguistics, pages 354–362. Asso-
ciation for Computational Linguistics.
Smith, N. A. and Eisner, J. (2005b). Guiding unsupervised
grammar induction using contrastive estimation. In
Proc. of IJCAI Workshop on Grammatical Inference
Applications, pages 73–82.
Smith, N. A. and Johnson, M. (2007). Weighted and proba-
bilistic context-free grammars are equally expressive.
Computational Linguistics, 33(4):477–491.
Solan, Z., Horn, D., Ruppin, E., and Edelman, S.
(2005). Unsupervised learning of natural languages.
Proceedings of the National Academy of Sciences,
102(33):11629–11634.
Stolcke, A. and Omohundro, S. (1994). Inducing prob-
abilistic grammars by bayesian model merging. In
Unsupervised Statistical Learning of Context-free Grammar
437