Model for automatic detection of lexical-syntactic errors in texts written in Spanish
Evaluating written texts is a task that mainly considers two aspects: syntactics and semantics. The first one focuses on the form of the text, and the second one, on its meaning. Conducting this task manually implies an effort in time and resources that can be reduced if part of the process is carried out automatically. According to the reviewed literature, there are different techniques for automatically correcting texts. One of them is the linguistic approach, which focuses on syntactic, semantic, and pragmatic elements. Likewise, this ongoing research is concerned with the automatic evaluation of syntactic errors in texts written in Spanish as a starting point to ensure coherence and cohesion in text composition, which may be useful in the academic environment. In order to carry out this study, a set of texts by students enrolled in an academic program was collected and analyzed by applying natural language processing and machine learning techniques. Additionally, the content of the corpus was manually corrected to compare the results of both methods, and correspondence was established between them. For this reason, it was concluded that the automatic method supports the syntactic correction process of a text written in Spanish.
 S. Russell and P. Norvig, Artificial intelligence: a modern approach. Prentice Hall, 1995.
 J. Corredor-Tapias and L. F. Nieto-Ruiz, “Un vistazo a los pilares de la lingüística moderna: Saussure, Chomsky y Van Dijk. Del estructuralismo a la lingüística textual,” Cuad. Lingüística Hispánica, no. 9, pp. 83–96, 2007.
 G. Sidorov, Construcción no lineal de n-gramas en la lingüística computacional. Sociedad Mexicana de Inteligencia Artificial, 2013.
 T. A. Van Dijk, “Texto y Contexto. Semántica y pragmática del discurso,” Estud. Linguística Apl., no. 2, pp. 131–133, 1982.
 J. Allen, Natural language understanding, 2nd ed. Benjamin/Cummings Publishing Company, 1995.
 A. Moreno-Sandoval, Lingüística computacional. Madrid, España: Editorial Síntesis, 1998.
 J. Posadas-Durán et al., “Syntactic n-grams as features for the author profiling task,” Work. Notes Pap. CLEF, p. 5, 2015.
 G. Sidorov, F. Velásquez, E. Stamatatos, A. Gelbukh, and L. Chanona-Hernández, “Syntactic N-grams as machine learning features for natural language processing,” Expert Syst. Appl., vol. 41, no. 3, pp. 853–860, Feb. 2014.
 C. González-Gallardo, J. Torres-Moreno, A. Montes-Rendón, and G. Sierra, “Perfilado de autor multilingüe en redes sociales a partir de n-gramas de caracteres y de etiquetas gramaticales,” Linguamática, vol. 8, no. 1, pp. 21–29, 2016.
 J. Castillo et al., “Desarrollo de sistemas de análisis de texto,” in XIX Workshop de Investigadores en Ciencias de la Computación, 2017, pp. 58–62.
 G. Parodi, “Lingüística de corpus: una introducción al ámbito,” RLA. Rev. lingüística teórica y Apl., vol. 46, no. 1, pp. 93–119, 2008.
 E. A. P. Del Castillo, J. A. A. Valencia, and A. Pomares Quimbaya, “Constructor automático de modelos de dominios sin corpus preexistente,” Soc. Española para el Proces. del Leng. Nat., vol. 59, pp. 129–132, 2017.
 E. Pitler, A. Louis, and A. Nenkova, “Automatic evaluation of linguistic quality in multi-document summarization,” in Proceedings of the 48th annual meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2010, pp. 544–554.
 W. Koza, “Marcadores discursivos del español. Descripción y propuesta de detección automática,” Rev. Epistemol. y Ciencias Humanas, vol. 2, pp. 109–120, 2009.
 M. Pinto-Cruces, “Modelo de detección automática de ironía en textos en español,” Universidad del Bío-Bío, 2017.
 Real Academia Española, Nueva gramática de la lengua española manual, 1st ed. Espasa, 2010.
 K. Toutanova, D. Klein, C. D. Manning, and Y. Singer, “Feature-rich part-of-speech tagging with a cyclic dependency network,” in Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL ’03, 2003, vol. 1, pp. 173–180.
 K. Toutanova and C. D. Manning, “Enriching the knowledge sources used in a maximum entropy part-of-speech tagger,” in Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics -, 2000, vol. 13, pp. 63–70.
 G. Leech and A. Wilson, EAGLES Recommendations for the Morphosyntactic Annotation of Corpora. EAGLES, 1996.
In accordance with the provisions of Agreement 034 of 2014 (ITM Intellectual Property Statute) Article 19 "The ideas expressed in the works and investigations published and / or manifested by their professors, contractors, administrative officers, servers, collaborators, apprentices, Visitors, students and researchers in any context are the sole responsibility of their authors and are not expressions of the Institution's official thinking."
The articles published by journal TecnoLógicas are literary and scientific works protected by copyright laws. With the signing of the Declaration of Originality, as well as with the delivery of the work for consideration or possible publication, the author (s) authorize, free of charge, the METROPOLITAN TECHNOLOGICAL INSTITUTE -ITM- for publication, reproduction, communication , distribution and transformation of the work and also declare under the seriousness of the oath that the work is original and unpublished exclusive authorship of the senders.
The full texts of the articles will be published under a Creative Commons License "Recognition-Non-Commercial-Share Equal" that allows others: