Aplicación de algoritmo de extracción de textos en los perfiles de usuarios en caso de los investigadores de la Universidad Técnica de Cotopaxi.
“Universidad Técnica de Cotopaxi” is a higher education institution that develops scientific production through its research professors, because the results obtained are reflected in scientific papers like: research papers, published books and lectures. Studies are made because it is fundamental to...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | bachelorThesis |
| Language: | spa |
| Published: |
2019
|
| Subjects: | |
| Online Access: | http://repositorio.utc.edu.ec/handle/27000/5752 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | “Universidad Técnica de Cotopaxi” is a higher education institution that develops scientific production through its research professors, because the results obtained are reflected in scientific papers like: research papers, published books and lectures. Studies are made because it is fundamental to produce scientific knowledge, not only for commitment but also because it generates a personal and institutional benefit. That`s why this process leads to a better information administration because in case of the high proliferation of data that is managed for the institution, it results difficult to be organized according to the research parameters that it belongs because it contains series of inconveniences, for example: consumes effort, time, money and sometimes it could be unworkable if the amount to classify is excessive. That`s why the principal proposal of this technological project is about the development of a scientific platform that allow us to gather a specific amount of substantial information and later implement an automatic classifying algorithm of text with which is possible to structure relevant data in a specific domain (class or categories). In order to obtain the fact aforementioned, research methods were used regarding development and text mining. Firstly, a documentary and explanatory study was carried out. In addition, research techniques such as interview and survey were applied in order to obtain truthful information. Secondly, the Scrum methodology was used, which helped to define the product backlog, which allowed to determine 8 functionalities that conceived the scientific platform: "EcuCiencia" to collect relevant data and finally the methodology Knowledge Discovery in DataBases (KDD) was applied making use of Machine Learning techniques to prepare the text, filter them, normalize them, label them, apply the algorithm of SVM classification and evaluation. As a result of the proposal implemented, it is determined that the scientific platform is able of storing transcendental information, currently it has 468 research papers, 152 books and 430 indexed lectures from which they have been extracted and processed for the development of a training model that served as a base in the automatic classification applied in the Engineering Major in Computer Systems, provoking in such a way that access to information is easier, organized and in less time. However, it is necessary to emphasize that the technological proposal is part of the research study "Red de Estudios Cienciométricos REDEC". |
|---|