MINERÍA DE DATOS EN LA ACCIDENTABILIDAD VEHICULAR EN LA ZONA URBANA DEL CANTÓN LOJA
Studies on vehicular accident rates allow identifying the factors that affect a road accident; therefore, it is essential to conduct this type of studies, which is why this work aims to apply data mining in vehicular accident rates in the urban area of Loja, through the implementation of the methodo...
Збережено в:
| Автор: | |
|---|---|
| Формат: | bachelorThesis |
| Мова: | spa |
| Опубліковано: |
2023
|
| Предмети: | |
| Онлайн доступ: | https://dspace.unl.edu.ec/jspui/handle/123456789/27840 |
| Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
| Резюме: | Studies on vehicular accident rates allow identifying the factors that affect a road accident; therefore, it is essential to conduct this type of studies, which is why this work aims to apply data mining in vehicular accident rates in the urban area of Loja, through the implementation of the methodology of Knowledge Discovery in Databases (KDD) considering five stages: (i) integration and data collection; (ii) selection, cleaning and transformation; (iii) data mining, (iv) interpretation and presentation of results; and (v) dissemination and use: (i) data integration and collection; (ii) selection, cleaning and transformation; (iii) data mining, (iv) interpretation and presentation of results; and (v) dissemination and use. The analyzed data were obtained from the standardized traffic accident records held by the Operational Traffic Control Unit (UCOT) during the period 2018 - 2021. Using the OpenRefine tool, data selection, cleaning and transformation were performed, such as the comparison of the most influential variables within the traffic records. To apply data mining, the decision tree technique was used, using the J48 and CART algorithms, through WEKA and Python tools, respectively. Forty-three different tests were performed to compare the predictive models. The Python tool showed better levels of performance and accuracy using the variables hour (41.62%) and urban parish (34.59%); while the WEKA tool generated higher results of correctly classified instances for the variables "day", "typology", "causes", "nro_injured" and "nro_dead" with 36.21%, 58.37%, 38.10% and 98.64% respectively. It was concluded that data mining can be applied in the urban area of Loja Canton, through predictive models capable of forecasting the probability of a traffic accident in the urban area of Loja Canton based on the 370 records from the year 2021. This allowed generating 370 resulting probability percentages and distinct patterns for each of the vehicle accident attributes. Keywords: KDD Methodology, Decision trees, WEKA, Python, Traffic accident. |
|---|