Deep learning neural network development for the classification of bacteriocin sequences produced by lactic acid bacteria
The rise of antibiotic-resistant bacteria presents a pressing need for exploring new natural compounds with innovative mechanisms to replace existing antibiotics. Bacteriocins offer promising alternatives for developing therapeutic and preventive strategies in livestock, aquaculture, and human healt...
Saved in:
| Main Author: | |
|---|---|
| Format: | bachelorThesis |
| Language: | eng |
| Published: |
2024
|
| Subjects: | |
| Online Access: | http://repositorio.yachaytech.edu.ec/handle/123456789/734 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The rise of antibiotic-resistant bacteria presents a pressing need for exploring new natural compounds with innovative mechanisms to replace existing antibiotics. Bacteriocins offer promising alternatives for developing therapeutic and preventive strategies in livestock, aquaculture, and human health. Specifically, those produced by LAB are recognized as GRAS and QPS. This study was used a deep learning neural network for binary classification of bacteriocin amino acid sequences, distinguishing those produced by LAB. This type of network can learn complex patterns and representations of data. The features were extracted using the k-mer method and vector embedding. Ten different groups were tested, combining embedding vectors and k-mers: EV, 'EV+3-mers', 'EV+5-mers', 'EV+7-mers', 'EV+15-mers', 'EV+20-mers', 'EV+3-mers+5-mers', 'EV+3-mers+7-mers', 'EV+5-mers+7-mers', and 'EV+15-mers+20-mers'. As results, five sets of 100 characteristic k-mers unique to bacteriocins produced by LAB were obtained for values of k = 3, 5, 7, 15, and 20. Significant difference was observed between the EV group and '5-mers+7-mers+EV', showing superior accuracy and loss results in the last group. Employing k-fold cross-validation with k=30, the average results for loss, accuracy, precision, recall, and F1 score were 9.900%, 90.143%, 90.300%, 90.100%, and 90.100% respectively. Folder 22 stood out with 8.500% loss, 91.471% accuracy, and 91.000% precision, recall, and F1 score. Presenting a performance that agrees with the existing literature. |
|---|