I am a Statistician with a strong foundation in data analysis, statistical modeling, and academic research. Proficient in SPSS, R, and Python, with a keen eye for detail and a commitment to producing accurate, insightful results. Eager to contribute to data-driven projects and grow in a professional analytics environment.
University of Southeastern Philippines Bachelor of Science in Statistics Aug 2021- June 2025
This section showcases the data analytics projects I created during my college years and more recently.
Title: A Machine Learning Approach to Weather Prediction in Davao City, Philippines: Integrating K-Means Clustering and K-Nearest Neighbors (k-NN) Objectives:
Conclusion: The k-NN model applied to the full dataset outperformed the cluster-based approach in predicting relative humidity, suggesting that clustering did not enhance model performance in this case
Code for K-means Clustering integrated with K-Nearest Neighbors (k-NN)
Code for K-Nearest Neighbors (k-NN)
Title: Comparative Analysis of Avocado Ripeness Classification Using Random Forest and k-Nearest Neighbors (k-NN) Objectives:
Conclusion: Among the two models evaluated, the Random Forest Classifier provided the most accurate and consistent performance, making it the most suitable algorithm for classifying avocado ripeness in this study.
Code for Random Forest and k-NN_ Avocado Ripeness Classification
Retrieved Dataset: Avocado Ripeness
Title: Correlational Analysis on Years of Experience and Salary Objective:
Conclusion: The correlation coefficient between Years of Experience and Salary is approximately 0.98, indicating a very strong positive linear relationship. This means that as the number of years of experience increases, the salary also tends to increase significantly. The strength of this correlation suggests that experience is a major factor influencing salary growth.
Code for the Correlational Analysis
Retrieved Dataset: Experience and Salary
Title: Real Estate Price Prediction in the Philippines Using Ensemble and Regularization-Based Machine Learning Models Objectives:
Conclusion: The KNN model showed solid performance in predicting property prices, with an RMSE of 3.91, R² of 0.75, and MAE of 3.76 on the test set, indicating a good but slightly imperfect fit. In comparison, Ridge Regression achieved an RMSE of 0.50, R² of 0.84, and MAE of 0.33, demonstrating a better fit with lower prediction errors. Lasso Regression had similar results to Ridge, with an RMSE of 0.50, R² of 0.84, and MAE of 0.38, suggesting that both regularized regression models handled the data effectively. On the other hand, the Random Forest model excelled with an RMSE of 0.40, R² of 0.90, and MAE of 0.25, outperforming the KNN, Ridge, and Lasso models, showcasing its ability to handle complex relationships in the data.
Title: A Robust k-NN Model for Breast Cancer Survival Analysis: Tackling Class Imbalance with Upsampling and Downsampling Objectives:
Conclusion: After meticulous tabulation, we obtained the following results: the KNN cross-validation (downsampling) yielded an accuracy of 84.2%, while the KNN-bootstrap (downsampling) yielded an accuracy of 84.5%. As we observe, there is only a tiny difference between the two models. Since kNN-Boot has higher accuracy, we conclude that the kNN model with bootstrapping (downsampling) is better.