- Created CLAUDE.md to provide development commands, architecture overview, and environment variables for the Nuxt 3 portfolio website. - Refactored project pages to remove unused color mappings and improve project filtering logic. - Updated content.config.ts to enforce stricter project type definitions and added short descriptions for projects. - Deleted outdated project files and added new projects related to hackathons and academic research. - Enhanced existing project descriptions with short summaries for better clarity.
2.2 KiB
slug, title, type, description, shortDescription, publishedAt, readingTime, status, tags, icon
| slug | title | type | description | shortDescription | publishedAt | readingTime | status | tags | icon | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sl-breast-cancer | Breast Cancer Detection | Academic Project | Prediction of breast cancer presence by comparing several supervised classification models using machine learning techniques. | A project comparing supervised classification models to predict breast cancer presence using machine learning. | 2025-06-06 | 2 | Completed |
|
i-ph-heart-half-duotone |
This project was carried out as part of the Statistical Learning course at Paris-Dauphine PSL University. The objective is to identify the most effective model for predicting or explaining the presence of breast cancer based on a set of biological and clinical features.
📊 Project Objectives
Develop and evaluate several supervised classification models to predict the presence of breast cancer based on biological features extracted from the Breast Cancer Coimbra dataset, provided by the UCI Machine Learning Repository.
The dataset contains 116 observations divided into two classes:
- 1: healthy individuals (controls)
- 2: patients diagnosed with breast cancer
There are 9 explanatory variables, including clinical measurements such as age, insulin levels, leptin, insulin resistance, among others.
🔍 Methodology
The project follows a comparative approach between several algorithms:
- Logistic Regression
- k-Nearest Neighbors (k-NN)
- Naive Bayes
- Artificial Neural Network (MLP with a 16-8-1 architecture)
Model evaluation is primarily based on the F1-score, which is more suitable in a medical context where identifying positive cases is crucial. Particular attention was paid to stratified cross-validation and to handling class imbalance, notably through the use of class weights and regularization techniques (L2, early stopping).
This project illustrates a concrete application of data science techniques to a public health issue, while implementing a rigorous methodology for supervised modeling.
📚 Resources
You can find the code here: Breast Cancer Detection