mirror of
https://github.com/ArthurDanjou/artsite.git
synced 2026-02-09 17:05:58 +01:00
feat: Add personal profile, projects, and skills documentation
- Created index.md for personal introduction and interests. - Added languages.json to specify language proficiencies. - Developed profile.md detailing academic background, skills, and career goals. - Introduced multiple project markdown files showcasing personal and academic projects, including ArtChat, ArtHome, and various data science initiatives. - Implemented skills.json to outline technical skills and competencies. - Compiled uses.md to document hardware and software tools utilized for development and personal projects.
This commit is contained in:
38
content/projects/artchat.md
Normal file
38
content/projects/artchat.md
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
slug: artchat
|
||||
title: ArtChat - Portfolio & Blog
|
||||
type: Personal Project
|
||||
description: My personal space on the web — a portfolio, a blog, and a digital lab where I showcase my projects, write about topics I care about, and experiment with design and web technologies.
|
||||
publishedAt: 2024-06-01
|
||||
readingTime: 1
|
||||
cover: artchat/cover.png
|
||||
favorite: true
|
||||
status: Active
|
||||
tags:
|
||||
- Vue.js
|
||||
- Nuxt
|
||||
- TypeScript
|
||||
- Tailwind CSS
|
||||
- Web
|
||||
emoji: 🌍
|
||||
---
|
||||
|
||||
[**ArtChat**](https://go.arthurdanjou.fr/website) is my personal space on the web — a portfolio, a blog, and a digital lab where I showcase my projects, write about topics I care about, and experiment with design and web technologies.
|
||||
|
||||
It's designed to be fast, accessible, and fully responsive. The site also serves as a playground to explore and test modern frontend tools.
|
||||
|
||||
## ⚒️ Tech Stack
|
||||
|
||||
- **UI** → [Vue.js](https://vuejs.org/): A progressive JavaScript framework for building interactive interfaces.
|
||||
- **Framework** → [Nuxt](https://nuxt.com/): A powerful full-stack framework built on Vue, perfect for modern web apps.
|
||||
- **Content System** → [Nuxt Content](https://content.nuxtjs.org/): File-based CMS to manage blog posts and pages using Markdown.
|
||||
- **Design System** → [Nuxt UI](https://nuxtui.com/): Fully styled, customizable UI components tailored for Nuxt.
|
||||
- **CMS & Editing** → [Nuxt Studio](https://nuxt.studio): Visual editing and content management integrated with Nuxt Content.
|
||||
- **Language** → [TypeScript](https://www.typescriptlang.org/): A statically typed superset of JavaScript.
|
||||
- **Styling** → [Sass](https://sass-lang.com/) & [Tailwind CSS](https://tailwindcss.com/): Utility-first CSS framework enhanced with SCSS flexibility.
|
||||
- **Deployment** → [NuxtHub](https://hub.nuxt.com/): Cloudflare-powered platform for fast, scalable Nuxt app deployment.
|
||||
- **Package Manager** → [pnpm](https://pnpm.io/): A fast, disk-efficient package manager for JavaScript/TypeScript projects.
|
||||
- **Linter** → [ESLint](https://eslint.org/): A tool for identifying and fixing problems in JavaScript/TypeScript code.
|
||||
- **ORM** → [Drizzle ORM](https://orm.drizzle.team/): A lightweight, type-safe ORM for TypeScript.
|
||||
- **Validation** → [Zod](https://zod.dev/): A TypeScript-first schema declaration and validation library with full static type inference.
|
||||
- **Deployment** → [NuxtHub](https://hub.nuxt.com/): A platform to deploy and scale Nuxt apps globally with minimal latency and full-stack capabilities.
|
||||
30
content/projects/arthome.md
Normal file
30
content/projects/arthome.md
Normal file
@@ -0,0 +1,30 @@
|
||||
---
|
||||
slug: arthome
|
||||
title: ArtHome - Browser Homepage
|
||||
type: Personal Project
|
||||
description: A customizable browser homepage that lets you organize all your favorite links in one place with categories, tabs, icons and colors.
|
||||
publishedAt: 2024-09-04
|
||||
readingTime: 1
|
||||
cover: arthome/cover.png
|
||||
status: Active
|
||||
tags:
|
||||
- Nuxt
|
||||
- Vue.js
|
||||
- Web
|
||||
- Productivity
|
||||
emoji: 🏡
|
||||
---
|
||||
|
||||
[ArtHome](https://go.arthurdanjou.fr/arthome) is a customizable browser homepage that lets you organize all your favorite links in one place.
|
||||
|
||||
Create categories and tabs to group your shortcuts, personalize them with icons and colors, and make the page private if you want to keep your links just for yourself. The interface is clean, responsive, and works across all modern browsers.
|
||||
|
||||
## 🛠️ Built with
|
||||
|
||||
- [Nuxt](https://nuxt.com): An open-source framework for building performant, full-stack web applications with Vue.
|
||||
- [NuxtHub](https://hub.nuxt.com): A Cloudflare-powered platform to deploy and scale Nuxt apps globally with minimal latency and full-stack capabilities.
|
||||
- [NuxtUI](https://ui.nuxt.com): A sleek and flexible component library that helps create beautiful, responsive UIs for Nuxt applications.
|
||||
- [ESLint](https://eslint.org): A linter that identifies and fixes problems in your JavaScript/TypeScript code.
|
||||
- [Drizzle ORM](https://orm.drizzle.team/): A lightweight, type-safe ORM built for TypeScript, designed for simplicity and performance.
|
||||
- [Zod](https://zod.dev/): A TypeScript-first schema declaration and validation library with full static type inference.
|
||||
- and a lot of ❤️
|
||||
46
content/projects/artlab.md
Normal file
46
content/projects/artlab.md
Normal file
@@ -0,0 +1,46 @@
|
||||
---
|
||||
slug: artlab
|
||||
title: ArtLab - Personal HomeLab
|
||||
type: Infrastructure Project
|
||||
description: A personal homelab environment where I deploy, test, and maintain self-hosted services with privacy-focused networking through VPN and Cloudflare Tunnels.
|
||||
publishedAt: 2025-09-04
|
||||
readingTime: 1
|
||||
cover: artlab/cover.png
|
||||
favorite: true
|
||||
status: Active
|
||||
tags:
|
||||
- Docker
|
||||
- Proxmox
|
||||
- HomeLab
|
||||
- Self-Hosted
|
||||
- Infrastructure
|
||||
emoji: 🏡
|
||||
---
|
||||
|
||||
[**ArtLab**](https://go.arthurdanjou.fr/status) is my personal homelab, where I experiment with self-hosting and automation.
|
||||
|
||||
My homelab is a self-hosted environment where I deploy, test, and maintain personal services. Everything is securely exposed **only through a private VPN** using [Tailscale](https://tailscale.com/), ensuring encrypted, access-controlled connections across all devices.
|
||||
For selected services, I also use **Cloudflare Tunnels** to enable secure external access without opening ports or exposing my public IP.
|
||||
|
||||
## 🛠️ Running Services
|
||||
|
||||
- **MinIO**: S3-compatible object storage for static files and backups.
|
||||
- **Immich**: Self-hosted photo management platform — a private alternative to Google Photos.
|
||||
- **Jellyfin**: Media server for streaming movies, shows, and music.
|
||||
- **Portainer & Docker**: Container orchestration and service management.
|
||||
- **Traefik**: Reverse proxy and automatic HTTPS with Let's Encrypt.
|
||||
- **Homepage**: A sleek dashboard to access and monitor all services.
|
||||
- **Proxmox**: Virtualization platform used to manage VMs and containers.
|
||||
- **Uptime Kuma**: Self-hosted uptime monitoring.
|
||||
- **Home Assistant**: Smart home automation and device integration.
|
||||
- **AdGuard Home**: Network-wide ad and tracker blocking via DNS.
|
||||
- **Beszel**: Self-hosted, lightweight alternative to Notion for notes and knowledge management.
|
||||
- **Palmr**: Personal logging and journaling tool.
|
||||
|
||||
## 🖥️ Hardware
|
||||
|
||||
- **Beelink EQR6**: AMD Ryzen mini PC, main server host.
|
||||
- **TP-Link 5-port Switch**: Network connectivity for all devices.
|
||||
- **UGREEN NASync DXP4800 Plus**: 4-bay NAS, currently populated with 2 × 8TB drives for storage and backups.
|
||||
|
||||
This homelab is a sandbox for DevOps experimentation, infrastructure reliability, and privacy-respecting digital autonomy.
|
||||
75
content/projects/artstudies.md
Normal file
75
content/projects/artstudies.md
Normal file
@@ -0,0 +1,75 @@
|
||||
---
|
||||
slug: artstudies
|
||||
title: ArtStudies - Academic Projects Collection
|
||||
type: Academic Project
|
||||
description: A curated collection of mathematics and data science projects developed during my academic journey, spanning Bachelor's and Master's studies.
|
||||
publishedAt: 2023-09-01
|
||||
readingTime: 1
|
||||
favorite: true
|
||||
status: Active
|
||||
tags:
|
||||
- Python
|
||||
- R
|
||||
- Data Science
|
||||
- Machine Learning
|
||||
- Mathematics
|
||||
emoji: 🎓
|
||||
---
|
||||
|
||||
# ArtStudies
|
||||
|
||||
[ArtStudies Projects](https://github.com/ArthurDanjou/artstudies) is a curated collection of academic projects completed throughout my mathematics studies. The repository showcases work in both _Python_ and _R_, focusing on mathematical modeling, data analysis, and numerical methods.
|
||||
|
||||
The projects are organized into two main sections:
|
||||
- **L3** – Third year of the Bachelor's degree in Mathematics
|
||||
- **M1** – First year of the Master's degree in Mathematics
|
||||
- **M2** – Second year of the Master's degree in Mathematics
|
||||
|
||||
## 📁 File Structure
|
||||
|
||||
- `L3`
|
||||
- `Analyse Matricielle`
|
||||
- `Analyse Multidimensionnelle`
|
||||
- `Calculs Numériques`
|
||||
- `Équations Différentielles`
|
||||
- `Méthodes Numériques`
|
||||
- `Probabilités`
|
||||
- `Projet Numérique`
|
||||
- `Statistiques`
|
||||
|
||||
- `M1`
|
||||
- `Data Analysis`
|
||||
- `General Linear Models`
|
||||
- `Monte Carlo Methods`
|
||||
- `Numerical Methods`
|
||||
- `Numerical Optimization`
|
||||
- `Portfolio Management`
|
||||
- `Statistical Learning`
|
||||
|
||||
- `M2`
|
||||
- `Data Visualisation`
|
||||
- `Deep Learning`
|
||||
- `Linear Models`
|
||||
- `Machine Learning`
|
||||
- `VBA`
|
||||
- `SQL`
|
||||
|
||||
## 🛠️ Technologies & Tools
|
||||
|
||||
- [Python](https://www.python.org): A high-level, interpreted programming language, widely used for data science, machine learning, and scientific computing.
|
||||
- [R](https://www.r-project.org): A statistical computing environment, perfect for data analysis and visualization.
|
||||
- [Jupyter](https://jupyter.org): Interactive notebooks combining code, results, and rich text for reproducible research.
|
||||
- [Pandas](https://pandas.pydata.org): A data manipulation library providing data structures and operations for manipulating numerical tables and time series.
|
||||
- [NumPy](https://numpy.org): Core package for numerical computing with support for large, multi-dimensional arrays and matrices.
|
||||
- [SciPy](https://www.scipy.org): A library for advanced scientific computations including optimization, integration, and signal processing.
|
||||
- [Scikit-learn](https://scikit-learn.org): A robust library offering simple and efficient tools for machine learning and statistical modeling, including classification, regression, and clustering.
|
||||
- [TensorFlow](https://www.tensorflow.org): A comprehensive open-source framework for building and deploying machine learning and deep learning models.
|
||||
- [Keras](https://keras.io): A high-level neural networks API, running on top of TensorFlow, designed for fast experimentation.
|
||||
- [Matplotlib](https://matplotlib.org): A versatile plotting library for creating high-quality static, animated, and interactive visualizations in Python.
|
||||
- [Plotly](https://plotly.com): An interactive graphing library for creating dynamic visualizations in Python and R.
|
||||
- [Seaborn](https://seaborn.pydata.org): A statistical data visualization library built on top of Matplotlib, providing a high-level interface for drawing attractive and informative graphics.
|
||||
- [RMarkdown](https://rmarkdown.rstudio.com): A dynamic tool for combining code, results, and narrative into high-quality documents and presentations.
|
||||
- [FactoMineR](https://factominer.free.fr/): An R package focused on multivariate exploratory data analysis (e.g., PCA, MCA, CA).
|
||||
- [ggplot2](https://ggplot2.tidyverse.org): A grammar-based graphics package for creating complex and elegant visualizations in R.
|
||||
- [RShiny](https://shiny.rstudio.com): A web application framework for building interactive web apps directly from R.
|
||||
- and my 🧠.
|
||||
57
content/projects/bikes-glm.md
Normal file
57
content/projects/bikes-glm.md
Normal file
@@ -0,0 +1,57 @@
|
||||
---
|
||||
slug: bikes-glm
|
||||
title: Generalized Linear Models for Bikes Prediction
|
||||
type: Academic Project
|
||||
description: Predicting the number of bikes rented in a bike-sharing system using Generalized Linear Models and various statistical techniques.
|
||||
publishedAt: 2025-01-24
|
||||
readingTime: 1
|
||||
status: Completed
|
||||
tags:
|
||||
- R
|
||||
- Statistics
|
||||
- Data Analysis
|
||||
- GLM
|
||||
- Mathematics
|
||||
emoji: 🚲
|
||||
---
|
||||
|
||||
# Generalized Linear Models for Bikes Prediction
|
||||
|
||||
## Overview
|
||||
|
||||
This project was completed as part of the **Generalized Linear Models** course at Paris-Dauphine PSL University. The objective was to develop and compare statistical models to predict the number of bicycle rentals in a bike-sharing system based on various environmental and temporal characteristics.
|
||||
|
||||
## 📊 Project Objectives
|
||||
|
||||
- Determine the best predictive model for bicycle rental counts
|
||||
- Analyze the impact of various features (temperature, humidity, wind speed, seasonality, etc.)
|
||||
- Apply and evaluate different generalized linear modeling techniques
|
||||
- Validate model assumptions and performance metrics
|
||||
|
||||
## 🔍 Methodology
|
||||
|
||||
The study employs rigorous statistical approaches including:
|
||||
|
||||
- **Exploratory Data Analysis (EDA)** - Understanding feature distributions and relationships
|
||||
- **Model Comparison** - Testing multiple GLM families (Poisson, Negative Binomial, Gaussian)
|
||||
- **Feature Selection** - Identifying the most influential variables
|
||||
- **Model Diagnostics** - Validating assumptions and checking residuals
|
||||
- **Cross-validation** - Ensuring robust performance estimates
|
||||
|
||||
## 📁 Key Findings
|
||||
|
||||
The analysis identified critical factors influencing bike-sharing demand:
|
||||
- Seasonal patterns and weather conditions
|
||||
- Temperature and humidity effects
|
||||
- Holiday and working day distinctions
|
||||
- Time-based trends and cyclical patterns
|
||||
|
||||
## 📚 Resources
|
||||
|
||||
- **Code Repository**: [GLM Bikes Code](https://go.arthurdanjou.fr/glm-bikes-code)
|
||||
- **Full Report**: See embedded PDF below
|
||||
|
||||
## 📄 Detailed Report
|
||||
|
||||
<iframe src="/projects/bikes-glm/Report.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
47
content/projects/breast-cancer.md
Normal file
47
content/projects/breast-cancer.md
Normal file
@@ -0,0 +1,47 @@
|
||||
---
|
||||
slug: breast-cancer
|
||||
title: Breast Cancer Detection
|
||||
type: Academic Project
|
||||
description: Prediction of breast cancer presence by comparing several supervised classification models using machine learning techniques.
|
||||
publishedAt: 2025-06-06
|
||||
readingTime: 2
|
||||
status: Completed
|
||||
tags:
|
||||
- Python
|
||||
- Machine Learning
|
||||
- Data Science
|
||||
- Classification
|
||||
- Healthcare
|
||||
emoji: 💉
|
||||
---
|
||||
|
||||
The project was carried out as part of the `Statistical Learning` course at Paris-Dauphine PSL University. Its objective is to identify the most effective model for predicting or explaining the presence of breast cancer based on a set of biological and clinical features.
|
||||
|
||||
This project aims to develop and evaluate several supervised classification models to predict the presence of breast cancer based on biological features extracted from the Breast Cancer Coimbra dataset, provided by the UCI Machine Learning Repository.
|
||||
|
||||
The dataset contains 116 observations divided into two classes:
|
||||
|
||||
- 1: healthy individuals (controls)
|
||||
|
||||
- 2: patients diagnosed with breast cancer
|
||||
|
||||
There are 9 explanatory variables, including clinical measurements such as age, insulin levels, leptin, insulin resistance, among others.
|
||||
|
||||
The project follows a comparative approach between several algorithms:
|
||||
|
||||
- Logistic Regression
|
||||
|
||||
- k-Nearest Neighbors (k-NN)
|
||||
|
||||
- Naive Bayes
|
||||
|
||||
- Artificial Neural Network (MLP with a 16-8-1 architecture)
|
||||
|
||||
Model evaluation is primarily based on the F1-score, which is more suitable in a medical context where identifying positive cases is crucial. Particular attention was paid to stratified cross-validation and to handling class imbalance, notably through the use of class weights and regularization techniques (L2, early stopping).
|
||||
|
||||
This project illustrates a concrete application of data science techniques to a public health issue, while implementing a rigorous methodology for supervised modeling.
|
||||
|
||||
You can find the code here: [Breast Cancer Detection](https://go.arthurdanjou.fr/breast-cancer-detection-code)
|
||||
|
||||
<iframe src="/projects/breast-cancer/report.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
157
content/projects/dropout-reduces-underfitting.md
Normal file
157
content/projects/dropout-reduces-underfitting.md
Normal file
@@ -0,0 +1,157 @@
|
||||
---
|
||||
slug: dropout-reduces-underfitting
|
||||
title: Dropout Reduces Underfitting
|
||||
type: Research Project
|
||||
description: TensorFlow/Keras implementation and reproduction of "Dropout Reduces Underfitting" (Liu et al., 2023). A comparative study of Early and Late Dropout strategies to optimize model convergence.
|
||||
publishedAt: 2024-12-10
|
||||
readingTime: 4
|
||||
status: Active
|
||||
tags:
|
||||
- Python
|
||||
- TensorFlow
|
||||
- Machine Learning
|
||||
- Deep Learning
|
||||
- Research
|
||||
emoji: 🔬
|
||||
---
|
||||
|
||||
📉 [Dropout Reduces Underfitting](https://github.com/arthurdanjou/dropoutreducesunderfitting): Reproduction & Analysis
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
> **Study and reproduction of the paper:** Liu, Z., et al. (2023). *Dropout Reduces Underfitting*. arXiv:2303.01500.
|
||||
|
||||
The paper is available at: [https://arxiv.org/abs/2303.01500](https://arxiv.org/abs/2303.01500)
|
||||
|
||||
This repository contains a robust and modular implementation in **TensorFlow/Keras** of **Early Dropout** and **Late Dropout** strategies. The goal is to verify the hypothesis that dropout, traditionally used to reduce overfitting, can also combat underfitting when applied solely during the initial training phase.
|
||||
|
||||
## 🎯 Scientific Objectives
|
||||
|
||||
The study aims to validate the three operating regimes of Dropout described in the paper:
|
||||
|
||||
1. **Early Dropout** (Targeting Underfitting): Active only during the initial phase to reduce gradient variance and align their direction, allowing for better final optimization.
|
||||
2. **Late Dropout** (Targeting Overfitting): Disabled at the start to allow rapid learning, then activated to regularize final convergence.
|
||||
3. **Standard Dropout**: Constant rate throughout training (Baseline).
|
||||
4. **No Dropout**: Control experiment without dropout.
|
||||
|
||||
## 🛠️ Technical Architecture
|
||||
|
||||
Unlike naive Keras callback implementations, this project uses a **dynamic approach via the TensorFlow graph** to ensure the dropout rate is properly updated on the GPU without model recompilation.
|
||||
|
||||
### Key Components
|
||||
|
||||
* **`DynamicDropout`**: A custom layer inheriting from `keras.layers.Layer` that reads its rate from a shared `tf.Variable`.
|
||||
* **`DropoutScheduler`**: A Keras `Callback` that drives the rate variable based on the current epoch and the chosen strategy (`early`, `late`, `standard`).
|
||||
* **`ExperimentPipeline`**: An orchestrator class that handles data loading (MNIST, CIFAR-10, Fashion MNIST), model creation (Dense or CNN), and execution of comparative benchmarks.
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
.
|
||||
├── README.md # This documentation file
|
||||
├── Dropout reduces underfitting.pdf # Original research paper
|
||||
├── pipeline.py # Main experiment pipeline
|
||||
├── pipeline.ipynb # Jupyter notebook for experiments
|
||||
├── pipeline_mnist.ipynb # Jupyter notebook for MNIST experiments
|
||||
├── pipeline_cifar10.ipynb # Jupyter notebook for CIFAR-10 experiments
|
||||
├── pipeline_cifar100.ipynb # Jupyter notebook for CIFAR-100 experiments
|
||||
├── pipeline_fashion_mnist.ipynb # Jupyter notebook for Fashion MNIST experiments
|
||||
├── requirements.txt # Python dependencies
|
||||
├── .python-version # Python version specification
|
||||
└── uv.lock # Dependency lock file
|
||||
```
|
||||
|
||||
## 🚀 Installation
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/arthurdanjou/dropoutreducesunderfitting.git
|
||||
cd dropoutreducesunderfitting
|
||||
```
|
||||
|
||||
## Install dependencies
|
||||
```bash
|
||||
pip install tensorflow numpy matplotlib seaborn scikit-learn
|
||||
```
|
||||
|
||||
## 📊 Usage
|
||||
|
||||
The main notebook pipeline.ipynb contains all necessary code. Here is how to run a typical experiment via the pipeline API.
|
||||
1. Initialization
|
||||
|
||||
Choose your dataset (cifar10, fashion_mnist, mnist) and architecture (cnn, dense).
|
||||
```python
|
||||
from pipeline import ExperimentPipeline
|
||||
|
||||
# Fashion MNIST is recommended to observe underfitting/overfitting nuances
|
||||
exp = ExperimentPipeline(dataset_name="fashion_mnist", model_type="cnn")
|
||||
```
|
||||
|
||||
2. Learning Curves Comparison
|
||||
|
||||
Compare training dynamics (Loss & Accuracy) of the three strategies.
|
||||
|
||||
```python
|
||||
exp.compare_learning_curves(
|
||||
modes=["standard", "early", "late"],
|
||||
switch_epoch=10, # The epoch where dropout state changes
|
||||
rate=0.4, # Dropout rate
|
||||
epochs=30
|
||||
)
|
||||
```
|
||||
|
||||
3. Ablation Studies
|
||||
|
||||
Study the impact of the "Early" phase duration or Dropout intensity.
|
||||
|
||||
```python
|
||||
# Impact of the switch epoch on final performance
|
||||
exp.compare_switch_epochs(
|
||||
switch_epochs=[5, 10, 15, 20],
|
||||
modes=["early"],
|
||||
rate=0.4,
|
||||
epochs=30
|
||||
)
|
||||
|
||||
# Impact of the dropout rate
|
||||
exp.compare_drop_rates(
|
||||
rates=[0.2, 0.4, 0.6],
|
||||
modes=["standard", "early"],
|
||||
switch_epoch=10,
|
||||
epochs=25
|
||||
)
|
||||
```
|
||||
|
||||
4. Data Regimes (Data Scarcity)
|
||||
|
||||
Verify the paper's hypothesis that Early Dropout shines on large datasets (or limited models) while Standard Dropout protects small datasets.
|
||||
|
||||
```python
|
||||
# Training on 10%, 50% and 100% of the dataset
|
||||
exp.run_dataset_size_comparison(
|
||||
fractions=[0.1, 0.5, 1.0],
|
||||
modes=["standard", "early"],
|
||||
rate=0.3,
|
||||
switch_epoch=10
|
||||
)
|
||||
```
|
||||
|
||||
## 📈 Expected Results
|
||||
|
||||
According to the paper, you should observe:
|
||||
|
||||
- Early Dropout: Higher initial Loss, followed by a sharp drop after the switch_epoch, often reaching a lower minimum than Standard Dropout (reduction of underfitting).
|
||||
- Late Dropout: Rapid rise in accuracy at the start (potential overfitting), then stabilized by the activation of dropout.
|
||||
|
||||
## 📝 Authors
|
||||
|
||||
- [Arthur Danjou](https://github.com/ArthurDanjou)
|
||||
- [Alexis Mathieu](https://github.com/Alex6535)
|
||||
- [Axelle Meric](https://github.com/AxelleMeric)
|
||||
- [Philippine Quellec](https://github.com/Philippine35890)
|
||||
- [Moritz Von Siemens](https://github.com/MoritzSiem)
|
||||
M.Sc. Statistical and Financial Engineering (ISF) - Data Science Track at Université Paris-Dauphine PSL
|
||||
|
||||
Based on the work of Liu, Z., et al. (2023). Dropout Reduces Underfitting.
|
||||
54
content/projects/loan-ml.md
Normal file
54
content/projects/loan-ml.md
Normal file
@@ -0,0 +1,54 @@
|
||||
---
|
||||
slug: loan-ml
|
||||
title: Machine Learning for Loan Prediction
|
||||
type: Academic Project
|
||||
description: Predicting loan approval and default risk using machine learning classification techniques.
|
||||
publishedAt: 2025-01-24
|
||||
readingTime: 2
|
||||
status: Completed
|
||||
tags:
|
||||
- Python
|
||||
- Machine Learning
|
||||
- Classification
|
||||
- Data Science
|
||||
- Finance
|
||||
emoji: 💰
|
||||
---
|
||||
|
||||
# Machine Learning for Loan Prediction
|
||||
|
||||
## Overview
|
||||
|
||||
This project focuses on building machine learning models to predict loan approval outcomes and assess default risk. The objective is to develop robust classification models that can effectively identify creditworthy applicants.
|
||||
|
||||
## 📊 Project Objectives
|
||||
|
||||
- Build and compare multiple classification models for loan prediction
|
||||
- Identify key factors influencing loan approval decisions
|
||||
- Evaluate model performance using appropriate metrics
|
||||
- Optimize model parameters for better predictive accuracy
|
||||
|
||||
## 🔍 Methodology
|
||||
|
||||
The study employs various machine learning approaches:
|
||||
|
||||
- **Exploratory Data Analysis (EDA)** - Understanding applicant characteristics and patterns
|
||||
- **Feature Engineering** - Creating meaningful features from raw data
|
||||
- **Model Comparison** - Testing multiple algorithms (Logistic Regression, Random Forest, Gradient Boosting, etc.)
|
||||
- **Hyperparameter Tuning** - Optimizing model performance
|
||||
- **Cross-validation** - Ensuring robust generalization
|
||||
|
||||
## 📁 Key Findings
|
||||
|
||||
[To be completed with your findings]
|
||||
|
||||
## 📚 Resources
|
||||
|
||||
- **Code Repository**: [Add link to your code]
|
||||
- **Dataset**: [Add dataset information]
|
||||
- **Full Report**: See embedded PDF below
|
||||
|
||||
## 📄 Detailed Report
|
||||
|
||||
<iframe src="/projects/loan-ml/Report.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
31
content/projects/monte-carlo-project.md
Normal file
31
content/projects/monte-carlo-project.md
Normal file
@@ -0,0 +1,31 @@
|
||||
---
|
||||
slug: monte-carlo-project
|
||||
title: Monte Carlo Methods Project
|
||||
type: Academic Project
|
||||
description: An implementation of different Monte Carlo methods and algorithms in R, including inverse CDF simulation, accept-reject methods, and stratification techniques.
|
||||
publishedAt: 2024-11-24
|
||||
readingTime: 3
|
||||
status: Completed
|
||||
tags:
|
||||
- R
|
||||
- Mathematics
|
||||
- Statistics
|
||||
- Monte Carlo
|
||||
- Numerical Methods
|
||||
emoji: 💻
|
||||
---
|
||||
|
||||
This is the report for the Monte Carlo Methods Project. The project was done as part of the course `Monte Carlo Methods` at the Paris-Dauphine University. The goal was to implement different methods and algorithms using Monte Carlo methods in R.
|
||||
|
||||
Methods and algorithms implemented:
|
||||
- Plotting graphs of functions
|
||||
- Inverse c.d.f. Random Variation simulation
|
||||
- Accept-Reject Random Variation simulation
|
||||
- Random Variable simulation with stratification
|
||||
- Cumulative density function
|
||||
- Empirical Quantile Function
|
||||
|
||||
You can find the code here: [Monte Carlo Project Code](https://go.arthurdanjou.fr/monte-carlo-code)
|
||||
|
||||
<iframe src="/projects/monte-carlo-project/Report.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
23
content/projects/schelling-segregation-model.md
Normal file
23
content/projects/schelling-segregation-model.md
Normal file
@@ -0,0 +1,23 @@
|
||||
---
|
||||
slug: schelling-segregation-model
|
||||
title: Schelling Segregation Model
|
||||
type: Academic Project
|
||||
description: A Python implementation of the Schelling Segregation Model using statistics and data visualization to analyze spatial segregation patterns.
|
||||
publishedAt: 2024-05-03
|
||||
readingTime: 4
|
||||
status: Completed
|
||||
tags:
|
||||
- Python
|
||||
- Data Visualization
|
||||
- Statistics
|
||||
- Modeling
|
||||
- Mathematics
|
||||
emoji: 📊
|
||||
---
|
||||
|
||||
This is the French version of the report for the Schelling Segregation Model project. The project was done as part of the course `Projet Numérique` at the Paris-Saclay University. The goal was to implement the Schelling Segregation Model in Python and analyze the results using statistics and data visualization.
|
||||
|
||||
You can find the code here: [Schelling Segregation Model Code](https://go.arthurdanjou.fr/schelling-code)
|
||||
|
||||
<iframe src="/projects/schelling/Projet.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
31
content/projects/sevetys.md
Normal file
31
content/projects/sevetys.md
Normal file
@@ -0,0 +1,31 @@
|
||||
---
|
||||
slug: sevetys
|
||||
title: Data Engineer Internship at Sevetys
|
||||
type: Internship Project
|
||||
description: Summary of my internship as a Data Engineer at Sevetys, focusing on data quality, cleaning, standardization, and comprehensive data quality metrics.
|
||||
publishedAt: 2025-07-31
|
||||
readingTime: 2
|
||||
status: Completed
|
||||
tags:
|
||||
- Python
|
||||
- PySpark
|
||||
- Data Engineering
|
||||
- Azure
|
||||
- Big Data
|
||||
emoji: 🐶
|
||||
---
|
||||
|
||||
[Sevetys](https://sevetys.fr) is a leading French network of over 200 veterinary clinics, employing more than 1,300 professionals. Founded in 2017, the group provides comprehensive veterinary care for companion animals, exotic pets, and livestock, with services ranging from preventive medicine and surgery to cardiology, dermatology, and 24/7 emergency care.
|
||||
|
||||
Committed to digital innovation, Sevetys leverages centralized data systems to optimize clinic operations, improve patient data management, and enhance the overall client experience. This combination of medical excellence and operational efficiency supports veterinarians in delivering the highest quality care nationwide.
|
||||
|
||||
During my two-month internship as a Data Engineer, I focused primarily on cleaning and standardizing customer and patient data — a critical task, as this data is extensively used by clinics, Marketing, and Performance teams. Ensuring data quality was therefore essential to the company's operations.
|
||||
|
||||
Additionally, I took charge of revising and enhancing an existing data quality report designed to evaluate the effectiveness of my cleaning processes. The report encompassed 47 detailed metrics assessing data completeness and consistency, providing valuable insights that helped maintain high standards across the organization.
|
||||
|
||||
## ⚙️ Stack
|
||||
|
||||
- [Microsoft Azure Cloud](https://azure.microsoft.com/)
|
||||
- [PySpark](https://spark.apache.org/docs/latest/api/python/)
|
||||
- [Python](https://www.python.org/)
|
||||
- [GitLab]()
|
||||
Reference in New Issue
Block a user