3 Commits

Author SHA1 Message Date
c45b1d6f25 fix: mettre à jour les liens vers le dépôt GitHub et l'application en direct dans le projet de visualisation de la tuberculose 2026-03-10 12:25:20 +01:00
1537343e44 fix: supprimer le fichier de projet "Data Visualisation Project" 2026-03-10 12:23:32 +01:00
ac5ccb3555 Refactor project documentation and structure
- Updated data visualization project documentation to remove incomplete warning.
- Deleted the glm-financial-assets project file and replaced it with glm-implied-volatility project file, detailing a comprehensive study on implied volatility prediction using GLMs and machine learning.
- Marked n8n automations project as completed.
- Added new project on reinforcement learning applied to Atari Tennis, detailing agent comparisons and results.
- Removed outdated rl-tennis project file.
- Updated package dependencies in package.json for improved stability and performance.
2026-03-10 12:07:09 +01:00
9 changed files with 1221 additions and 889 deletions

1344
bun.lock

File diff suppressed because it is too large Load Diff

View File

@@ -1,58 +0,0 @@
---
slug: data-visualisation
title: Data Visualisation Project
type: Academic Project
description: An interactive data visualization project built with R, R Shiny, and ggplot2 for creating dynamic, explorable visualizations.
shortDescription: An interactive data visualization project using R and R Shiny.
publishedAt: 2026-01-05
readingTime: 1
status: Completed
tags:
- R
- R Shiny
- Data Visualization
- ggplot2
icon: i-ph-chart-bar-duotone
---
::warning
The project is complete, but the documentation is still being expanded with more details.
::
This project involves building an interactive data visualization application using R and R Shiny. The goal is to deliver dynamic, explorable visualizations that let users interact with the data in meaningful ways.
::BackgroundTitle{title="Technologies & Tools"}
::
- **[R](https://www.r-project.org)**: A statistical computing environment, perfect for data analysis and visualization.
- **[R Shiny](https://shiny.rstudio.com)**: A web application framework for R that enables the creation of interactive web applications directly from R.
- **[ggplot2](https://ggplot2.tidyverse.org)**: A powerful R package for creating static and dynamic visualizations using the Grammar of Graphics.
- **[dplyr](https://dplyr.tidyverse.org)**: An R package for data manipulation, providing a consistent set of verbs to help you solve common data manipulation challenges.
- **[tidyr](https://tidyr.tidyverse.org)**: An R package for tidying data, making it easier to work with and visualize.
- **[tidyverse](https://www.tidyverse.org)**: A collection of R packages designed for data science that share an underlying design philosophy, grammar, and data structures.
- **[sf](https://r-spatial.github.io/sf/)**: An R package for working with simple features, providing support for spatial data manipulation and analysis.
- **[rnaturalearth](https://docs.ropensci.org/rnaturalearth/)**: An R package that provides easy access to natural earth map data for creating geographical visualizations.
- **[rnaturalearthdata](https://github.com/ropensci/rnaturalearthdata)**: Companion package to rnaturalearth containing large natural earth datasets.
- **[knitr](https://yihui.org/knitr/)**: An R package for dynamic report generation, enabling the integration of code and text.
- **[kableExtra](https://haozhu233.github.io/kableExtra/)**: An R package for customizing tables and enhancing their visual presentation.
- **[gridExtra](https://cran.r-project.org/web/packages/gridExtra/)**: An R package for arranging multiple grid-based plots on a single page.
- **[moments](https://cran.r-project.org/web/packages/moments/)**: An R package for computing moments, skewness, kurtosis and related statistics.
- **[factoextra](http://www.sthda.com/english/rpkgs/factoextra/)**: An R package for multivariate data analysis and visualization, including PCA and clustering methods.
- **[shinydashboard](https://rstudio.github.io/shinydashboard/)**: An R package for creating dashboards with Shiny.
- **[leaflet](https://rstudio.github.io/leaflet/)**: An R package for creating interactive maps using the Leaflet JavaScript library.
- **[plotly](https://plotly.com/r/)**: An R package for creating interactive visualizations with the Plotly library.
- **[RColorBrewer](https://cran.r-project.org/web/packages/RColorBrewer/)**: An R package providing color palettes for maps and other graphics.
- **[DT](https://rstudio.github.io/DT/)**: An R package for creating interactive data tables.
::BackgroundTitle{title="Resources"}
::
You can find the code here: [Data Visualisation Code](https://go.arthurdanjou.fr/datavis-code)
And the online application here: [Data Visualisation App](https://go.arthurdanjou.fr/datavis-app)
::BackgroundTitle{title="Detailed Report"}
::
<iframe src="/projects/datavis.pdf" width="100%" height="1000px">
</iframe>

View File

@@ -0,0 +1,97 @@
---
slug: dataviz-tuberculose
title: Monitoring & Segmentation of Tuberculosis Cases
type: Academic Project
description: An interactive data visualization project built with R, R Shiny, and ggplot2 for creating dynamic, explorable visualizations.
shortDescription: An interactive data visualization project using R and R Shiny.
publishedAt: 2026-01-05
readingTime: 1
status: Completed
tags:
- R
- R Shiny
- Data Visualization
- ggplot2
icon: i-ph-chart-bar-duotone
---
Interactive Shiny dashboard for WHO tuberculosis data analysis and clustering.
- **GitHub Repository:** [Tuberculose-Visualisation](https://github.com/ArthurDanjou/Tuberculose-Visualisation)
- **Live Application:** [Tuberculose Data Visualization](https://go.arthurdanjou.fr/datavis-app)
::BackgroundTitle{title="Overview"}
::
This project provides an interactive visualization tool for monitoring and segmenting global tuberculosis data from the World Health Organization (WHO). It applies multivariate analysis to reveal operational typologies of global health risks.
**Author:** Arthur Danjou
**Program:** M2 ISF - Dauphine PSL
**Course:** Data Visualisation (2025-2026)
::BackgroundTitle{title="Features"}
::
- Interactive world map with cluster visualization
- K-means clustering for country segmentation (Low/Moderate/Critical Impact)
- Time series analysis with year selector (animated)
- Region filtering by WHO regions
- Key Performance Indicators (KPIs) dashboard
- Raw data exploration with data tables
::BackgroundTitle{title="Project Structure"}
::
```
├── app.R # Shiny application
├── NoticeTechnique.Rmd # Technical report (R Markdown)
├── NoticeTechnique.pdf # Compiled technical report
├── data/
│ ├── TB_analysis_ready.RData # Processed data with clusters
│ └── TB_burden_countries_2025-12-09.csv # Raw WHO data
└── renv/ # R package management
```
::BackgroundTitle{title="Requirements"}
::
- R (>= 4.0.0)
- R packages (see `renv.lock`):
- shiny
- shinydashboard
- leaflet
- plotly
- dplyr
- sf
- RColorBrewer
- DT
- rnaturalearth
::BackgroundTitle{title="Installation"}
::
1. Clone this repository
2. Open R/RStudio in the project directory
3. Restore packages with `renv::restore()`
4. Run the application:
```r
shiny::runApp("app.R")
```
::BackgroundTitle{title="Detailed Report"}
::
<iframe src="/projects/datavis.pdf" width="100%" height="1000px">
</iframe>
::BackgroundTitle{title="License"}
::
© 2026 Arthur Danjou. All rights reserved.
::BackgroundTitle{title="Resources"}
::
You can find the code here: [Data Visualisation Code](https://go.arthurdanjou.fr/datavis-code)
And the online application here: [Data Visualisation App](https://go.arthurdanjou.fr/datavis-app)

View File

@@ -1,71 +0,0 @@
---
slug: implied-volatility-modeling
title: Implied Volatility Surface Modeling
type: Academic Project
description: A large-scale statistical study comparing Generalized Linear Models (GLMs) and black-box machine learning architectures to predict the implied volatility of S&P 500 options.
shortDescription: Predicting the SPX volatility surface using GLMs and black-box models on 1.2 million observations.
publishedAt: 2026-02-28
readingTime: 3
status: In progress
tags:
- R
- GLM
- Finance
- Machine Learning
icon: i-ph-graph-duotone
---
This project targets high-precision calibration of the **Implied Volatility Surface** using a large-scale dataset of S&P 500 (SPX) European options.
The core objective is to stress-test classic statistical models against modern predictive algorithms. **Generalized Linear Models (GLMs)** provide a transparent baseline, while more complex "black-box" architectures are evaluated on whether their accuracy gains justify reduced interpretability in a risk management context.
::BackgroundTitle{title="Dataset & Scale"}
::
The modeling is performed on a high-dimensional dataset with over **1.2 million observations**.
- **Target Variable**: `implied_vol_ref` (implied volatility).
- **Features**: Option strike price ($K$), underlying asset price ($S$), and time to maturity ($\tau$).
- **Volume**: A training set of $1,251,307$ rows and a test set of identical size.
::BackgroundTitle{title="Modeling Methodology"}
::
The project follows a rigorous statistical pipeline to compare two modeling philosophies:
### 1. The Statistical Baseline (GLM)
Using R's GLM framework, I implement models with targeted link functions and error distributions (such as **Gamma** or **Inverse Gaussian**) to capture the global structure of the volatility surface. These models serve as the benchmark for transparency and stability.
### 2. The Black-Box Challenge
To capture local non-linearities such as the volatility smile and skew, I explore more complex architectures. Performance is evaluated by **Root Mean Squared Error (RMSE)** relative to the GLM baselines.
### 3. Feature Engineering
Key financial indicators are derived from the raw data:
- **Moneyness**: Calculated as the ratio $K/S$.
- **Temporal Dynamics**: Transformations of time to maturity to linearize the term structure.
::BackgroundTitle{title="Evaluation & Reproducibility"}
::
Performance is measured strictly via RMSE on the original scale of the target variable. To ensure reproducibility and precise comparisons across model iterations, a fixed random seed is maintained throughout the workflow.
```r
set.seed(2025)
TrainData <- read.csv("train_ISF.csv", stringsAsFactors = FALSE)
TestX <- read.csv("test_ISF.csv", stringsAsFactors = FALSE)
rmse_eval <- function(actual, predicted) {
sqrt(mean((actual - predicted)^2))
}
```
::BackgroundTitle{title="Critical Analysis"}
::
Beyond pure prediction, the project addresses:
- Model Limits: Identifying market regimes where models fail (e.g., deep out-of-the-money options).
- Interpretability: Quantifying the trade-off between complexity and practical utility in a risk management context.
- Future Extensions: Considering richer dynamics, such as historical volatility or skew-specific targets.

View File

@@ -0,0 +1,336 @@
---
slug: implied-volatility-prediction-from-options-data
title: Implied Volatility Prediction from Options Data
type: Academic Project
description: A large-scale statistical study comparing Generalized Linear Models (GLMs) and black-box machine learning architectures to predict the implied volatility of S&P 500 options.
shortDescription: Predicting implied volatility using advanced regression techniques and machine learning models on financial options data.
publishedAt: 2026-02-28
readingTime: 3
status: Completed
tags:
- R
- GLM
- Finance
- Machine Learning
- Statistical Modeling
icon: i-ph-graph-duotone
---
> **M2 Master's Project** Predicting implied volatility using advanced regression techniques and machine learning models on financial options data.
This project explores the prediction of **implied volatility** from options market data, combining classical statistical methods with modern machine learning approaches. The analysis covers data preprocessing, feature engineering, model benchmarking, and interpretability analysis using real-world financial panel data.
- **GitHub Repository:** [Implied-Volatility-from-Options-Data](https://github.com/ArthurDanjou/Implied-Volatility-from-Options-Data)
---
::BackgroundTitle{title="Project Overview"}
::
### Problem Statement
Implied volatility represents the market's forward-looking expectation of an asset's future volatility. Accurate prediction is crucial for:
- **Option pricing** and valuation
- **Risk management** and hedging strategies
- **Trading strategies** based on volatility arbitrage
### Dataset
The project uses a comprehensive panel dataset tracking **3,887 assets** across **544 observation dates** (2019-2022):
| File | Description | Shape |
|------|-------------|-------|
| `Train_ISF.csv` | Training data with target variable | 1,909,465 rows × 21 columns |
| `Test_ISF.csv` | Test data for prediction | 1,251,308 rows × 18 columns |
| `hat_y.csv` | Final predictions from both models | 1,251,308 rows × 2 columns |
### Key Variables
**Target Variable:**
- `implied_vol_ref` The implied volatility to predict
**Feature Categories:**
- **Identifiers:** `asset_id`, `obs_date`
- **Market Activity:** `call_volume`, `put_volume`, `call_oi`, `put_oi`, `total_contracts`
- **Volatility Metrics:** `realized_vol_short`, `realized_vol_mid1-3`, `realized_vol_long1-4`, `market_vol_index`
- **Option Structure:** `strike_dispersion`, `maturity_count`
---
::BackgroundTitle{title="Methodology"}
::
### Data Pipeline
```
Raw Data
┌─────────────────────────────────────────────────────────┐
│ Data Splitting (Chronological 80/20) │
│ - Training: 2019-10 to 2021-07 │
│ - Validation: 2021-07 to 2022-03 │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Feature Engineering │
│ - Aggregation of volatility horizons │
│ - Creation of financial indicators │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Data Preprocessing (tidymodels) │
│ - Winsorization (99.5th percentile) │
│ - Log/Yeo-Johnson transformations │
│ - Z-score normalization │
│ - PCA (95% variance retention) │
└─────────────────────────────────────────────────────────┘
Three Datasets Generated:
├── Tree-based (raw, scale-invariant)
├── Linear (normalized, winsorized)
└── PCA (dimensionality-reduced)
```
### Feature Engineering
New financial indicators created to capture market dynamics:
| Feature | Description | Formula |
|---------|-------------|---------|
| `pulse_ratio` | Volatility trend direction | RV_short / RV_long |
| `stress_spread` | Asset vs market stress | RV_short - Market_VIX |
| `put_call_ratio_volume` | Immediate market stress | Put_Volume / Call_Volume |
| `put_call_ratio_oi` | Long-term risk structure | Put_OI / Call_OI |
| `liquidity_ratio` | Market depth | Total_Volume / Total_OI |
| `option_dispersion` | Market uncertainty | Strike_Dispersion / Total_Contracts |
| `put_low_strike` | Downside protection density | Strike_Dispersion / Put_OI |
| `put_proportion` | Hedging vs speculation | Put_Volume / Total_Volume |
---
::BackgroundTitle{title="Models Implemented"}
::
### Linear Models
| Model | Description | Best RMSE |
|-------|-------------|-----------|
| **OLS** | Ordinary Least Squares | 11.26 |
| **Ridge** | L2 regularization | 12.48 |
| **Lasso** | L1 regularization (variable selection) | 12.03 |
| **Elastic Net** | L1 + L2 combined | ~12.03 |
| **PLS** | Partial Least Squares (on PCA) | 12.79 |
### Linear Mixed-Effects Models (LMM)
Advanced panel data models accounting for asset-specific effects:
| Model | Features | RMSE |
|-------|----------|------|
| LMM Baseline | All variables + Random Intercept | 8.77 |
| LMM Reduced | Collinearity removal | ~8.77 |
| LMM Interactions | Financial interaction terms | ~8.77 |
| LMM + Quadratic | Convexity terms (vol of vol) | 8.41 |
| **LMM + Random Slopes (mod_lmm_5)** | Asset-specific betas | **8.10** ⭐ |
### Tree-Based Models
| Model | Strategy | Validation RMSE | Training RMSE |
|-------|----------|-----------------|---------------|
| **XGBoost** | Level-wise, Bayesian tuning | 10.70 | 0.57 |
| **LightGBM** | Leaf-wise, feature regularization | **10.61** ⭐ | 10.90 |
| Random Forest | Bagging | DNF* | - |
*DNF: Did Not Finish (computational constraints)
### Neural Networks
| Model | Architecture | Status |
|-------|--------------|--------|
| MLP | 128-64 units, tanh activation | Failed to converge |
---
::BackgroundTitle{title="Results Summary"}
::
### Model Comparison
```
RMSE Performance (Lower is Better)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Linear Mixed-Effects (LMM5) 8.38 ████████████████████ Best Linear
Linear Mixed-Effects (LMM4) 8.41 ███████████████████
Linear Mixed-Effects (Baseline) 8.77 ██████████████████
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
LightGBM 10.61 ███████████████ Best Non-Linear
XGBoost 10.70 ██████████████
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
OLS (with interactions) 11.26 █████████████
Lasso 12.03 ███████████
OLS (baseline) 12.01 ███████████
Ridge 12.48 ██████████
PLS 12.79 █████████
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```
### Key Findings
1. **Best Linear Model:** LMM with Random Slopes (RMSE = 8.38)
- Captures asset-specific volatility sensitivities
- Includes quadratic terms for convexity effects
2. **Best Non-Linear Model:** LightGBM (RMSE = 10.61)
- Superior generalization vs XGBoost
- Feature regularization prevents overfitting
3. **Interpretability Insights (SHAP Analysis):**
- `realized_vol_mid` dominates (57% of gain)
- Volatility clustering confirmed as primary driver
- Non-linear regime switching in stress_spread
---
::BackgroundTitle{title="Repository Structure"}
::
```
PROJECT/
├── Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd # Main analysis (Quarto)
├── Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.html # Rendered report
├── packages.R # R dependencies installer
├── Train_ISF.csv # Training data (~1.9M rows)
├── Test_ISF.csv # Test data (~1.25M rows)
├── hat_y.csv # Final predictions
├── README.md # This file
└── results/
├── lightgbm/ # LightGBM model outputs
└── xgboost/ # XGBoost model outputs
```
---
::BackgroundTitle{title="Getting Started"}
::
### Prerequisites
- **R** ≥ 4.0
- Required packages (auto-installed via `packages.R`)
### Installation
```r
# Install all dependencies
source("packages.R")
```
Or manually install key packages:
```r
install.packages(c(
"tidyverse", "tidymodels", "caret", "glmnet",
"lme4", "lmerTest", "xgboost", "lightgbm",
"ranger", "pls", "shapviz", "rBayesianOptimization"
))
```
### Running the Analysis
1. **Open the Quarto document:**
```r
# In RStudio
rstudioapi::navigateToFile("Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd")
```
2. **Render the document:**
```r
quarto::quarto_render("Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd")
```
3. **Or run specific sections interactively** using the code chunks in the `.qmd` file
---
::BackgroundTitle{title="Technical Details"}
::
### Data Split Strategy
- **Chronological split** at 80th percentile of dates
- Prevents look-ahead bias and data leakage
- Training: ~1.53M observations
- Validation: ~376K observations
### Hyperparameter Tuning
- **Method:** Bayesian Optimization (Gaussian Processes)
- **Acquisition:** Expected Improvement (UCB)
- **Goal:** Maximize negative RMSE
### Evaluation Metric
**Exponential RMSE** on original scale:
$$
RMSE_{real} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \exp(\hat{y}_{\log, i}) - y_i \right)^2}
$$
Models trained on log-transformed target for variance stabilization.
---
::BackgroundTitle{title="Key Concepts"}
::
### Financial Theories Applied
1. **Volatility Clustering** Past volatility predicts future volatility
2. **Variance Risk Premium** Spread between implied and realized volatility
3. **Fear Gauge** Put-call ratio as sentiment indicator
4. **Mean Reversion** Volatility tends to return to long-term average
5. **Liquidity Premium** Illiquid assets command higher volatility
### Statistical Methods
- Panel data modeling with fixed and random effects
- Principal Component Analysis (PCA)
- Bayesian hyperparameter optimization
- SHAP values for model interpretability
---
::BackgroundTitle{title="Authors"}
::
**Team:**
- Arthur DANJOU
- Camille LEGRAND
- Axelle MERIC
- Moritz VON SIEMENS
**Course:** Classification and Regression (M2)
**Academic Year:** 2025-2026
---
::BackgroundTitle{title="Notes"}
::
- **Computational Constraints:** Some models (Random Forest, MLP) failed due to hardware limitations (16GB RAM, CPU-only)
- **Reproducibility:** Set `seed = 2025` for consistent results
- **Language:** Analysis documented in English, course materials in French
---
::BackgroundTitle{title="References"}
::
Key R packages used:
- `tidymodels` Modern modeling framework
- `glmnet` Regularized regression
- `lme4` / `lmerTest` Mixed-effects models
- `xgboost` / `lightgbm` Gradient boosting
- `shapviz` Model interpretability
- `rBayesianOptimization` Hyperparameter tuning

View File

@@ -6,7 +6,7 @@ description: An academic project exploring the automation of GenAI workflows usi
shortDescription: Automating GenAI workflows with n8n and Ollama in a self-hosted environment.
publishedAt: 2026-03-15
readingTime: 2
status: In progress
status: Completed
tags:
- n8n
- Gemini

View File

@@ -0,0 +1,119 @@
---
slug: rl-tennis-atari-game
title: Reinforcement Learning for Tennis Strategy Optimization
type: Academic Project
description: An academic project exploring the application of reinforcement learning to optimize tennis strategies. The project involves training RL agents on Atari Tennis (ALE) to evaluate strategic decision-making through competitive self-play and baseline benchmarking.
shortDescription: Reinforcement learning algorithms applied to Atari tennis matches for strategy optimization and competitive benchmarking.
publishedAt: 2026-03-13
readingTime: 3
status: Completed
tags:
- Reinforcement Learning
- Python
- Gymnasium
- Atari
- ALE
icon: i-ph-lightning-duotone
---
Comparison of Reinforcement Learning algorithms on Atari Tennis (`ALE/Tennis-v5` via Gymnasium/PettingZoo).
- **GitHub Repository:** [Tennis-Atari-Game](https://github.com/ArthurDanjou/Tennis-Atari-Game)
::BackgroundTitle{title="Overview"}
::
This project implements and compares five RL agents playing Atari Tennis against the built-in AI and in head-to-head tournaments.
::BackgroundTitle{title="Algorithms"}
::
| Agent | Type | Policy | Update Rule |
|-------|------|--------|-------------|
| **Random** | Baseline | Uniform random | None |
| **SARSA** | TD(0), on-policy | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (r + \gamma \hat{q}(s', a') - \hat{q}(s, a)) \cdot \phi(s)$ |
| **Q-Learning** | TD(0), off-policy | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (r + \gamma \max_{a'} \hat{q}(s', a') - \hat{q}(s, a)) \cdot \phi(s)$ |
| **Monte Carlo** | First-visit MC | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (G_t - \hat{q}(s, a)) \cdot \phi(s)$ |
| **DQN** | Deep Q-Network | ε-greedy | MLP (256→256) with experience replay & target network |
::BackgroundTitle{title="Architecture"}
::
- **Linear agents** (SARSA, Q-Learning, Monte Carlo): $\hat{q}(s, a; \mathbf{W}) = \mathbf{W}_a^\top \phi(s)$ with $\phi(s) \in \mathbb{R}^{128}$ (RAM observation)
- **DQN**: MLP network (128 → 128 → 64 → 18) trained with Adam optimizer, Huber loss, and periodic target network sync
::BackgroundTitle{title="Environment"}
::
- **Game**: Atari Tennis via PettingZoo (`tennis_v3`)
- **Observation**: RAM state (128 features)
- **Action Space**: 18 discrete actions
- **Agents**: 2 players (`first_0` and `second_0`)
::BackgroundTitle{title="Project Structure"}
::
```
.
├── Project_RL_DANJOU_VON-SIEMENS.ipynb # Main notebook
├── README.md # This file
├── checkpoints/ # Saved agent weights
│ ├── sarsa.pkl
│ ├── q_learning.pkl
│ ├── montecarlo.pkl
│ └── dqn.pkl
└── plots/ # Training & evaluation plots
├── SARSA_training_curves.png
├── Q-Learning_training_curves.png
├── MonteCarlo_training_curves.png
├── DQN_training_curves.png
├── evaluation_results.png
└── championship_matrix.png
```
::BackgroundTitle{title="Key Results"}
::
### Win Rate vs Random Baseline
| Agent | Win Rate |
|-------|----------|
| SARSA | 88.9% |
| Q-Learning | 41.2% |
| Monte Carlo | 47.1% |
| DQN | 6.2% |
### Championship Tournament
Full round-robin tournament where each agent faces every other agent in both positions (first_0/second_0).
::BackgroundTitle{title="Notebook Sections"}
::
1. **Configuration & Checkpoints** — Incremental training workflow with pickle serialization
2. **Utility Functions** — Observation normalization, ε-greedy policy
3. **Agent Definitions**`RandomAgent`, `SarsaAgent`, `QLearningAgent`, `MonteCarloAgent`, `DQNAgent`
4. **Training Infrastructure**`train_agent()`, `plot_training_curves()`
5. **Evaluation** — Match system, random baseline, round-robin tournament
6. **Results & Visualization** — Win rate plots, matchup matrix heatmap
::BackgroundTitle{title="Known Issues"}
::
- **Monte Carlo & DQN**: Checkpoint loading issues — saved weights may not restore properly during evaluation (training works correctly)
::BackgroundTitle{title="Dependencies"}
::
- Python 3.13+
- `numpy`, `matplotlib`
- `torch`
- `gymnasium`, `ale-py`
- `pettingzoo`
- `tqdm`
::BackgroundTitle{title="Authors"}
::
- Arthur DANJOU
- Moritz VON SIEMENS

View File

@@ -1,55 +0,0 @@
---
slug: rl-tennis
title: Reinforcement Learning for Tennis Strategy Optimization
type: Academic Project
description: An academic project exploring the application of reinforcement learning to optimize tennis strategies. The project involves training RL agents on Atari Tennis (ALE) to evaluate strategic decision-making through competitive self-play and baseline benchmarking.
shortDescription: Reinforcement learning algorithms applied to Atari tennis matches for strategy optimization and competitive benchmarking.
publishedAt: 2026-03-13
readingTime: 3
status: In progress
tags:
- Reinforcement Learning
- Python
- Gymnasium
- Atari
- ALE
icon: i-ph-lightning-duotone
---
::BackgroundTitle{title="Overview"}
::
This project serves as a practical application of theoretical Reinforcement Learning (RL) principles. The goal is to develop and train autonomous agents capable of mastering the complex dynamics of **Atari Tennis**, using the **Arcade Learning Environment (ALE)** via Farama Foundation's Gymnasium.
Instead of simply reaching a high score, this project focuses on **strategy optimization** and **comparative performance** through a multi-stage tournament architecture.
::BackgroundTitle{title="Technical Objectives"}
::
The project is divided into three core phases:
### 1. Algorithm Implementation
I am implementing several key RL algorithms covered during my academic curriculum to observe their behavioral differences in a high-dimensional state space:
* **Value-Based Methods:** Deep Q-Networks (DQN) and its variants (Double DQN, Dueling DQN).
* **Policy Gradient Methods:** Proximal Policy Optimization (PPO) for more stable continuous action control.
* **Exploration Strategies:** Implementing epsilon-greedy and entropy-based exploration to handle the sparse reward signals in tennis rallies.
#### 2. The "Grand Slam" Tournament (Self-Play)
To determine the most robust strategy, I developed a competitive framework:
* **Agent vs. Agent:** Different algorithms (e.g., PPO vs. DQN) are pitted against each other in head-to-head matches.
* **Evolutionary Ranking:** Success is measured not just by points won, but by the ability to adapt to the opponent's playstyle (serve-and-volley vs. baseline play).
* **Winner Identification:** The agent with the highest win rate and most stable policy is crowned the "Optimal Strategist."
#### 3. Benchmarking Against Atari Baselines
The final "Boss Level" involves taking my best-performing trained agent and testing it against the pre-trained, high-performance algorithms provided by the Atari/ALE benchmarks. This serves as a validation step to measure the efficiency of my custom implementations against industry-standard baselines.
::BackgroundTitle{title="Tech Stack & Environment"}
::
* **Environment:** [ALE (Arcade Learning Environment) - Tennis](https://ale.farama.org/environments/tennis/)
* **Frameworks:** Python, Gymnasium, PyTorch (for neural network backends).
* **Key Challenges:** Handling the long-horizon dependency of a tennis match and the high-frequency input of the Atari RAM/Pixels.
---
*This project is currently in the training phase. I am fine-tuning the reward function to discourage "passive" play and reward aggressive net approaches.*

View File

@@ -18,11 +18,11 @@
},
"dependencies": {
"@libsql/client": "^0.17.0",
"@nuxt/content": "3.11.2",
"@nuxt/eslint": "1.15.1",
"@nuxt/ui": "^4.4.0",
"@nuxthub/core": "0.10.6",
"@nuxtjs/mdc": "0.20.1",
"@nuxt/content": "3.12.0",
"@nuxt/eslint": "1.15.2",
"@nuxt/ui": "4.5.1",
"@nuxthub/core": "0.10.7",
"@nuxtjs/mdc": "0.20.2",
"@nuxtjs/seo": "3.4.0",
"@vueuse/core": "^14.2.1",
"@vueuse/math": "^14.2.1",
@@ -30,23 +30,23 @@
"drizzle-kit": "^0.31.9",
"drizzle-orm": "^0.45.1",
"nuxt": "4.3.1",
"nuxt-studio": "1.3.2",
"vue": "3.5.28",
"vue-router": "5.0.2",
"nuxt-studio": "1.4.0",
"vue": "3.5.30",
"vue-router": "5.0.3",
"zod": "^4.3.6"
},
"devDependencies": {
"@iconify-json/devicon": "1.2.58",
"@iconify-json/devicon": "1.2.59",
"@iconify-json/file-icons": "^1.2.2",
"@iconify-json/logos": "^1.2.10",
"@iconify-json/ph": "^1.2.2",
"@iconify-json/twemoji": "1.2.5",
"@iconify-json/vscode-icons": "1.2.43",
"@types/node": "25.2.3",
"@iconify-json/vscode-icons": "1.2.45",
"@types/node": "25.4.0",
"@vueuse/nuxt": "14.2.1",
"eslint": "10.0.0",
"eslint": "10.0.3",
"typescript": "^5.9.3",
"vue-tsc": "3.2.4",
"wrangler": "4.66.0"
"vue-tsc": "3.2.5",
"wrangler": "4.71.0"
}
}