Refactor project documentation and structure

- Updated data visualization project documentation to remove incomplete warning. - Deleted the glm-financial-assets project file and replaced it with glm-implied-volatility project file, detailing a comprehensive study on implied volatility prediction using GLMs and machine learning. - Marked n8n automations project as completed. - Added new project on reinforcement learning applied to Atari Tennis, detailing agent comparisons and results. - Removed outdated rl-tennis project file. - Updated package dependencies in package.json for improved stability and performance.
2026-03-16 05:09:46 +01:00 · 2026-03-10 12:07:09 +01:00
parent 6d0e55e188
commit ac5ccb3555
8 changed files with 1124 additions and 835 deletions
--- a/content/projects/data-visualisation.md
+++ b/content/projects/data-visualisation.md
@@ -15,10 +15,6 @@ tags:
 icon: i-ph-chart-bar-duotone
 ---

-::warning
-The project is complete, but the documentation is still being expanded with more details.
-::
-
 This project involves building an interactive data visualization application using R and R Shiny. The goal is to deliver dynamic, explorable visualizations that let users interact with the data in meaningful ways.

 ::BackgroundTitle{title="Technologies & Tools"}
--- a/content/projects/glm-financial-assets.md
+++ b/content/projects/glm-financial-assets.md
@@ -1,71 +0,0 @@
---
-slug: implied-volatility-modeling
-title: Implied Volatility Surface Modeling
-type: Academic Project
-description: A large-scale statistical study comparing Generalized Linear Models (GLMs) and black-box machine learning architectures to predict the implied volatility of S&P 500 options.
-shortDescription: Predicting the SPX volatility surface using GLMs and black-box models on 1.2 million observations.
-publishedAt: 2026-02-28
-readingTime: 3
-status: In progress
-tags:
-  - R
-  - GLM
-  - Finance
-  - Machine Learning
-icon: i-ph-graph-duotone
---
-
-This project targets high-precision calibration of the **Implied Volatility Surface** using a large-scale dataset of S&P 500 (SPX) European options.
-
-The core objective is to stress-test classic statistical models against modern predictive algorithms. **Generalized Linear Models (GLMs)** provide a transparent baseline, while more complex "black-box" architectures are evaluated on whether their accuracy gains justify reduced interpretability in a risk management context.
-
-::BackgroundTitle{title="Dataset & Scale"}
-::
-
-The modeling is performed on a high-dimensional dataset with over **1.2 million observations**.
-
- **Target Variable**: `implied_vol_ref` (implied volatility).
- **Features**: Option strike price ($K$), underlying asset price ($S$), and time to maturity ($\tau$).
- **Volume**: A training set of $1,251,307$ rows and a test set of identical size.
-
-::BackgroundTitle{title="Modeling Methodology"}
-::
-
-The project follows a rigorous statistical pipeline to compare two modeling philosophies:
-
-### 1. The Statistical Baseline (GLM)
-Using R's GLM framework, I implement models with targeted link functions and error distributions (such as **Gamma** or **Inverse Gaussian**) to capture the global structure of the volatility surface. These models serve as the benchmark for transparency and stability.
-
-### 2. The Black-Box Challenge
-To capture local non-linearities such as the volatility smile and skew, I explore more complex architectures. Performance is evaluated by **Root Mean Squared Error (RMSE)** relative to the GLM baselines.
-
-### 3. Feature Engineering
-Key financial indicators are derived from the raw data:
- **Moneyness**: Calculated as the ratio $K/S$.
- **Temporal Dynamics**: Transformations of time to maturity to linearize the term structure.
-
-::BackgroundTitle{title="Evaluation & Reproducibility"}
-::
-
-Performance is measured strictly via RMSE on the original scale of the target variable. To ensure reproducibility and precise comparisons across model iterations, a fixed random seed is maintained throughout the workflow.
-
-```r
-set.seed(2025)
-
-TrainData <- read.csv("train_ISF.csv", stringsAsFactors = FALSE)
-TestX <- read.csv("test_ISF.csv", stringsAsFactors = FALSE)
-
-rmse_eval <- function(actual, predicted) {
-  sqrt(mean((actual - predicted)^2))
-}
-
-```
-
-::BackgroundTitle{title="Critical Analysis"}
-::
-
-Beyond pure prediction, the project addresses:
-
- Model Limits: Identifying market regimes where models fail (e.g., deep out-of-the-money options).
- Interpretability: Quantifying the trade-off between complexity and practical utility in a risk management context.
- Future Extensions: Considering richer dynamics, such as historical volatility or skew-specific targets.
--- a/content/projects/glm-implied-volatility.md
+++ b/content/projects/glm-implied-volatility.md
@@ -0,0 +1,336 @@
+---
+slug: implied-volatility-prediction-from-options-data
+title: Implied Volatility Prediction from Options Data
+type: Academic Project
+description: A large-scale statistical study comparing Generalized Linear Models (GLMs) and black-box machine learning architectures to predict the implied volatility of S&P 500 options.
+shortDescription: Predicting implied volatility using advanced regression techniques and machine learning models on financial options data. 
+publishedAt: 2026-02-28
+readingTime: 3
+status: Completed
+tags:
+  - R
+  - GLM
+  - Finance
+  - Machine Learning
+  - Statistical Modeling
+icon: i-ph-graph-duotone
+---
+
+> **M2 Master's Project** – Predicting implied volatility using advanced regression techniques and machine learning models on financial options data.
+
+This project explores the prediction of **implied volatility** from options market data, combining classical statistical methods with modern machine learning approaches. The analysis covers data preprocessing, feature engineering, model benchmarking, and interpretability analysis using real-world financial panel data.
+
+- **GitHub Repository:** [Implied-Volatility-from-Options-Data](https://github.com/ArthurDanjou/Implied-Volatility-from-Options-Data)
+
+---
+
+::BackgroundTitle{title="Project Overview"}
+::
+
+### Problem Statement
+
+Implied volatility represents the market's forward-looking expectation of an asset's future volatility. Accurate prediction is crucial for:
+- **Option pricing** and valuation
+- **Risk management** and hedging strategies
+- **Trading strategies** based on volatility arbitrage
+
+### Dataset
+
+The project uses a comprehensive panel dataset tracking **3,887 assets** across **544 observation dates** (2019-2022):
+
+| File | Description | Shape |
+|------|-------------|-------|
+| `Train_ISF.csv` | Training data with target variable | 1,909,465 rows × 21 columns |
+| `Test_ISF.csv` | Test data for prediction | 1,251,308 rows × 18 columns |
+| `hat_y.csv` | Final predictions from both models | 1,251,308 rows × 2 columns |
+
+### Key Variables
+
+**Target Variable:**
+- `implied_vol_ref` – The implied volatility to predict
+
+**Feature Categories:**
+- **Identifiers:** `asset_id`, `obs_date`
+- **Market Activity:** `call_volume`, `put_volume`, `call_oi`, `put_oi`, `total_contracts`
+- **Volatility Metrics:** `realized_vol_short`, `realized_vol_mid1-3`, `realized_vol_long1-4`, `market_vol_index`
+- **Option Structure:** `strike_dispersion`, `maturity_count`
+
+---
+
+::BackgroundTitle{title="Methodology"}
+::
+
+### Data Pipeline
+
+```
+Raw Data
+    ↓
+┌─────────────────────────────────────────────────────────┐
+│  Data Splitting (Chronological 80/20)                   │
+│  - Training: 2019-10 to 2021-07                         │
+│  - Validation: 2021-07 to 2022-03                       │
+└─────────────────────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────────────────────┐
+│  Feature Engineering                                    │
+│  - Aggregation of volatility horizons                   │
+│  - Creation of financial indicators                     │
+└─────────────────────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────────────────────┐
+│  Data Preprocessing (tidymodels)                        │
+│  - Winsorization (99.5th percentile)                    │
+│  - Log/Yeo-Johnson transformations                      │
+│  - Z-score normalization                                │
+│  - PCA (95% variance retention)                         │
+└─────────────────────────────────────────────────────────┘
+    ↓
+Three Datasets Generated:
+├── Tree-based (raw, scale-invariant)
+├── Linear (normalized, winsorized)
+└── PCA (dimensionality-reduced)
+```
+
+### Feature Engineering
+
+New financial indicators created to capture market dynamics:
+
+| Feature | Description | Formula |
+|---------|-------------|---------|
+| `pulse_ratio` | Volatility trend direction | RV_short / RV_long |
+| `stress_spread` | Asset vs market stress | RV_short - Market_VIX |
+| `put_call_ratio_volume` | Immediate market stress | Put_Volume / Call_Volume |
+| `put_call_ratio_oi` | Long-term risk structure | Put_OI / Call_OI |
+| `liquidity_ratio` | Market depth | Total_Volume / Total_OI |
+| `option_dispersion` | Market uncertainty | Strike_Dispersion / Total_Contracts |
+| `put_low_strike` | Downside protection density | Strike_Dispersion / Put_OI |
+| `put_proportion` | Hedging vs speculation | Put_Volume / Total_Volume |
+
+---
+
+::BackgroundTitle{title="Models Implemented"}
+::
+
+### Linear Models
+
+| Model | Description | Best RMSE |
+|-------|-------------|-----------|
+| **OLS** | Ordinary Least Squares | 11.26 |
+| **Ridge** | L2 regularization | 12.48 |
+| **Lasso** | L1 regularization (variable selection) | 12.03 |
+| **Elastic Net** | L1 + L2 combined | ~12.03 |
+| **PLS** | Partial Least Squares (on PCA) | 12.79 |
+
+### Linear Mixed-Effects Models (LMM)
+
+Advanced panel data models accounting for asset-specific effects:
+
+| Model | Features | RMSE |
+|-------|----------|------|
+| LMM Baseline | All variables + Random Intercept | 8.77 |
+| LMM Reduced | Collinearity removal | ~8.77 |
+| LMM Interactions | Financial interaction terms | ~8.77 |
+| LMM + Quadratic | Convexity terms (vol of vol) | 8.41 |
+| **LMM + Random Slopes (mod_lmm_5)** | Asset-specific betas | **8.10** ⭐ |
+
+### Tree-Based Models
+
+| Model | Strategy | Validation RMSE | Training RMSE |
+|-------|----------|-----------------|---------------|
+| **XGBoost** | Level-wise, Bayesian tuning | 10.70 | 0.57 |
+| **LightGBM** | Leaf-wise, feature regularization | **10.61** ⭐ | 10.90 |
+| Random Forest | Bagging | DNF* | - |
+
+*DNF: Did Not Finish (computational constraints)
+
+### Neural Networks
+
+| Model | Architecture | Status |
+|-------|--------------|--------|
+| MLP | 128-64 units, tanh activation | Failed to converge |
+
+---
+
+::BackgroundTitle{title="Results Summary"}
+::
+
+### Model Comparison
+
+```
+RMSE Performance (Lower is Better)
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Linear Mixed-Effects (LMM5)     8.38 ████████████████████ Best Linear
+Linear Mixed-Effects (LMM4)     8.41 ███████████████████
+Linear Mixed-Effects (Baseline) 8.77 ██████████████████
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+LightGBM                       10.61 ███████████████ Best Non-Linear
+XGBoost                        10.70 ██████████████
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+OLS (with interactions)        11.26 █████████████
+Lasso                          12.03 ███████████
+OLS (baseline)                 12.01 ███████████
+Ridge                          12.48 ██████████
+PLS                            12.79 █████████
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+
+### Key Findings
+
+1. **Best Linear Model:** LMM with Random Slopes (RMSE = 8.38)
+   - Captures asset-specific volatility sensitivities
+   - Includes quadratic terms for convexity effects
+
+2. **Best Non-Linear Model:** LightGBM (RMSE = 10.61)
+   - Superior generalization vs XGBoost
+   - Feature regularization prevents overfitting
+
+3. **Interpretability Insights (SHAP Analysis):**
+   - `realized_vol_mid` dominates (57% of gain)
+   - Volatility clustering confirmed as primary driver
+   - Non-linear regime switching in stress_spread
+
+---
+
+::BackgroundTitle{title="Repository Structure"}
+::
+
+```
+PROJECT/
+├── Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd     # Main analysis (Quarto)
+├── Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.html    # Rendered report
+├── packages.R                                         # R dependencies installer
+├── Train_ISF.csv                                      # Training data (~1.9M rows)
+├── Test_ISF.csv                                       # Test data (~1.25M rows)
+├── hat_y.csv                                          # Final predictions
+├── README.md                                          # This file
+└── results/
+    ├── lightgbm/                                      # LightGBM model outputs
+    └── xgboost/                                       # XGBoost model outputs
+```
+
+---
+
+::BackgroundTitle{title="Getting Started"}
+::
+
+
+### Prerequisites
+
+- **R** ≥ 4.0
+- Required packages (auto-installed via `packages.R`)
+
+### Installation
+
+```r
+# Install all dependencies
+source("packages.R")
+```
+
+Or manually install key packages:
+
+```r
+install.packages(c(
+  "tidyverse", "tidymodels", "caret", "glmnet",
+  "lme4", "lmerTest", "xgboost", "lightgbm",
+  "ranger", "pls", "shapviz", "rBayesianOptimization"
+))
+```
+
+### Running the Analysis
+
+1. **Open the Quarto document:**
+   ```r
+   # In RStudio
+   rstudioapi::navigateToFile("Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd")
+   ```
+
+2. **Render the document:**
+   ```r
+   quarto::quarto_render("Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd")
+   ```
+
+3. **Or run specific sections interactively** using the code chunks in the `.qmd` file
+
+---
+
+::BackgroundTitle{title="Technical Details"}
+::
+
+### Data Split Strategy
+
+- **Chronological split** at 80th percentile of dates
+- Prevents look-ahead bias and data leakage
+- Training: ~1.53M observations
+- Validation: ~376K observations
+
+### Hyperparameter Tuning
+
+- **Method:** Bayesian Optimization (Gaussian Processes)
+- **Acquisition:** Expected Improvement (UCB)
+- **Goal:** Maximize negative RMSE
+
+### Evaluation Metric
+
+**Exponential RMSE** on original scale:
+
+$$
+RMSE_{real} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \exp(\hat{y}_{\log, i}) - y_i \right)^2}
+$$
+
+Models trained on log-transformed target for variance stabilization.
+
+---
+
+::BackgroundTitle{title="Key Concepts"}
+::
+
+### Financial Theories Applied
+
+1. **Volatility Clustering** – Past volatility predicts future volatility
+2. **Variance Risk Premium** – Spread between implied and realized volatility
+3. **Fear Gauge** – Put-call ratio as sentiment indicator
+4. **Mean Reversion** – Volatility tends to return to long-term average
+5. **Liquidity Premium** – Illiquid assets command higher volatility
+
+### Statistical Methods
+
+- Panel data modeling with fixed and random effects
+- Principal Component Analysis (PCA)
+- Bayesian hyperparameter optimization
+- SHAP values for model interpretability
+
+---
+
+::BackgroundTitle{title="Authors"}
+::
+
+**Team:**
+- Arthur DANJOU
+- Camille LEGRAND  
+- Axelle MERIC
+- Moritz VON SIEMENS
+
+**Course:** Classification and Regression (M2)
+**Academic Year:** 2025-2026
+
+---
+
+::BackgroundTitle{title="Notes"}
+::
+
+- **Computational Constraints:** Some models (Random Forest, MLP) failed due to hardware limitations (16GB RAM, CPU-only)
+- **Reproducibility:** Set `seed = 2025` for consistent results
+- **Language:** Analysis documented in English, course materials in French
+
+---
+
+::BackgroundTitle{title="References"}
+::
+
+Key R packages used:
+- `tidymodels` – Modern modeling framework
+- `glmnet` – Regularized regression
+- `lme4` / `lmerTest` – Mixed-effects models
+- `xgboost` / `lightgbm` – Gradient boosting
+- `shapviz` – Model interpretability
+- `rBayesianOptimization` – Hyperparameter tuning
--- a/content/projects/n8n-automations.md
+++ b/content/projects/n8n-automations.md
@@ -6,7 +6,7 @@ description: An academic project exploring the automation of GenAI workflows usi
 shortDescription: Automating GenAI workflows with n8n and Ollama in a self-hosted environment.
 publishedAt: 2026-03-15
 readingTime: 2
-status: In progress
+status: Completed
 tags:
  - n8n
  - Gemini
--- a/content/projects/rl-tennis-atari-game.md
+++ b/content/projects/rl-tennis-atari-game.md
@@ -0,0 +1,119 @@
+---
+slug: rl-tennis-atari-game
+title: Reinforcement Learning for Tennis Strategy Optimization
+type: Academic Project
+description: An academic project exploring the application of reinforcement learning to optimize tennis strategies. The project involves training RL agents on Atari Tennis (ALE) to evaluate strategic decision-making through competitive self-play and baseline benchmarking.
+shortDescription: Reinforcement learning algorithms applied to Atari tennis matches for strategy optimization and competitive benchmarking.
+publishedAt: 2026-03-13
+readingTime: 3
+status: Completed
+tags:
+  - Reinforcement Learning
+  - Python
+  - Gymnasium
+  - Atari
+  - ALE
+icon: i-ph-lightning-duotone
+---
+
+Comparison of Reinforcement Learning algorithms on Atari Tennis (`ALE/Tennis-v5` via Gymnasium/PettingZoo).
+
+- **GitHub Repository:** [Tennis-Atari-Game](https://github.com/ArthurDanjou/Tennis-Atari-Game)
+
+::BackgroundTitle{title="Overview"}
+::
+
+This project implements and compares five RL agents playing Atari Tennis against the built-in AI and in head-to-head tournaments.
+
+::BackgroundTitle{title="Algorithms"}
+::
+
+| Agent | Type | Policy | Update Rule |
+|-------|------|--------|-------------|
+| **Random** | Baseline | Uniform random | None |
+| **SARSA** | TD(0), on-policy | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (r + \gamma \hat{q}(s', a') - \hat{q}(s, a)) \cdot \phi(s)$ |
+| **Q-Learning** | TD(0), off-policy | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (r + \gamma \max_{a'} \hat{q}(s', a') - \hat{q}(s, a)) \cdot \phi(s)$ |
+| **Monte Carlo** | First-visit MC | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (G_t - \hat{q}(s, a)) \cdot \phi(s)$ |
+| **DQN** | Deep Q-Network | ε-greedy | MLP (256→256) with experience replay & target network |
+
+::BackgroundTitle{title="Architecture"}
+::
+
+- **Linear agents** (SARSA, Q-Learning, Monte Carlo): $\hat{q}(s, a; \mathbf{W}) = \mathbf{W}_a^\top \phi(s)$ with $\phi(s) \in \mathbb{R}^{128}$ (RAM observation)
+- **DQN**: MLP network (128 → 128 → 64 → 18) trained with Adam optimizer, Huber loss, and periodic target network sync
+
+::BackgroundTitle{title="Environment"}
+::
+
+- **Game**: Atari Tennis via PettingZoo (`tennis_v3`)
+- **Observation**: RAM state (128 features)
+- **Action Space**: 18 discrete actions
+- **Agents**: 2 players (`first_0` and `second_0`)
+
+::BackgroundTitle{title="Project Structure"}
+::
+
+```
+.
+├── Project_RL_DANJOU_VON-SIEMENS.ipynb    # Main notebook
+├── README.md                              # This file
+├── checkpoints/                           # Saved agent weights
+│   ├── sarsa.pkl
+│   ├── q_learning.pkl
+│   ├── montecarlo.pkl
+│   └── dqn.pkl
+└── plots/                                 # Training & evaluation plots
+    ├── SARSA_training_curves.png
+    ├── Q-Learning_training_curves.png
+    ├── MonteCarlo_training_curves.png
+    ├── DQN_training_curves.png
+    ├── evaluation_results.png
+    └── championship_matrix.png
+```
+
+::BackgroundTitle{title="Key Results"}
+::
+
+### Win Rate vs Random Baseline
+
+| Agent | Win Rate |
+|-------|----------|
+| SARSA | 88.9% |
+| Q-Learning | 41.2% |
+| Monte Carlo | 47.1% |
+| DQN | 6.2% |
+
+### Championship Tournament
+
+Full round-robin tournament where each agent faces every other agent in both positions (first_0/second_0).
+
+::BackgroundTitle{title="Notebook Sections"}
+::
+
+1. **Configuration & Checkpoints** — Incremental training workflow with pickle serialization
+2. **Utility Functions** — Observation normalization, ε-greedy policy
+3. **Agent Definitions** — `RandomAgent`, `SarsaAgent`, `QLearningAgent`, `MonteCarloAgent`, `DQNAgent`
+4. **Training Infrastructure** — `train_agent()`, `plot_training_curves()`
+5. **Evaluation** — Match system, random baseline, round-robin tournament
+6. **Results & Visualization** — Win rate plots, matchup matrix heatmap
+
+::BackgroundTitle{title="Known Issues"}
+::
+
+- **Monte Carlo & DQN**: Checkpoint loading issues — saved weights may not restore properly during evaluation (training works correctly)
+
+::BackgroundTitle{title="Dependencies"}
+::
+
+- Python 3.13+
+- `numpy`, `matplotlib`
+- `torch`
+- `gymnasium`, `ale-py`
+- `pettingzoo`
+- `tqdm`
+
+::BackgroundTitle{title="Authors"}
+::
+
+- Arthur DANJOU
+- Moritz VON SIEMENS
--- a/content/projects/rl-tennis.md
+++ b/content/projects/rl-tennis.md
@@ -1,55 +0,0 @@
---
-slug: rl-tennis
-title: Reinforcement Learning for Tennis Strategy Optimization
-type: Academic Project
-description: An academic project exploring the application of reinforcement learning to optimize tennis strategies. The project involves training RL agents on Atari Tennis (ALE) to evaluate strategic decision-making through competitive self-play and baseline benchmarking.
-shortDescription: Reinforcement learning algorithms applied to Atari tennis matches for strategy optimization and competitive benchmarking.
-publishedAt: 2026-03-13
-readingTime: 3
-status: In progress
-tags:
-  - Reinforcement Learning
-  - Python
-  - Gymnasium
-  - Atari
-  - ALE
-icon: i-ph-lightning-duotone
---
-
-::BackgroundTitle{title="Overview"}
-::
-
-This project serves as a practical application of theoretical Reinforcement Learning (RL) principles. The goal is to develop and train autonomous agents capable of mastering the complex dynamics of **Atari Tennis**, using the **Arcade Learning Environment (ALE)** via Farama Foundation's Gymnasium.
-
-Instead of simply reaching a high score, this project focuses on **strategy optimization** and **comparative performance** through a multi-stage tournament architecture.
-
-::BackgroundTitle{title="Technical Objectives"}
-::
-
-The project is divided into three core phases:
-
-### 1. Algorithm Implementation
-I am implementing several key RL algorithms covered during my academic curriculum to observe their behavioral differences in a high-dimensional state space:
-* **Value-Based Methods:** Deep Q-Networks (DQN) and its variants (Double DQN, Dueling DQN).
-* **Policy Gradient Methods:** Proximal Policy Optimization (PPO) for more stable continuous action control.
-* **Exploration Strategies:** Implementing epsilon-greedy and entropy-based exploration to handle the sparse reward signals in tennis rallies.
-
-#### 2. The "Grand Slam" Tournament (Self-Play)
-To determine the most robust strategy, I developed a competitive framework:
-* **Agent vs. Agent:** Different algorithms (e.g., PPO vs. DQN) are pitted against each other in head-to-head matches.
-* **Evolutionary Ranking:** Success is measured not just by points won, but by the ability to adapt to the opponent's playstyle (serve-and-volley vs. baseline play).
-* **Winner Identification:** The agent with the highest win rate and most stable policy is crowned the "Optimal Strategist."
-
-#### 3. Benchmarking Against Atari Baselines
-The final "Boss Level" involves taking my best-performing trained agent and testing it against the pre-trained, high-performance algorithms provided by the Atari/ALE benchmarks. This serves as a validation step to measure the efficiency of my custom implementations against industry-standard baselines.
-
-::BackgroundTitle{title="Tech Stack & Environment"}
-::
-
-* **Environment:** [ALE (Arcade Learning Environment) - Tennis](https://ale.farama.org/environments/tennis/)
-* **Frameworks:** Python, Gymnasium, PyTorch (for neural network backends).
-* **Key Challenges:** Handling the long-horizon dependency of a tennis match and the high-frequency input of the Atari RAM/Pixels.
-
---
-
-*This project is currently in the training phase. I am fine-tuning the reward function to discourage "passive" play and reward aggressive net approaches.*