mirror of
https://github.com/ArthurDanjou/artsite.git
synced 2026-03-16 05:09:46 +01:00
Compare commits
9 Commits
evlog-impl
...
c45b1d6f25
| Author | SHA1 | Date | |
|---|---|---|---|
| c45b1d6f25 | |||
| 1537343e44 | |||
| ac5ccb3555 | |||
| 6d0e55e188 | |||
| 5e743cb13e | |||
| 68a3b0468b | |||
| 0703ac7ff7 | |||
| 20f17fba4e | |||
| 81747fb458 |
@@ -5,7 +5,7 @@ defineProps<{
|
||||
</script>
|
||||
|
||||
<template>
|
||||
<h1 class="mb-2 font-bold text-7xl text-transparent opacity-15 text-stroke-neutral-400 dark:text-stroke-neutral-500 text-stroke-2 -translate-x-16">
|
||||
<h1 class="w-full md:w-[110%] mt-4 mb-2 font-bold text-4xl md:text-7xl text-transparent opacity-15 text-stroke-neutral-500 text-stroke-2 md:-translate-x-16">
|
||||
{{ title }}
|
||||
</h1>
|
||||
</template>
|
||||
|
||||
@@ -96,63 +96,65 @@ const grouped_projects = computed(() => {
|
||||
/>
|
||||
</div>
|
||||
<div class="space-y-2">
|
||||
<div class="flex items-center gap-2">
|
||||
<h1 class="font-bold">
|
||||
{{ project.title }}
|
||||
</h1>
|
||||
<UTooltip
|
||||
text="Favorite"
|
||||
:delay-duration="4"
|
||||
>
|
||||
<UBadge
|
||||
v-if="project.favorite"
|
||||
color="amber"
|
||||
variant="subtle"
|
||||
size="sm"
|
||||
icon="i-ph-star-four-duotone"
|
||||
/>
|
||||
</UTooltip>
|
||||
<UTooltip
|
||||
text="In Progress"
|
||||
:delay-duration="4"
|
||||
>
|
||||
<UBadge
|
||||
v-if="project.status === 'In progress'"
|
||||
color="blue"
|
||||
variant="soft"
|
||||
size="sm"
|
||||
icon="i-ph-hourglass-duotone"
|
||||
/>
|
||||
</UTooltip>
|
||||
<UTooltip
|
||||
text="Archived"
|
||||
:delay-duration="4"
|
||||
>
|
||||
<UBadge
|
||||
v-if="project.status === 'Archived'"
|
||||
color="gray"
|
||||
variant="soft"
|
||||
size="sm"
|
||||
icon="i-ph-archive-duotone"
|
||||
/>
|
||||
</UTooltip>
|
||||
</div>
|
||||
<h1 class="font-bold">
|
||||
{{ project.title }}
|
||||
</h1>
|
||||
<p class="italic text-xs text-muted">
|
||||
{{ project.shortDescription }}
|
||||
</p>
|
||||
<div
|
||||
v-if="project.tags?.length"
|
||||
class="flex flex-wrap gap-1.5"
|
||||
>
|
||||
<UBadge
|
||||
v-for="tag in project.tags"
|
||||
:key="tag"
|
||||
color="neutral"
|
||||
variant="outline"
|
||||
size="xs"
|
||||
<div class="flex items-center justify-between">
|
||||
<div
|
||||
v-if="project.tags?.length"
|
||||
class="flex flex-wrap gap-1.5"
|
||||
>
|
||||
{{ tag }}
|
||||
</UBadge>
|
||||
<UBadge
|
||||
v-for="tag in project.tags"
|
||||
:key="tag"
|
||||
color="neutral"
|
||||
variant="outline"
|
||||
size="xs"
|
||||
>
|
||||
{{ tag }}
|
||||
</UBadge>
|
||||
</div>
|
||||
<div class="flex gap-2 items-center justify-center">
|
||||
<UTooltip
|
||||
text="Favorite"
|
||||
:delay-duration="4"
|
||||
>
|
||||
<UBadge
|
||||
v-if="project.favorite"
|
||||
color="amber"
|
||||
variant="subtle"
|
||||
size="sm"
|
||||
icon="i-ph-star-four-duotone"
|
||||
/>
|
||||
</UTooltip>
|
||||
<UTooltip
|
||||
text="In Progress"
|
||||
:delay-duration="4"
|
||||
>
|
||||
<UBadge
|
||||
v-if="project.status === 'In progress'"
|
||||
color="blue"
|
||||
variant="soft"
|
||||
size="sm"
|
||||
icon="i-ph-hourglass-duotone"
|
||||
/>
|
||||
</UTooltip>
|
||||
<UTooltip
|
||||
text="Archived"
|
||||
:delay-duration="4"
|
||||
>
|
||||
<UBadge
|
||||
v-if="project.status === 'Archived'"
|
||||
color="gray"
|
||||
variant="soft"
|
||||
size="sm"
|
||||
icon="i-ph-archive-duotone"
|
||||
/>
|
||||
</UTooltip>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
@@ -9,11 +9,12 @@ Research demands deep focus, but breakthrough ideas often come from stepping bac
|
||||
|
||||
---
|
||||
|
||||
## ⚡ High-Velocity Interests
|
||||
::BackgroundTitle{title="High-Velocity Interests"}
|
||||
::
|
||||
|
||||
I am drawn to environments where strategy, speed, and precision intersect. These are not just pastimes, but exercises in optimization under constraints.
|
||||
|
||||
::div{class="grid grid-cols-1 md:grid-cols-2 gap-6"}
|
||||
:::div{class="grid grid-cols-1 md:grid-cols-2 gap-6"}
|
||||
|
||||
::card{title="Motorsports Strategy" icon="i-ph-flag-checkered-duotone"}
|
||||
**Formula 1 Enthusiast**
|
||||
@@ -27,15 +28,16 @@ Team sports are my foundation for resilience. As a :hover-text{text="former Rugb
|
||||
* **Takeaway:** Collective intelligence always outperforms individual brilliance.
|
||||
::
|
||||
|
||||
::
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## 🌍 Perspectives & Culture
|
||||
::BackgroundTitle{title="Perspectives & Culture"}
|
||||
::
|
||||
|
||||
Curiosity is the fuel of a researcher. Expanding my horizon helps me approach problems with fresh angles.
|
||||
|
||||
::div{class="grid grid-cols-1 md:grid-cols-2 gap-6"}
|
||||
:::div{class="grid grid-cols-1 md:grid-cols-2 gap-6"}
|
||||
|
||||
::card{title="Global Exploration" icon="i-ph-airplane-tilt-duotone"}
|
||||
**Travel & Adaptation**
|
||||
@@ -48,14 +50,15 @@ Exposure to diverse systems fosters adaptability. From the history of **Egypt**
|
||||
As a long-time supporter of **PSG**, I appreciate the tactical analysis and performance management at the highest level. Football is a game of :hover-text{text="spatial optimization" hover="Controlling space & transitions"}, much like architecting a neural network.
|
||||
::
|
||||
|
||||
::
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## 🎵 Creative Patterns
|
||||
::BackgroundTitle{title="Creative Patterns"}
|
||||
::
|
||||
|
||||
**Music** serves as my cognitive reset. Training my ear to recognize harmony and structure translates directly to identifying elegant solutions in system design. It reinforces my belief that great engineering, like great music, requires both **technical precision** and **artistic intuition**.
|
||||
|
||||
::card{title="Philosophy" icon="i-ph-sparkle-duotone"}
|
||||
"Balance is not something you find, it's something you create."
|
||||
::
|
||||
::
|
||||
|
||||
@@ -20,11 +20,13 @@ icon: i-ph-flask-duotone
|
||||
|
||||
[**ArtLab**](https://go.arthurdanjou.fr/status) is my personal homelab: a controlled environment for experimenting with DevOps, distributed systems, and private cloud architecture.
|
||||
|
||||
## 🏗️ Architectural Philosophy
|
||||
::BackgroundTitle{title="Architectural Philosophy"}
|
||||
::
|
||||
|
||||
The infrastructure follows a **Zero Trust** model. Access is restricted to a private mesh VPN using **Tailscale (WireGuard)**, removing the need for open ports. For select public endpoints, **Cloudflare Tunnels** provide a hardened entry point, keeping my public IP hidden while preserving end-to-end encryption from the edge to the origin.
|
||||
|
||||
## 🛠️ Service Stack
|
||||
::BackgroundTitle{title="Service Stack"}
|
||||
::
|
||||
|
||||
Services are grouped by functional domain to keep orchestration clean and scalable:
|
||||
|
||||
@@ -51,7 +53,8 @@ Services are grouped by functional domain to keep orchestration clean and scalab
|
||||
* **MQTT Broker**: Low-latency message bus for device-to-service communication.
|
||||
* **Zigbee2MQTT**: Bridge for local Zigbee device control without cloud dependencies.
|
||||
|
||||
## 🖥️ Hardware Specifications
|
||||
::BackgroundTitle{title="Hardware Specifications"}
|
||||
::
|
||||
|
||||
| Component | Hardware | Role |
|
||||
| :--- | :--- | :--- |
|
||||
|
||||
@@ -20,7 +20,8 @@ icon: i-ph-globe-hemisphere-west-duotone
|
||||
|
||||
More than a static site, it is a modern **Portfolio** designed to be fast, accessible, and type-safe. It also acts as a live production environment where I test the latest frontend technologies and Edge computing paradigms.
|
||||
|
||||
## ⚡ The Nuxt Stack Architecture
|
||||
::BackgroundTitle{title="The Nuxt Stack Architecture"}
|
||||
::
|
||||
|
||||
This project is built entirely on the **Nuxt ecosystem**, leveraging module synergy for strong developer experience and performance.
|
||||
|
||||
|
||||
@@ -23,7 +23,8 @@ The projects are organized into three main sections:
|
||||
- **M1** – First year of the Master's degree in Mathematics
|
||||
- **M2** – Second year of the Master's degree in Mathematics
|
||||
|
||||
## 📁 File Structure
|
||||
::BackgroundTitle{title="File Structure"}
|
||||
::
|
||||
|
||||
- `L3`
|
||||
- `Analyse Matricielle`
|
||||
@@ -52,7 +53,8 @@ The projects are organized into three main sections:
|
||||
- `VBA`
|
||||
- `SQL`
|
||||
|
||||
## 🛠️ Technologies & Tools
|
||||
::BackgroundTitle{title="Technologies & Tools"}
|
||||
::
|
||||
|
||||
- **[Python](https://www.python.org)**: A high-level, interpreted programming language, widely used for data science, machine learning, and scientific computing.
|
||||
- **[R](https://www.r-project.org)**: A statistical computing environment, perfect for data analysis and visualization.
|
||||
|
||||
82
content/projects/climate-issues.md
Normal file
82
content/projects/climate-issues.md
Normal file
@@ -0,0 +1,82 @@
|
||||
---
|
||||
slug: climate-issues
|
||||
title: Wind Risk Modeling - The 1999 Martin Storm
|
||||
type: Academic Project
|
||||
description: An advanced study on wind risk modeling and meteorological hazard assessment, focusing on the historical Martin Storm of December 1999. Combines data analysis, statistical modeling, and GIS mapping to quantify natural disaster impacts.
|
||||
shortDescription: A comprehensive analysis of wind risk modeling during the 1999 Martin Storm using statistical methods and spatial analysis.
|
||||
publishedAt: 2026-02-17
|
||||
readingTime: 5
|
||||
status: Completed
|
||||
tags:
|
||||
- Meteorology
|
||||
- Risk Assessment
|
||||
- Data Analysis
|
||||
- Climate Science
|
||||
- GIS
|
||||
- Statistics
|
||||
icon: i-ph-wind-duotone
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Overview"}
|
||||
::
|
||||
|
||||
This project is a detailed study of **wind risk assessment and modeling** in the context of natural disasters, using the **December 1999 Martin Storm** as a case study. The analysis combines statistical methods, meteorological data, and spatial analysis techniques to understand and quantify the impacts of extreme wind events.
|
||||
|
||||
::BackgroundTitle{title="Objectives"}
|
||||
::
|
||||
|
||||
The primary objectives of this research were:
|
||||
|
||||
1. **Characterize extreme meteorological events** and their propagation patterns
|
||||
2. **Model wind risk** using statistical and probabilistic approaches
|
||||
3. **Assess spatial distribution** of hazards using GIS mapping techniques
|
||||
4. **Quantify economic and environmental impacts** of the storm
|
||||
5. **Develop predictive models** for future risk assessment and disaster preparedness
|
||||
|
||||
::BackgroundTitle{title="Methodology"}
|
||||
::
|
||||
|
||||
### Data Sources
|
||||
- Historical meteorological records from the 1999 Martin Storm
|
||||
- Wind speed measurements from weather stations across France
|
||||
- Satellite imagery and atmospheric pressure data
|
||||
- Damage assessments and economic loss records
|
||||
|
||||
### Analytical Techniques
|
||||
- **Time-series analysis** of wind speed and atmospheric pressure
|
||||
- **Spatial interpolation** using kriging and other geostatistical methods
|
||||
- **Probability distribution fitting** (Weibull, Gumbel, and Log-Normal distributions)
|
||||
- **Return period estimation** for extreme wind events
|
||||
- **Geographic Information Systems (GIS)** for hazard mapping and visualization
|
||||
|
||||
### Statistical Models
|
||||
- Extreme Value Theory (EVT) for tail risk analysis
|
||||
- Generalized Extreme Value (GEV) distributions
|
||||
- Peak-over-threshold (POT) methods
|
||||
- Spatial correlation analysis
|
||||
|
||||
::BackgroundTitle{title="Key Findings"}
|
||||
::
|
||||
|
||||
The analysis revealed:
|
||||
- Wind speeds exceeding 100 km/h across multiple regions
|
||||
- Non-uniform spatial distribution of damage intensity
|
||||
- Correlation patterns between meteorological variables and structural damage
|
||||
- Seasonal and geographical risk variations
|
||||
- Return period estimations for comparable extreme events
|
||||
|
||||
::BackgroundTitle{title="Applications"}
|
||||
::
|
||||
|
||||
The methodologies developed in this project have applications in:
|
||||
- **Disaster risk reduction and preparedness** planning
|
||||
- **Insurance and risk assessment** for natural hazards
|
||||
- **Urban planning** and infrastructure resilience
|
||||
- **Climate adaptation** strategies
|
||||
- **Early warning systems** for extreme weather events
|
||||
|
||||
::BackgroundTitle{title="Detailed Report"}
|
||||
::
|
||||
|
||||
<iframe src="/projects/climate-issues.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
@@ -1,55 +0,0 @@
|
||||
---
|
||||
slug: data-visualisation
|
||||
title: Data Visualisation Project
|
||||
type: Academic Project
|
||||
description: An interactive data visualization project built with R, R Shiny, and ggplot2 for creating dynamic, explorable visualizations.
|
||||
shortDescription: An interactive data visualization project using R and R Shiny.
|
||||
publishedAt: 2026-01-05
|
||||
readingTime: 1
|
||||
status: Completed
|
||||
tags:
|
||||
- R
|
||||
- R Shiny
|
||||
- Data Visualization
|
||||
- ggplot2
|
||||
icon: i-ph-chart-bar-duotone
|
||||
---
|
||||
|
||||
::warning
|
||||
The project is complete, but the documentation is still being expanded with more details.
|
||||
::
|
||||
|
||||
This project involves building an interactive data visualization application using R and R Shiny. The goal is to deliver dynamic, explorable visualizations that let users interact with the data in meaningful ways.
|
||||
|
||||
## 🛠️ Technologies & Tools
|
||||
|
||||
- **[R](https://www.r-project.org)**: A statistical computing environment, perfect for data analysis and visualization.
|
||||
- **[R Shiny](https://shiny.rstudio.com)**: A web application framework for R that enables the creation of interactive web applications directly from R.
|
||||
- **[ggplot2](https://ggplot2.tidyverse.org)**: A powerful R package for creating static and dynamic visualizations using the Grammar of Graphics.
|
||||
- **[dplyr](https://dplyr.tidyverse.org)**: An R package for data manipulation, providing a consistent set of verbs to help you solve common data manipulation challenges.
|
||||
- **[tidyr](https://tidyr.tidyverse.org)**: An R package for tidying data, making it easier to work with and visualize.
|
||||
- **[tidyverse](https://www.tidyverse.org)**: A collection of R packages designed for data science that share an underlying design philosophy, grammar, and data structures.
|
||||
- **[sf](https://r-spatial.github.io/sf/)**: An R package for working with simple features, providing support for spatial data manipulation and analysis.
|
||||
- **[rnaturalearth](https://docs.ropensci.org/rnaturalearth/)**: An R package that provides easy access to natural earth map data for creating geographical visualizations.
|
||||
- **[rnaturalearthdata](https://github.com/ropensci/rnaturalearthdata)**: Companion package to rnaturalearth containing large natural earth datasets.
|
||||
- **[knitr](https://yihui.org/knitr/)**: An R package for dynamic report generation, enabling the integration of code and text.
|
||||
- **[kableExtra](https://haozhu233.github.io/kableExtra/)**: An R package for customizing tables and enhancing their visual presentation.
|
||||
- **[gridExtra](https://cran.r-project.org/web/packages/gridExtra/)**: An R package for arranging multiple grid-based plots on a single page.
|
||||
- **[moments](https://cran.r-project.org/web/packages/moments/)**: An R package for computing moments, skewness, kurtosis and related statistics.
|
||||
- **[factoextra](http://www.sthda.com/english/rpkgs/factoextra/)**: An R package for multivariate data analysis and visualization, including PCA and clustering methods.
|
||||
- **[shinydashboard](https://rstudio.github.io/shinydashboard/)**: An R package for creating dashboards with Shiny.
|
||||
- **[leaflet](https://rstudio.github.io/leaflet/)**: An R package for creating interactive maps using the Leaflet JavaScript library.
|
||||
- **[plotly](https://plotly.com/r/)**: An R package for creating interactive visualizations with the Plotly library.
|
||||
- **[RColorBrewer](https://cran.r-project.org/web/packages/RColorBrewer/)**: An R package providing color palettes for maps and other graphics.
|
||||
- **[DT](https://rstudio.github.io/DT/)**: An R package for creating interactive data tables.
|
||||
|
||||
## 📚 Resources
|
||||
|
||||
You can find the code here: [Data Visualisation Code](https://go.arthurdanjou.fr/datavis-code)
|
||||
|
||||
And the online application here: [Data Visualisation App](https://go.arthurdanjou.fr/datavis-app)
|
||||
|
||||
## 📄 Detailed Report
|
||||
|
||||
<iframe src="/projects/datavis.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
97
content/projects/dataviz-tuberculose.md
Normal file
97
content/projects/dataviz-tuberculose.md
Normal file
@@ -0,0 +1,97 @@
|
||||
---
|
||||
slug: dataviz-tuberculose
|
||||
title: Monitoring & Segmentation of Tuberculosis Cases
|
||||
type: Academic Project
|
||||
description: An interactive data visualization project built with R, R Shiny, and ggplot2 for creating dynamic, explorable visualizations.
|
||||
shortDescription: An interactive data visualization project using R and R Shiny.
|
||||
publishedAt: 2026-01-05
|
||||
readingTime: 1
|
||||
status: Completed
|
||||
tags:
|
||||
- R
|
||||
- R Shiny
|
||||
- Data Visualization
|
||||
- ggplot2
|
||||
icon: i-ph-chart-bar-duotone
|
||||
---
|
||||
|
||||
Interactive Shiny dashboard for WHO tuberculosis data analysis and clustering.
|
||||
|
||||
- **GitHub Repository:** [Tuberculose-Visualisation](https://github.com/ArthurDanjou/Tuberculose-Visualisation)
|
||||
- **Live Application:** [Tuberculose Data Visualization](https://go.arthurdanjou.fr/datavis-app)
|
||||
|
||||
::BackgroundTitle{title="Overview"}
|
||||
::
|
||||
|
||||
This project provides an interactive visualization tool for monitoring and segmenting global tuberculosis data from the World Health Organization (WHO). It applies multivariate analysis to reveal operational typologies of global health risks.
|
||||
|
||||
**Author:** Arthur Danjou
|
||||
**Program:** M2 ISF - Dauphine PSL
|
||||
**Course:** Data Visualisation (2025-2026)
|
||||
|
||||
::BackgroundTitle{title="Features"}
|
||||
::
|
||||
|
||||
- Interactive world map with cluster visualization
|
||||
- K-means clustering for country segmentation (Low/Moderate/Critical Impact)
|
||||
- Time series analysis with year selector (animated)
|
||||
- Region filtering by WHO regions
|
||||
- Key Performance Indicators (KPIs) dashboard
|
||||
- Raw data exploration with data tables
|
||||
|
||||
::BackgroundTitle{title="Project Structure"}
|
||||
::
|
||||
|
||||
```
|
||||
├── app.R # Shiny application
|
||||
├── NoticeTechnique.Rmd # Technical report (R Markdown)
|
||||
├── NoticeTechnique.pdf # Compiled technical report
|
||||
├── data/
|
||||
│ ├── TB_analysis_ready.RData # Processed data with clusters
|
||||
│ └── TB_burden_countries_2025-12-09.csv # Raw WHO data
|
||||
└── renv/ # R package management
|
||||
```
|
||||
|
||||
::BackgroundTitle{title="Requirements"}
|
||||
::
|
||||
|
||||
- R (>= 4.0.0)
|
||||
- R packages (see `renv.lock`):
|
||||
- shiny
|
||||
- shinydashboard
|
||||
- leaflet
|
||||
- plotly
|
||||
- dplyr
|
||||
- sf
|
||||
- RColorBrewer
|
||||
- DT
|
||||
- rnaturalearth
|
||||
|
||||
::BackgroundTitle{title="Installation"}
|
||||
::
|
||||
|
||||
1. Clone this repository
|
||||
2. Open R/RStudio in the project directory
|
||||
3. Restore packages with `renv::restore()`
|
||||
4. Run the application:
|
||||
```r
|
||||
shiny::runApp("app.R")
|
||||
```
|
||||
|
||||
::BackgroundTitle{title="Detailed Report"}
|
||||
::
|
||||
|
||||
<iframe src="/projects/datavis.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
|
||||
::BackgroundTitle{title="License"}
|
||||
::
|
||||
|
||||
© 2026 Arthur Danjou. All rights reserved.
|
||||
|
||||
::BackgroundTitle{title="Resources"}
|
||||
::
|
||||
|
||||
You can find the code here: [Data Visualisation Code](https://go.arthurdanjou.fr/datavis-code)
|
||||
|
||||
And the online application here: [Data Visualisation App](https://go.arthurdanjou.fr/datavis-app)
|
||||
@@ -21,7 +21,8 @@ The paper is available at: [https://arxiv.org/abs/2303.01500](https://arxiv.org/
|
||||
|
||||
This repository contains a robust, modular **TensorFlow/Keras** implementation of **Early Dropout** and **Late Dropout** strategies. The goal is to verify the hypothesis that dropout, traditionally used to reduce overfitting, can also combat underfitting when applied only during the initial training phase.
|
||||
|
||||
## 🎯 Scientific Objectives
|
||||
::BackgroundTitle{title="Scientific Objectives"}
|
||||
::
|
||||
|
||||
The study aims to validate the operating regimes of Dropout described in the paper:
|
||||
|
||||
@@ -30,7 +31,8 @@ The study aims to validate the operating regimes of Dropout described in the pap
|
||||
3. **Standard Dropout**: Constant rate throughout training (baseline).
|
||||
4. **No Dropout**: Control experiment without dropout.
|
||||
|
||||
## 🛠️ Technical Architecture
|
||||
::BackgroundTitle{title="Technical Architecture"}
|
||||
::
|
||||
|
||||
Unlike naive Keras callback implementations, this project uses a **dynamic approach via the TensorFlow graph** to ensure the dropout rate updates on the GPU without model recompilation.
|
||||
|
||||
@@ -40,7 +42,8 @@ Unlike naive Keras callback implementations, this project uses a **dynamic appro
|
||||
* **`DropoutScheduler`**: A Keras `Callback` that drives the rate variable based on the current epoch and the chosen strategy (`early`, `late`, `standard`).
|
||||
* **`ExperimentPipeline`**: An orchestrator class that handles data loading (MNIST, CIFAR-10, Fashion MNIST), model creation (Dense or CNN), and execution of comparative benchmarks.
|
||||
|
||||
## File Structure
|
||||
::BackgroundTitle{title="File Structure"}
|
||||
::
|
||||
|
||||
```
|
||||
.
|
||||
@@ -57,7 +60,8 @@ Unlike naive Keras callback implementations, this project uses a **dynamic appro
|
||||
└── uv.lock # Dependency lock file
|
||||
```
|
||||
|
||||
## 🚀 Installation
|
||||
::BackgroundTitle{title="Installation"}
|
||||
::
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
@@ -65,12 +69,14 @@ git clone https://github.com/arthurdanjou/dropoutreducesunderfitting.git
|
||||
cd dropoutreducesunderfitting
|
||||
```
|
||||
|
||||
## Install dependencies
|
||||
::BackgroundTitle{title="Install dependencies"}
|
||||
::
|
||||
```bash
|
||||
pip install tensorflow numpy matplotlib seaborn scikit-learn
|
||||
```
|
||||
|
||||
## 📊 Usage
|
||||
::BackgroundTitle{title="Usage"}
|
||||
::
|
||||
|
||||
The main notebook pipeline.ipynb contains all necessary code. Here is how to run a typical experiment via the pipeline API.
|
||||
|
||||
@@ -133,19 +139,22 @@ exp.run_dataset_size_comparison(
|
||||
)
|
||||
```
|
||||
|
||||
## 📈 Expected Results
|
||||
::BackgroundTitle{title="Expected Results"}
|
||||
::
|
||||
|
||||
According to the paper, you should observe:
|
||||
|
||||
- Early Dropout: Higher initial loss, followed by a sharp drop after the switch_epoch, often reaching a lower minimum than Standard Dropout (reduction of underfitting).
|
||||
- Late Dropout: Rapid rise in accuracy at the start (potential overfitting), then stabilized by the activation of dropout.
|
||||
|
||||
## 📄 Detailed Report
|
||||
::BackgroundTitle{title="Detailed Report"}
|
||||
::
|
||||
|
||||
<iframe src="/projects/dropout-reduces-underfitting.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
|
||||
## 📝 Authors
|
||||
::BackgroundTitle{title="Authors"}
|
||||
::
|
||||
|
||||
- [Arthur Danjou](https://github.com/ArthurDanjou)
|
||||
- [Alexis Mathieu](https://github.com/Alex6535)
|
||||
|
||||
@@ -17,14 +17,16 @@ icon: i-ph-bicycle-duotone
|
||||
|
||||
This project was completed as part of the **Generalized Linear Models** course at Paris-Dauphine PSL University. The objective was to develop and compare statistical models that predict bicycle rentals in a bike-sharing system using environmental and temporal features.
|
||||
|
||||
## 📊 Project Objectives
|
||||
::BackgroundTitle{title="Project Objectives"}
|
||||
::
|
||||
|
||||
- Determine the best predictive model for bicycle rental counts
|
||||
- Analyze the impact of key features (temperature, humidity, wind speed, seasonality, etc.)
|
||||
- Apply and evaluate different generalized linear modeling techniques
|
||||
- Validate model assumptions and performance metrics
|
||||
|
||||
## 🔍 Methodology
|
||||
::BackgroundTitle{title="Methodology"}
|
||||
::
|
||||
|
||||
The study uses a rigorous statistical workflow, including:
|
||||
|
||||
@@ -34,7 +36,8 @@ The study uses a rigorous statistical workflow, including:
|
||||
- **Model Diagnostics** - Validating assumptions and checking residuals
|
||||
- **Cross-validation** - Ensuring robust performance estimates
|
||||
|
||||
## 📁 Key Findings
|
||||
::BackgroundTitle{title="Key Findings"}
|
||||
::
|
||||
|
||||
The analysis identified critical factors influencing bike-sharing demand:
|
||||
- Seasonal patterns and weather conditions
|
||||
@@ -42,11 +45,13 @@ The analysis identified critical factors influencing bike-sharing demand:
|
||||
- Holiday and working day distinctions
|
||||
- Time-based trends and cyclical patterns
|
||||
|
||||
## 📚 Resources
|
||||
::BackgroundTitle{title="Resources"}
|
||||
::
|
||||
|
||||
You can find the code here: [GLM Bikes Code](https://go.arthurdanjou.fr/glm-bikes-code)
|
||||
|
||||
## 📄 Detailed Report
|
||||
::BackgroundTitle{title="Detailed Report"}
|
||||
::
|
||||
|
||||
<iframe src="/projects/bikes-glm.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
|
||||
@@ -1,67 +0,0 @@
|
||||
---
|
||||
slug: implied-volatility-modeling
|
||||
title: Implied Volatility Surface Modeling
|
||||
type: Academic Project
|
||||
description: A large-scale statistical study comparing Generalized Linear Models (GLMs) and black-box machine learning architectures to predict the implied volatility of S&P 500 options.
|
||||
shortDescription: Predicting the SPX volatility surface using GLMs and black-box models on 1.2 million observations.
|
||||
publishedAt: 2026-02-28
|
||||
readingTime: 3
|
||||
status: In progress
|
||||
tags:
|
||||
- R
|
||||
- GLM
|
||||
- Finance
|
||||
- Machine Learning
|
||||
icon: i-ph-graph-duotone
|
||||
---
|
||||
|
||||
This project targets high-precision calibration of the **Implied Volatility Surface** using a large-scale dataset of S&P 500 (SPX) European options.
|
||||
|
||||
The core objective is to stress-test classic statistical models against modern predictive algorithms. **Generalized Linear Models (GLMs)** provide a transparent baseline, while more complex "black-box" architectures are evaluated on whether their accuracy gains justify reduced interpretability in a risk management context.
|
||||
|
||||
## 📊 Dataset & Scale
|
||||
|
||||
The modeling is performed on a high-dimensional dataset with over **1.2 million observations**.
|
||||
|
||||
- **Target Variable**: `implied_vol_ref` (implied volatility).
|
||||
- **Features**: Option strike price ($K$), underlying asset price ($S$), and time to maturity ($\tau$).
|
||||
- **Volume**: A training set of $1,251,307$ rows and a test set of identical size.
|
||||
|
||||
## 🛠️ Modeling Methodology
|
||||
|
||||
The project follows a rigorous statistical pipeline to compare two modeling philosophies:
|
||||
|
||||
### 1. The Statistical Baseline (GLM)
|
||||
Using R's GLM framework, I implement models with targeted link functions and error distributions (such as **Gamma** or **Inverse Gaussian**) to capture the global structure of the volatility surface. These models serve as the benchmark for transparency and stability.
|
||||
|
||||
### 2. The Black-Box Challenge
|
||||
To capture local non-linearities such as the volatility smile and skew, I explore more complex architectures. Performance is evaluated by **Root Mean Squared Error (RMSE)** relative to the GLM baselines.
|
||||
|
||||
### 3. Feature Engineering
|
||||
Key financial indicators are derived from the raw data:
|
||||
- **Moneyness**: Calculated as the ratio $K/S$.
|
||||
- **Temporal Dynamics**: Transformations of time to maturity to linearize the term structure.
|
||||
|
||||
## 📈 Evaluation & Reproducibility
|
||||
|
||||
Performance is measured strictly via RMSE on the original scale of the target variable. To ensure reproducibility and precise comparisons across model iterations, a fixed random seed is maintained throughout the workflow.
|
||||
|
||||
```r
|
||||
set.seed(2025)
|
||||
|
||||
TrainData <- read.csv("train_ISF.csv", stringsAsFactors = FALSE)
|
||||
TestX <- read.csv("test_ISF.csv", stringsAsFactors = FALSE)
|
||||
|
||||
rmse_eval <- function(actual, predicted) {
|
||||
sqrt(mean((actual - predicted)^2))
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
## 🔍 Critical Analysis
|
||||
|
||||
Beyond pure prediction, the project addresses:
|
||||
|
||||
- Model Limits: Identifying market regimes where models fail (e.g., deep out-of-the-money options).
|
||||
- Interpretability: Quantifying the trade-off between complexity and practical utility in a risk management context.
|
||||
- Future Extensions: Considering richer dynamics, such as historical volatility or skew-specific targets.
|
||||
336
content/projects/glm-implied-volatility.md
Normal file
336
content/projects/glm-implied-volatility.md
Normal file
@@ -0,0 +1,336 @@
|
||||
---
|
||||
slug: implied-volatility-prediction-from-options-data
|
||||
title: Implied Volatility Prediction from Options Data
|
||||
type: Academic Project
|
||||
description: A large-scale statistical study comparing Generalized Linear Models (GLMs) and black-box machine learning architectures to predict the implied volatility of S&P 500 options.
|
||||
shortDescription: Predicting implied volatility using advanced regression techniques and machine learning models on financial options data.
|
||||
publishedAt: 2026-02-28
|
||||
readingTime: 3
|
||||
status: Completed
|
||||
tags:
|
||||
- R
|
||||
- GLM
|
||||
- Finance
|
||||
- Machine Learning
|
||||
- Statistical Modeling
|
||||
icon: i-ph-graph-duotone
|
||||
---
|
||||
|
||||
> **M2 Master's Project** – Predicting implied volatility using advanced regression techniques and machine learning models on financial options data.
|
||||
|
||||
This project explores the prediction of **implied volatility** from options market data, combining classical statistical methods with modern machine learning approaches. The analysis covers data preprocessing, feature engineering, model benchmarking, and interpretability analysis using real-world financial panel data.
|
||||
|
||||
- **GitHub Repository:** [Implied-Volatility-from-Options-Data](https://github.com/ArthurDanjou/Implied-Volatility-from-Options-Data)
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Project Overview"}
|
||||
::
|
||||
|
||||
### Problem Statement
|
||||
|
||||
Implied volatility represents the market's forward-looking expectation of an asset's future volatility. Accurate prediction is crucial for:
|
||||
- **Option pricing** and valuation
|
||||
- **Risk management** and hedging strategies
|
||||
- **Trading strategies** based on volatility arbitrage
|
||||
|
||||
### Dataset
|
||||
|
||||
The project uses a comprehensive panel dataset tracking **3,887 assets** across **544 observation dates** (2019-2022):
|
||||
|
||||
| File | Description | Shape |
|
||||
|------|-------------|-------|
|
||||
| `Train_ISF.csv` | Training data with target variable | 1,909,465 rows × 21 columns |
|
||||
| `Test_ISF.csv` | Test data for prediction | 1,251,308 rows × 18 columns |
|
||||
| `hat_y.csv` | Final predictions from both models | 1,251,308 rows × 2 columns |
|
||||
|
||||
### Key Variables
|
||||
|
||||
**Target Variable:**
|
||||
- `implied_vol_ref` – The implied volatility to predict
|
||||
|
||||
**Feature Categories:**
|
||||
- **Identifiers:** `asset_id`, `obs_date`
|
||||
- **Market Activity:** `call_volume`, `put_volume`, `call_oi`, `put_oi`, `total_contracts`
|
||||
- **Volatility Metrics:** `realized_vol_short`, `realized_vol_mid1-3`, `realized_vol_long1-4`, `market_vol_index`
|
||||
- **Option Structure:** `strike_dispersion`, `maturity_count`
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Methodology"}
|
||||
::
|
||||
|
||||
### Data Pipeline
|
||||
|
||||
```
|
||||
Raw Data
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Data Splitting (Chronological 80/20) │
|
||||
│ - Training: 2019-10 to 2021-07 │
|
||||
│ - Validation: 2021-07 to 2022-03 │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Feature Engineering │
|
||||
│ - Aggregation of volatility horizons │
|
||||
│ - Creation of financial indicators │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Data Preprocessing (tidymodels) │
|
||||
│ - Winsorization (99.5th percentile) │
|
||||
│ - Log/Yeo-Johnson transformations │
|
||||
│ - Z-score normalization │
|
||||
│ - PCA (95% variance retention) │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
Three Datasets Generated:
|
||||
├── Tree-based (raw, scale-invariant)
|
||||
├── Linear (normalized, winsorized)
|
||||
└── PCA (dimensionality-reduced)
|
||||
```
|
||||
|
||||
### Feature Engineering
|
||||
|
||||
New financial indicators created to capture market dynamics:
|
||||
|
||||
| Feature | Description | Formula |
|
||||
|---------|-------------|---------|
|
||||
| `pulse_ratio` | Volatility trend direction | RV_short / RV_long |
|
||||
| `stress_spread` | Asset vs market stress | RV_short - Market_VIX |
|
||||
| `put_call_ratio_volume` | Immediate market stress | Put_Volume / Call_Volume |
|
||||
| `put_call_ratio_oi` | Long-term risk structure | Put_OI / Call_OI |
|
||||
| `liquidity_ratio` | Market depth | Total_Volume / Total_OI |
|
||||
| `option_dispersion` | Market uncertainty | Strike_Dispersion / Total_Contracts |
|
||||
| `put_low_strike` | Downside protection density | Strike_Dispersion / Put_OI |
|
||||
| `put_proportion` | Hedging vs speculation | Put_Volume / Total_Volume |
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Models Implemented"}
|
||||
::
|
||||
|
||||
### Linear Models
|
||||
|
||||
| Model | Description | Best RMSE |
|
||||
|-------|-------------|-----------|
|
||||
| **OLS** | Ordinary Least Squares | 11.26 |
|
||||
| **Ridge** | L2 regularization | 12.48 |
|
||||
| **Lasso** | L1 regularization (variable selection) | 12.03 |
|
||||
| **Elastic Net** | L1 + L2 combined | ~12.03 |
|
||||
| **PLS** | Partial Least Squares (on PCA) | 12.79 |
|
||||
|
||||
### Linear Mixed-Effects Models (LMM)
|
||||
|
||||
Advanced panel data models accounting for asset-specific effects:
|
||||
|
||||
| Model | Features | RMSE |
|
||||
|-------|----------|------|
|
||||
| LMM Baseline | All variables + Random Intercept | 8.77 |
|
||||
| LMM Reduced | Collinearity removal | ~8.77 |
|
||||
| LMM Interactions | Financial interaction terms | ~8.77 |
|
||||
| LMM + Quadratic | Convexity terms (vol of vol) | 8.41 |
|
||||
| **LMM + Random Slopes (mod_lmm_5)** | Asset-specific betas | **8.10** ⭐ |
|
||||
|
||||
### Tree-Based Models
|
||||
|
||||
| Model | Strategy | Validation RMSE | Training RMSE |
|
||||
|-------|----------|-----------------|---------------|
|
||||
| **XGBoost** | Level-wise, Bayesian tuning | 10.70 | 0.57 |
|
||||
| **LightGBM** | Leaf-wise, feature regularization | **10.61** ⭐ | 10.90 |
|
||||
| Random Forest | Bagging | DNF* | - |
|
||||
|
||||
*DNF: Did Not Finish (computational constraints)
|
||||
|
||||
### Neural Networks
|
||||
|
||||
| Model | Architecture | Status |
|
||||
|-------|--------------|--------|
|
||||
| MLP | 128-64 units, tanh activation | Failed to converge |
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Results Summary"}
|
||||
::
|
||||
|
||||
### Model Comparison
|
||||
|
||||
```
|
||||
RMSE Performance (Lower is Better)
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
Linear Mixed-Effects (LMM5) 8.38 ████████████████████ Best Linear
|
||||
Linear Mixed-Effects (LMM4) 8.41 ███████████████████
|
||||
Linear Mixed-Effects (Baseline) 8.77 ██████████████████
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
LightGBM 10.61 ███████████████ Best Non-Linear
|
||||
XGBoost 10.70 ██████████████
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
OLS (with interactions) 11.26 █████████████
|
||||
Lasso 12.03 ███████████
|
||||
OLS (baseline) 12.01 ███████████
|
||||
Ridge 12.48 ██████████
|
||||
PLS 12.79 █████████
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
```
|
||||
|
||||
### Key Findings
|
||||
|
||||
1. **Best Linear Model:** LMM with Random Slopes (RMSE = 8.38)
|
||||
- Captures asset-specific volatility sensitivities
|
||||
- Includes quadratic terms for convexity effects
|
||||
|
||||
2. **Best Non-Linear Model:** LightGBM (RMSE = 10.61)
|
||||
- Superior generalization vs XGBoost
|
||||
- Feature regularization prevents overfitting
|
||||
|
||||
3. **Interpretability Insights (SHAP Analysis):**
|
||||
- `realized_vol_mid` dominates (57% of gain)
|
||||
- Volatility clustering confirmed as primary driver
|
||||
- Non-linear regime switching in stress_spread
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Repository Structure"}
|
||||
::
|
||||
|
||||
```
|
||||
PROJECT/
|
||||
├── Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd # Main analysis (Quarto)
|
||||
├── Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.html # Rendered report
|
||||
├── packages.R # R dependencies installer
|
||||
├── Train_ISF.csv # Training data (~1.9M rows)
|
||||
├── Test_ISF.csv # Test data (~1.25M rows)
|
||||
├── hat_y.csv # Final predictions
|
||||
├── README.md # This file
|
||||
└── results/
|
||||
├── lightgbm/ # LightGBM model outputs
|
||||
└── xgboost/ # XGBoost model outputs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Getting Started"}
|
||||
::
|
||||
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- **R** ≥ 4.0
|
||||
- Required packages (auto-installed via `packages.R`)
|
||||
|
||||
### Installation
|
||||
|
||||
```r
|
||||
# Install all dependencies
|
||||
source("packages.R")
|
||||
```
|
||||
|
||||
Or manually install key packages:
|
||||
|
||||
```r
|
||||
install.packages(c(
|
||||
"tidyverse", "tidymodels", "caret", "glmnet",
|
||||
"lme4", "lmerTest", "xgboost", "lightgbm",
|
||||
"ranger", "pls", "shapviz", "rBayesianOptimization"
|
||||
))
|
||||
```
|
||||
|
||||
### Running the Analysis
|
||||
|
||||
1. **Open the Quarto document:**
|
||||
```r
|
||||
# In RStudio
|
||||
rstudioapi::navigateToFile("Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd")
|
||||
```
|
||||
|
||||
2. **Render the document:**
|
||||
```r
|
||||
quarto::quarto_render("Projet_MRC_DANJOU_LEGRAND_MERIC_VONSIEMENS.qmd")
|
||||
```
|
||||
|
||||
3. **Or run specific sections interactively** using the code chunks in the `.qmd` file
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Technical Details"}
|
||||
::
|
||||
|
||||
### Data Split Strategy
|
||||
|
||||
- **Chronological split** at 80th percentile of dates
|
||||
- Prevents look-ahead bias and data leakage
|
||||
- Training: ~1.53M observations
|
||||
- Validation: ~376K observations
|
||||
|
||||
### Hyperparameter Tuning
|
||||
|
||||
- **Method:** Bayesian Optimization (Gaussian Processes)
|
||||
- **Acquisition:** Expected Improvement (UCB)
|
||||
- **Goal:** Maximize negative RMSE
|
||||
|
||||
### Evaluation Metric
|
||||
|
||||
**Exponential RMSE** on original scale:
|
||||
|
||||
$$
|
||||
RMSE_{real} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \exp(\hat{y}_{\log, i}) - y_i \right)^2}
|
||||
$$
|
||||
|
||||
Models trained on log-transformed target for variance stabilization.
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Key Concepts"}
|
||||
::
|
||||
|
||||
### Financial Theories Applied
|
||||
|
||||
1. **Volatility Clustering** – Past volatility predicts future volatility
|
||||
2. **Variance Risk Premium** – Spread between implied and realized volatility
|
||||
3. **Fear Gauge** – Put-call ratio as sentiment indicator
|
||||
4. **Mean Reversion** – Volatility tends to return to long-term average
|
||||
5. **Liquidity Premium** – Illiquid assets command higher volatility
|
||||
|
||||
### Statistical Methods
|
||||
|
||||
- Panel data modeling with fixed and random effects
|
||||
- Principal Component Analysis (PCA)
|
||||
- Bayesian hyperparameter optimization
|
||||
- SHAP values for model interpretability
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Authors"}
|
||||
::
|
||||
|
||||
**Team:**
|
||||
- Arthur DANJOU
|
||||
- Camille LEGRAND
|
||||
- Axelle MERIC
|
||||
- Moritz VON SIEMENS
|
||||
|
||||
**Course:** Classification and Regression (M2)
|
||||
**Academic Year:** 2025-2026
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="Notes"}
|
||||
::
|
||||
|
||||
- **Computational Constraints:** Some models (Random Forest, MLP) failed due to hardware limitations (16GB RAM, CPU-only)
|
||||
- **Reproducibility:** Set `seed = 2025` for consistent results
|
||||
- **Language:** Analysis documented in English, course materials in French
|
||||
|
||||
---
|
||||
|
||||
::BackgroundTitle{title="References"}
|
||||
::
|
||||
|
||||
Key R packages used:
|
||||
- `tidymodels` – Modern modeling framework
|
||||
- `glmnet` – Regularized regression
|
||||
- `lme4` / `lmerTest` – Mixed-effects models
|
||||
- `xgboost` / `lightgbm` – Gradient boosting
|
||||
- `shapviz` – Model interpretability
|
||||
- `rBayesianOptimization` – Hyperparameter tuning
|
||||
@@ -16,13 +16,15 @@ tags:
|
||||
icon: i-ph-shield-check-duotone
|
||||
---
|
||||
|
||||
## The Setting: Fort de Mont-Valérien
|
||||
::BackgroundTitle{title="The Setting: Fort de Mont-Valérien"}
|
||||
::
|
||||
|
||||
This was not a typical university hackathon. Organized by the **Commissariat au Numerique de Defense (CND)**, the event took place over three intense days within the walls of the **Fort de Mont-Valerien**, a highly secured military fortress.
|
||||
|
||||
Working in this environment underscored the real-world stakes of the mission. Our **team of six**, representing **Universite Paris-Dauphine**, competed against several elite engineering schools to solve critical defense-related data challenges.
|
||||
|
||||
## The Mission: Classifying the "Invisible"
|
||||
::BackgroundTitle{title="The Mission: Classifying the Invisible"}
|
||||
::
|
||||
|
||||
The core task involved processing poorly labeled and noisy firewall logs. In a defense context, a "missing" log or a mislabeled entry can be the difference between a minor system bug and a coordinated intrusion.
|
||||
|
||||
@@ -38,13 +40,15 @@ In military cybersecurity, the cost of a **False Negative** (an undetected attac
|
||||
|
||||
> **Key Achievement:** Our model significantly reduced the rate of undetected threats compared to the baseline configurations provided at the start of the challenge.
|
||||
|
||||
## Deployment & Interaction
|
||||
::BackgroundTitle{title="Deployment & Interaction"}
|
||||
::
|
||||
|
||||
To make our findings operational, we built a **Streamlit-based command center**:
|
||||
* **On-the-Fly Analysis:** Security officers can paste a single log line to get an immediate "Bug vs. Attack" probability score.
|
||||
* **Bulk Audit:** The interface supports CSV uploads, allowing for the rapid analysis of entire daily log batches to highlight high-risk anomalies.
|
||||
|
||||
## Technical Stack
|
||||
::BackgroundTitle{title="Technical Stack"}
|
||||
::
|
||||
* **Language:** Python
|
||||
* **ML Library:** Scikit-learn, XGBoost
|
||||
* **Deployment:** Streamlit
|
||||
|
||||
@@ -16,13 +16,15 @@ tags:
|
||||
icon: i-ph-database-duotone
|
||||
---
|
||||
|
||||
## The Challenge
|
||||
::BackgroundTitle{title="The Challenge"}
|
||||
::
|
||||
|
||||
Organized by **Natixis**, this hackathon followed a high-intensity format: **three consecutive Saturdays** of on-site development, bridged by two full weeks of remote collaboration.
|
||||
|
||||
Working in a **team of four**, our goal was to bridge the gap between non-technical stakeholders and complex financial databases by creating an autonomous "Data Talk" agent.
|
||||
|
||||
## Core Features
|
||||
::BackgroundTitle{title="Core Features"}
|
||||
::
|
||||
|
||||
### 1. Data Engineering & Schema Design
|
||||
Before building the AI layer, we handled a significant data migration task. I led the effort to:
|
||||
@@ -39,14 +41,16 @@ Data is only useful if it’s readable. Our Nuxt application goes beyond raw tab
|
||||
* **Dynamic Charts:** The agent automatically determines the best visualization type (Bar, Line, Pie) based on the query result and renders it using interactive components.
|
||||
* **Narrative Explanations:** A final LLM pass summarizes the data findings in plain English, highlighting anomalies or key trends.
|
||||
|
||||
## Technical Stack
|
||||
::BackgroundTitle{title="Technical Stack"}
|
||||
::
|
||||
|
||||
* **Frontend/API:** **Nuxt 3** for a seamless, reactive user interface.
|
||||
* **Orchestration:** **Vercel AI SDK** to manage streams and tool-calling logic.
|
||||
* **Inference:** **Ollama** for running LLMs locally, ensuring data privacy during development.
|
||||
* **Storage:** **PostgreSQL** for the converted data warehouse.
|
||||
|
||||
## Impact & Results
|
||||
::BackgroundTitle{title="Impact & Results"}
|
||||
::
|
||||
|
||||
This project demonstrated that a modern stack (Nuxt + local LLMs) can drastically reduce the time needed for data discovery. By the final Saturday, our team presented a working prototype capable of handling multi-table joins and generating real-time financial dashboards from simple chat prompts.
|
||||
|
||||
|
||||
@@ -18,14 +18,16 @@ icon: i-ph-money-wavy-duotone
|
||||
|
||||
This project focuses on building machine learning models to predict loan approval outcomes and assess default risk. The objective is to develop robust classification models that identify creditworthy applicants.
|
||||
|
||||
## 📊 Project Objectives
|
||||
::BackgroundTitle{title="Project Objectives"}
|
||||
::
|
||||
|
||||
- Build and compare multiple classification models for loan prediction
|
||||
- Identify key factors influencing loan approval decisions
|
||||
- Evaluate model performance using appropriate metrics
|
||||
- Optimize model parameters for better predictive accuracy
|
||||
|
||||
## 🔍 Methodology
|
||||
::BackgroundTitle{title="Methodology"}
|
||||
::
|
||||
|
||||
The study employs a range of machine learning approaches:
|
||||
|
||||
@@ -35,7 +37,8 @@ The study employs a range of machine learning approaches:
|
||||
- **Hyperparameter Tuning** - Optimizing model performance
|
||||
- **Cross-validation** - Ensuring robust generalization
|
||||
|
||||
## 📄 Detailed Report
|
||||
::BackgroundTitle{title="Detailed Report"}
|
||||
::
|
||||
|
||||
<iframe src="/projects/loan-ml.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
|
||||
@@ -19,7 +19,8 @@ icon: i-ph-dice-five-duotone
|
||||
|
||||
This report presents the Monte Carlo Methods Project completed as part of the **Monte Carlo Methods** course at Paris-Dauphine University. The goal was to implement a range of Monte Carlo methods and algorithms in R.
|
||||
|
||||
## 🛠️ Methods and Algorithms
|
||||
::BackgroundTitle{title="Methods and Algorithms"}
|
||||
::
|
||||
|
||||
- Plotting graphs of functions
|
||||
- Inverse CDF random variation simulation
|
||||
@@ -28,11 +29,13 @@ This report presents the Monte Carlo Methods Project completed as part of the **
|
||||
- Cumulative density function
|
||||
- Empirical quantile function
|
||||
|
||||
## 📚 Resources
|
||||
::BackgroundTitle{title="Resources"}
|
||||
::
|
||||
|
||||
You can find the code here: [Monte Carlo Project Code](https://go.arthurdanjou.fr/monte-carlo-code)
|
||||
|
||||
## 📄 Detailed Report
|
||||
::BackgroundTitle{title="Detailed Report"}
|
||||
::
|
||||
|
||||
<iframe src="/projects/monte-carlo.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
|
||||
@@ -6,7 +6,7 @@ description: An academic project exploring the automation of GenAI workflows usi
|
||||
shortDescription: Automating GenAI workflows with n8n and Ollama in a self-hosted environment.
|
||||
publishedAt: 2026-03-15
|
||||
readingTime: 2
|
||||
status: In progress
|
||||
status: Completed
|
||||
tags:
|
||||
- n8n
|
||||
- Gemini
|
||||
@@ -17,11 +17,13 @@ tags:
|
||||
icon: i-ph-plugs-connected-duotone
|
||||
---
|
||||
|
||||
## Overview
|
||||
::BackgroundTitle{title="Overview"}
|
||||
::
|
||||
|
||||
This project focuses on designing and implementing autonomous workflows that leverage Large Language Models (LLMs) to streamline productivity and academic research. By orchestrating Generative AI through a self-hosted infrastructure on my **[ArtLab](/projects/artlab)**, I built a private ecosystem that acts as both a personal assistant and a specialized research agent.
|
||||
|
||||
## Key Workflows
|
||||
::BackgroundTitle{title="Key Workflows"}
|
||||
::
|
||||
|
||||
### 1. Centralized Productivity Hub
|
||||
I developed a synchronization engine that bridges **Notion**, **Google Calendar**, and **Todoist**.
|
||||
@@ -35,7 +37,8 @@ To stay at the forefront of AI research, I built an automated pipeline for acade
|
||||
* **Knowledge Base:** Relevant papers and posts are automatically stored in a structured Notion database.
|
||||
* **Interactive Research Agent:** I integrated a chat interface within n8n that allows me to query this collected data. I can request summaries, ask specific technical questions about a paper, or extract the most relevant insights for my current thesis work.
|
||||
|
||||
## Technical Architecture
|
||||
::BackgroundTitle{title="Technical Architecture"}
|
||||
::
|
||||
|
||||
The environment is built to handle complex multi-step chains, moving beyond simple API calls to create context-aware agents.
|
||||
|
||||
@@ -44,7 +47,8 @@ The environment is built to handle complex multi-step chains, moving beyond simp
|
||||
* **Data Sources:** RSS feeds and Notion databases.
|
||||
* **Notifications & UI:** Gmail for briefings and Discord for real-time system alerts.
|
||||
|
||||
## Key Objectives
|
||||
::BackgroundTitle{title="Key Objectives"}
|
||||
::
|
||||
|
||||
1. **Privacy-Centric AI:** Ensuring that sensitive academic data and personal schedules remain within a self-hosted or controlled environment.
|
||||
2. **Academic Efficiency:** Reducing the "noise" of information overload by using AI to surface only the most relevant research papers.
|
||||
|
||||
119
content/projects/rl-tennis-atari-game.md
Normal file
119
content/projects/rl-tennis-atari-game.md
Normal file
@@ -0,0 +1,119 @@
|
||||
---
|
||||
slug: rl-tennis-atari-game
|
||||
title: Reinforcement Learning for Tennis Strategy Optimization
|
||||
type: Academic Project
|
||||
description: An academic project exploring the application of reinforcement learning to optimize tennis strategies. The project involves training RL agents on Atari Tennis (ALE) to evaluate strategic decision-making through competitive self-play and baseline benchmarking.
|
||||
shortDescription: Reinforcement learning algorithms applied to Atari tennis matches for strategy optimization and competitive benchmarking.
|
||||
publishedAt: 2026-03-13
|
||||
readingTime: 3
|
||||
status: Completed
|
||||
tags:
|
||||
- Reinforcement Learning
|
||||
- Python
|
||||
- Gymnasium
|
||||
- Atari
|
||||
- ALE
|
||||
icon: i-ph-lightning-duotone
|
||||
---
|
||||
|
||||
Comparison of Reinforcement Learning algorithms on Atari Tennis (`ALE/Tennis-v5` via Gymnasium/PettingZoo).
|
||||
|
||||
- **GitHub Repository:** [Tennis-Atari-Game](https://github.com/ArthurDanjou/Tennis-Atari-Game)
|
||||
|
||||
::BackgroundTitle{title="Overview"}
|
||||
::
|
||||
|
||||
This project implements and compares five RL agents playing Atari Tennis against the built-in AI and in head-to-head tournaments.
|
||||
|
||||
::BackgroundTitle{title="Algorithms"}
|
||||
::
|
||||
|
||||
| Agent | Type | Policy | Update Rule |
|
||||
|-------|------|--------|-------------|
|
||||
| **Random** | Baseline | Uniform random | None |
|
||||
| **SARSA** | TD(0), on-policy | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (r + \gamma \hat{q}(s', a') - \hat{q}(s, a)) \cdot \phi(s)$ |
|
||||
| **Q-Learning** | TD(0), off-policy | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (r + \gamma \max_{a'} \hat{q}(s', a') - \hat{q}(s, a)) \cdot \phi(s)$ |
|
||||
| **Monte Carlo** | First-visit MC | ε-greedy | $W_a \leftarrow W_a + \alpha \cdot (G_t - \hat{q}(s, a)) \cdot \phi(s)$ |
|
||||
| **DQN** | Deep Q-Network | ε-greedy | MLP (256→256) with experience replay & target network |
|
||||
|
||||
::BackgroundTitle{title="Architecture"}
|
||||
::
|
||||
|
||||
- **Linear agents** (SARSA, Q-Learning, Monte Carlo): $\hat{q}(s, a; \mathbf{W}) = \mathbf{W}_a^\top \phi(s)$ with $\phi(s) \in \mathbb{R}^{128}$ (RAM observation)
|
||||
- **DQN**: MLP network (128 → 128 → 64 → 18) trained with Adam optimizer, Huber loss, and periodic target network sync
|
||||
|
||||
::BackgroundTitle{title="Environment"}
|
||||
::
|
||||
|
||||
- **Game**: Atari Tennis via PettingZoo (`tennis_v3`)
|
||||
- **Observation**: RAM state (128 features)
|
||||
- **Action Space**: 18 discrete actions
|
||||
- **Agents**: 2 players (`first_0` and `second_0`)
|
||||
|
||||
::BackgroundTitle{title="Project Structure"}
|
||||
::
|
||||
|
||||
```
|
||||
.
|
||||
├── Project_RL_DANJOU_VON-SIEMENS.ipynb # Main notebook
|
||||
├── README.md # This file
|
||||
├── checkpoints/ # Saved agent weights
|
||||
│ ├── sarsa.pkl
|
||||
│ ├── q_learning.pkl
|
||||
│ ├── montecarlo.pkl
|
||||
│ └── dqn.pkl
|
||||
└── plots/ # Training & evaluation plots
|
||||
├── SARSA_training_curves.png
|
||||
├── Q-Learning_training_curves.png
|
||||
├── MonteCarlo_training_curves.png
|
||||
├── DQN_training_curves.png
|
||||
├── evaluation_results.png
|
||||
└── championship_matrix.png
|
||||
```
|
||||
|
||||
::BackgroundTitle{title="Key Results"}
|
||||
::
|
||||
|
||||
### Win Rate vs Random Baseline
|
||||
|
||||
| Agent | Win Rate |
|
||||
|-------|----------|
|
||||
| SARSA | 88.9% |
|
||||
| Q-Learning | 41.2% |
|
||||
| Monte Carlo | 47.1% |
|
||||
| DQN | 6.2% |
|
||||
|
||||
### Championship Tournament
|
||||
|
||||
Full round-robin tournament where each agent faces every other agent in both positions (first_0/second_0).
|
||||
|
||||
::BackgroundTitle{title="Notebook Sections"}
|
||||
::
|
||||
|
||||
1. **Configuration & Checkpoints** — Incremental training workflow with pickle serialization
|
||||
2. **Utility Functions** — Observation normalization, ε-greedy policy
|
||||
3. **Agent Definitions** — `RandomAgent`, `SarsaAgent`, `QLearningAgent`, `MonteCarloAgent`, `DQNAgent`
|
||||
4. **Training Infrastructure** — `train_agent()`, `plot_training_curves()`
|
||||
5. **Evaluation** — Match system, random baseline, round-robin tournament
|
||||
6. **Results & Visualization** — Win rate plots, matchup matrix heatmap
|
||||
|
||||
::BackgroundTitle{title="Known Issues"}
|
||||
::
|
||||
|
||||
- **Monte Carlo & DQN**: Checkpoint loading issues — saved weights may not restore properly during evaluation (training works correctly)
|
||||
|
||||
::BackgroundTitle{title="Dependencies"}
|
||||
::
|
||||
|
||||
- Python 3.13+
|
||||
- `numpy`, `matplotlib`
|
||||
- `torch`
|
||||
- `gymnasium`, `ale-py`
|
||||
- `pettingzoo`
|
||||
- `tqdm`
|
||||
|
||||
::BackgroundTitle{title="Authors"}
|
||||
::
|
||||
|
||||
- Arthur DANJOU
|
||||
- Moritz VON SIEMENS
|
||||
@@ -1,52 +0,0 @@
|
||||
---
|
||||
slug: rl-tennis
|
||||
title: Reinforcement Learning for Tennis Strategy Optimization
|
||||
type: Academic Project
|
||||
description: An academic project exploring the application of reinforcement learning to optimize tennis strategies. The project involves training RL agents on Atari Tennis (ALE) to evaluate strategic decision-making through competitive self-play and baseline benchmarking.
|
||||
shortDescription: Reinforcement learning algorithms applied to Atari tennis matches for strategy optimization and competitive benchmarking.
|
||||
publishedAt: 2026-03-13
|
||||
readingTime: 3
|
||||
status: In progress
|
||||
tags:
|
||||
- Reinforcement Learning
|
||||
- Python
|
||||
- Gymnasium
|
||||
- Atari
|
||||
- ALE
|
||||
icon: i-ph-lightning-duotone
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This project serves as a practical application of theoretical Reinforcement Learning (RL) principles. The goal is to develop and train autonomous agents capable of mastering the complex dynamics of **Atari Tennis**, using the **Arcade Learning Environment (ALE)** via Farama Foundation's Gymnasium.
|
||||
|
||||
Instead of simply reaching a high score, this project focuses on **strategy optimization** and **comparative performance** through a multi-stage tournament architecture.
|
||||
|
||||
## Technical Objectives
|
||||
|
||||
The project is divided into three core phases:
|
||||
|
||||
### 1. Algorithm Implementation
|
||||
I am implementing several key RL algorithms covered during my academic curriculum to observe their behavioral differences in a high-dimensional state space:
|
||||
* **Value-Based Methods:** Deep Q-Networks (DQN) and its variants (Double DQN, Dueling DQN).
|
||||
* **Policy Gradient Methods:** Proximal Policy Optimization (PPO) for more stable continuous action control.
|
||||
* **Exploration Strategies:** Implementing epsilon-greedy and entropy-based exploration to handle the sparse reward signals in tennis rallies.
|
||||
|
||||
### 2. The "Grand Slam" Tournament (Self-Play)
|
||||
To determine the most robust strategy, I developed a competitive framework:
|
||||
* **Agent vs. Agent:** Different algorithms (e.g., PPO vs. DQN) are pitted against each other in head-to-head matches.
|
||||
* **Evolutionary Ranking:** Success is measured not just by points won, but by the ability to adapt to the opponent's playstyle (serve-and-volley vs. baseline play).
|
||||
* **Winner Identification:** The agent with the highest win rate and most stable policy is crowned the "Optimal Strategist."
|
||||
|
||||
### 3. Benchmarking Against Atari Baselines
|
||||
The final "Boss Level" involves taking my best-performing trained agent and testing it against the pre-trained, high-performance algorithms provided by the Atari/ALE benchmarks. This serves as a validation step to measure the efficiency of my custom implementations against industry-standard baselines.
|
||||
|
||||
## Tech Stack & Environment
|
||||
|
||||
* **Environment:** [ALE (Arcade Learning Environment) - Tennis](https://ale.farama.org/environments/tennis/)
|
||||
* **Frameworks:** Python, Gymnasium, PyTorch (for neural network backends).
|
||||
* **Key Challenges:** Handling the long-horizon dependency of a tennis match and the high-frequency input of the Atari RAM/Pixels.
|
||||
|
||||
---
|
||||
|
||||
*This project is currently in the training phase. I am fine-tuning the reward function to discourage "passive" play and reward aggressive net approaches.*
|
||||
@@ -18,11 +18,13 @@ icon: i-ph-city-duotone
|
||||
|
||||
This report presents the Schelling Segregation Model project completed as part of the **Projet Numerique** course at Paris-Saclay University. The goal was to implement the Schelling Segregation Model in Python and analyze the results using statistics and data visualization.
|
||||
|
||||
## 📚 Resources
|
||||
::BackgroundTitle{title="Resources"}
|
||||
::
|
||||
|
||||
You can find the code here: [Schelling Segregation Model Code](https://go.arthurdanjou.fr/schelling-code)
|
||||
|
||||
## 📄 Detailed Report
|
||||
::BackgroundTitle{title="Detailed Report"}
|
||||
::
|
||||
|
||||
<iframe src="/projects/schelling.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
|
||||
@@ -20,13 +20,15 @@ icon: i-ph-dog-duotone
|
||||
|
||||
Committed to digital innovation, Sevetys leverages centralized data systems to optimize clinic operations, improve patient data management, and enhance the overall client experience. This combination of medical excellence and operational efficiency supports veterinarians in delivering high-quality care nationwide.
|
||||
|
||||
## 🎯 Internship Objectives
|
||||
::BackgroundTitle{title="Internship Objectives"}
|
||||
::
|
||||
|
||||
During my two-month internship as a Data Engineer, I focused primarily on cleaning and standardizing customer and patient data, a critical task because this data is extensively used by clinics, Marketing, and Performance teams. Ensuring data quality was essential to the company's operations.
|
||||
|
||||
Additionally, I revised and enhanced an existing data quality report designed to evaluate the effectiveness of my cleaning processes. The report covered 47 detailed metrics assessing data completeness and consistency, providing valuable insights that helped maintain high standards across the organization.
|
||||
|
||||
## ⚙️ Technology Stack
|
||||
::BackgroundTitle{title="Technology Stack"}
|
||||
::
|
||||
|
||||
- **[Microsoft Azure Cloud](https://azure.microsoft.com/)**: Cloud infrastructure platform
|
||||
- **[PySpark](https://spark.apache.org/docs/latest/api/python/)**: Distributed data processing framework
|
||||
|
||||
@@ -17,7 +17,8 @@ icon: i-ph-heart-half-duotone
|
||||
|
||||
This project was carried out as part of the **Statistical Learning** course at Paris-Dauphine PSL University. The objective is to identify the most effective model for predicting or explaining the presence of breast cancer based on a set of biological and clinical features.
|
||||
|
||||
## 📊 Project Objectives
|
||||
::BackgroundTitle{title="Project Objectives"}
|
||||
::
|
||||
|
||||
Develop and evaluate several supervised classification models to predict the presence of breast cancer based on biological features extracted from the Breast Cancer Coimbra dataset, provided by the UCI Machine Learning Repository.
|
||||
|
||||
@@ -27,7 +28,8 @@ The dataset contains 116 observations divided into two classes:
|
||||
|
||||
There are 9 explanatory variables, including clinical measurements such as age, insulin levels, leptin, insulin resistance, among others.
|
||||
|
||||
## 🔍 Methodology
|
||||
::BackgroundTitle{title="Methodology"}
|
||||
::
|
||||
|
||||
The project follows a comparative approach between several algorithms:
|
||||
|
||||
@@ -40,11 +42,13 @@ Model evaluation is primarily based on the F1-score, which is more suitable in a
|
||||
|
||||
This project illustrates a concrete application of data science techniques to a public health issue, while implementing a rigorous methodology for supervised modeling.
|
||||
|
||||
## 📚 Resources
|
||||
::BackgroundTitle{title="Resources"}
|
||||
::
|
||||
|
||||
You can find the code here: [Breast Cancer Detection](https://go.arthurdanjou.fr/breast-cancer-detection-code)
|
||||
|
||||
## 📄 Detailed Report
|
||||
::BackgroundTitle{title="Detailed Report"}
|
||||
::
|
||||
|
||||
<iframe src="/projects/breast-cancer.pdf" width="100%" height="1000px">
|
||||
</iframe>
|
||||
|
||||
28
package.json
28
package.json
@@ -18,11 +18,11 @@
|
||||
},
|
||||
"dependencies": {
|
||||
"@libsql/client": "^0.17.0",
|
||||
"@nuxt/content": "3.11.2",
|
||||
"@nuxt/eslint": "1.15.1",
|
||||
"@nuxt/ui": "^4.4.0",
|
||||
"@nuxthub/core": "0.10.6",
|
||||
"@nuxtjs/mdc": "0.20.1",
|
||||
"@nuxt/content": "3.12.0",
|
||||
"@nuxt/eslint": "1.15.2",
|
||||
"@nuxt/ui": "4.5.1",
|
||||
"@nuxthub/core": "0.10.7",
|
||||
"@nuxtjs/mdc": "0.20.2",
|
||||
"@nuxtjs/seo": "3.4.0",
|
||||
"@vueuse/core": "^14.2.1",
|
||||
"@vueuse/math": "^14.2.1",
|
||||
@@ -30,23 +30,23 @@
|
||||
"drizzle-kit": "^0.31.9",
|
||||
"drizzle-orm": "^0.45.1",
|
||||
"nuxt": "4.3.1",
|
||||
"nuxt-studio": "1.3.2",
|
||||
"vue": "3.5.28",
|
||||
"vue-router": "5.0.2",
|
||||
"nuxt-studio": "1.4.0",
|
||||
"vue": "3.5.30",
|
||||
"vue-router": "5.0.3",
|
||||
"zod": "^4.3.6"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@iconify-json/devicon": "1.2.58",
|
||||
"@iconify-json/devicon": "1.2.59",
|
||||
"@iconify-json/file-icons": "^1.2.2",
|
||||
"@iconify-json/logos": "^1.2.10",
|
||||
"@iconify-json/ph": "^1.2.2",
|
||||
"@iconify-json/twemoji": "1.2.5",
|
||||
"@iconify-json/vscode-icons": "1.2.42",
|
||||
"@types/node": "25.2.3",
|
||||
"@iconify-json/vscode-icons": "1.2.45",
|
||||
"@types/node": "25.4.0",
|
||||
"@vueuse/nuxt": "14.2.1",
|
||||
"eslint": "10.0.0",
|
||||
"eslint": "10.0.3",
|
||||
"typescript": "^5.9.3",
|
||||
"vue-tsc": "3.2.4",
|
||||
"wrangler": "4.65.0"
|
||||
"vue-tsc": "3.2.5",
|
||||
"wrangler": "4.71.0"
|
||||
}
|
||||
}
|
||||
|
||||
BIN
public/projects/climate-issues.pdf
Normal file
BIN
public/projects/climate-issues.pdf
Normal file
Binary file not shown.
Reference in New Issue
Block a user