artsite/content/projects/glm-financial-assets.md at 5a4a4f380fee728f159e139584e8d4ae4b2d1091

mirror of https://github.com/ArthurDanjou/artsite.git synced 2026-03-16 03:09:44 +01:00

Files

Arthur DANJOU 5a4a4f380f feat: Add CLAUDE.md for project guidance and update project files

- Created CLAUDE.md to provide development commands, architecture overview, and environment variables for the Nuxt 3 portfolio website.
- Refactored project pages to remove unused color mappings and improve project filtering logic.
- Updated content.config.ts to enforce stricter project type definitions and added short descriptions for projects.
- Deleted outdated project files and added new projects related to hackathons and academic research.
- Enhanced existing project descriptions with short summaries for better clarity.

2026-02-16 19:48:31 +01:00

3.1 KiB

Raw Blame History

slug, title, type, description, shortDescription, publishedAt, readingTime, status, tags, icon

slug

title

type

description

shortDescription

publishedAt

readingTime

status

📊 Dataset & Scale

The modeling is performed on a high-dimensional dataset with over 1.2 million observations.

Target Variable: implied_vol_ref (implied volatility).
Features: Option strike price (K), underlying asset price (S), and time to maturity (\tau).
Volume: A training set of 1,251,307 rows and a test set of identical size.

🛠️ Modeling Methodology

The project follows a rigorous statistical pipeline to compare two modeling philosophies:

1. The Statistical Baseline (GLM)

Using R's GLM framework, I implement models with targeted link functions and error distributions (such as Gamma or Inverse Gaussian) to capture the global structure of the volatility surface. These models serve as the benchmark for transparency and stability.

2. The Black-Box Challenge

To capture local non-linearities such as the volatility smile and skew, I explore more complex architectures. Performance is evaluated by Root Mean Squared Error (RMSE) relative to the GLM baselines.

3. Feature Engineering

Key financial indicators are derived from the raw data:

Moneyness: Calculated as the ratio K/S.
Temporal Dynamics: Transformations of time to maturity to linearize the term structure.

📈 Evaluation & Reproducibility

Performance is measured strictly via RMSE on the original scale of the target variable. To ensure reproducibility and precise comparisons across model iterations, a fixed random seed is maintained throughout the entire workflow.

set.seed(2025)

TrainData <- read.csv("train_ISF.csv", stringsAsFactors = FALSE)
TestX <- read.csv("test_ISF.csv", stringsAsFactors = FALSE)

rmse_eval <- function(actual, predicted) {
  sqrt(mean((actual - predicted)^2))
}

🔍 Critical Analysis

Beyond pure prediction, the project addresses:

Model Limits: Identifying market regimes where models fail (e.g., deep out-of-the-money options).
Interpretability: Quantifying the trade-off between complexity and practical utility in a risk management context.
Future Extensions: Considering richer dynamics, such as historical volatility or skew-specific targets.

3.1 KiB Raw Blame History