artsite/content/projects/hackathon-cnd.md at 5a4a4f380fee728f159e139584e8d4ae4b2d1091

mirror of https://github.com/ArthurDanjou/artsite.git synced 2026-03-16 07:09:20 +01:00

Files

Arthur DANJOU 5a4a4f380f feat: Add CLAUDE.md for project guidance and update project files

- Created CLAUDE.md to provide development commands, architecture overview, and environment variables for the Nuxt 3 portfolio website.
- Refactored project pages to remove unused color mappings and improve project filtering logic.
- Updated content.config.ts to enforce stricter project type definitions and added short descriptions for projects.
- Deleted outdated project files and added new projects related to hackathons and academic research.
- Enhanced existing project descriptions with short summaries for better clarity.

2026-02-16 19:48:31 +01:00

3.2 KiB

Raw Blame History

slug, title, type, description, shortDescription, publishedAt, readingTime, status, tags, icon

slug

title

type

description

shortDescription

publishedAt

readingTime

status

The Setting: Fort de Mont-Valérien

This wasn't your typical university hackathon. Organized by the Commissariat au Numérique de Défense (CND), the event took place over three intense days within the walls of the Fort de Mont-Valérien—a highly secured military fortress.

Working in this environment underscored the real-world stakes of the mission. Our team of six, representing Université Paris-Dauphine, competed against several elite engineering schools to solve critical defense-related data challenges.

The Mission: Classifying the "Invisible"

The core task involved processing poorly labeled and noisy firewall logs. In a defense context, a "missing" log or a mislabeled entry can be the difference between a minor system bug and a coordinated intrusion.

1. Tactical Log Translation

Firewall logs are often cryptic and inconsistent. We developed a preprocessing pipeline to:

Feature Extraction: Parse raw logs into structured data (headers, flags, payloads).
Contextual Labeling: Distinguish between routine system "bugs" (non-malicious failures) and actual "attacks" (malicious intent).

2. Strategic Goal: Recalling the Threat

In military cybersecurity, the cost of a False Negative (an undetected attack) is catastrophic.

Model Priority: We optimized our classifiers specifically for Recall. We would rather investigate a few system bugs (False Positives) than let a single attack slip through the net.
Techniques: We used ensemble methods (XGBoost/Random Forest) combined with advanced resampling to handle the heavy class imbalance typical of network traffic.

Key Achievement: Our model significantly reduced the rate of undetected threats compared to standard baseline configurations provided at the start of the challenge.

Deployment & Interaction

To make our findings operational, we built a Streamlit-based command center:

On-the-Fly Analysis: Security officers can paste a single log line to get an immediate "Bug vs. Attack" probability score.
Bulk Audit: The interface supports CSV uploads, allowing for the rapid analysis of entire daily log batches to highlight high-risk anomalies.

Technical Stack

Language: Python
ML Library: Scikit-learn, XGBoost
Deployment: Streamlit
Environment: High-security on-site military infrastructure

Representing Dauphine in such a specialized environment was a highlight of my academic year. Would you like me to elaborate on the specific feature engineering techniques we used to "clean" the raw military logs?

3.2 KiB Raw Blame History