Files
ArtStudies/M2/Reinforcement Learning/project
Arthur DANJOU 63ebb3ec8d Enhance Monte Carlo agent with performance optimizations and memory efficiency
- Updated weight and feature storage to use float32 for reduced memory bandwidth.
- Implemented compact storage for raw observations as uint8, batch-normalized at episode end.
- Introduced vectorized return computation and chunk-based weight updates using einsum.
- Reduced weight sanitization to once per episode instead of per-step.
- Refactored action selection and return calculation for improved efficiency.
2026-03-04 18:25:43 +01:00
..