Back to Projects

CERN Particle Physics ML Pipeline

Featured

Neural network pipeline for ATLAS detector signal separation

2024-20251 yearCalifornia State University, StanislausStudent Researcher, Software Engineer
PythonKerasTensorFlow

Problem Statement

CERN's ATLAS detector generates massive amounts of collision data, but signal events (interesting physics) are buried in background noise. Traditional analysis methods struggled with the high dimensionality and complexity of the data. Need automated way to separate signal from background with high accuracy.

Solution

Developed a complete neural network pipeline in Python using Keras and scikit-learn. Built data extraction pipeline converting CERN ROOT files to Apache Parquet for efficient analysis. Implemented robust feature selection methods (univariate and multivariate) with statistical power scoring. Created comprehensive visualization tools to evaluate model performance and feature importance.

Impact

Successfully trained neural network to separate signal from background in ATLAS collision data. Feature selection methods enhanced model training data, increasing classification performance. Pipeline is reusable and can be adapted for other particle physics experiments.

Technical Highlights

  • Built end-to-end ML pipeline: preprocessing → training → evaluation
  • Converted CERN ROOT format to Parquet for 10x faster data loading
  • Implemented univariate + multivariate feature selection with statistical validation
  • Integrated Keras models with scikit-learn for robust cross-validation
  • Automated hyperparameter tuning with grid/random search
  • Created visualization suite for model evaluation and feature analysis

Key Metrics

Improved vs baseline
Model Performance
20+ variables
Features Evaluated
ROOT → Parquet
Data Format

Future Improvements

  • Implement gradient boosting (XGBoost) for comparison
  • Add SHAP values for model interpretability
  • Create web dashboard for interactive model exploration
  • Parallelize preprocessing for larger datasets