Methods Comparison — ML for REIMS

Summary Results

Task	Baseline (OPLS-DA)	Best Deep Learning Model	Improvement
Fish Species	OPLS-DA: 96.39%	MoE Transformer: 100.00%	+3.61%
Fish Body Part	OPLS-DA: 51.17%	Ensemble Transformer: 74.13%	+22.96%
Oil Contamination	OPLS-DA: 26.43%	TL MoE Transformer: 49.10%	+22.67%
Cross-species Adulteration	OPLS-DA: 79.96%	Pre-trained Transformer: 91.97%	+12.01%
Batch Detection	OPLS-DA: 53.19%	SpectroSim (Transformer): 70.80%	+17.61%

Model Details

MoE Transformer (Gone Phishing)

Fish Species

100.00%

Mixture of Experts Transformer replacing FFN layers with gated expert sub-networks. Routes different spectral regions to specialized experts.

See Chapter 4 →

Ensemble Transformer (Autobots)

Fish Body Part

74.13%

Multi-scale stacked ensemble: three Transformers with 2, 4, and 8 layers act as level-0 classifiers, combined by a meta-learner.

See Chapter 4 →

TL MoE Transformer

Oil Contamination

49.10%

Transfer Learning applied to MoE Transformer: pre-trained on species identification, fine-tuned for ordinal oil contamination detection.

See Chapter 5 →

Pre-trained Transformer

Cross-species Adulteration

91.97%

Transformer pre-trained with Masked Spectra Modelling (MSM), then fine-tuned for 3-class adulteration detection (Hoki/Mackerel/mixture).

See Chapter 5 →

SpectroSim (Transformer)

Batch Detection

70.80%

Self-supervised contrastive framework (SimCLR adaptation) using a Transformer encoder to learn pairwise mass spectra similarity without labels.

See Chapter 6 →

Common Methods

All models are evaluated using Balanced Classification Accuracy (BCA) to handle class imbalance. The baseline throughout is OPLS-DA (Orthogonal Partial Least Squares Discriminant Analysis), the standard chemometrics method. Deep learning architectures are all based on the Transformer with various enhancements (MoE, ensembling, transfer learning, contrastive learning).

Explainability is provided via LIME and Grad-CAM, identifying which m/z spectral features drive each prediction.