Methods Comparison

Cross-chapter comparison of all analytical tasks, baselines, and best-performing models from the thesis.

Summary Results

Task Baseline (OPLS-DA) Best Deep Learning Model Improvement
Fish Species OPLS-DA: 96.39% MoE Transformer: 100.00% +3.61%
Fish Body Part OPLS-DA: 51.17% Ensemble Transformer: 74.13% +22.96%
Oil Contamination OPLS-DA: 26.43% TL MoE Transformer: 49.10% +22.67%
Cross-species Adulteration OPLS-DA: 79.96% Pre-trained Transformer: 91.97% +12.01%
Batch Detection OPLS-DA: 53.19% SpectroSim (Transformer): 70.80% +17.61%

Model Details

MoE Transformer (Gone Phishing)
Fish Species
100.00%

Mixture of Experts Transformer replacing FFN layers with gated expert sub-networks. Routes different spectral regions to specialized experts.

See Chapter 4 →
Ensemble Transformer (Autobots)
Fish Body Part
74.13%

Multi-scale stacked ensemble: three Transformers with 2, 4, and 8 layers act as level-0 classifiers, combined by a meta-learner.

See Chapter 4 →
TL MoE Transformer
Oil Contamination
49.10%

Transfer Learning applied to MoE Transformer: pre-trained on species identification, fine-tuned for ordinal oil contamination detection.

See Chapter 5 →
Pre-trained Transformer
Cross-species Adulteration
91.97%

Transformer pre-trained with Masked Spectra Modelling (MSM), then fine-tuned for 3-class adulteration detection (Hoki/Mackerel/mixture).

See Chapter 5 →
SpectroSim (Transformer)
Batch Detection
70.80%

Self-supervised contrastive framework (SimCLR adaptation) using a Transformer encoder to learn pairwise mass spectra similarity without labels.

See Chapter 6 →

Common Methods

All models are evaluated using Balanced Classification Accuracy (BCA) to handle class imbalance. The baseline throughout is OPLS-DA (Orthogonal Partial Least Squares Discriminant Analysis), the standard chemometrics method. Deep learning architectures are all based on the Transformer with various enhancements (MoE, ensembling, transfer learning, contrastive learning).

Explainability is provided via LIME and Grad-CAM, identifying which m/z spectral features drive each prediction.