Plankton Classification on Imbalanced Dataset via Hybrid Resample Method with LightBGM

Yiran Liu, Xu Qiao, Rui Gao

2021 International Conference on Image, Vision and Computing Cited 3 times

Abstract

Plankton monitoring plays an essential role in marine ecological environment protection, effective identification of its species and quantity can assess the health of the marine ecosystem. Thus, it is valuable to build an automatic classification system for plankton. However, the data of plankton naturally exhibit an imbalance in their class distribution. As a result, we need to take the class-imbalance problem into account for plankton classification. In this paper, we propose a classification model based on a hybrid resample method with LightBGM classifier. Our hybrid resample method combines borderline-SMOTE oversampling and Fuzzy C-means cluster-based undersampling (BSFCM), which is available for handling both within-class and between-class imbalance. In addition, to eliminate the irrelevant factors, dataset preprocessing and feature dimension reduction are employed for the in situ plankton images. The F1-measure and G-mean are used as the evaluation criterion to assess the classification performance. The experimental results show that our BSFCM method using LightBGM classifier is superior to the compared benchmark methods, and achieves good performance on the imbalanced plankton dataset.

BibTeX
@inproceedings{Liu2021Plankton,
  author = {Liu, Yiran and Qiao, Xu and Gao, Rui},
  booktitle = {2021 6th International Conference on Image, Vision and Computing (ICIVC)},
  title = {Plankton Classification on Imbalanced Dataset via Hybrid Resample Method with LightBGM},
  year = {2021},
  volume = {},
  number = {},
  pages = {191-195},
  abstract = {Plankton monitoring plays an essential role in marine ecological environment protection, effective identification of its species and quantity can assess the health of the marine ecosystem. Thus, it is valuable to build an automatic classification system for plankton. However, the data of plankton naturally exhibit an imbalance in their class distribution. As a result, we need to take the class-imbalance problem into account for plankton classification. In this paper, we propose a classification model based on a hybrid resample method with LightBGM classifier. Our hybrid resample method combines borderline-SMOTE oversampling and Fuzzy C-means cluster-based undersampling (BSFCM), which is available for handling both within-class and between-class imbalance. In addition, to eliminate the irrelevant factors, dataset preprocessing and feature dimension reduction are employed for the in situ plankton images. The F1-measure and G-mean are used as the evaluation criterion to assess the classification performance. The experimental results show that our BSFCM method using LightBGM classifier is superior to the compared benchmark methods, and achieves good performance on the imbalanced plankton dataset.},
  keywords = {Dimensionality reduction;Deep learning;Technological innovation;Image recognition;Biological system modeling;Ecosystems;Benchmark testing;plankton classification;imbalanced data;FCM;cluster-based undersampling;borderline-SMOTE;LightBGM},
  doi = {10.1109/ICIVC52351.2021.9526988},
  issn = {},
  month = {July},
}