BIO-01 Harmful Algal Blooms
Prediction of appearances of plankton species in Mombetsu, Hokkaido, Japan, using AI technology  (Invited)
Satoshi Nagai* , Fisheries Technology Institute, Japan Fisheries Research and Education Agency, Yokohama, Japan
Satoshi Tazawa, AXIOHELIX Co. Ltd, Chiyoda-ku, Tokyo, Japan
Noriko Nishi, AXIOHELIX Co. Ltd, Chiyoda-ku, Tokyo, Japan
Sirje Sildever, Tallinn University of Technology, Akadeemia tee 15A, 12618 Tallinn, Estonia
Hiromi Kasai, Fisheries Resources Institute, Japan Fisheries Research and Education Agency, Hokkaido, Japan
Junya Hirai, Atmosphere and Ocean Research Institute, The University of Tokyo, Tokyo, Japan
Akihiro Shiomoto, Tokyo University of Agriculture, Hokkaido, Japan
Taisei Kikuchi, University of Miyazaki, Miyazaki, Japan
Seiji Katakura, City of Mombetsu, Hokkaido, Japan
Fumito Maruyama, Hiroshima University, Hiroshima, Japan

To study the relationship between the change in biodiversity and long-term changes in environmental parameters, weekly monitoring has been carried out at one location off the coast of Mombetsu city from Apr 2012 to Mar 2020 (n = 445). We performed metabarcoding analyses using universal primers for eukaryotes targeting 18S and 28S ribosomal RNA (two primer sets) to detect as many taxa as possible. We identified 2,983 species, i.e., 909, 684, 296, 250, 188, 186, and 470 species in Fungi, Metazoa, Bacillariophyceae, Dinophyceae, Ciliophora, Archaeplastida, and other eukaryotes, respectively. In this study, we introduced AI technologies to predict occurrences of plankton species with seven environmental parameters (water temperature, salinity, NO2, NO3, PO4, SiO2, Chl a). Long short-term memory (LSTM), an artificial recurrent neural network (RNN) architecture used in deep learning, was adopted in this study. LSTM networks are well-suited to classifying, processing, and making predictions based on time series data since there can be lags of unknown duration between important events in time series. For the first six years, all data (n = 335) were used for the training AI (LSTM), and the abundances of each species were reproduced. The appearances of each species in the remaining data set (n = 110) were then tried to predict using only the environmental data sets. Here, we used only the species information which had >30 sequences during the monitoring. As a result of the prediction, 110 from 726 species (15.2%) based on 18S and 71 from 476 species (14.9%) in 28S seemed to be predictable. Moreover, we introduced an AI model of the timeseries_forecaster in Autokeras, which can try 20 different patterns with six options with LSTM and GRU models with the automatic best model function. The result showed that 111 from 726 species (15.3%) based on 18S and 64 from 476 species (13.4%) in 28S seemed to be predictable. We have started analyzing the 10-year dataset and will show the results including HAB species at the conference.