A team led by the Institut de Ciències del Mar (ICM-CSIC) in Barcelona, in collaboration with the Monterey Bay Aquarium Research Institute (MBARI) in California, the Universitat Politècnica de Catalunya (UPC), and the Universitat de Girona (UdG), has made a groundbreaking discovery. They have proven for the first time that reinforcement learning, a type of neural network that learns from rewards to make the best decisions, enables autonomous vehicles and underwater robots to locate and track marine objects and animals. Their findings have been published in the journal Science Robotics.
Underwater robotics is becoming increasingly important for exploring the oceans, which present many challenges. These vehicles can reach depths of up to 4,000 meters and provide valuable in-situ data that complements satellite data. This technology enables the study of small-scale phenomena, such as CO2 capture by marine organisms, which helps regulate climate change.
This new research reveals that reinforcement learning, commonly used in control and robotics as well as in natural language processing tools like ChatGPT, enables underwater robots to learn the optimal actions to achieve specific goals. These action policies can match or even outperform traditional analytical methods in certain situations.
Ivan Masmitjà, the lead author of the study, explains, “This type of learning allows us to train a neural network to optimize a specific task that would otherwise be very difficult to achieve. For example, we have demonstrated that we can optimize the trajectory of a vehicle to locate and track moving objects underwater.” Masmitjà has worked between ICM-CSIC and MBARI on this project.
This breakthrough “will allow us to further study ecological phenomena, such as migration or movement of marine species, using autonomous robots on small and large scales. Additionally, these advancements will enable real-time monitoring of other oceanographic instruments through a network of robots, where some can be on the surface transmitting the actions performed by other robotic platforms on the seabed via satellite,” says ICM-CSIC researcher Joan Navarro, who also participated in the study.
For this research, the team utilized range acoustic techniques to estimate the position of objects by measuring distance at different points. However, the accuracy of object localization depends on the location of these acoustic range measurements. This is where artificial intelligence, specifically reinforcement learning, becomes vital in identifying the optimal trajectory for the robot by determining the best measurement points.
The neural networks were trained using the computer cluster at the Barcelona Supercomputing Center (BSC-CNS), which houses Spain’s most powerful supercomputer and one of Europe’s most powerful ones. “This allowed us to adjust the parameters of various algorithms much faster compared to using conventional computers,” explains Prof. Mario Martin, from the Computer Science Department of the UPC and one of the authors of the study.
Once trained, the algorithms were tested on different autonomous vehicles, including the AUV Sparus II developed by VICOROB. The tests were conducted in experimental missions in the port of Sant Feliu de Guíxols, in the Baix Empordà, and in Monterey Bay, California, in collaboration with MBARI’s principal investigator of the Bioinspiration Lab, Kakani Katija.
“Our simulation environment incorporates the control architecture of real vehicles, which allowed us to efficiently implement the algorithms before going to sea,” explains Narcís Palomeras from UdG.
In future research, the team plans to explore the use of the same algorithms for more complex missions. This includes using multiple vehicles to locate objects, detect fronts and thermoclines, or perform cooperative algae upwelling through multi-platform reinforcement learning techniques.
This research was made possible by the European Marie Curie Individual Fellowship won by Ivan Masmitjà in 2020 and the BITER project, funded by the Ministry of Science and Innovation of the Government of Spain, which is currently being implemented.