Each spring, populations of river herring migrate from Massachusetts’ coastal waters to begin their annual journey up rivers and streams to freshwater spawning habitats. River herring populations have experienced severe population declines over the past few decades, and their migration is widely monitored throughout the region, primarily through customary visual counts and volunteer-based programs.
Monitoring fish movements and understanding population dynamics are vital to inform conservation efforts and support fisheries management. With the start of the annual herring fishery this month, researchers and resource managers are once again taking up the challenge of counting and estimating migratory fish populations as accurately as possible.
A team of scientists from Woodwell Climate Research Center, MIT Sea Grant, MIT Computer Science and Artificial Intelligence Lab (CSAIL), MIT Lincoln Laboratory and Intuit investigated a novel monitoring method using underwater video and computer vision to complement citizen science efforts. The researchers – Zhongqi Chen and Linda Deegan of the Woodwell Climate Research Center, Robert Vincent and Kevin Bennett of MIT Sea Grant, Sara Beery and Timm Haucke of MIT CSAIL, Austin Powell of Intuit, and Lydia Zuehsow of MIT Lincoln Laboratory – published a paper describing this work in the journal in February.
Public article entitled “From snapshots to continuous estimates: enhancing citizen science with computer vision for fish monitoring”, describes how recent advances in computer vision and deep learning, from object detection and tracking to species classification, offer promising real-world solutions for automating fish counts with improved performance and data quality.
Traditional monitoring methods are limited by time, environmental conditions and labor intensity. Visual counts performed by volunteers are limited to short sampling periods during the day, missing movements at night, and short migratory bursts when hundreds of fish pass by in a matter of minutes. While technologies such as passive acoustic monitoring and imaging sonar have improved continuous fish monitoring in certain conditions, the most promising and low-cost option – manual underwater video surveys – is still labor-intensive and time-consuming. With the increasing demand for automated video processing solutions, this study presents a scalable, cost-effective, and efficient deep learning-based system for reliable automated fish monitoring.
The team built a comprehensive pipeline – from underwater field cameras to video labeling and model training – to achieve automated computer-assisted fish counting. Videos were collected from three rivers in Massachusetts: the Coonamessett River in Falmouth, the Ipswich River (Ipswich), and the Santuit River in Mashpee.
To prepare the training dataset, the team selected video clips showing variations in lighting, water clarity, fish species and densities, time of day, and time of year to ensure that the computer vision model would perform reliably in a variety of real-world scenarios. They used an open-source web platform to manually mark videos frame-by-frame with bounding boxes to track fish movement. In total, they labeled 1,435 video clips and annotated 59,850 frames.
Scientists compared and verified computer vision counts with human video surveys, streamside visual counts, and passive integrated transponder tagging (PIT) data. They concluded that models trained on a variety of data from multiple locations and years performed best and provided high-resolution, full-season counts consistent with traditionally established estimates. Taking it a step further, the system provided insight into migratory behavior, timing and movement patterns linked to environmental factors. Using video footage of the 2024 Coonamesset River migration, the system counted 42,510 river herring and revealed that upstream migration peaked at dawn, while downstream migration took place mainly at night, with fish taking advantage of darker and quieter periods to avoid predators.
With this real-world application, researchers aim to improve computer vision in fisheries management and provide a framework and best practices for incorporating this technology into conservation efforts for a wide range of aquatic species. “MIT Sea Grant has been funding work on this topic for some time, and this excellent work by Zhongqi Chen and colleagues will expand fisheries monitoring capabilities and improve fish population assessments for fisheries managers and conservation groups,” says Vincent. “It will also provide education and training to students, the public and citizen science groups, supporting ecologically and culturally important river herring populations along our coasts.”
Still, traditional monitoring is essential to maintain the consistency of long-term data sets until fisheries management agencies fully implement automated counting systems. Even then, computer vision and citizen science should be viewed as complementary. Volunteers will be needed to maintain the cameras and directly participate in the computer vision process, from video annotation to model verification. Scientists predict that combining citizen observations and computer vision-generated data will help create a more comprehensive and holistic approach to environmental monitoring.
This work was funded by an MIT Sea Grant with additional support provided by the Northeast Climate Adaptation Science Center, the MIT Abdul Latif Jameel Water and Food Systems Seed Grant, the Global Center for AI and Biodiversity Change (with support from the National Science Foundation and the Natural Sciences and Engineering Sciences Research Council of Canada), and the MIT Undergraduate Research Opportunities Program.
