Enhancing Conservation Through Machine Learning: Digitizing the Seeds of Hawaii
Seed banks are critical resources for ex-situ conservation and biological research, serving as one of the most effective approaches for conserving resources essential for habitat restoration. Despite their utility for research, limitations to accessibility do exist. Travel restrictions, such as time and funding, might impede researchers from accessing seed collections for in-person visits. Further, many collections across seed banks and herbaria are not digitized (i.e., imaged or databased) or are photographed insufficiently for identification. However, technological advances, such as Z-stacking software, enabling high-resolution imaging, and artificial intelligence applications in machine learning, may now be applied to quantify specific trait data. The Harold L. Lyon Arboretum in Honolulu, Hawaiʻi, initiated a project to take multi-focal images of all 178 genera (spanning 81 families) in the Seed Conservation Lab. These images and detailed metadata are hosted on SeedsOfHawaii.org, providing the foundation for language learning models that utilize AI algorithms within a Retrieval-Augmented Generation framework to assist researchers in species identification, model population dynamics, and predict trajectories of adaptation. By making our collection digitally accessible and integrating our website into the global compendium of botanical data, we contribute to the collective understanding of plant biodiversity and improve future conservation management strategies. Seed banks that digitize their collections through modern approaches can expand their utility and use for reference further than physical collections by increasing their visibility through online availability.