Skip to content

Data Portal

Explore and download the Museum’s research and collections data.

Coral Reef Sand Data for Training SandE

This repository contains all the training data used to train the Sediment Analysis Neural-network Data-engine (SAND-e). SAND-e was created as part of the PhD thesis of G. William M. Harrison under the supervision of Ken Johnson and Willem Renema and the master’s thesis of Teigan Collins under the supervision of Ken Johnson and Nadia Santodomingo with major contributions made by Allia Rosedy. This study is part of the Reef Refugia project (NE/R011044/1) funded by NERC. William Harrison’s PhD is being conducted with 4D-REEF project funded by the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 813360. The images and .json files corresponding to the relevant masks for each image used to train the segmenter can be found in the “segmenter_training_data.zip". The classifier was trained with the images in the “classifier_training_data.zip” images for classifier training are sorted into separate folders by type. The "AI vs Human testing.zip” file contains the raw images input into SAND-e (“raw images”) and the labelled version of those images given to human testers (“labelled images”), the outputs produced by SAND-e (“Segmenter_Classifier_Outputs.csv”), the answers given by the humans (“Human_annotations_anonymized.csv”), and a form of them in which SAND-e’s outputs have been aligned grain-to-grain with the human answers (“Human_AI_Grain_IDS.csv”). Finally, the “Deployment” folder contains the raw images fed into SAND-e for the analysis part of this study in the “raw images” folder and SAND-e’s outputs in the “AI_ouputs.zip” file.

Data and Resources

Cite this as

George William Harrison; Teigan Georgia Collins; Nadiezhda Santodomingo; Kenneth Johnson et al. (2025). Coral Reef Sand Data for Training SandE [Data set]. Natural History Museum. https://doi.org/10.5519/tqbpc3pr
Retrieved: 15:41 11 Oct 2025 (UTC) BibTeX

Additional Info

Field Value
Primary contributors
Harrison, George William ( 0000-0002-2563-7695);
Collins, Teigan Georgia ( 0009-0009-0106-849X);
Santodomingo, Nadiezhda ( 0000-0003-1392-2672);
Johnson, Kenneth ( 0000-0002-4666-1213);
Renema, Willem ( 0000-0002-1627-5995)
Other contributors
Last updated 16 September 2025
Last resource update 16 September 2025 (segmenter_training_data.zip)
Created 15 September 2025
License Creative Commons Attribution