





















Abstract:Cluster closure, defined as the progressive filling of gaps between the berries in a grape bunch, is a key trait in vineyard management, impacting disease risk. However, traditional visual scoring methods are labor-intensive, subjective, and lack temporal resolution. Existing datasets rarely support fine-grained berry-level analysis, limiting the development of robust deep learning models. In this work, we present ViViD-5k, a large-scale in-field Vineyard Vision Dataset containing 5,000 images with dense annotations, including over 648,000 berry centroids and cluster segmentation masks spanning 13 grape varieties. Building on this dataset, we introduce GrapeSAM, a two-stage visual pipeline that combines point-based berry localization with prompt-based segmentation using Segment Anything, followed by transformer-based cluster segmentation. The pipeline enables automated, in-field estimation of cluster closure with minimal supervision. Quantitative results demonstrate strong segmentation and counting accuracy across diverse conditions, while visualizations confirm robustness on both in-domain and out-of-domain samples. This work provides a scalable and objective alternative to manual compactness scoring and supports high-throughput grape phenotyping with enhanced spatial detail.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Other Quantitative Biology (q-bio.OT) |
| Cite as: | arXiv:2605.24353 [cs.CV] |
| (or arXiv:2605.24353v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.24353 arXiv-issued DOI via DataCite (pending registration) |
From: Yu Jiang [view email]
[v1]
Sat, 23 May 2026 02:30:02 UTC (34,059 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。