TY - GEN
T1 - Rep-MedSAM
T2 - International Challenge on Segment Anything in Medical Images on Laptop held in conjunction with the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
AU - Wei, Muxin
AU - Chen, Shuqing
AU - Wu, Silin
AU - Xu, Dabin
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Medical image segmentation has been a pivotal step in clinical practice, enabling more precise analysis of medical images. MedSAM, as a medical image segmentation foundation model, has significantly extended the ability of SAM to segment a broad spectrum of different modalities of medical images and achieves excellent performance comparing specialist models. However, with a heavy image encoder, MedSAM falls short of clinical usage in terms of time efficiency. Therefore, the CVPR 2024: Segment Anything In Medical Images On Laptop Challenge addresses performance and efficiency in a task, where the model infers with only CPU. To this end, we propose Rep-MedSAM, which integrates RepViT, a mobile-friendly CNN with efficient designs of lightweight ViTs, by replacing the image encoder in MedSAM. Our method is simple but effective, including knowledge distillation from pretrained MedSAM, whole-pipeline training and fine-tuning with extra datasets. We conduct all experiments on the challenge. Our method achieved an average DSC of 85.90% and an average NSD of 87.07% on validation. As for time cost, our method shows thrilling results compared to the baseline on validation. The average time for 2D and 3D cases is 0.47 s and 22.47 s, respectively, with an average of 2.41 s for each case. Our code is available at GitHub.
AB - Medical image segmentation has been a pivotal step in clinical practice, enabling more precise analysis of medical images. MedSAM, as a medical image segmentation foundation model, has significantly extended the ability of SAM to segment a broad spectrum of different modalities of medical images and achieves excellent performance comparing specialist models. However, with a heavy image encoder, MedSAM falls short of clinical usage in terms of time efficiency. Therefore, the CVPR 2024: Segment Anything In Medical Images On Laptop Challenge addresses performance and efficiency in a task, where the model infers with only CPU. To this end, we propose Rep-MedSAM, which integrates RepViT, a mobile-friendly CNN with efficient designs of lightweight ViTs, by replacing the image encoder in MedSAM. Our method is simple but effective, including knowledge distillation from pretrained MedSAM, whole-pipeline training and fine-tuning with extra datasets. We conduct all experiments on the challenge. Our method achieved an average DSC of 85.90% and an average NSD of 87.07% on validation. As for time cost, our method shows thrilling results compared to the baseline on validation. The average time for 2D and 3D cases is 0.47 s and 22.47 s, respectively, with an average of 2.41 s for each case. Our code is available at GitHub.
KW - MedSAM
KW - Medical Images
KW - Rep-ViT
UR - https://www.scopus.com/pages/publications/86000272029
U2 - 10.1007/978-3-031-81854-7_4
DO - 10.1007/978-3-031-81854-7_4
M3 - 会议稿件
AN - SCOPUS:86000272029
SN - 9783031818530
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 57
EP - 69
BT - Medical Image Segmentation Foundation Models. CVPR 2024 Challenge
A2 - Ma, Jun
A2 - Ma, Jun
A2 - Ma, Jun
A2 - Zhou, Yuyin
A2 - Wang, Bo
A2 - Wang, Bo
A2 - Wang, Bo
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 17 June 2024 through 21 June 2024
ER -