type
status
date
slug
summary
tags
category
icon
password
The researchers used Meta AI’s Segment Anything Model 2 (SAM 2) to evaluate its performance for zero-shot segmentation of surgical tools in different types of surgical videos.
The model was able to automatically identify and segment surgical tools in subsequent frames of the video, without having seen these videos or surgical tools before, and with just a few prompts (such as manually labeling certain tools in the first frame of the video).
The study selected different types of surgical videos for evaluation, including:
- Endoscopic surgery videos (e.g. tested with the EndoNeRF, EndoVis'17, and SurgToolLoc datasets), which are usually captured while performing surgery in vivo using an endoscope.
- Microsurgery videos, such as those taken using a microscope during ear surgery, such as cochlear implant surgery.
The study showed that SAM 2 performed well on endoscopic and microsurgery videos, accurately segmenting surgical tools in the videos, but its performance may deteriorate if the videos are too long or the environment is complex (such as blur or occlusion).
Features of SAM-2
- Zero-shot segmentation: Without pre-labeled data, the SAM 2 model can identify and segment surgical tools directly from surgical videos. By using cues (e.g., points, boxes, masks), the model is able to take information in the first frame of the video and apply it to the entire video for tool tracking and segmentation.
- Multi-scenario applicability: The tool is suitable for different types of surgical videos, including endoscopic surgery and microsurgery. It can handle scenes containing multiple surgical tools, as well as surgical procedures with different video lengths.
- Efficient Segmentation: The SAM 2 model was trained using the extensive Segment Anything Video (SA-V) dataset, which enables efficient segmentation of surgical tools in videos, reducing reliance on manual annotation and improving segmentation accuracy.
- Built-in memory: The model integrates a memory to propagate initial cues between frames of the video, thus ensuring the continuity and accuracy of the segmentation.
- Coping with complex environments: Despite challenges such as blur, occlusion, and bleeding, SAM 2 can still provide reliable tool segmentation results in these cases and further improve the segmentation results through additional cues.
- Suitable for real-time applications: Due to its combination of zero-shot segmentation capabilities and memory bank, the tool has the potential to be used in real-time surgical scenarios to help surgeons identify and track surgical tools more accurately.
Main Uses of SAM-2
Improving surgical precision and safety:
By automatically identifying and segmenting tools in surgical videos, surgeons can see the surgical scene more clearly, reducing possible misoperations during the operation, thereby improving the accuracy and safety of the operation.
Reduce the workload of manual annotation:
Traditional surgical video analysis requires a lot of manual annotation, while the tool's zero-sample segmentation capability can be directly applied with no or very little manual annotation, significantly reducing time and labor costs.
Support diverse surgical scenarios:
The tool can be applied to different types of surgical videos, including endoscopic surgery and microsurgery, and can adapt to changes in the number of tools and length of surgical procedures, making it widely applicable.
Helping medical research and training:
Automatic segmentation and recognition of surgical tools is a powerful auxiliary tool for medical researchers, which can accelerate the analysis and research of surgical procedures. At the same time, it can also be used in medical education to help medical students and young doctors learn surgical procedures more intuitively.
Promoting the development of surgical robots:
In the field of surgical robotics, this tool can be used to enhance the robot's visual system, enabling it to more accurately identify tools in the surgical environment, thereby better assisting doctors in performing surgery.
Laying the foundation for future surgical AI systems:
The development and application of this tool demonstrates the potential of artificial intelligence in surgical procedures. In the future, it can be further developed into a smarter and more automated surgical support system, and even enable partially automated surgery.
Experimental Results of SAM-2
In this study, experiments and performance evaluation focused on two aspects: endoscopic surgery datasets and microsurgery datasets . The researchers used multiple public datasets to test the segmentation performance of Segment Anything Model 2 (SAM 2) in these surgical videos.
1. Endoscopic Surgery Dataset
Dataset used:
- EndoNeRF: Contains two surgical video clips, containing 63 and 156 frames respectively.
- EndoVis'17: Contains 8 robotic surgery videos, each containing 255 frames, and the corresponding ground truth segmentation masks.
- SurgToolLoc: Contains 24,695 video clips, each lasting 30 seconds and captured at 60 frames per second, all from the da Vinci robotic surgery system.
Evaluation results:
- In the endoscopic surgery dataset, SAM 2 demonstrates strong tool segmentation capabilities, especially in multi-tool scenarios and with varying video lengths.
- Quantitative evaluation using the EndoVis'17 dataset shows that SAM 2 outperforms other mainstream segmentation methods such as U-Net, UNet++, and TransUNet in terms of Dice score, IoU (Intersection over Union), and MAE (Mean Absolute Error).
Quantitative evaluation data :
- Dice score: SAM 2 reaches 0.937, which is significantly higher than U-Net (0.894), UNet++ (0.909) and TransUNet (0.904).
- IoU: The IoU value of SAM 2 is 0.890, which is also better than other methods.
- MAE: SAM 2 has the best performance with a mean absolute error (MAE) of 0.018.
2. Microsurgery Dataset
Dataset used :
- Our own dataset: Cochlear implant surgery videos collected from Vanderbilt University Medical Center and the Medical University of South Carolina, including surgery clips of varying lengths (2 to 10 seconds) covering different stages of the surgery (such as drilling and implantation).
Evaluation results :
- SAM 2 performs well in microsurgery videos, especially providing reliable segmentation results in both single-tool and multi-tool scenarios.
- SAM 2 segmentation results are best when the surgical scene is well lit and the tool motion quality is high.
Conclusion of SAM-2
Model performance:
- SAM 2 demonstrates significant performance advantages in zero-shot video segmentation tasks, especially in accurately segmenting tools in surgical videos under good lighting conditions and high-quality tool motion.
- In various surgical scenarios (e.g., endoscopic and microsurgery), SAM 2 is able to generate reliable tool segmentation results by providing point cues only in the first frame of the video, demonstrating its strong versatility and adaptability.
Limitations of the model:
- Segmentation Challenges for Long Video Sequences: As the video sequence gets longer, the segmentation accuracy of SAM 2 decreases, especially in the later stages of the video, where the accuracy of detail segmentation decreases. This performance degradation is a significant challenge for real-time surgical video applications and requires further improvement.
- Impact of complex surgical environments: Complex factors in the surgical environment, such as blur, bleeding, and tool occlusion, significantly affect the segmentation accuracy of SAM 2. Especially in microsurgery, the model is prone to losing segmentation fineness due to the limitations of the microscope camera and the interaction of tools with the surgical surface.
Coping strategies:
- Introducing additional cues (such as when a new tool enters the scene) can improve the segmentation accuracy to a certain extent, especially when dealing with complex or dynamically changing surgical scenes.
- To address these challenges, future research directions should focus on how to improve the performance of the model in long video sequences and how to enhance its robustness in complex environments by fine-tuning the model.
SAM 2, as the second-generation Segment Anything Model, performs well in surgical videos, especially in zero-sample conditions, and can effectively segment surgical tools in videos. Its strong versatility and adaptability in a variety of surgical scenarios make it a strong candidate for future surgical video analysis and real-time assistance tools.
Although SAM 2 demonstrates significant performance improvement in the surgical tool segmentation task, its limitations in long video processing and complex surgical environments still need further study. Future work should focus on optimizing the model to ensure its higher practicality and reliability in various clinical settings.
Overall, this study is the first to evaluate the potential of SAM 2 in surgical videos, demonstrate its effectiveness in a variety of surgical scenarios, and lay the foundation for the development of future surgical AI systems. The results provide important references for the application of SAM 2 in clinical practice, especially in improving surgical precision and safety.
- Author:KCGOD
- URL:https://kcgod.com/surgery-with-sam-2
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts
Google Launches Gemini-Powered Vids App for AI Video Creation
FLUX 1.1 Pro Ultra: Revolutionary AI Image Generator with 4MP Resolution
X-Portrait 2: ByteDance's Revolutionary AI Animation Tool for Cross-Style Expression Transfer
8 Best AI Video Generators Your YouTube Channel Needs
Meta AI’s Orion AR Glasses: Smart AI-Driven Tech to Replace Smartphones