Meet the AnyRGBD segment: a toolkit for segmentation of depth-displayed images based on SAM

Estimated read time: 3 min

Wireless

To segment depth images displayed using SAM, the researchers developed the Segment AnyRGBD toolkit. SAD, short for Segment Any RGBD, was recently presented by researchers at NTU. SAD can easily segment any 3D object from RGBD inputs (or just generate depth images).

The depth image produced is then sent to SAM where researchers have shown that people can easily identify objects by visualizing a depth map. This is achieved by first mapping the depth map ((H, W)) to the RGB space ((H, W, 3)) through the color scheme function. The rendered depth image pays less attention to texture and more to geometry than an RGB image. In SAM-based projects such as SSA, Anything-3D, and SAM 3D, the input images are all RGB images. Researchers have pioneered the use of SAM to directly extract geometric details.

OVSeg is a semantic segmentation tool used by researchers. The study authors gave consumers a choice between raw RGB images or creating depth images as input to the SAM. The user can retrieve semantic masks (where each color represents a different category) and category-associated SAM masks in either case.

🚀 Join the fastest ML Subreddit community

Results

Since texture information is most prominent in RGB images and geometry information is present in depth images, the former are much brighter than their rendered counterparts. As the accompanying diagram shows, SAM offers a greater variety of masks for RGB inputs than it does for Depth inputs.

Excessive segmentation in SAM has been reduced thanks to the depth image produced. In the accompanying illustration, for example, the chair is identified as one of four parts of a table extracted from RGB images using semantic segmentation. However, the table was correctly categorized as a whole on depth profile. In the attached image, the blue circles indicate regions of the skull that were misclassified as walls in the RGB image but correctly identified in the depth image.

The red circled chair in the deep image may be two chairs so close together that they are treated as a single entity. RGB image texture data is essential in element identification.

repo and tool

Visit https://huggingface.co/spaces/jcenaa/Segment-Any-RGBD to see the repository.

This repository is open source based on OVSeg, which is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License. However, some parts of the project are covered by different licenses: the MIT license covers both CLIP and ZSSEG.

https://huggingface.co/spaces/jcenaa/Segment-Any-RGBD is where one can try out the tool.

For this task, one will need a GPU and may obtain one by redundancy of space and upgrading settings to use the GPU instead of waiting in line. There is a significant delay between starting the frame, processing the SAM clips, processing the zero-shot semantic clips, and generating the 3D results. Final results are available in about 2-5 minutes.


scan the code And repo. Don’t forget to join 20k+ML Sub RedditAnd discord channelAnd Email newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we’ve missed anything, feel free to email us at Asif@marktechpost.com

🚀 Check out 100’s AI Tools in the AI ​​Tools Club


Dhanshree Shenwai is a Computer Science Engineer with sound experience in FinTech companies covering Finance, Cards, Payments and Banking field with a keen interest in AI applications. She is passionate about exploring new technologies and developments in today’s evolving world making everyone’s life easy.


Source link

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.