Meta’s new AI model tags and tracks every object in your videos

Meta has a new AI model that can label and follow any object in a video as it moves around. The Segment Anything Model 2 (SAM 2) extends the capabilities of its predecessor, SAM, which was limited to images, opening up new opportunities for video editing and analysis.

SAM 2’s real-time segmentation is a potentially huge technical leap. It showcases how AI can process moving images and distinguish among the elements on screen even as they move around or out of the frame and back in again.

Segmentation is the term for how software determines which pixels in an image belong to which objects. An AI assistant that can do so makes it a lot easier to process or edit complicated images. That was the breakthrough of Meta’s original SAM. SAM has helped segment sonar images of coral reefs, parsed satellite images to aid disaster relief efforts, and even analyzed cellular images to detect skin cancer.

SAM 2 widens the video capacity, which is no small feat and would not have been feasible until very recently. As part of SAM 2’s debut, Meta shared a database of 50,000 videos created to train the model. That’s on top of the 100,000 other videos Meta mentioned employing. Along with all the training data, real-time video segmentation takes a significant amount of computing power, so while SAM 2 is open and free at the moment, it likely won’t stay that way forever.

Segment Success

Using SAM 2, video editors could isolate and manipulate objects within a scene more easily than the limited abilities of current editing software and far beyond manually adjusting each frame. Meta envisions SAM 2 revolutionizing interactive video, too. Users could select and manipulate objects within live videos or virtual spaces thanks to the AI model.

Meta thinks SAM 2 could also play a crucial role in the development and training of computer vision systems, particularly in autonomous vehicles. Accurate and efficient object tracking is essential for these systems to interpret and navigate their environments safely. SAM 2’s capabilities could expedite the annotation process of visual data, providing high-quality training data for these AI systems.

A lot of the AI video hype is around generating videos from text prompts. Models like OpenAI’s Sora, Runway, and Google Veo get a lot of attention for a reason. Still, the kind of editing ability provided by SAM 2 might play an even bigger role in embedding AI in video creation.

And, while Meta might have an edge now, other AI video developers are keen on producing their own version. For instance, Google’s recent research has led to video summarization and object recognition features that it is testing on YouTube. Adobe and its Firefly AI tools are also centered on photo and video editing and include content-aware fill and auto-reframe features.

You might also like…

Services Marketplace – Listings, Bookings & Reviews

Entertainment blogs & Forums

This 22-inch foldable 3K portable monitor will cost $2,500 when it launches in 2026, and I am not sure why anyone would want to buy it

Cheap(er) 15.36TB PCIe Gen 5 SSDs on the way as Adata launches new enterprise brand, but don’t expect these to fit your PC case

NYT Wordle today — answer and my hints for game #1426, Thursday, May 15

Apple wants to connect thoughts to iPhone control – and there’s a very good reason for it

This 22-inch foldable 3K portable monitor will cost $2,500 when it launches in 2026, and I am not sure why anyone would want to buy it

The Science Fiction and Fantasy Books You Can’t Afford to Miss in September!

Send a newsletter? This $100 list-building tool is just $12 right now.

There’s officially a snake named after Salazar Slytherin now

This 22-inch foldable 3K portable monitor will cost $2,500 when it launches in 2026, and I am not sure why anyone would want to buy it

Cheap(er) 15.36TB PCIe Gen 5 SSDs on the way as Adata launches new enterprise brand, but don’t expect these to fit your PC case

NYT Wordle today — answer and my hints for game #1426, Thursday, May 15

Apple wants to connect thoughts to iPhone control – and there’s a very good reason for it

Meta’s new AI model tags and tracks every object in your videos

Bydls

Segment Success

You might also like…

Related Post

This 22-inch foldable 3K portable monitor will cost $2,500 when it launches in 2026, and I am not sure why anyone would want to buy it

Cheap(er) 15.36TB PCIe Gen 5 SSDs on the way as Adata launches new enterprise brand, but don’t expect these to fit your PC case

NYT Wordle today — answer and my hints for game #1426, Thursday, May 15

You missed

This 22-inch foldable 3K portable monitor will cost $2,500 when it launches in 2026, and I am not sure why anyone would want to buy it

Cheap(er) 15.36TB PCIe Gen 5 SSDs on the way as Adata launches new enterprise brand, but don’t expect these to fit your PC case

NYT Wordle today — answer and my hints for game #1426, Thursday, May 15

Apple wants to connect thoughts to iPhone control – and there’s a very good reason for it