Home Technology Cheap AI “video scraping” can now extract data from any screen recording

Technology

Cheap AI “video scraping” can now extract data from any screen recording

October 18, 2024

129

Video scraping is just one of many new tricks possible when the latest large language models (LLMs), such as Google’s Gemini and GPT-4o, are actually “multimodal” models, allowing audio, video, image, and text input. These models translate any multimedia input into tokens (chunks of data), which they use to make predictions about which tokens should come next in a sequence.

A term like “token prediction model” (TPM) might be more accurate than “LLM” these days for AI models with multimodal inputs and outputs, but a generalized alternative term hasn’t really taken off yet. But no matter what you call it, having an AI model that can take video inputs has interesting implications, both good and potentially bad.

Breaking down input barriers

Willison is far from the first person to feed video into AI models to achieve interesting results (more on that below, and here’s a 2015 paper that uses the “video scraping” term), but as soon as Gemini launched its video input capability, he began to experiment with it in earnest.

In February, Willison demonstrated another early application of AI video scraping on his blog, where he took a seven-second video of the books on his bookshelves, then got Gemini 1.5 Pro to extract all of the book titles it saw in the video and put them in a structured, or organized, list.

Converting unstructured data into structured data is important to Willison, because he’s also a data journalist. Willison has created tools for data journalists in the past, such as the Datasette project, which lets anyone publish data as an interactive website.

To every data journalist’s frustration, some sources of data prove resistant to scraping (capturing data for analysis) due to how the data is formatted, stored, or presented. In these cases, Willison delights in the potential for AI video scraping because it bypasses these traditional barriers to data extraction.

Source link

Cheap AI “video scraping” can now extract data from any screen recording

Breaking down input barriers

LEAVE A REPLY Cancel reply

MUST READ

International Flight En Route to Houston Diverted to Seattle After Unruly...

Sleeping Positions for Period Cramps To Try ASAP

Man Arrested With Guns Near Obama Home Came Because Of Kevin...

reMarkable 3: Release date, features, specs and rumors

EVEN MORE NEWS

HILARIOUS! Cameraman Shouts “Trump 2028” as Trump Asked About Zero Border...

Nearly Every House Democrat Votes AGAINST Condemning Violent Anti-ICE Riots in...

an internal OpenAI paper on AI stages is affecting Microsoft talks...

POPULAR CATEGORY