Multimodal AI become accessible: new model runs on your laptop

November 1, 2023

A new open-source artificial intelligence model named Obsidian, announced in an Oct. 30 Reddit post, represents a breakthrough in multimodal AI accessibility. Obsidian is the first 3b parameter multimodal AI — which makes it a model compact enough to run efficiently on a regular laptop.

Multimodal AI refers to AI systems that can process and connect data from different modes, such as text, images, audio, and video — in this case, the model accepts text and pictures as input, much like the latest version of OpenAI’s GPT-4V. While multimodal AI models like DALL-E 3 and GPT-4 have shown impressive capabilities, their enormous size makes them resource-intensive to run, requiring expensive high-end hardware — and their models are a closely guarded secret, so you could never run them even if you had the necessary specialized hardware.

The AI intelligence model, Obsidian, packs multimodal intelligence into a standard laptop’s memory

Obsidian changes this by packing multimodal intelligence into a model small enough to fit into a standard laptop’s memory and run at practical speeds. At 3 billion parameters, Obsidian builds upon the Capybara-3B model architecture, which achieves state-of-the-art performance compared to similarly sized models. The developer also announced on Reddit that a multimodal model based on the highly-praised Mistral open-source 7B model will soon follow.

Obsidian’s compact size is thanks to techniques adapted from the LLaMA model architecture. According to the Reddit post announcing Obsidian, it was pre-trained on a diverse synthesized multi-modal dataset, including text paired with corresponding images. This training methodology allowed it to develop strong language and vision capabilities despite its reduced parameters.

The result is an AI assistant with conversational skills and visual understanding that can fit in your backpack. Obsidian breaks down barriers to accessing AI, opening up new possibilities for on-device intelligence.

While still an early version, Obsidian’s efficient form factor sets an exciting precedent. It demonstrates that multimodal AI does not have to be locked up in giant data centers but can be made compact enough to be distributed widely.

Featured Image Credit: From Image Creation at Aimesoft; Thank you!

Radek Zielinski

Radek Zielinski is an experienced technology and financial journalist with a passion for cybersecurity and futurology.

Source link

Multimodal AI become accessible: new model runs on your laptop

The AI intelligence model, Obsidian, packs multimodal intelligence into a standard laptop’s memory

Radek Zielinski

LEAVE A REPLY Cancel reply

MUST READ

Netherlands Letdown: Geert Wilders Doesn’t Have Support of Coalition Partners To...

8 Best Superfood Supplements, Per Nutritionists 2023

NY Times: CIA Built 12 Secret Spy Bases in Ukraine Waging...

Q Acoustics’s M40 micro-towers aim to fill your home with sound

EVEN MORE NEWS

Fox News Poll Has Terrible News For Trump In Pennsylvania

Even Rupert Murdoch’s Wall Street Journal Finds Kamala Harris Has Wiped...

Netflix Users’ Money May Be Going to Kamala Harris’ Campaign, Here’s...

POPULAR CATEGORY