Trending

OpenAI to Integrate Shopify Seamlessly into ChatGPT for In‑Chat Shopping

Red Hat Bets on Open SLMs and Inference Optimization for Responsible, Enterprise‑Ready AI

OpenAI’s o3 and o4‑mini–Reasoning Models Exhibit Increased Hallucination

Table of Contents

Meta Releases Llama 4: A New Generation of Flagship AI Models

Read Time: 2 minutes

Table of Contents

Meta’s Llama 4 models introduce major upgrades in efficiency, visual understanding, and responsiveness. The Scout and Maverick models are now available via Llama.com and Hugging Face, while Behemoth remains in training. These models aim to match or exceed current market leaders in various AI tasks.

Meta has unveiled Llama 4, a new suite of AI models designed to push the boundaries of open-source multimodal AI. Released on a Saturday, the Llama 4 family introduces four powerful models: Llama 4 Scout, Llama 4 Maverick, and the in-training Llama 4 Behemoth, each showcasing advanced performance in reasoning, multimodality, and cost-efficient architecture.

A Fast-Paced Launch with Global Rollout

The Scout and Maverick models are now publicly accessible via Llama.com and platforms like Hugging Face. Meta AI, the assistant embedded in apps like WhatsApp and Instagram, has been updated to use Llama 4 in 40 countries. However, full multimodal capabilities are currently limited to U.S. users in English.

Notably, usage is restricted in the European Union, due to regional governance and privacy regulations. Also, companies exceeding 700 million MAUs must seek a special license.

Built on Mixture-of-Experts (MoE) Architecture

Llama 4 represents Meta’s first deployment of a Mixture-of-Experts (MoE) architecture, enabling improved compute efficiency. MoE models allocate specific tasks to specialized “expert” networks, allowing for faster and more context-aware responses.

  • Maverick: 400B total parameters, 17B active, 128 experts.
  • Scout: 109B total parameters, 17B active, 16 experts.

Both are optimized for tasks like summarization, code reasoning, multilingual support, and creative writing.

Behemoth: Meta’s Largest AI to Date

Currently in training, Behemoth features nearly 2 trillion total parameters and 288B active parameters, with 16 experts. Early internal tests show that Behemoth outperforms competitors like GPT-4.5, Claude 3.7 Sonnet, and Gemini 2.0 Pro in STEM-related benchmarks.

Behemoth will require advanced infrastructure, such as multiple Nvidia H100 DGX units, for inference and training.

Performance and Responsiveness

  • Maverick outperforms GPT-4o and Gemini 2.0 in coding, reasoning, and image tasks but slightly trails newer models like Gemini 2.5 Pro and Claude 3.7 Sonnet.
  • Scout features an exceptionally large context window (10 million tokens), allowing for processing extremely long documents and images.

Additionally, Scout can operate on a single Nvidia H100 GPU, offering accessibility to smaller developers.

Shifting the Content Moderation Paradigm

Llama 4 models have been tuned to provide more nuanced responses to politically or socially charged prompts. Meta claims that the new models strike a better balance between responsiveness and neutrality, enabling discussions across diverse viewpoints.

This is a deliberate shift following criticism from political stakeholders, particularly in the U.S., about potential bias in large language models. Meta asserts that Llama 4 “responds more often to debated topics” and is “dramatically more balanced.”

Challenges with Licensing and Fair Use

Despite being open-source, Llama 4’s licensing terms prohibit usage in the EU and by ultra-large-scale companies without prior approval. These limitations may constrain adoption among enterprise and European developers, reigniting debates around the meaning of “open” in open-source AI.

community

Get Instant Domain Overview
Discover your competitors‘ strengths and leverage them to achieve your own success