Databricks DBRX LLM – A New Groundbreaking Open LLM

Updated on March 29 2024
image

The world of artificial intelligence is buzzing with excitement over DBRX, an open large language model (LLM) introduced by Databricks, a leader in data and AI solutions. DBRX model comes at a time when demand for advanced natural language processing capabilities is booming across industries.

What is DBRX: Databricks’ Groundbreaking Open LLM?

DBRX is a transformer-based, decoder-only LLM meticulously developed by Databricks’ team. With a massive 132 billion total parameters and 36 billion actively engaged parameters, DBRX stands apart with its innovative mixture-of-experts (MoE) architecture.

Unlike traditional models, DBRX utilizes a fine-grained MoE approach, leveraging 16 experts and dynamically selecting the optimal combination of four for each task. This technique results in a remarkable 65 times more possible expert combinations compared to existing open MoE models like Mixtral and Grok-1, translating into exceptional performance gains.

DBRX was pretrained on a carefully curated dataset of 12T tokens, comprising text and code data. This new dataset, developed using Databricks’ suite of tools, is estimated to be at least twice as effective token-for-token compared to the data used to pretrain the company’s previous LLM family, MPT.

Also Read: All About Multimodal LLM

Key Features of DBRX

DBRX’s standout performance is a result of its innovative architecture, optimized pretraining data, and Databricks’ expertise in developing cutting-edge LLMs. Here are some of the key differentiators and features that set DBRX apart:

Mixture-of-Experts (MoE) Architecture: DBRX’s fine-grained MoE architecture is a game-changer, allowing for a remarkable 65 times more possible expert combinations compared to existing open MoE models. This approach translates into exceptional performance gains across various benchmarks.

Optimized Pretraining Data: DBRX was pretrained on a meticulously curated dataset of 12T tokens, comprising text and code data. This dataset is estimated to be at least twice as effective token-for-token compared to the data used for Databricks’ previous LLM family, MPT.

Efficiency: With MoE architecture, training a smaller DBRX variant required 1.7 times fewer floating-point operations (FLOPs) than LLaMA2-13B to reach comparable performance levels. Databricks’ end-to-end LLM pretraining pipeline has become nearly four times more compute-efficient in the past ten months.

Inference Speed: When hosted on Databricks’ optimized Mosaic AI Model Serving infrastructure, DBRX can generate text at an astonishing rate of up to 150 tokens per second per user, outperforming traditional dense models by a factor of two to three times.

Computational Comparison with other models:

MetricDBRXBenefit
Inference Speed2x faster than LLaMA2-70BFaster computations
Model Size (Total Parameters)40% of Grok-1More manageable deployment
Model Size (Active Parameters)40% of Grok-1More manageable deployment
Inference Throughput2-3x higher than 132B non-MoE modelBetter efficiency trade-off

How Does DBRX Compare with Other LLMs?

LLM Comparisons
Credit: Databricks

DBRX’s exceptional performance is evident across a wide range of standard benchmarks, consistently outperforming established open-source LLMs and setting new records.

BenchmarkDBRX InstructMixtral InstructGrok-1
MMLU73.70%71.40%73.00%
HumanEval70.10%54.80%63.20%
GSM8K66.90%61.10%62.90%

 

As the table illustrates, DBRX Instruct surpasses the competition, outperforming Mixtral Instruct and even the specialized CodeLLaMA-70B Instruct model on programming tasks like HumanEval.

Also Read: Mistral Large: Mistral AI’s New LLM Outshines GPT4, Claude and ChatGPT

According to the creators’ reported scores, DBRX Instruct outperforms GPT-3.5 on general knowledge, common-sense reasoning, and mathematical reasoning, while remaining competitive with Gemini 1.0 Pro and Mistral Medium across various benchmarks.

On long-context tasks and retrieval-augmented generation (RAG), DBRX Instruct outperforms GPT-3.5 Turbo at all context lengths and most parts of the sequence. It is also competitive with open models like Mixtral Instruct and the current version of GPT-3.5 Turbo on RAG benchmarks.

How to Access DBRX

DBRX’s open-source nature is a game-changer. It empowers both the AI community and business in taking advantage of the model’s capacities without the limitations typical of the proprietary models. The placement of DBRX’s on the Hugging Face platform under an open license by DataBricks shows their intention to support an open AI environment. This openness facilitates collaboration and innovation, as programmers and researchers can rely on DBRX and develop new tools and increase the area of AI efficiency.

Databricks provides a number of ways to use DBRX.  These include:

Foundation Model APIs: Foundation Model APIs are provided by Databricks and let users communicate with DBRX via an easy-to-use interface. By using these APIs, users can include DBRX into their workflows and applications.

AI Playground Chat Interface: Users can access DBRX via the AI Playground chat interface for rapid testing and experimentation. A user-friendly environment is offered by this interface for engaging with the model and learning about its capabilities.

In general, customers may effortlessly integrate DBRX into their workflows and utilize its potential for an extensive array of applications by utilizing the tools and resources offered by Databricks.

Also Read: Google Gemma: Open Source AI Model

Conclusion

The introduction of DBRX by Databricks marks a significant milestone in the field of artificial intelligence. This powerful DBRX model, with its innovative mixture-of-experts architecture and exceptional performance on language understanding, programming, and math benchmarks, is setting new standards for what’s achievable in the world of open large language models.

By making the DBRX Base and DBRX Instruct models available on the HuggingFace platform under an open license, Databricks has demonstrated its dedication to open-source collaboration. This move enables enterprises to take control of their data in the generative AI era using the DBRX Databricks LLM.

About Appscribed

Appscribed is a comprehensive resource for SaaS tools, providing in-depth reviews, insightful comparisons, and feature analysis. It serves as a knowledge hub, offering access to the latest industry blogs and news, thereby empowering businesses to make informed decisions in their digital transformation journey.

Related Articles