From AI Agents to the Lakehouse: The Future of Data Utilization

Introduction: A New Era Driven by the Evolution of Data and AI

The “Databricks AI + Data Summit 2025,” held from June 15 to 18 in San Francisco, California, was one of the world’s largest conferences dedicated to data and AI. With over 22,000 attendees from around the globe, it showcased the cutting edge of the industry. This blog post provides a detailed recap of the Databricks AI summit, highlighting key themes including data lakes, AI engineering, and advanced self-service analytics.

Overview of Data + AI Summit 2025

The AI & Data Summit 2025—Databricks’ annual global conference—reached record scale this year. Held over four days, it brought together data scientists, engineers, professionals, and business leaders from around the world to engage in lively discussions on the latest AI technologies and data strategies.

One key theme across keynotes and sessions was the concept of the “Data Product,” which we’ll explore next.

What Are Data Products? Enhancing Reusability and Scalability

A “data product” is not just a collection of data—it is a curated, quality-assured data asset that delivers value to specific consumers or use cases.

Traditionally, enterprise data was managed in departmental silos, with each team independently cleansing, transforming, and validating data. This led to poor reusability, data inconsistency, and slow decision-making across the organization.

With the data product approach, business teams become consumers of pre-packaged, reusable data components—complete with defined use cases, quality standards, metadata, and monitoring capabilities. This fosters organization-wide data democratization and scalable usage. Achieving this requires a fully integrated data analytics foundation.

Toward Full Integration of Data + AI with Lakebase and Unity Catalog

The most noteworthy data platform announcements at the Summit were the release of “Lakebase” and enhancements to “Unity Catalog.” These initiatives are closely tied to Databricks’ strategic acquisitions of Neon and Tabular in 2024—each valued at approximately $1 billion. These developments represent a $2 billion investment in advancing Databricks’ vision of unifying data and AI.

Lakebase: A Unified Database for Operations and Analytics

Lakebase introduces a new architecture that removes the boundaries between operational (OLTP) and analytical (OLAP) databases. This serverless, PostgreSQL-compatible, Delta Lake-based database separates storage and compute for scalable resource allocation.

For example, application data can be instantly analyzed in real-time to inform business operations—without the need for traditional ETL or reverse ETL processes. It represents a significant step toward a fully unified data platform in the future.

Unity Catalog’s Evolution: Multi-Platform but Unified Governance

Previously known as a metadata catalog for AI model and data management within Databricks, Unity Catalog has undergone significant evolution. Key new features include:

Apache Iceberg REST Catalog API
Enables unified metadata access to Iceberg-format data from tools like Amazon EMR, Trino, and Snowflake.
Apache Iceberg Federation
Allows virtual integration of external Iceberg tables directly into Unity Catalog, creating seamless data unification across cloud accounts and storage systems.

Additionally, Unity Catalog Metrics and Unity Catalog Discovery were introduced to address a common issue: inconsistent KPI definitions across departments, which resulted in discrepancies in tools such as Power BI and Tableau. By centrally defining business metrics at the Lakehouse layer, enterprises can ensure consistency across analytics and business intelligence tools. Enhancements to metric definitions, dashboard governance, and approval workflows further improve data integrity and trust.

The Evolution of Analytics and AI

AI Engineering: Professionalizing Complex AI Development

One of the key highlights of the Summit was the focus on “AI Engineering”—a discipline that brings software engineering principles to the development and deployment of AI systems, ensuring greater stability, maintainability, and quality.

In 2024, “Compound AI Systems” gained attention, where multiple AI agents collaborate to complete tasks. In 2025, the focus shifted to how such systems can be built and operated at a production quality level.

During the event, several solutions were unveiled to support AI engineering—here are some of the highlights:

MLflow 3.0: Now supports performance evaluation and monitoring of AI models and agents.
DSPy 3.0: Enables automatic tuning of AI prompts.
AgentBricks: A new environment for designing, evaluating, and improving AI agents via natural language.

These tools automate and streamline the entire AI lifecycle—raising the standard for enterprise AI implementation.

MLflow 3.0: Enhancing Quality with LLM Judge

The crucial part of MLflow 3.0 is the concept of the “LLM Judge”—a framework where one AI model evaluates the output quality of another. Unlike traditional methods that rely mainly on comparing outputs to predefined answers, LLM Judge is designed for the generative AI era, where flexible assessments, such as “Is this output appropriate?” or “Is this a better response?” are crucial.

The system uses judgment prompts to evaluate each input-output pair, determining the validity and relevance of the result. These evaluations are then used to refine and calibrate prompts for better performance. In some cases, human professionals are expected to be involved for reviewing output quality and improvement suggestions.

DSPy: Automatic Prompt Tuning by AI

This open-source framework enables automatic prompt optimization based on performance evaluations. Using feedback labeled by LLM Judges—as either successful or failed—DSPy intelligently adjusts prompts by adding few-shot examples, fine-tuning phrasing, or avoiding known failure patterns. It can even suggest new evaluation data to improve prompt performance further, making prompt design and tuning tasks with AI automation.

These capabilities represent a new paradigm in which AI can evaluate and improve itself, helping to eliminate the dependency on an individual person’s expertise. As a result, AI development becomes faster, more scalable, and of consistently high quality.

AgentBricks: Democratizing AI Agent Development

Databricks also introduced AgentBricks, a unified development environment for AI agents that supports natural language-driven development.

Key features include:

Request input in natural language
Error diagnosis and fix suggestions
Continuous self-improvement

Running entirely on the Databricks platform, AgentBricks integrates generative, evaluation, and orchestration AIs—creating highly sophisticated, collaborative AI systems. While currently geared toward AI engineers, in the future it is expected to enable business users to build agents with no code—a true democratization of AI.

Advancing Self-Service Analytics: From Natural Language ETL to AI-Assisted BI

Databricks highlighted significant progress toward its long-standing goal of “data democratization.” Two demos stood out:

LakeFlow Designer: No-Code ETL with Natural Language

Lakeflow Designer is a no-code ETL tool that lets users build data pipelines using natural language commands. In the demo, the presenter used contact center call logs and simply asked the system to add a new column that determines whether each case was closed. Then, they calculated the close rate by agent, all through natural language instructions.

The system automatically visualized the entire data flow, allowing users to inspect inputs and outputs at each step in the process. The most impressive moment occurred when the speaker uploaded an image showing the desired aggregation layout and said to the platform, “I want to summarize the data like this.” Instantly, the tool generated the necessary SQL and processing steps, prompting audible reactions of surprise from the audience.

AI-Assisted BI: Explaining Trends in Natural Language

Another demo showcased AI-enhanced business intelligence. The use case involved a line chart displaying hourly marketing impressions, with a visible spike. When the presenter asked in natural language, “Why is there a spike here?” the AI automatically searched relevant datasets and responded with an explanation, such as, “There’s an increase in marketing activity in the APAC region.”

Not only did it generate the insight in plain language, but it also surfaced the supporting tables and data to provide full context. This level of analysis would previously have required assistance from a data engineer, but now even complex queries can be performed interactively with just a natural language prompt in a single session. This marks a significant leap forward in intuitive, self-service data exploration capabilities.

dotData’s Key Takeaways

From dotData’s perspective, three major trends stood out:

Accelerated enterprise data integration through robust data platforms
Deeper adoption of the data product management paradigm
Natural language-powered AI enabling non-engineers to conduct data prep and build AI solutions

That said, the path to “data democratization” is not without challenges. Despite advances in models and tools, such as AutoML and analytics platforms, many business users continue to struggle to find the right solution that helps them derive meaningful insights from various data types. Achieving the intuitive querying shown in the demos requires well-prepared data—yet data prep demands both skill and effort. Even with clean data, asking the right questions, creating the right prompts, and digging deeper into insights remains a challenge.

dotData’s Approach: dotData Insight and dotData Feature Factory

To address these challenges, dotData offers two key solutions:

dotData Insight empowers business teams within enterprises by leveraging AI to extract meaningful features from operational data automatically. Users can explore insights without technical knowledge, guided by questions they want answered. Generative AI helps interpret results and formulate hypotheses—lowering the barrier to advanced analytics.
dotData Feature Factory, integrated with Databricks, transforms both structured and unstructured internal data into high-quality, metadata-rich data products. These assets can be managed under governance policies via Unity Catalog, elevating raw operational data into Silver or Gold-tier data products.

Together, these tools connect seamlessly with existing business intelligence and machine learning platforms, creating a truly user-friendly analytics environment. For more information or to schedule a live demo, please don’t hesitate to contact us.

Yukitaka Kusumura, Ph.D.

Yukitaka is the principal research engineer and a co-founder of dotData, where he leads the R&D of AI-powered feature engineering technology. He has over ten years of experience in research related to data science, including machine learning, natural language processing, and big data engineering. Prior to joining dotData, Yukitaka was a principal researcher at NEC Corporation. He led the invention of cutting-edge technologies related to automated feature engineering from various data sources and worked with clients as a data science practitioner. Yukitaka received his Ph.D. degree in Engineering from Osaka University.

dotData's AI Platform

dotData Feature Factory Boosting ML Accuracy through Feature Discovery

dotData Feature Factory provides data scientists to develop curated features by turning data processing know-how into reusable assets. It enables the discovery of hidden patterns in data through algorithms within a feature space built around data, improving the speed and efficiency of feature discovery while enhancing reusability, reproducibility, collaboration among experts, and the quality and transparency of the process. dotData Feature Factory strengthens all data applications, including machine learning model predictions, data visualization through business intelligence (BI), and marketing automation.

Learn More about dotData Feature Factory

dotData Insight Unlocking Hidden Patterns

dotData Insight is an innovative data analysis platform designed for business teams to identify high-value hyper-targeted data segments with ease. It provides dotData's hidden patterns through an intuitive, approachable interface. Through the powerful combination of AI-driven data analysis and GenAI, Insight discovers actionable business drivers that impact your most critical key performance indicators (KPIs). This convergence allows business teams to intuitively understand data insights, develop new business ideas, and more effectively plan and execute strategies.

Learn More about dotData Insight

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Dive Deeper

Products

Our On-Demand Webinars

Case Studies

Industry

Need

News

News

Events

News

Case Study: Sumitomo Mitsui Trust Bank Increases Close Rates by 20X with AI

Databricks AI + Data Summit 2025 Recap

From AI Agents to the Lakehouse: The Future of Data Utilization

Introduction: A New Era Driven by the Evolution of Data and AI

Overview of Data + AI Summit 2025

What Are Data Products? Enhancing Reusability and Scalability

Toward Full Integration of Data + AI with Lakebase and Unity Catalog

Lakebase: A Unified Database for Operations and Analytics

Unity Catalog’s Evolution: Multi-Platform but Unified Governance

The Evolution of Analytics and AI

AI Engineering: Professionalizing Complex AI Development

MLflow 3.0: Enhancing Quality with LLM Judge

DSPy: Automatic Prompt Tuning by AI

AgentBricks: Democratizing AI Agent Development

Advancing Self-Service Analytics: From Natural Language ETL to AI-Assisted BI

LakeFlow Designer: No-Code ETL with Natural Language

AI-Assisted BI: Explaining Trends in Natural Language

dotData’s Key Takeaways

dotData’s Approach: dotData Insight and dotData Feature Factory

dotData's AI Platform

dotData Feature Factory Boosting ML Accuracy through Feature Discovery

dotData Insight Unlocking Hidden Patterns

Related Articles

Snowflake Summit 2025 Recap: Building the Future of AI and Apps

NAF 2025: The Evolving Landscape of Non-Prime Auto Finance

AI Model Comparison: GPT-4o, Llama 3.1, and Claude 3.5 on Bedrock

Dive Deeper

Products

Our On-Demand Webinars

Case Studies

Industry

Need

News

News

Events

News

Case Study: Sumitomo Mitsui Trust Bank Increases Close Rates by 20X with AI

Databricks AI + Data Summit 2025 Recap

Join Our Newsletter

From AI Agents to the Lakehouse: The Future of Data Utilization

Introduction: A New Era Driven by the Evolution of Data and AI

Overview of Data + AI Summit 2025

What Are Data Products? Enhancing Reusability and Scalability

Toward Full Integration of Data + AI with Lakebase and Unity Catalog

Lakebase: A Unified Database for Operations and Analytics

Unity Catalog’s Evolution: Multi-Platform but Unified Governance

The Evolution of Analytics and AI

AI Engineering: Professionalizing Complex AI Development

MLflow 3.0: Enhancing Quality with LLM Judge

DSPy: Automatic Prompt Tuning by AI

AgentBricks: Democratizing AI Agent Development

Advancing Self-Service Analytics: From Natural Language ETL to AI-Assisted BI

LakeFlow Designer: No-Code ETL with Natural Language

AI-Assisted BI: Explaining Trends in Natural Language

dotData’s Key Takeaways

dotData’s Approach: dotData Insight and dotData Feature Factory

dotData's AI Platform

dotData Feature Factory Boosting ML Accuracy through Feature Discovery

dotData Insight Unlocking Hidden Patterns

Related Articles

Snowflake Summit 2025 Recap: Building the Future of AI and Apps

NAF 2025: The Evolving Landscape of Non-Prime Auto Finance

AI Model Comparison: GPT-4o, Llama 3.1, and Claude 3.5 on Bedrock