

January 20, 2026
GigaOm Radar for Data Warehouses v6
Andrew J. Brust
Analyst at GigaOm
1. Executive Summary
Data warehouses assemble, organize, cleanse, and optimize enterprise data from various, disparate sources so it can be used in an organization’s data analytics, reporting, dashboards, and AI workloads. These systems centralize physical data, access governance, and analytics capabilities into a single solution, simplifying operations and reducing complexity. They provide an integrated, high-quality, consolidated view of a company's data. This data provides decision-makers with a foundation for directing business operations and planning for the future.
Data warehouses are distinct from operational or transactional systems that store the record-by-record data from the individual transactions and activities of the business. Analyzing data typically requires looking at it in the aggregate (rather than looking at individual rows and transactions). Operational databases aren't typically called upon, nor are they able to retrieve large amounts of batch data in response to analytical queries. With specific technologies, storage architectures, and design choices optimizing them for fast retrieval and analytics of large amounts of aggregate data, data warehouses represent a paradigm shift from the operational to the analytic. Data warehouses provide significant value to organizations by serving as a "single source of truth" for organizational data. They consolidate disparate data, cleanse and prepare it, and ultimately enable it to be used across diverse analytics workloads.
Data warehouses are important to any end user seeking to derive insights from organizational data through analytics or to interpret the results of that analysis. They are particularly important to the decision-makers within an enterprise, who rely upon this insight to understand the current state of the business, direct operations, and determine an effective going forward strategy.
Data warehouses benefit organizations by positioning them to remain competitive in a fast-paced business environment. The capabilities of data warehouses deliver returns on investment through operational excellence, faster time to insight, and reduced risk. Data warehouses reduce data sprawl and improve and centralize governance. They enhance the quality of an organization's data and provide semantic standardization, improving the trustworthiness of data and analytics results. All of these benefits, in turn, foster a data-driven culture and ensure confident decisions can be made regarding the operations and future of the business.
From a CEO and COO perspective, data warehouses provide a reliable data foundation that empowers these executives to translate analytical insights into action, direct day-to-day business operations, and confidently develop a strategy for the future grounded in data-backed evidence. From a CFO perspective, data warehouses provide a high-quality, accurate, enterprise-wide view of financial data, improving transparency and enabling them to more accurately determine current performance and predict future performance. From a CIO and CTO perspective, data warehouses reduce complexity and technology sprawl by consolidating data physically and enabling diverse analytics workloads to run on a single platform.
This is our sixth year evaluating the data warehouse space in the context of our Key Criteria and Radar reports. This report builds on our previous analysis and considers how the market has evolved over the last year.
This GigaOm Radar report examines 13 of the top data warehouse solutions and compares offerings against the capabilities (table stakes, key features, and emerging features) and nonfunctional requirements (business criteria) outlined in the companion Key Criteria report. Together, these reports provide an overview of the market, identify leading data warehouse offerings, and help decision-makers evaluate these solutions so they can make a more informed investment decision.
GIGAOM KEY CRITERIA AND RADAR REPORTS
The GigaOm Key Criteria report provides a detailed decision framework for IT and executive leadership assessing enterprise technologies. Each report defines relevant functional and nonfunctional aspects of solutions in a sector. The Key Criteria report informs the GigaOm Radar report, which provides a forward-looking assessment of vendor solutions in the sector.
2. Market Categories and User Segments
To help prospective customers find the best fit for their use case and business requirements, we assess how well data warehouse solutions are designed to serve specific market categories and user segments (Table 1).
For this report, we recognize the following market categories:
Small-to-medium business (SMB): In this category, we assess solutions on their ability to meet the needs of organizations ranging from small businesses to medium-sized companies. Also assessed are departmental use cases in large enterprises where ease of use and deployment are more important than extensive management functionality, data mobility, and feature set.
Large enterprise: Here, offerings are assessed on their ability to support large and business-critical projects. Optimal solutions in this category have a strong focus on flexibility, performance, data services, and features to improve security and data protection. Scalability is another big differentiator, as is the ability to deploy the same service in different environments.
Specialized: Here, solutions are assessed on their ability to support more niche, specialized scenarios. Optimal solutions are designed for specific workloads and use cases such as big data analytics and high-performance computing (HPC).
In addition, we recognize the following user segments:
Business user: Business users are typically beginners in the realm of data and analytics. While these employees may occasionally need to use analytical tools to perform self-service exploration and analysis, they rely on others to handle the technical aspects of configuring and provisioning them.
Business analyst: These users have some knowledge of data analysis tasks and are familiar with using self-service tools to perform analytics. They evaluate data to derive business insights and make recommendations for improvements, such as enhancing performance or reducing costs.
Data analyst: These users review data to identify trends and patterns that benefit organizations at the corporate level. While not as technical as data engineers, data analysts possess knowledge of data preparation, visualization, and analytics that can inform organizational strategy.
Data engineer: Data engineers are technically well versed and apply their specialized knowledge to prepare, organize, and model data, transforming it into actionable information for the organizations they support.
Table 1. Vendor Positioning: Market Categories and User Segments
Table 1 components are evaluated in a binary yes/no manner and do not factor into a vendor’s designation as a Leader, Challenger, or Entrant on the Radar chart (Figure 1).
“Target market” reflects which use cases each solution is recommended for, not simply whether that group can use it. For example, if an SMB could use a solution but doing so would be cost-prohibitive, that solution would be rated “no” for SMBs.
3. Decision Criteria Comparison
All solutions included in this Radar report meet the following table stakes—capabilities widely adopted and well implemented in the sector:
Massively parallel processing
Analytics optimizations
Support for cloud-based operation
Business intelligence (BI) platform integrations
Scalability and elasticity
Security and access controls
Tables 2, 3, and 4 summarize how each vendor in this research performs in the areas we consider differentiating and critical in this sector. The objective is to give the reader a snapshot of the technical capabilities of available solutions, define the perimeter of the relevant market space, and gauge the potential impact on the business.
Key features differentiate solutions, highlighting the primary criteria to be considered when evaluating a data warehouse solution
Emerging features show how well each vendor implements capabilities that are not yet mainstream but are expected to become more widespread and compelling within the next 12 to 18 months
Business criteria provide insight into the nonfunctional requirements that factor into a purchase decision and determine a solution’s impact on an organization
These decision criteria are summarized below. More detailed descriptions can be found in the corresponding report, “GigaOm Key Criteria for Evaluating Data Warehouse Solutions.”
Key Features
Managed services: Managed service offerings abstract the complexities of platform administration for tasks such as configuration, setup, resource provisioning, and maintenance. By taking care of these functions on behalf of the customer, these solutions are designed to reduce the customer’s burden of operating the platform.
Support for open table formats: Open table formats (Apache Iceberg, Delta Lake, and, to a lesser extent, Apache Hudi) are layered on top of open file formats such as Apache Parquet. They provide metadata management, fast data manipulation operations, and ACID (atomicity, consistency, isolation, and durability) guarantees for the data stored in the underlying file formats.
Streaming and real-time data ingestion: This criterion refers to a data warehouse platform's ability to ingest data in real time or near-real time. This support can take the form of change data capture (CDC), data replication, mirroring, or streaming data ingestion capabilities.
Metadata management and data cataloging: Metadata management and data cataloging capabilities provide context and schema for data stored in the warehouse. This predefined schema contributes to the analytics query performance that data warehouses are able to achieve, improves data access and discovery, reduces inefficiency, and helps create a well-managed data foundation for BI, advanced analytics, and AI workloads.
Concurrency optimizations: Concurrency optimizations enable data warehouses to support simultaneous users running multiple queries or workloads without impacting performance or increasing processing times to an unreasonable degree. In today’s business environment, concurrency handling is especially relevant to help data warehouse platforms meet the needs of large enterprise customers, which often have large numbers of concurrent users and potentially resource-intensive queries.
Integrated and/or in-database machine learning (ML) and predictive analytics: Certain data warehouse platforms offer features that enable users to perform tasks in the ML lifecycle directly within the data warehouse. These capabilities eliminate the need to extract data and move it to a separate environment for these tasks, streamlining ML workflows and reducing data access governance risks and complexity.
Table 2. Key Features Comparison
Emerging Features
Generative AI (GenAI) automation and assistance: This criterion assesses data warehouse features involving generative AI models (including those powering agents) to automate and assist with analytics, data engineering, and platform administration tasks. Examples of capabilities in these areas include analytics code generation assistance (SQL, Python), natural language querying interfaces, assistance with generating visualizations and reports, and assistance with creating and maintaining data pipelines to move data into the data warehouse.
Semistructured and unstructured data storage and processing: This criterion evaluates the degree to which the solution supports storing and processing semistructured or unstructured data, in addition to conventional structured, scalar data used for BI workloads. This functionality is important because it enables users to analyze this data in data warehouses and eliminates the need to move data from other systems for processing, increasing efficiency and reducing infrastructure costs.
Table 3. Emerging Features Comparison
Business Criteria
Ecosystem and integrations: This criterion evaluates a warehouse’s openness to integrating with third-party platforms for analytics and data management. This quality is important because data warehouses will interact with all (or nearly all) of an organization’s other applications and platforms.
Ease of administration and maintenance: Beyond provisioning and scaling, some solutions are designed to further reduce the customer’s involvement in operating the platform. This characteristic is important for organizations that prefer a data warehouse offering specifically designed to support this overall low-touch operation.
Analytics workload diversity: This criterion refers to the variety of analytics workloads that can be handled from the data warehouse platform. Beyond the core BI and reporting use case, these workloads can include support for developing customer-facing embedded analytics applications, AI and ML, real-time or near-real-time analytics, and analysis of geospatial, graph, and time-series data.
Hybrid enablement: This criterion refers to unifying an organization’s data estate across both on-prem and cloud environments. Interoperability across environments helps organizations comply with data sovereignty and residency requirements by ensuring data remains physically stored within specific regional or national boundaries, on-prem and in the cloud.
AI readiness: This criterion evaluates the overall degree to which a data warehouse offering enables a business’s AI readiness. AI readiness encompasses AI, ML, GenAI, and agentic AI. It refers to capabilities and frameworks that help a business leverage its data to implement and improve AI across the organization.
Resiliency, fault tolerance, and high availability: This criterion assesses the system's overall resilience and availability. This characteristic enables data warehouse customers to maintain continuous access to their critical data, minimize financial losses from system downtime, and maintain the integrity and reliability of their data.
Table 4. Business Criteria Comparison
4. GigaOm Radar
The GigaOm Radar plots vendor solutions across a series of concentric rings with those positioned closer to the center being judged as having the most complete solutions. The chart characterizes each vendor on two axes—balancing Maturity versus Innovation and Feature Play versus Platform Play—while providing an arrowhead that projects each solution’s expected evolution over the coming 12 to 18 months.
Figure 1. GigaOm Radar for Data Warehouses
As you can see in Figure 1, the majority of vendors are on the Platform Play half of the Radar chart. This indicates that many vendors have chosen to specialize in this technology when approaching the data warehouse space. The majority of vendors are also in the Maturity half of the Radar chart, reflecting the decades of development that underlie these offerings. However, there is nearly a balance between Maturity and Innovation among vendors in the Leaders circle, meaning that top data warehouse offerings aren’t limited to just those that provide stability or those that embrace breaking change. Vendors who take either approach still provide top products.
Of the small subset of vendors that are both Leaders and Outperformers, the majority are on the Feature Play side of the Radar chart. The highest-performing offerings, therefore, are those that form one component of an end-to-end platform and benefit from all the integrated capabilities of the other modules, including data integration, data engineering, data science, streaming data ingestion and processing, and BI.
In reviewing solutions, it’s important to keep in mind that there are no universal “best” or “worst” offerings; every solution has aspects that might make it a better or worse fit for specific customer requirements. Prospective customers should consider their current and future needs when comparing solutions and vendor roadmaps.
INSIDE THE GIGAOM RADAR
To create the GigaOm Radar graphic, key features, emerging features, and business criteria are scored and weighted. Key features and business criteria receive the highest weighting and have the most impact on vendor positioning on the Radar graphic. Emerging features receive a lower weighting and have a lower impact on vendor positioning on the Radar graphic. The resulting chart is a forward-looking perspective on all the vendors in this report, based on their products’ technical capabilities and roadmaps.
Note that the Radar is technology-focused, and business considerations such as vendor market share, customer share, spend, recency or longevity in the market, and so on are not considered in our evaluations. As such, these factors do not impact scoring and positioning on the Radar graphic.
For more information, please visit our Methodology.
5. Solution Insights
AWS: Amazon Redshift
Solution Overview
Amazon Redshift is a fully managed cloud data warehouse designed to support complex analytics across large volumes of data. It leverages data warehouse optimizations, such as massively parallel processing (MPP), columnar storage, and data compression, to improve query performance on large datasets.
Amazon Redshift is architected for scalability and elasticity. It offers a serverless option, Redshift Serverless, that can automatically adjust capacity to meet increasing workload demands and is designed to simplify platform administration and maintenance. Redshift Serverless also includes an AI-powered scaling and optimization capability that helps dynamically allocate compute resources as workloads increase, maintaining query performance within specified price-performance objectives.
Amazon Redshift integrates with multiple AWS services, including AWS Glue for data preparation and technical metadata management. The Redshift Spectrum feature allows users to query data in Amazon S3 object storage without loading it into Redshift.
AWS is positioned as a Challenger and Fast Mover in the Maturity/Platform Play quadrant of the data warehouse Radar chart.
Strengths
AWS received high scores on a number of decision criteria, including:
Managed services: Redshift Serverless is designed to simplify data warehouse infrastructure. Redshift Serverless enables customers to analyze data without needing to set up, tune, and manage Amazon Redshift clusters.
Integrated and/or in-database ML and predictive analytics: Amazon Redshift ML is a capability within Amazon Redshift that enables users to create, train, and deploy ML models using SQL commands. This capability is intended to simplify the end-to-end ML workflow, eliminating the time and effort required to integrate with an external ML service. The AWS product documentation notes that certain Redshift ML functions incur additional costs. Examples are those that use Amazon SageMaker AI for model training or integrate with Amazon Bedrock for building GenAI applications.
GenAI automation and assistance: AWS offers several features for GenAI assistance, including the Amazon Q generative SQL coding assistant in the Redshift query editor v2. This assistant responds to natural language prompts to generate SQL statements and assist with data analysis.
Opportunities
AWS has room for improvement in a few decision criteria, including:
Semistructured and unstructured data storage and processing: Existing capabilities for semistructured data include the SUPER data type and the ability to query data in object storage (such as Amazon S3) directly via Redshift Spectrum. According to AWS’s literature, Amazon Redshift doesn’t appear to support storing unstructured data in the database. Amazon Redshift requires data to be structured by a defined schema. Unstructured data needs to be processed first, typically through extract, transform, and load (ETL) steps, before it can be loaded into the system.
Hybrid enablement: While Redshift can query data in other storage systems, it is a cloud-native solution that can be deployed only in the cloud.
Purchase Considerations
Amazon Redshift is available in two options: Provisioned and Serverless. For the Provisioned option, customers can choose between on-demand instances with hourly billing or reserved instances.
Provisioned Amazon Redshift includes multiple options to help optimize and strike a balance between cost and performance. Elastic Resize allows users to adjust the amount of provisioned compute capacity. Resize Scheduler allows users to add and remove nodes on a daily or weekly basis. Concurrency Scaling provisions additional compute capacity based on workload needs, charged on a per-second, on-demand basis for what’s used in excess of accumulated free credits.
In Redshift Serverless, capacity is automatically provisioned and managed based on workload needs. Users are charged for compute capacity on a per-second basis and not charged during idle periods. There is an option to prepurchase serverless reservations in the form of a committed number of Redshift Processing Units (a measure of data warehouse capacity). These can be purchased annually at a discount off the on-demand rates.
Use Cases
Some of the core use cases of Amazon Redshift include traditional data warehousing for BI and reporting, predictive analytics and ML-based forecasting, and analysis of log data to identify inefficiencies or monitor performance. The ability to ingest streaming data also allows Amazon Redshift to support real-time analytics and dashboards. The Amazon Redshift MCP server enables AI agents to perform tasks such as querying data in Amazon Redshift. Amazon Redshift can work with other AWS services to enable a variety of additional use cases, such as integrating with Amazon Bedrock to power GenAI applications and retrieval-augmented generation (RAG) workflows. One consideration to weigh before taking advantage of these integrations is the potential for these additional services to incur additional charges or require licenses.
Cloudera
Solution Overview
Cloudera is an end-to-end data and AI platform that provides a wide range of fully integrated data, analytics, and AI services (including data warehousing, AI and ML, data ingestion and preparation, data engineering, security and governance, a data lakehouse, data observability, an operational database, data catalog, and data visualization) from a single product.
Cloudera Data Warehouse consists of virtual data warehouses built from on-prem clusters or fleets of container-based compute instances. These warehouses access data organized in underlying data catalogs, which multiple warehouses may access concurrently. Users can size the virtual warehouse instances, configure autoscaling, and set thresholds for auto-suspend or timeout. Cloudera Data Warehouse is powered by and uses multiple engines and services, including Apache Impala, Apache Hive, Apache Spark, Apache Flink, Apache Kafka, Apache NiFi, and Trino. All of these combine to help Cloudera support self-service and advanced analytics workloads, end-to-end AI and ML lifecycles, and many other use cases.
Cloudera Shared Data Experience (SDX) provides a unified control plane across any deployment scenario, centralizing metadata management, security and access controls, data sharing, and system monitoring.
Cloudera is positioned as a Leader and Outperformer in the Maturity/Feature Play quadrant of the data warehouse Radar chart.
Strengths
Cloudera scored well on a number of decision criteria, including:
Support for open table formats: The Apache Iceberg open table format is the default format for data in the platform, and Cloudera can access data stored in Delta Lake or Hudi tables. Cloudera can also migrate data stored in these formats into Iceberg tables, leveraging the platform’s support for Iceberg.
Metadata management and data cataloging: Cloudera excelled in this key feature through numerous capabilities, including its integrated Data Catalog service. Cloudera offers data discovery, support for managing and serving data across an organization, and robust data lineage tracking and impact analysis (the latter capabilities are derived from its acquisition of the data lineage solution Octopai). This reduces the potential complexity of integrating with multiple, discrete third-party tools to achieve the equivalent capabilities. Cloudera also added support for the Iceberg REST catalog specification, which exposes a server-side catalog through a REST API. It allows greater flexibility for performing cataloging operations across tools and languages.
Integrated and/or in-database ML and predictive analytics: Cloudera offers multiple capabilities and interfaces in this area. Cloudera AI, formerly known as Cloudera Machine Learning, supports the full ML lifecycle: exploring data, creating and deploying models, and monitoring performance in production. Cloudera supports a wide range of ML models and algorithms. Cloudera AI Workbench provides an interactive environment for data scientists to explore data, build models, and use familiar tools such as Jupyter notebooks and RStudio. The Cloudera AI Inference service is a production-grade environment for model serving and deployment that runs in the customer’s cloud. It is powered by integration with NVIDIA Inference Microservices and is especially beneficial for enterprises concerned about sending proprietary data to third-party service providers. Finally, enhancements to support the development of GenAI solutions were in preview as of the research phase of this report (October 2025). These tools include RAG Studio, a no-code application for creating RAG chatbots, and Agent Studio, a service that helps users design, deploy, and orchestrate agentic AI workflows to improve automation and efficiency.
Cloudera is classified as an Outperformer because of its ambitious roadmap, including tools to support development of GenAI solutions. Cloudera’s relatively quick pace of development of capabilities such as enhancements to data cataloging and support for the end-to-end ML lifecycle also contribute to this designation.
Opportunities
Cloudera has room for improvement in a few decision criteria, including:
Managed services: Cloudera isn’t a fully autonomous or serverless platform. However, it offers many features and options for customers who prefer a more low-touch platform operation. Cloudera provides a managed services offering through a partner, Nexus Cognitive. It also provides many strong features to simplify the administrative experience, including an integrated Cloudera Observability component.
Ease of administration and maintenance: While Cloudera is not a fully autonomous or serverless platform, it provides capabilities that abstract the complexity of platform administration and maintenance. These include its Cloudera Observability module, assistance with upgrades, and a central governance framework through its Shared Data Experience (SDX) control plane.
Purchase Considerations
Cloudera’s licensing is based on usage or capacity, with options for a pay-as-you-go service or prepaid credits. Cloudera is available in many deployment options, including any public or private cloud, on-prem, as a managed SaaS offering through a partner (Nexus Cognitive), or as a customizable, customer-managed deployment.
Use Cases
As a fully integrated, end-to-end platform, Cloudera supports a wide range of workloads beyond data warehousing. However, some of the key use cases relevant to this report include BI support, data integration, data sharing and collaboration, data exploration and discovery, streaming and IoT data analysis, customer 360 and personalization, and ML, AI, and predictive analytics. Specific verticals supported include advertising personalization, supply chain optimization, and credit risk assessment, among others.
Google Cloud: BigQuery
Solution Overview
BigQuery is Google Cloud’s serverless, cloud-native enterprise data and analytics platform. BigQuery provides capabilities for core analytics workloads, including business intelligence, real-time analytics, ML, and geospatial analysis, helping streamline organizations’ data management and analytics processes. As a serverless platform, BigQuery eliminates the need for customers to provision or manually size clusters, helping to simplify platform administration and maintenance.
BigQuery is architected with decoupled storage and compute, allowing each to be scaled independently for performance and cost optimization benefits. BigQuery integrates with other Google Cloud services to support additional workloads. It also integrates with many third-party tools (such as Tableau) and a variety of notebooks and code-based analysis tools.
Google Cloud is positioned as a Leader and Fast Mover in the Innovation/Platform Play quadrant of the data warehouse Radar chart.
Strengths
Google Cloud scored well on a number of decision criteria, including:
Managed services: BigQuery is specifically designed to abstract the details of and minimize customer involvement in infrastructure management and platform operations. BigQuery automatically allocates compute resources as workloads require, eliminating the need for customers to provision and manage these resources.
Integrated and/or in-database ML and predictive analytics: BigQuery ML allows users to create, evaluate, and run ML models with SQL commands and store them in BigQuery datasets. This reduces infrastructure complexity by eliminating the need to move data into a separate environment, and minimizes the need for specialized knowledge of programming languages and ML frameworks. Eliminating these considerations makes these analytics workloads more accessible to personas such as business analysts and data analysts.
GenAI automation and assistance: Gemini in BigQuery provides GenAI assistance to generate SQL or Python code in response to natural language prompts and can refine or explain a SQL query in natural language. Gemini in BigQuery data canvas provides a graphical interface for searching for data assets and a natural language interface for generating SQL queries to describe and create custom visualizations. Gemini in BigQuery can also provide GenAI-powered assistance with data preparation, including suggesting transformations or steps that users can preview and apply. Users can make natural language suggestions for alternatives or refine existing suggestions.
Opportunities
Google Cloud has room for improvement in a few decision criteria, including:
Semistructured and unstructured data storage and processing: BigQuery’s support for this emerging feature is poised to improve when an upcoming unstructured data type, ObjectRef, now in preview, becomes generally available. According to the product’s literature, this capability would allow unstructured data to be integrated into standard tables by using ObjectRef values. BigQuery’s existing capabilities in this area include features such as a dedicated JSON data type and Object Tables, which currently provide read-only tables over unstructured data objects in Cloud Storage.
Hybrid enablement: While the platform can query data in other storage systems, it is a cloud-native solution that can only be deployed in the cloud.
Purchase Considerations
Pricing in BigQuery is divided into two main components: compute and storage. Compute pricing covers the cost of processing queries, such as SQL statements, user-defined functions, and scripts. These can be charged on an on-demand basis per number of bytes processed in a query, or based on the compute capacity (virtual CPUs) used to run queries. Additionally, for the capacity-based pricing model, users can prepurchase slots (dedicated capacity commitments) at a lower price. Regarding storage pricing, users are charged for tables and table partitions that are either active (modified in the last 90 days) or long-term (not modified for 90 consecutive days) at approximately half the price.
Use Cases
BigQuery supports a wide variety of use cases, including the core data warehousing use case for BI and reporting, as well as visualization tools and applications described above. Additionally, the BigQuery BI Engine is a built-in, in-memory vector-processing engine that accelerates SQL queries and BI analysis workloads. End-to-end ML workflows and predictive analytics are supported through BigQuery ML. BigQuery also supports ad-hoc and exploratory analysis and, through support for geographic data types, geospatial analytics. Integration with Vertex AI also enables AI-related use cases such as text summarization, sentiment analysis, and embedding generation for vector search use cases.
IBM: IBM Db2 Warehouse, IBM Db2 Warehouse SaaS*
Solution Overview
IBM Db2 Warehouse is a data warehouse offering based on IBM’s foundational Db2 database, architected for high-performance analytics on large volumes of data. It is designed with a separation of compute and storage and leverages analytics optimizations such as columnar storage and in-memory processing. In addition to supporting fast analytics over large volumes of data, it provides built-in end-to-end MLOps support via SQL directly in the database.
The database offers deployment options to suit the needs of a variety of customers, including large enterprises, smaller businesses and teams, and developers seeking to build test applications. These options are detailed further in the Purchase Considerations section below.
A recent enhancement to its database administration experience, called Db2 Intelligence Center, brings AI-powered optimization and assistance to traditional database administration tasks, including management, monitoring, troubleshooting, and performance tuning.
IBM is positioned as a Leader and Fast Mover in the Maturity/Platform Play quadrant of the data warehouse Radar chart.
Strengths
IBM scored well on a number of decision criteria, including:
Concurrency optimizations: IBM received top marks because of multiple capabilities, including the ability to create service classes (allowing administrators to allocate resources and assign priorities to specific workloads) and to set thresholds for resource usage. Additionally, AI-powered workload management capabilities (like query optimization and performance tuning) are provided through Db2 Intelligence Center, an enhanced management console designed to simplify and automate operations and management.
Integrated and/or in-database ML and predictive analytics: There are several reasons for IBM’s high mark in this key feature. With SQL stored procedures, this offering enables the creation, training, and application of ML models directly within the database environment. This capability eliminates the need to extract data to external environments for model development, improving governance and efficiency. Db2 Warehouse supports building end-to-end ML workflows (including data transformation, model building, and evaluation) entirely within the database, using familiar SQL.
GenAI automation and assistance: Db2 Database Assistant, a GenAI-powered assistant, can help with tasks such as monitoring the Db2 instance and providing information about resource usage, active connections, SQL activity, and backups. The vendor also claims it’s trained on the official product documentation and can help answer technical questions about the platform.
Opportunities
IBM has room for improvement in a few decision criteria, including:
Streaming and real-time data ingestion: As of the research phase of this report (October 2025), an upcoming integration with IBM StreamSets was in Technology Preview. Current capabilities for real-time data ingestion include the Db2 Ingest utility. Data replication can be achieved through capabilities that provide near-real-time asynchronous data replication from a primary database server to one or more standby replicas.
Semistructured and unstructured data storage and processing: IBM’s offering doesn’t include a dedicated JSON data type, unlike some other offerings in this report. It can store JSON data in the BSON (binary JSON) format. For unstructured data, IBM Db2 Warehouse provides a native VECTOR data type that stores a structured representation of unstructured data for use in various workloads detailed below.
Purchase Considerations
IBM Db2 Warehouse can be deployed on-prem or in hybrid configurations. IBM Db2 Warehouse SaaS can also be deployed in the cloud on IBM Cloud, AWS, and Azure. IBM also provides the free Db2 Community Edition, described as a “base image of Db2 software distributed via trial download sites and Docker hub.” Db2 Community Edition allows users to explore the full features of Db2 without needing to commit to a purchase contract. It can help developers create and deploy test applications in production or nonproduction environments.
Use Cases
In addition to BI and reporting, IBM Db2 Warehouse supports data science, ML, and predictive analytics with strong in-database capabilities. It also supports geospatial analysis. With support for a new, native VECTOR data type, IBM Db2 Warehouse supports workloads such as similarity search and RAG for fine-tuning LLMs.
Microsoft: Microsoft Fabric
Solution Overview
Microsoft Fabric’s data warehousing workload is one of many modules in this fully integrated, end-to-end data, analytics, and AI solution. Microsoft Fabric abstracts, enhances, and unifies several previously separate services into a single product. All services operate on the Microsoft OneLake unified data lake. In OneLake, data is stored by default in the Delta Lake open table format, the platform's native data format.
While Microsoft Fabric was released to general availability just two years ago, the origins of the data warehousing component in Fabric stretch back much further to Azure SQL Data Warehouse, a previous, standalone product. Azure SQL Data Warehouse is built on technology compatible with Microsoft’s on-prem data warehouse offering, Analytics Platform System (APS), formerly known as Parallel Data Warehouse. While Fabric abstracts and unifies its data warehousing service with the other modules and technologies in the platform, it’s useful to remember this context and history of development over the decades.
Microsoft is positioned as a Leader and Outperformer in the Innovation/Feature Play quadrant of the data warehouse Radar chart.
Strengths
Microsoft scored well on a number of decision criteria, including:
Support for open table formats: The Delta Lake open table format is the default format for data in this solution. According to the Microsoft Fabric documentation, the flexibility to work interoperably with data stored in Delta Lake and in the Iceberg open table format is enabled through metadata virtualization.
Streaming and real-time data ingestion: Microsoft Fabric received a high score in this key feature because of multiple capabilities. One example is the eventstreams feature in Fabric Real-Time Intelligence, which allows users to connect to streaming data sources such as Azure Event Hubs, Apache Kafka, and different CDC sources, optionally transform it, and route it to destinations, including the Fabric Lakehouse. It can be used in data warehousing workloads through the Fabric Data Warehouse SQL analytics endpoint.
Integrated and/or in-database ML and predictive analytics: Microsoft Fabric received a high score for this key feature due to multiple in-platform capabilities, including a dedicated data science and ML module. The Fabric Data Warehouse Spark connector enables users to access and work with data from a Fabric warehouse in their ML workflows, eliminating the need for data movement involving a separate environment.
Microsoft is classified as an Outperformer because of its forward-looking roadmap and rapid release of new features, such as the ability to convert Iceberg tables to Delta Lake through metadata virtualization, its significant investments in research and development, and its industry thought leadership.
Opportunities
Microsoft has room for improvement in a few decision criteria, including:
GenAI automation and assistance: Copilot in Microsoft Fabric Data Warehouses was still in preview in October 2025. However, BI and visualizations are core use cases for data warehousing, and Copilot for Power BI is generally available. It provides GenAI assistance in many tasks, including generating reports and summaries, and creating and refining visualizations.
Hybrid enablement: Microsoft Fabric is a purely cloud-based platform, limiting its ability to support hybrid cloud and on-prem scenarios. However, it can include data stored in external storage systems via the shortcuts feature and mirroring. Additionally, an on-prem data gateway enables access to on-prem data sources via Data Factory, Fabric's data integration module.
Purchase Considerations
In Microsoft Fabric, users access all component services through a single interface and pay for them as a single SaaS offering. This single-platform approach is intended to simplify both billing and analytics.
Use Cases
Microsoft Fabric supports a wide variety of use cases. Beyond data warehousing for BI and reporting, it provides data integration, data science and ML, real-time analytics, and operational database workloads. Microsoft Fabric has the potential to provide a strong data foundation for workloads involving AI agents. This offering also has a host of features and functions in development that are expected to enhance its ability to support GenAI and agentic AI workloads.
Ocient: The Ocient Hyperscale Data Warehouse
Solution Overview
The Ocient Hyperscale Data Warehouse is an enterprise data warehouse optimized for high-performance, complex analytics over large volumes of data. Architecturally, Ocient colocates compute and storage, to which it attributes its ability to enable high-concurrency, low-latency analytics over huge volumes of data.
Ocient blends elements of real-time online analytical processing (OLAP) databases with those of enterprise SQL data warehouses, coupling them with data preparation, in-platform ML, and geospatial analytics. This functionality enables the warehouse to support a wide variety of complex analytics with potential time and geography dimensions at scale, enabling decision-making across industries such as telecommunications, AdTech, data providers, and government agencies.
Ocient is positioned as a Challenger and Fast Mover in the Innovation/Platform Play quadrant of the data warehouse Radar chart.
Strengths
Ocient scored well on a number of decision criteria, including:
Streaming and real-time data ingestion: Ocient supports high-speed ingestion and makes data available for querying in near-real time. Ocient can ingest data from streaming data sources such as Kafka and supports continuous file loading from object storage such as S3.
Concurrency optimizations: Ocient’s workload management capabilities allow multiple concurrent workloads to operate on the same data. Some specific features that enable Ocient to achieve this include the ability to set resource priorities and limits.
Integrated and/or in-database ML and predictive analytics: Ocient received a high score for this key feature due to multiple capabilities, most notably its in-database ML module. These capabilities allow customers to build, train, and deploy ML models on full datasets directly in the database. These mechanisms reduce complexity and governance concerns by enabling such workloads to be performed in a single environment.
Opportunities
Ocient has room for improvement in a few decision criteria, including:
Support for open table formats: As of the research phase of this report (October 2025), Ocient doesn’t possess any native support for open table formats such as Apache Iceberg or Delta Lake.
GenAI automation and assistance: As of the research phase of this report (October 2025), Ocient did not have any native or in-database GenAI assistance. Through APIs or ODBC/JDBC interfaces, it has the potential to be integrated with other tools and applications as part of a larger workflow to build GenAI or agentic AI applications that provide this assistance.
Purchase Considerations
Ocient supports multiple deployment options to suit a variety of customer needs. For those who would prefer more customization and control, it supplies a customer-managed option that can be deployed on-prem or in the public cloud. For those who would prefer the additional vendor-provided support, Ocient offers a fully managed solution, OcientCloud.
Use Cases
Ocient’s high-concurrency, low-latency analytics over huge data volumes, paired with its ML and predictive, time-series, and geospatial analytics, enables a number of potential workloads. In addition to powering real-time dashboards and customer-facing embedded analytics, it’s well suited for time-critical decision-making and analysis across industries such as telecommunications, cybersecurity, threat detection, and AdTech.
OpenText: OpenText Analytics Database (Vertica)
Solution Overview
OpenText Analytics Database, formerly Vertica, is a high-performance, columnar, massively parallel processing data warehouse. It forms the data foundation and core analytics engine for the OpenText Analytics Cloud platform. The solution powers a number of analytics workloads, including BI and reporting, training and operationalizing ML models, and embedded analytics. Architecturally, OpenText Analytics Database provides a separation of compute and storage through its Eon Mode. This option enables administrators to scale up or down to meet workload demands.
Vertica was one of the first databases to emphasize columnar storage and massively parallel processing. The OpenText Analytics Database retains that core and, along with other architectural advancements and a suite of advanced analytics functions, can power use cases such as predictive maintenance, personalization, and real-time decision-making.
OpenText is positioned as a Challenger and Fast Mover in the Maturity/Feature Play quadrant of the data warehouse Radar chart.
Strengths
OpenText scored well on a number of decision criteria, including:
Streaming and real-time data ingestion: OpenText offers Apache Kafka integration for streaming data ingestion, along with support for data replication and mirroring capabilities. OpenText Analytics Database can also integrate with third-party solutions, such as BryteFlow, for change data capture.
Integrated and/or in-database ML and predictive analytics: OpenText integrates with Jupyter notebooks, enabling users to run ML workloads via SQL or Python. The VerticaPy feature supports end-to-end ML workflows through a Python interface. Users can also operationalize models trained in other locations by using the Predictive Model Markup Language (PMML) format.
Semistructured and unstructured data storage and processing: OpenText Analytics Database can access and query semistructured data in external storage systems in place without moving the data. Additionally, OpenText says the solution can work with another element of its portfolio, OpenText IDOL, through natural language processing and ML, to analyze unstructured data. The extracted data can be loaded into “flex tables,” which are designed to handle unstructured or semistructured data with an unknown or varying schema, such as JSON or log data.
Opportunities
OpenText has room for improvement in a few decision criteria, including:
Support for open table formats: OpenText Analytics Database doesn’t support storing data in open table formats directly as tables in the database. However, it can access and query data stored in the Iceberg open table format in existing external storage systems without moving the data.
GenAI automation and assistance: OpenText Intelligence Aviator is not generally available as of the research phase of this report (October 2025). This upcoming feature will assist in generating visualizations and in exploring and querying data through natural language. Although it is a separate product from OpenText Analytics Database, it is expected to be available as an add-on.
Purchase Considerations
OpenText Analytics Database can be deployed in a variety of options, including the three major public clouds and on-prem. The vendor also provides the OpenText Core Analytics Database, a fully managed SaaS offering built on this database.
Licensing for OpenText Analytics Database is subscription-based, available on a pay-as-you-go model, with hourly usage billing, or via an annual, term-based subscription. Customers can choose to license by storage capacity or by number of cores. OpenText also provides a free community edition of OpenText Analytics Database, designed to let interested parties explore the platform's capabilities without a purchase commitment.
Use Cases
OpenText Analytics Database provides the data foundation for a number of workloads, including customer 360 and personalization. Predictive analytics of data from IoT sensors enables support for predictive maintenance in industries such as manufacturing and transportation. Analysis of real-time data ingested into the platform enables high-speed decision-making in industries such as e-commerce and telecom.
Oracle: Oracle Autonomous AI Database
Solution Overview
Oracle Autonomous AI Database is a self-managing, highly scalable database service, built to blend the advantages of both fast query performance and a streamlined, low-touch platform administration experience. Autonomous AI Database users are not required to configure or manage any infrastructure. The database automatically handles resource provisioning, backups, patching, and upgrades. It automatically manages increasing and decreasing resource usage, such as storage and compute capacity, in response to workload demands.
Oracle Autonomous AI Database includes built-in support for the Oracle APEX application development platform and Oracle REST Data Services, the Database Actions web-based interface for monitoring, administration, data management, and analytics. It also includes Oracle Machine Learning, which provides notebook interfaces, multiple APIs, and support for end-to-end ML workflows. This offering is available in numerous forms and deployment options to suit a variety of customer and team needs, which is detailed in the Purchase Considerations section below.
Oracle is positioned as a Leader and Fast Mover in the Innovation/Platform Play quadrant of the data warehouse Radar chart.
Strengths
Oracle scored well on a number of decision criteria, including:
Managed services: The autonomous nature of Oracle’s solution means that many database administration tasks are handled automatically by the platform on behalf of the customer. Examples include automatic resource provisioning and memory configuration to optimize performance, backups, and patching.
Metadata management and data cataloging: Oracle enables unified data discovery, data sharing, data lineage and impact analysis, and natural language querying through Select AI, a GenAI-powered feature. Oracle has also added support for multiple catalogs for data stored in the Iceberg open table format, including Unity Catalog, AWS Glue, and Snowflake Open Catalog. This capability enables its offering to “search, discover, preview, and query any Iceberg table in any Iceberg catalog, on any cloud.”
GenAI automation and assistance: Select AI provides multiple capabilities that simplify and automate tasks for users. Specific capabilities include an interface that converts natural-language prompts into SQL queries. Select AI also automates the generation of vector embeddings for similarity search using a vector store, which expedites RAG workflows.
Opportunities
Oracle has room for improvement in a few decision criteria, including:
Support for open table formats: Oracle Autonomous AI Database doesn’t support any open table formats as native data formats for the platform. However, it can query data in Iceberg tables stored in external object storage and cataloged by services, including Unity Catalog, Polaris, and AWS Glue. The Data Transforms capability supports loading data from SQL-based data sources into Apache Iceberg as a target.
Streaming and real-time data ingestion: Through the Data Studio in Oracle Autonomous AI Database, the Live Feeds feature handles continuous data ingestion from the OCI Object Store through either a push or pull basis. Support for streaming data ingestion and CDC is also provided through separate but tightly integrated Oracle services. Streaming data ingestion is supported via integration with Oracle Cloud Infrastructure (OCI) Streaming. Through integration with the Oracle GoldenGate enterprise CDC solution, customers can use CDC capabilities for Oracle and non-Oracle sources and targets.
Purchase Considerations
There are many deployment options for Oracle Autonomous AI Database, including a fully managed, cloud-based offering on OCI and on-prem deployment in a customer’s datacenter through Oracle Exadata Cloud@Customer, an option that also supports hybrid deployment. Oracle also partners with the three major public cloud providers to colocate Oracle Exadata hardware in their datacenters (Oracle Database@AWS, Oracle Database@Azure, and Oracle Database@Google Cloud) to improve interoperability and simplify multicloud deployments. In addition, users can spin up low-cost instances of the solution for development and testing, as well as short-term free instances of the solution; with limited functionality for backups, recovery, and logging for exploring the solution’s features.
Use Cases
Oracle Autonomous AI Database supports a wide variety of use cases across multiple industries and verticals, such as retail and e-commerce, manufacturing, supply chain management, and telecommunications. Through partnerships with third-party tools and integrations with Oracle applications, such as APEX and Oracle Analytics Cloud, it supports BI and reporting, along with application development. The solution's built-in ML component supports end-to-end MLOps. Vector search capabilities enable similarity search and creation of RAG workflows to fine-tune LLMs and improve their output.
SAP: SAP Business Data Cloud*
Solution Overview
SAP Business Data Cloud (BDC) is a fully managed SaaS solution that combines SAP HANA Cloud, SAP Datasphere, SAP BW, SAP Databricks, and SAP Analytics Cloud into a single platform for data and analytics, accessible through a single subscription. SAP BDC integrates these solutions to provide a unified data experience and enable customers to access, manage, and derive insights from all their data across SAP and non-SAP sources.
SAP Datasphere is the data warehousing component of SAP BDC. It combines capabilities for semantic modeling, data cataloging, data preparation and integration, and data federation and virtualization. It provides the data foundation and the main data integration and data modeling capabilities for data used in SAP BDC. It is integrated with the other services, such as SAP Databricks for ML and predictive analytics and SAP Analytics Cloud for visualizations.
SAP is positioned as a Challenger and Fast Mover in the Maturity/Feature Play quadrant of the data warehouse Radar chart.
Strengths
SAP scored well on a number of decision criteria, including:
Managed services: SAP Datasphere is a core component of the fully managed SAP BDC solution. SAP BDC also helps improve efficiency by consolidating data and analytics workloads under a single platform.
Metadata management and data cataloging: The data cataloging capabilities of SAP Datasphere include data discovery, a business glossary, and data lineage tracking and impact analytics. SAP Datasphere also provides data product support and domain-oriented architecture through its “Spaces” virtual work environments feature. There are also capabilities for sharing data through a marketplace feature.
Integrated and/or in-database ML and predictive analytics: SAP Datasphere can leverage the ML and predictive analytics libraries and algorithms in the embedded HANA database to support ML workflows. The platform also has a partnership with Databricks that enables organizations to use its sophisticated AI and ML capabilities for workloads without data movement.
Opportunities
SAP has room for improvement in a few decision criteria, including:
Support for open table formats: SAP Datasphere doesn’t support open table formats such as Apache Iceberg or Delta Lake as native or default formats for Datasphere itself. However, Delta Lake is the default format for data in the Databricks platform, and the Databricks component of SAP BDC enables integration with and exposure of Databricks data for direct consumption by SAP analytics applications.
GenAI automation and assistance: As of the research phase of this report, integration of the SAP Joule assistant's capabilities into SAP Datasphere is in the “early adopter” phase and not yet generally available in the data warehouse. In SAP Analytics Cloud, another component of SAP BDC, a feature called “Just Ask” allows users to query data in conversational English.
Purchase Considerations
Billing can be on a pay-as-you-go model based on consumption of capacity units, or via a longer-term enterprise license agreement. Services that are described as contributing to capacity units include compute and storage, data integration, data catalog (including crawling and storage), use of the SAP BW bridge service, and data lake storage.
Use Cases
SAP Datasphere excels in many use cases, including data integration, data modeling, traditional BI and reporting, and ML and predictive analytics. It also supports the data fabric and data mesh approaches to data architecture and management. Some industry-specific examples include support for customer 360-degree views in retail, personalized marketing campaigns, and predictive maintenance in manufacturing and supply chain management.
Snowflake: The Snowflake Platform*
Solution Overview
The Snowflake Platform is a cloud-native, fully managed SaaS data warehouse and AI solution. As a fully managed platform, Snowflake abstracts the complexity of platform administration and maintenance from customers. Separating storage and compute allows each resource to scale independently for cost and performance optimization. Snowflake enables multiple workloads from a single platform, including application development, operational workloads, AI and ML, GenAI, and, looking to the future, agentic AI.
To further its capabilities and strategy for evolving agentic AI workloads, Snowflake announced its acquisition of Crunchy Data in June 2025. Crunchy Data is a provider of PostgreSQL operational/transactional database technology. The acquired technology is intended to power Snowflake Postgres, a platform component that provides a foundation for building and deploying AI agents and applications, as well as in-house PostgreSQL functionality.
Snowflake is positioned as a Leader and Outperformer in the Innovation/Platform Play quadrant of the data warehouse Radar chart.
Strengths
Snowflake scored well on a number of decision criteria, including:
Support for open table formats: Snowflake supports the Iceberg open table format as a native format for data. Snowflake also supports reading data from Delta Lake tables stored in external storage locations through a feature called external volumes. These objects implement an identity and access management (IAM) entity for a specific storage location. The IAM entity securely connects Snowflake to the underlying storage to access table information, including metadata, schema, and more.
Integrated and/or in-database ML and predictive analytics: Snowflake ML provides a suite of capabilities to support and scale end-to-end ML workflows. These include mechanisms for preparing data, capabilities for creating and using ML features via the Snowflake Feature Store, and constructs for training, operationalizing, deploying, and monitoring ML models. At the enterprise pricing tier, Snowflake also provides ML lineage capabilities, allowing customers to track data throughout the ML lifecycle.
GenAI automation and assistance: Snowflake incorporates several features that provide GenAI assistance and automation. Snowflake Copilot enables users to ask open-ended questions to explore data, generate SQL queries via natural language, preview query results, and edit queries before running them. Snowflake Copilot iterates on and builds complex queries through conversational interaction and follow-up questions. It also helps evaluate query efficiency, optimize SQL queries, and explain queries in natural language.
Snowflake is classified as an Outperformer due to its progressive roadmap and investments in developing new, cutting-edge capabilities.
Opportunities
Snowflake has room for improvement in a few decision criteria, including:
Semistructured and unstructured data storage and processing: As of this writing, multiple functions for processing images and audio, as well as for multimodal analytics, were in preview. Existing capabilities in this area include storing semistructured data in the database using the VARIANT data type and storing unstructured data in the database using the FILE data type. Snowflake also provides full-text search and automated document processing capabilities.
Hybrid enablement: While the solution can query data in other storage systems, it is a cloud-native solution that can only be deployed in the cloud.
Purchase Considerations
Snowflake can be deployed on any of the major public clouds, with pricing influenced by factors such as the chosen Snowflake edition, data storage and transfer, compute usage, Snowflake credits, and virtual warehouse sizes. The standard tier includes core platform features, fully managed elastic compute, data security and encryption, Snowpark access, data sharing, optimized compressed storage, and time travel capabilities. The enterprise tier builds on this with multicluster compute, more granular data governance and privacy controls, and longer time travel windows. The business critical tier adds enhanced security, failover/failback, and disaster recovery to support mission-critical enterprise workloads. For maximum isolation and control over data, the Virtual Private Snowflake option provides the capabilities of the business critical edition within a fully dedicated Snowflake environment.
Use Cases
Snowflake can support a variety of analytics and AI workloads across industries such as advertising, financial services, healthcare, and manufacturing. In addition to data integration, ML, and geospatial data analysis, Snowflake provides functions for sentiment extraction, semantic search, and document processing (such as text extraction, summary generation, and document analysis). A set of capabilities called Cortex Agent (currently in preview) could help customers orchestrate and monitor AI agents.
Teradata: Teradata VantageCloud
Solution Overview
Teradata VantageCloud is a modern data, analytics, and AI platform designed to support enterprise-scale workloads on large data volumes. The platform includes VantageCloud, a cloud-native solution with a browser-based UI intended to support exploratory, ad hoc, departmental, and production workloads. Data can be stored in VantageCloud in cloud object storage or in the platform’s object file system.
VantageCloud is available in many deployment options, including on-prem and in the public cloud as a managed SaaS offering. Teradata provides flexible tooling so users can work with the languages and tools of their choice, such as SQL, Python, and R.
Teradata pioneered many of the core data warehouse optimizations and architectures that form the table stakes described in the companion Key Criteria report, including MPP. With over four decades of data warehouse experience, Teradata is a major incumbent and data warehouse specialist that brings its perspective to bear for large enterprises and smaller businesses alike.
Teradata is positioned as a Leader and Fast Mover in the Maturity/Platform Play quadrant of the data warehouse Radar chart.
Strengths
Teradata scored well on a number of decision criteria, including:
Streaming and real-time data ingestion: In addition to a wide ecosystem that includes data integration platforms and streaming data solutions such as Amazon Kinesis, Teradata provides multiple options for data ingestion at scale. The Kafka Access Module for the Teradata Parallel Transporter data ingestion service provides additional support for streaming data within the platform. Teradata can integrate with third-party solutions such as Fivetran and Airbyte.
Concurrency optimizations: Teradata offers strong enterprise workload management capabilities. Teradata Active System Management (TASM) provides database administrators the ability to dynamically and automatically manage workloads according to their SLAs. This includes capabilities for workload prioritization, performance tuning, and system monitoring and management.
Integrated and/or in-database ML and analytics: Teradata excelled in this key feature through its ClearScape Analytics suite, which provides a portfolio of in-database functions and tools to help customers perform complex analytics and manage the full ML lifecycle. This includes building, training, and deploying ML models directly within the database. ClearScape Analytics allows users to use their preferred tools, such as Python, R, and SQL, to run analytics functions and execute code.
Opportunities
Teradata has room for improvement in a few decision criteria, including:
Metadata management and data cataloging: Teradata doesn’t appear to have native GenAI-powered capabilities in this area, such as AI-generated documentation for data assets. However, Teradata has a strong ecosystem of integrations and partners with third-party solutions, such as Alation, to provide these advanced capabilities. Looking ahead, Teradata says it is investing in an open global catalog and semantic mapping to enable the creation of a unified knowledge base.
GenAI automation and assistance: While Teradata’s ask.ai GenAI chatbot assistant became generally available in AWS in 2024, as of October 2025, it remained exclusive to that cloud. Teradata recently announced a host of capabilities and frameworks enabling agentic AI and on-prem AI development, marking an evolution from an earlier approach centered on in-platform GenAI assistance and signaling a broader vision for moving beyond a single chatbot toward a comprehensive framework for AI enablement and AI readiness for its customers.
Purchase Considerations
Teradata supports deployment on-prem, across the three major public clouds, in a customer’s virtual private cloud, and in hybrid scenarios. Teradata pricing is based on compute, storage, and add-ons such as AI, data transfer, apps, or services.
Use Cases
Beyond traditional BI and reporting, Teradata supports in-database ML and advanced analytics while introducing a variety of capabilities that align with its broader vision of comprehensive GenAI and Agentic AI enablement. Some specific industries that can benefit from the Teradata solution include financial services for fraud protection and regulatory compliance, telecommunications for personalized customer experiences and improving operational efficiency, manufacturing for process optimization and predictive maintenance, and healthcare for patient-360 and disease-onset management. Teradata also recently released the Teradata Enterprise Vector Store, an in-database solution that supports storing, managing, and processing vector embeddings. It helps combine unstructured data with structured data from the warehouse to improve the output of GenAI and agentic AI applications.
VMware (Broadcom): Tanzu Data Intelligence, Tanzu Greenplum
Solution Overview
VMware Tanzu Greenplum is an enterprise-scale, high-performance data warehousing and analytics database offering. It is based on PostgreSQL, a relational database, and is optimized for complex analytics over large volumes of data.
Tanzu Greenplum is a component of the Tanzu Data Intelligence (TDI) platform, VMware’s modern data analytics and AI platform. It encompasses multiple services for data ingestion, a data lakehouse, data warehousing, ML and AI, and data governance. Within this context, Tanzu Greenplum functions as a core, high-performance, highly optimized advanced analytics database engine. Other components of TDI include Tanzu Data Lake, Tanzu GemFire (an in-memory, real-time operational data store), Tanzu RabbitMQ (real-time messaging and event streaming), Tanzu Data Flow (a data pipeline orchestration service), Tanzu for Postgres, Tanzu for MySQL (a curated, enterprise-grade RDBMS), and Tanzu for Valkey (a caching data engine).
VMware is positioned as a Challenger and Fast Mover in the Maturity/Platform Play quadrant of the data warehouse Radar chart.
Strengths
VMware scored well on a number of decision criteria, including:
Streaming and real-time data ingestion: Tanzu Greenplum supports multiple capabilities for streaming and real-time data ingestion. Tanzu Greenplum Streaming Server (GPSS) integrates with streaming data solutions such as Apache Kafka and the Tanzu RabbitMQ component of TDI, enabling continuous data ingestion into Tanzu Greenplum. The solution also supports data replication. The Platform Extension Framework (PXF) enables querying external streaming frameworks or federated datasets without ingesting the data.
Concurrency optimizations: Tanzu Greenplum provides multiple capabilities that enable it to support high levels of concurrent users and process large volumes of concurrent queries. These features include row-level locking and multi-version concurrency control (MVCC), which support multiple simultaneous transactions without conflicts, and resource groups, which allow CPU, memory, and I/O resources to be allocated to critical workloads while maintaining system performance. Dynamic workload management rules adjust resource allocation in real time based on query complexity, system load, and user role, along with workload scheduling and prioritization.
Integrated and/or in-database ML and predictive analytics: Tanzu Greenplum supports the open source Apache MADlib library, which provides algorithms for regression, classification, clustering, dimensionality reduction, topic modeling, graph analytics, and time-series forecasting. It lets users employ ML workflows without exporting data to a separate environment.
Opportunities
VMware has room for improvement in a few decision criteria, including:
Managed services: VMware doesn’t currently offer Tanzu Greenplum as a managed, cloud-based SaaS offering. However, as of this writing, VMware says it is actively working on developing a formal managed services provider (MSP) reseller program in collaboration with Broadcom. Anticipated benefits of this development include providing customers with further flexibility of choice, from the control and customizability of a customer-managed implementation to the additional professional support and infrastructure management that a managed services offering would provide.
Support for open table formats: According to VMware, adding support for the Apache Iceberg open table format is a near-term roadmap item. Tanzu Greenplum currently supports querying and accessing data in formats such as Parquet files stored in external systems, including HDFS and cloud object storage. This support is achieved through PXF.
GenAI automation and assistance: As of this writing, Tanzu Greenplum does not support any in-platform features for GenAI automation or assistance, such as copilot-like assistants, AI-powered code generation assistance, natural language query interfaces, GenAI-powered visualization and reports, or GenAI assistance in creating and maintaining data pipelines for moving and transforming data to be stored in the warehouse. However, Tanzu Greenplum offers strong capabilities for GenAI-related workloads, including support for storing vector embeddings and performing vector similarity searches. It also enables RAG queries, native integration with PostgresML and HuggingFace in Tanzu Greenplum to enable in-database execution of LLMs, and the ability to combine vector search with SQL-based filtering for enhanced precision and relevance of search results.
Purchase Considerations
Tanzu Greenplum is licensed on a subscription basis. Pricing is per core for a specific term, such as a year. Tanzu Greenplum can be deployed in any of the three public clouds as a public cloud image, a software-only distribution, a virtual VMware vSphere appliance, or a customer-managed bare metal implementation. A managed SaaS offering is stated to be on the vendor’s near-term roadmap.
Use Cases
VMware supports a wide variety of analytics workloads, including BI, reporting, and dashboards. In-database ML capabilities support training, deployment, and management of ML models within a single environment. Tanzu Greenplum delivers real-time data processing and analytics for scenarios that require decisions to be made as events occur in the real world, such as facilitating cybersecurity threat detection or financial services market trading and analysis.
Yellowbrick: Yellowbrick Data Warehouse*
Solution Overview
Yellowbrick Data Warehouse is an MPP data analytics platform based on PostgreSQL, supporting large-scale analytics across mixed workloads. It supports enterprise data warehousing, streaming data analytics, and the development of analytics applications. Yellowbrick’s primary emphasis and key differentiators include cost/performance optimization and efficiency, personalized support, and flexible deployment options to meet data sovereignty and residency requirements.
A hybrid row/column store helps the platform better handle high-speed ingestion of streaming and real-time data, as well as historical batch data. Yellowbrick has invested in abstracting the complexities of platform administration and maintenance for the customer. It seeks to provide customers with a “SaaS-like” experience while still giving them control over where their data resides.
Yellowbrick is positioned as a Challenger and Fast Mover in the Maturity/Platform Play quadrant of the data warehouse Radar chart.
Strengths
Yellowbrick scored well on a number of decision criteria, including:
Streaming and real-time data ingestion: Yellowbrick’s PostgreSQL compatibility enables any PostgreSQL-compatible data connector to work with Yellowbrick. This allows Yellowbrick to integrate with a number of streaming data platforms and CDC tools. The solution also provides native connectors for streaming data sources, such as Kafka, and has a hybrid row/column store structure to optimize its ability to handle real-time streaming and historical batch data.
Concurrency optimizations: Yellowbrick offers multiple mechanisms that contribute to strong workload management capabilities as well as high-concurrency, low-latency interactive queries. Some of these include the capacity for load balancing across compute clusters and for managing mixed workloads on a single cluster by automatically configuring triggered workload management policies. Policy-based resource management helps route queries to the correct cluster. Yellowbrick also includes features intended to prevent runaway queries from disrupting other users’ workloads.
Opportunities
Yellowbrick has room for improvement in a few decision criteria, including:
Support for open table formats: Yellowbrick doesn’t appear to provide support for data stored in open table formats directly within the solution. However, it has a strong partnership with Databricks, by which the Delta Lake open table format is the default format for data storage. Yellowbrick can connect to and leverage Databricks’ capabilities for workloads that involve data stored in this open table format. Relying on a second platform, however, can introduce at least some additional complexity.
Analytics workload diversity: As of the research phase of this report, support for geospatial functions is in public preview. Once generally available, Yellowbrick’s existing capabilities in this area stand to be further enhanced.
Purchase Considerations
Yellowbrick supports multiple deployment options, including the public cloud, a customer’s private cloud, on-prem, or any combination of these. Yellowbrick pricing allows customers to select from several choices. One-year or three-year subscriptions, with pricing per vCPU per year, are described as providing predictable value and a guaranteed multiyear price. There is also an on-demand option, billed monthly for consumption per vCPU per hour.
Use Cases
Yellowbrick supports a number of key use cases, including enterprise BI and reporting, application development, and real-time data analysis. Yellowbrick also supports customers with migration from legacy systems. Specific use cases across verticals include financial services (customer 360, portfolio monitoring, credit risk analysis, real-time fraud detection and analytics), retail (inventory management and demand forecasting), public sector (supply chain management and logistics), and insurance (financial reporting and compliance, underwriting, and claims ratio calculation).
6. Analyst’s Outlook
The data warehouse market is an established and mature sector. Vendors have worked for decades to develop, refine, and enhance their platforms. Some have been doing so since the technology was invented.
Continued development and evolution are driven by several key factors, reflecting the major considerations that influence a potential purchase decision. Organizations need solutions that can handle a growing diversity of analytics workloads from a single data platform. This need continues to push data warehouses to expand by adding support for open table formats, integrating and adding in-database features that can support ML and predictive analytics, and supporting semistructured and unstructured data storage and processing.
Customers need to make decisions faster, and this continues to push data warehouses to develop streaming and real-time data ingestion capabilities. Enterprise organizations must build and improve their AI readiness, and this innovation imperative continues to drive the addition of in-database features to enable AI-related workloads. It also encourages the development of stronger metadata management and data cataloging capabilities, along with the capacity to enrich data with semantic business context. These features ensure a high-quality, well-organized, and well-managed data foundation.
For a potential customer seeking to understand the data warehouse landscape, it’s important to know that the choice of a data warehouse platform today is influenced not only by query performance and architectural considerations. It’s also affected by how well the offering enables diverse workloads and helps businesses position themselves well for the future.
For IT decision-makers seeking to adopt a data warehouse, the following short-term, actionable items can help best position you for success:
Define key use cases: Determine and articulate the organization’s priorities for critical workloads before reviewing any specific vendor offerings. These priorities will include BI and reporting for internal dashboards or customer-facing embedded analytics, AI and ML, and real-time analytics. This is a key step that helps establish and maintain a business objective-oriented perspective for viewing and evaluating vendor platforms.
Demonstrate ROI with a proof of concept: Creating a proof of concept for a controlled, limited-scope scenario can help evaluate solutions in practice, build confidence in the data warehouse's ability to solve business problems, and generate enthusiasm among stakeholders.
Lay a strong AI-ready foundation: It’s worth investing up-front effort to plan and implement a data model that can also enrich organizational data with the appropriate business context. It’s also worth the effort to set up pipelines to transform and prepare high-quality data for consumption in AI-related and agentic workloads, helping position the organization well for the future.
Data warehouses continue to develop in response to the needs of enterprise customers. Vendors continue to expand support for handling BI, ML, and agentic AI workloads from a single platform. As organizations’ data estates continue to expand, data warehouses’ ability to support hybrid and multicloud architectures remains essential. Organizations will also need to continue to invest in training and upskilling data teams and personnel, encouraging continued involvement in analytics and helping users obtain the information they need. These key takeaways are important starting points for IT and business leadership on their quest to understand the potential of data warehouses to support their business objectives.
7. Methodology
*Vendors marked with an asterisk did not participate in our research process for the Radar report, and their capsules and scoring were compiled via desk research.
For more information about our research process for Radar reports, please visit our Methodology.
8. About Andrew J. Brust
Andrew Brust has held developer, CTO, analyst, research director, and market strategist positions at organizations ranging from the City of New York and Cap Gemini to GigaOm and Datameer. He has worked with small, medium, and Fortune 1000 clients in numerous industries and with software companies ranging from small ISVs to large clients like Microsoft. The understanding of technology and the way customers use it that resulted from this experience makes his market and product analyses relevant, credible, and empathetic.
Andrew has tracked the Big Data and Analytics industry since its inception, as GigaOm’s Research Director and as ZDNet’s original blogger for Big Data and Analytics. Andrew co-chairs Visual Studio Live!, one of the nation’s longest-running developer conferences, and currently covers data and analytics for The New Stack and VentureBeat. As a seasoned technical author and speaker in the database field, Andrew understands today’s market in the context of its extensive enterprise underpinnings.
9. About GigaOm
GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.
GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.
GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.
10. Copyright
© Knowingly, Inc. 2026 "GigaOm Radar for Data Warehouses" is a trademark of Knowingly, Inc. For permission to reproduce this report, please contact sales@gigaom.com.