SimplicityBI

Menu
  • What we do
      • Digital Transformation Strategy
      • Data Implementation Services
      • Project Services
      • Managed Services
  • Solutions
      • Unified Data
        Platforms
      • Cloud data
        Integration
      • Master Data
        Management (MDM)
      • Data Lake
        DW
      • Data
        Visualization
      • Data
        Analytics
  • Technologies 
      • denodo
        • The Denodo Platform


          All the benefits of data virtualization including the ability to provide real-time access to integrated data across an organization’s diverse data sources, without replicating any data.

          The Denodo Platform offers the broadest access to structured and unstructured data residing in an enterprise, big data, and cloud sources in both batch and real-time, exceeding the performance needs of data-intensive organizations.

          Denodo Cloud Solutions

          Oil and Gas
      • technologies-microsoft-icon
        • Unleash the power in your data

          Reimagine the realm of possibility. Microsoft data platform solutions release the potential hidden in your data - whether it's on-premises, in the cloud, or at the edge - and reveal insights and opportunities to transform your business.Why use the Microsoft data platform

          Fast and Agile
          Work with a flexible data platform that gives you a consistent experience across platforms and gets your innovations to market faster—you can build your apps and then deploy anywhere.

          Built-in Intelligence
          The Microsoft data platform brings AI to your data so you gain deep knowledge about your business and customers like never before. Only Microsoft brings machine learning to database engines and to the edge, for faster predictions and better security.

          Enterprise Proven
          Bring your business to scale while trusting that your security, performance, and availability needs are covered—with an industry-leading total cost of ownership.
      • technologies-ibm-icon-transparent
        • How your business can get smarter



          Analytics

          Gain greater insights and innovate faster


          Cloud
          Control of your cloud should belong to you. SoftLayer can help.


          IT Infrastructure
          Build the foundation for cognitive business



          Services
          Transform your business with our expertise
      • technologies-looker-icon-transparent
        • looker is more than data analytics software, a full platform.

          Bring Data to every part of your business.

          Data Everywhere

          Deliver data directly in the tools your teams use everyday. Bring data into every action and every decision - in Slack, in Salesforce.com, even in your custom applications. Or build new applications on top of the Looker Data Platform to truly customize the experience for your business.

          Analytics evolved

          We believe everybody should have access to reliable analytics to make data-driven decisions. And to deliver on this promise, we had to re-imagine and rebuild how analytics are done from the ground up.
      • snowflake
        • To support today’s data analytics, companies need a data warehouse built for the cloud.


          One that offers rapid deployment, on-demand scalability, and compelling performance at significantly lower cost than existing solutions. Snowflake on Amazon Web Services (AWS) represents a SQL data warehouse built for the cloud.

          Snowflake’s unique architecture natively handles diverse data in a single system, with the elasticity to support any scale of data, workload, and users.
      • striim-platform
        • Striim Enables Modern Cloud Architecture


          Striim is a patented, enterprise-grade platform that offers continuous real-time data ingestion, high-speed in-flight stream processing, and sub-second delivery of data to cloud and on-premises endpoints.

          Striim continuously delivers data where you need it, when you need it, and in the correct format to be immediately available to high-value operational workloads.
      • semarchy
        • Intelligent MDM™


          Semarchy is the Intelligent MDM company. Its xDM platform is an innovation in multi-vector Master Data Management (MDM) that leverages smart algorithms and material design to simplify data stewardship, governance, and integration.

          It is implemented via an agile and iterative approach that delivers business value almost immediately and scales to meet enterprise complexity.
      • google-bigquery-platform
        • Google Bigquery


          A fast, highly scalable, cost-effective, and fully managed cloud data warehouse for analytics, with built-in machine learning.

          BigQuery is Google's serverless, highly scalable, enterprise data warehouse designed to make all your data analysts productive at an unmatched price-performance. Because there is no infrastructure to manage, you can focus on analyzing data to find meaningful insights using familiar SQL without the need for a database administrator.
      • tableau-technology
        • Connect to More Data


          Connect to data on prem or in the cloud—whether it’s big data, a SQL database, a spreadsheet, or cloud apps like Google Analytics and Salesforce.

          Access and combine disparate data without writing code. Power users can pivot, split, and manage metadata to optimize data sources. Analysis begins with data. Get more from yours with Tableau.
  • Insights
      • Events
      • White Papers
      • Blog
      • Webinar
  • Who we are
      • Client Stories
  • Join us
    • Careers
Contact us
Wednesday, 20 June 2018 / Published in Data Virtualization

Data Virtualization and Database Migration and Acceleration

Author: Rick F. van der Lans
Date: June 2018

Not every organization is happy with the reporting and analytical performance of their data warehouse environment. One customer indicated that some of their online reports take at least ten minutes to complete. Ten minutes is a long time if you have to wait for a report to show up on your screen. In some organizations it’s not the query performance that causes performance problems but loading new data into the data warehouse or data marts. Another customer mentioned that refreshing their data warehouse takes a full weekend plus the Monday morning.

But what can we do about this tenacious performance problem? That is the topic of this fourth article in a series on use cases of data virtualization [link to third article].

The Performance Struggle

Getting the right performance is a real struggle for many organizations. Their database administrators continuously strive to improve the query and load performance. Sometimes it’s hard to find a way to accelerate a query or a load process. And sometimes, if they find a way to improve something, the law of conservation of misery comes into play: if you speed up something, you’re very likely slowing down something else. It’s almost never a win-win situation.

The Need for a Faster Database Platform

So, why don’t organizations switch to another, faster database platform? Why not migrate to one of the many, really fast database platforms that have been introduced the last couple of years such as, SnowflakeDB in the cloud, MapD on a GPU-based platform, or Impala on a Hadoop cluster? All of them have been optimized and tuned specifically to support data warehouse workloads. Regrettably, life is not that easy. There are several aspects that complicates migration to another database server:

  • Different SQL dialects: The SQL dialects implemented by the vendors in their SQL database servers are not the same. Some support the SQL window functions, some don’t; some support advanced analytical functions, others don’t; and some support recursive queries, and some don’t. Therefore, queries cannot always be migrated without rewriting them.
  • Different internal architectures: Although SQL database servers look very similar on the outside, they can be very different on the inside. For example, the internal architecture of a GPU-based SQL database server is very different from a classic SQL product, which in turn is very different from a SQL-on-Hadoop engine. So, queries have to be reformulated to fully exploit these products.
  • Different data structures: Some database servers are good at running queries on normalized table structures, while others prefer star schemas or denormalized data structures. This implies that even if we can run our queries unchanged on the new platform, we may have to rewrite them, because the data structures have been changed to make optimal use of this platform.

 

Data Virtualization and Database Migration

Data virtualization can come to the rescue when migrating to another database. Placing a data virtualization server between the reports and the database server eases switching to another database server. If data is migrated to another database server, the data virtualization server hides that this new product supports a (slightly) different SQL dialect. The reports work with the SQL dialect of the data virtualization server, and the latter tries to push down most of that SQL query to the underlying database server. What is pushed down differs for each database server.

Data virtualization servers also allow for a gradual and stepwise migration. There is no need to migrate all the tables in one go. Instead, tables can be migrated one by one or group by group. Data virtualization servers will use their data federation capabilities to hide that the data is distributed among the old and the new database.

As indicated in the first article of this series, data virtualization servers support caching; see Figure 1. With caching the virtual content of a virtual table is determined and stored in some database. This feature can be used to accelerate query processing. For example, if access to a certain set of physical tables in a data mart is constantly slow, the virtual tables pointing to those physical tables, can be cached to another, faster database platform. When the virtual tables are cached, query performance is determined by this new platform. Reports won’t have to be modified when caching is deployed.

Figure 1:

The Long-Term Perspective of Database Migration

If we look at data migration from a more long-term perspective, another advantage becomes clear. When a data virtualization server is used to decouple the data consumers from the data stores, it’s much easier to benefit from all the new database platforms that have been introduced lately and are probably going to be introduced the coming years. With a data virtualization layer, organizations are not stuck with the database technology they selected a long time ago. It’s worth noting that the last years we have seen an avalanche of new database technologies. Just think about all the Hadoop-related technologies, the GPU-based database products, the NoSQL products, the in-memory products, and the translytical database servers.

Summary

Data virtualization servers can be used to develop entire data warehouse projects, such as the logical data warehouse architecture (see Article 3), but also for more down-to-earth use cases, such as database migration. Data warehouses are becoming bigger and bigger and the reporting and analytical workloads are continuously expanding. Eventually for many organizations the performance of their existing database server will not be sufficient anymore. Data virtualization servers may be the solution to migrate smoothly, seamlessly, and through a risk-averse stepwise approach.

In the fifth article of this series [link to next article], we focus on a popular topic: data lake. Data virtualization supports the development of more practical data lakes architectures.

  • Tweet

What you can read next

Data Virtualization and the 360-Degree Customer View
Data Virtualization and the Logical Data warehouse
Data Virtualization and the Logical Data Lake

Leave a Reply Cancel reply

Your email address will not be published.

 

 

“Genius is making complex ideas simple,
not making simple ideas complex”

Albert Einstein

ABOUT SimplicityBI

  • What we do
  • Solutions
  • Technologies 
  • Who we are
  • Insights
  • Join us

GET IN TOUCH

T: +1 (800) 308 8114
Email: contact@simplicitybi.com

SimplicityBI
407 2nd Street SW, Calgary
Alberta, Canada

  • Events
  • White Papers
  • Blog
  • Contact us

© 2022. All rights reserved. Powered by Instalogic Marketing

TOP
×

Get In Touch

Find out how SimplicityBI can impact your bottom line and
elevate your organization’s performance.

  • What we do
    • Back
    • Digital Transformation Strategy
    • Data Implementation Services
    • Project Services
    • Managed Services
    • Back
  • Solutions
    • Back
    • Unified Data Platforms
    • Cloud Data Integration
    • Master Data Managment (MDM)
    • Data Lake – DW
    • Data Virtualization
    • Data Analytics
    • Back
  • Insights
    • Back
    • Events
    • White Papers
    • Blog
    • Webinar
    • Back
  • Technologies 
    • Back
    • Denodo
    • Microsoft Data Platform
    • IBM
    • looker
    • Snowflake
    • Striim
    • Semarchy
    • Google Bigquery
    • Tableau
    • Back
  • Who we are
  • Join us
    • Back
    • Careers
    • Back
  • Contact us