16
 min read

Maintaining Materialized Views on MySQL with LeapDB

Learn different ways to maintain materialized views including LeapDB.

How to maintain Materialized Views on MySQL with LeapDB.

A materialized view is a table that the database keeps up to date with the results of another query. This allows you to have denormalized data strategically for speeding up certain queries without the performance and data correctness challenges and non maintainability of a denormalized schema.

So for a very simple example, let’s say you want a table of active users and their transaction count:

START TRANSACTION;
TRUNCATE active_users_mv;
INSERT INTO active_users_mv(user_id, name, transaction_cnt) 
SELECT user_id, name, count(*) as transaction_cnt
FROM user
JOIN transaction
USING (user_id)
GROUP BY user.user_id;
COMMIT;


You’ll need a faster solution than the above SQL script to keep that materialized view up to date on a large database.


There are a few ways that people are currently maintaining materialized views of their data. 

Create mysql VIEW’s that make it easier to query your data

  • These will be extremely slow to query
  • Very easy to set up
  • They stay up to date
  • Only useful for very small data sets
  • You will need to spend heavily on hardware as your data grows to keep the system running
  • Transition to LeapDB is an seemless way to achieve speed.
  • Correctness guaranteed.


Create a separate out of date view of your tables (like our example above)

  • Easy and cheap if it’s a single unchanging snapshot.
  • CREATE a table and INSERT SELECT into it from your system.
  • TRUNCATE and re INSERT SELECT later to re sync the data.
  • Easy if you only need to rebuild the snapshot once in a while
  • It may be more out of date than you’d like. 
  • If your data set is too large it may be difficult to generate the data.
  • Fast when it comes to querying it.
  • Correctness guaranteed.


Create a separate up to date view of your tables maintained by your application

  • Difficult to maintain
  • Difficult to set up
  • High performance
  • Not as high performance as LeapDB because typically people do not batch and aggregate their increments.
  • Expensive in engineering time
  • We are tempted to write a tutorial on how to do this, but now that LeapDB is available, a home grown solution is not a great answer. In fact the biggest enthusiasts of LeapDB are those that have manually maintained solutions.
  • This can be synchronous and transactional.
  • Difficult to maintain correctness. Regular auditing recommended.


Use Triggers to maintain your materialized views

  • Fast for your queries
  • Faster than rebuilding your table.
  • Only works for simple single table VIEWS
  • Not so easy to maintain although easier than putting it in the application.
  • Slows down changes to your base tables.
  • No ability to batch changes together. Each update initiates a trigger.
  • This is synchronous and not asynchronous and is even transactional.
  • Difficult to think through but should maintain correctness.


LeapDB

  • Easy to set up
  • High performance
  • Fast for your queries
  • Asynchronous. This keeps your transactions lean. Triggers and native sync materialized views are overhead and stop batching.
  • Incrementally updated. Refreshing does not work well for large views.
  • Correct data is maintained for you by LeapDB.
  • Complex queries supported.
  • Batch and aggregation used for performance.
  • Easy to maintain. Views are defined by a SELECT statement. Define your view with a SELECT statements with JOIN and GROUP BY and HAVING and other materialized views.
  • We are 100% compatible extension of MySQL 5.7 (More versions to follow.)
  • This means you don’t have to deal with some third party engine that may or may not behave exacty like MySQL and which would therefore put your data at risk.
  • This means you team doesn’t have to learn a new obscure technology.

Standard Use Cases for Materialized Views

  • Reporting with aggregation
  • Aggregation requires the database to scan through every single record in each GROUP. Obviously, if groups are large this is inevitably slow. It’s much better to keep the GROUPs in a separate table and let LeapDB maintain it for you. Having sub second results for analytical queries can open up a realm of possibilities.
  • Reporting with complex multi-table criteria
  • If you have queries that involve many tables and no single table is able to filter the results, and the tables are not so small, it often becomes impossible to optimize the query to the sub second performance that is often required. In this case a great solution is to either make a materialized view out of the query, or to get a bit more strategic and make a view that can speed up that whole class of query. Sometimes, you need a little bit of denormalized data to make your database leap.
  • Simplified views of the data
  • Sometimes, it can greatly simplify your schema to have a simplified view of your data. MySQL already has calculated columns but those only work based upon a single row of a single table.

You want to have a normalized schema. The benefits in terms of performance and flexibility and correctness and clarity and maintainability are many. However, you also often need to have strategically denormalized data. This can accelerate queries beyond the theoretical limits of your database and make your system more usable for reporting. We therefore recommend to you that you use LeapDB so that you can have the benefits of normalized data and the benefits of denormalization without the pitfalls of either.