How to maintain Materialized Views on MySQL with LeapDB.
A materialized view is a table that the database keeps up to date with the results of another query. This allows you to have denormalized data strategically for speeding up certain queries without the performance and data correctness challenges and non maintainability of a denormalized schema.
So for a very simple example, let’s say you want a table of active users and their transaction count:
INSERT INTO active_users_mv(user_id, name, transaction_cnt)
SELECT user_id, name, count(*) as transaction_cnt
GROUP BY user.user_id;
You’ll need a faster solution than the above SQL script to keep that materialized view up to date on a large database.
There are a few ways that people are currently maintaining materialized views of their data.
Create mysql VIEW’s that make it easier to query your data
These will be extremely slow to query
Very easy to set up
They stay up to date
Only useful for very small data sets
You will need to spend heavily on hardware as your data grows to keep the system running
Transition to LeapDB is an seemless way to achieve speed.
Create a separate out of date view of your tables (like our example above)
Easy and cheap if it’s a single unchanging snapshot.
CREATE a table and INSERT SELECT into it from your system.
TRUNCATE and re INSERT SELECT later to re sync the data.
Easy if you only need to rebuild the snapshot once in a while
It may be more out of date than you’d like.
If your data set is too large it may be difficult to generate the data.
Fast when it comes to querying it.
Create a separate up to date view of your tables maintained by your application
Difficult to maintain
Difficult to set up
Not as high performance as LeapDB because typically people do not batch and aggregate their increments.
Expensive in engineering time
We are tempted to write a tutorial on how to do this, but now that LeapDB is available, a home grown solution is not a great answer. In fact the biggest enthusiasts of LeapDB are those that have manually maintained solutions.
This can be synchronous and transactional.
Difficult to maintain correctness. Regular auditing recommended.
Use Triggers to maintain your materialized views
Fast for your queries
Faster than rebuilding your table.
Only works for simple single table VIEWS
Not so easy to maintain although easier than putting it in the application.
Slows down changes to your base tables.
No ability to batch changes together. Each update initiates a trigger.
This is synchronous and not asynchronous and is even transactional.
Difficult to think through but should maintain correctness.
Easy to set up
Fast for your queries
Asynchronous. This keeps your transactions lean. Triggers and native sync materialized views are overhead and stop batching.
Incrementally updated. Refreshing does not work well for large views.
Correct data is maintained for you by LeapDB.
Complex queries supported.
Batch and aggregation used for performance.
Easy to maintain. Views are defined by a SELECT statement. Define your view with a SELECT statements with JOIN and GROUP BY and HAVING and other materialized views.
We are 100% compatible extension of MySQL 5.7 (More versions to follow.)
This means you don’t have to deal with some third party engine that may or may not behave exacty like MySQL and which would therefore put your data at risk.
This means you team doesn’t have to learn a new obscure technology.
Standard Use Cases for Materialized Views
Reporting with aggregation
Aggregation requires the database to scan through every single record in each GROUP. Obviously, if groups are large this is inevitably slow. It’s much better to keep the GROUPs in a separate table and let LeapDB maintain it for you. Having sub second results for analytical queries can open up a realm of possibilities.
Reporting with complex multi-table criteria
If you have queries that involve many tables and no single table is able to filter the results, and the tables are not so small, it often becomes impossible to optimize the query to the sub second performance that is often required. In this case a great solution is to either make a materialized view out of the query, or to get a bit more strategic and make a view that can speed up that whole class of query. Sometimes, you need a little bit of denormalized data to make your database leap.
Simplified views of the data
Sometimes, it can greatly simplify your schema to have a simplified view of your data. MySQL already has calculated columns but those only work based upon a single row of a single table.
You want to have a normalized schema. The benefits in terms of performance and flexibility and correctness and clarity and maintainability are many. However, you also often need to have strategically denormalized data. This can accelerate queries beyond the theoretical limits of your database and make your system more usable for reporting. We therefore recommend to you that you use LeapDB so that you can have the benefits of normalized data and the benefits of denormalization without the pitfalls of either.