Comment on page
🪅
What are the differences between OLTP and OLAP databases?
Jan 30, 2023
Author: Robin Tang
When differentiating between databases, one key characteristic is whether the database is OLTP or OLAP. We will explore what they are, key differences, and go through some real-world examples in this blog post.
Online transactional processing (OLTP) databases are characterized for having high throughput, low latency transactions that are meant for internal applications.
OLTP databases are known for having strong data integrity, transactions support, and indices to support well-known access patterns.
OLTP databases are used to enable applications to serve requests such as:
- Can this person log in?
- Process a credit card payment
- Connecting a rider with a driver
- ATMs
- Hotel check ins
- Airline ticketing system
Online analytical processing (OLAP) databases primarily consist of aggregated data (coming from various data sources) and the dataset is a historical snapshot of the upstream source. OLAP databases are primarily used for data analysis, reports and other types of analytical queries. While performance is still important, it is typically much more forgiving than OLTP databases.
Examples of such OLAP queries may include:
- How many customers are signing up per month?
- What is the VIP customer demographics breakdown?
- What is our paid acquisition channel performance (cost per thousand impressions, cost per click, customer acquisition cost, customer lifetime value)?
Category | OLTP | OLAP |
---|---|---|
Access patterns | Static and well established. Queries are optimized by creating indices to support their access patterns.
If one were to query an unindexed field, it would be extremely slow because it's anti-pattern for the database type and would require a full table scan. | Dynamic. The database allows for dynamic queries across a multitude of datasets, allowing maximum flexibility when performing data analysis. |
Type of workload | OLTP workloads would involve reading, updating and deleting data.
This is most suitable for applications and processing transactions. | The primary workload here would be queries (reads).
This is most suitable for workloads that involve data analysis, machine learning applications using offline datasets for training and other workloads that are read-only. |
Relational vs NoSQL | OLTP databases can be either NoSQL or relational. | OLAP databases are strictly relational, as they are optimized for data analysis via SQL. |
Main users | Applications.
It's advisable for each application to have their own database instance. |
|
Performance | Operations against an OLTP database should be measured in milliseconds. Performance is hypercritical. | OLAP performance is much more forgiving. As much of the workload is internal facing. |
Examples | Relational
NoSQL
|
|

A typical system architecture diagram that exists at most companies.
In this diagram, the customer would place an order which would result in data being saved in the Order Postgres DB. The
order
service would also make an API call to the payment
service which has its own DB as well. There would be internal processes (typically ETL jobs) that are periodically run to synchronize the data within the service transactional database into the company’s data warehouse.Separately, there would be processes to synchronize external data from Zendesk, Stripe, Mailchimp. The exact methodology would depend on the vendors’ capabilities. There are vendors that have native integrations with selected data warehouses, others require custom integrations via their API. Companies like Fivetran, Airbyte, Rivery provide connectors that integrate with the tools’ API and load the datasets into your data warehouse.
Artie Transfer is able to drastically improve the internal process by streaming and only applying the changes to the downstream data warehouse. By doing so, Transfer is able to provide real-time transactional data within your data warehouse.
In order to do this, Artie Transfer has the following features built in:
- Automatic retries & idempotency. We take reliability seriously and it's feature 0. Latency reduction is nice, but doesn't matter if the data is wrong. We provide automatic retries and idempotency such that we will always achieve eventual consistency.
- Automatic table creation. Transfer will create the table in the designated database if the table doesn't exist.
- Error reporting. Provide your Sentry API key and errors from data processing will appear in your Sentry project.
- Schema detection. Transfer will automatically detect column changes and apply them to the destination.
- Scalable architecture. Transfer's architecture stays the same whether we’re dealing with 1GB or 100+ TB of data.
- Sub-minute latency. Transfer is built with a consumer framework and is constantly streaming messages in the background. Say goodbye to schedulers!
If you are interested in learning more check out Transfer’s Github repo, schedule a demo here or drop us a note at [email protected]!
Last modified 20d ago