Comment on page
What are the differences between OLTP and OLAP databases?
Jan 30, 2023
Author: Robin Tang
When differentiating between databases, one key characteristic is whether the database is OLTP or OLAP. We will explore what they are, key differences, and go through some real-world examples in this blog post.
Online transactional processing (OLTP) databases are characterized for having high throughput, low latency transactions that are meant for internal applications.
OLTP databases are known for having strong data integrity, transactions support, and indices to support well-known access patterns.
OLTP databases are used to enable applications to serve requests such as:
- Can this person log in?
- Process a credit card payment
- Connecting a rider with a driver
- Hotel check ins
- Airline ticketing system
Online analytical processing (OLAP) databases primarily consist of aggregated data (coming from various data sources) and the dataset is a historical snapshot of the upstream source. OLAP databases are primarily used for data analysis, reports and other types of analytical queries. While performance is still important, it is typically much more forgiving than OLTP databases.
Examples of such OLAP queries may include:
- How many customers are signing up per month?
- What is the VIP customer demographics breakdown?
- What is our paid acquisition channel performance (cost per thousand impressions, cost per click, customer acquisition cost, customer lifetime value)?
A typical system architecture diagram that exists at most companies.
In this diagram, the customer would place an order which would result in data being saved in the Order Postgres DB. The
orderservice would also make an API call to the
paymentservice which has its own DB as well. There would be internal processes (typically ETL jobs) that are periodically run to synchronize the data within the service transactional database into the company’s data warehouse.
Separately, there would be processes to synchronize external data from Zendesk, Stripe, Mailchimp. The exact methodology would depend on the vendors’ capabilities. There are vendors that have native integrations with selected data warehouses, others require custom integrations via their API. Companies like Fivetran, Airbyte, Rivery provide connectors that integrate with the tools’ API and load the datasets into your data warehouse.
Artie Transfer is able to drastically improve the internal process by streaming and only applying the changes to the downstream data warehouse. By doing so, Transfer is able to provide real-time transactional data within your data warehouse.
In order to do this, Artie Transfer has the following features built in:
- Automatic retries & idempotency. We take reliability seriously and it's feature 0. Latency reduction is nice, but doesn't matter if the data is wrong. We provide automatic retries and idempotency such that we will always achieve eventual consistency.
- Automatic table creation. Transfer will create the table in the designated database if the table doesn't exist.
- Error reporting. Provide your Sentry API key and errors from data processing will appear in your Sentry project.
- Schema detection. Transfer will automatically detect column changes and apply them to the destination.
- Scalable architecture. Transfer's architecture stays the same whether we’re dealing with 1GB or 100+ TB of data.
- Sub-minute latency. Transfer is built with a consumer framework and is constantly streaming messages in the background. Say goodbye to schedulers!