Speeding up InfluxDB line protocol

Tancrede Collard

Tancrede Collard

QuestDB Team

InfluxDB is the current market leader in time series, and we thought it would only be fair if we had a stab at their ingestion format called InfluxDB line protocol (“ILP”) to compare data ingestion performance between QuestDB and InfluxDB.

It would not be an overstatement to say that InfluxDB uses a lot of CPU. We set ourselves to build a receiver for ILP, which stores data faster than InfluxDB while being hardware efficient.

Why ILP?

Starting with QuestDB 4.0.4, users can ingest data through ILP to leverage SQL to query InfluxDB data alongside other tables in a relational database while keeping the flexibility of ILP.

QuestDB's architecture and its API integration

Data loss over UDP

We have conducted our testing over UDP, thus expecting some level of data loss. However, we did not anticipate that InfluxDB would lose so much.

We have built a sender, which caches outgoing messages in a small buffer before sending them to a UDP socket. It sends data as fast as possible to eventually overpower the consumers and introduce packet loss. To test for different use cases, we have throttled the sender by varying the size of its buffer. A smaller buffer results in more frequent network calls and results in lower sending rates.

The benchmark publishes 50 million messages at various speeds. We then measure the number of entries in each DB after the fact to calculate the implied capture rate.

We use the Dell XPS 15 7590, 64Gb RAM, 6-core i9 CPU, 1TB SSD drive. In this experiment, both the sender and QuestDB/InfluxDB instance run on the same machine. UDP publishing is over loopback. OS is Fedora 31, OS UDP buffer size (net.core.rmem_max) is 104_857_600.

It comes down to ingestion speed

Database performance is the bottleneck that results in packet loss. Messages are denied entry, and the loss rate is a direct function of the underlying database speed.

By sending 50m messages at different speeds, we get the following outcome.

Table showing a comparison of the capture rate between QuestDB and InfluxDB

Capture rate as a function of sender speed

InfluxDB’s capture rate rapidly drops below 50%, eventually converging toward single-digit rates.

Chart showing a comparison of the capture rate between QuestDB and InfluxDB

Capture rate as a function of sending speed.

Chart showing a comparison of the ingestion rate between QuestDB and InfluxDB

Implied ingestion speed in function of Sender speed

QuestDB’s ingestion speed results are obtained through ILP. Our ingestion speed is considerably higher while using our native input formats instead.

Why is the sender’s rate slower for InfluxDB compared to QuestDB?

In this test, we run the sender and the DB on the same machine, and it turns out that InfluxDB slows down our UDP sender by cannibalizing the CPU. Here is what happens to your CPUs while using InfluxDB:

Bar chart showing the CPU usage of InfluxDB, idle vs on load

InfluxDB’s CPU usage when serving requests

When in use, InfluxDB saturates all of the CPU. As a consequence, it slows down any other program running on the same machine.

QuestDB’s secret sauce

We maximise the utilization of each CPU, from which we extract as much performance as possible. For the example below, we compared InfluxDB’s ingestion speed using 12 cores to QuestDB using one CPU core only. Despite utilizing one core instead of 12, QuestDB still outperforms InfluxDB significantly.

If spare CPU capacity arises, QuestDB will execute multiple data ingestion in parallel, leveraging multiple CPUs at the same time, but with one key difference; QuestDB uses work-stealing algorithms to ensure every last bit of CPU capacity is used while never being idle. Let us illustrate why this is the case.

Modern network cards have much superior throughput than the single receiver. Being limited to one receiver by design, InfluxDB considerably under-utilizes the network card, which is the limiting factor in the pipeline. Chart showing InfluxDB's queuing mechanism

All CPU cores open one single receiver that under-utilizes the network card

Conversely, QuestDB can open parallel receivers (requiring one core each), fully utilizing the network card capabilities. The following illustration assumes that there would be spare CPU capacity in other cores to be filled. In such a scenario we would get QuestDB utilizing 12 cores, with each one of those being considerably faster than InfluxDB’s combined 12 cores!

Chart showing QuestDB's queuing mechanism

Each CPU core opens an independent receiver working in parallel that fully leverages the network card

Besides ingestion, InfluxDB also saturates the CPU on queries. The current user cannibalizes the whole CPU, while other users have to wait for their turn. How the CPU is shared under InfluxDB's load

Users monopolize all CPU cores one after the other

By contrast, QuestDB uses each core separately, allowing multiple users to query or write concurrently without delay. The performance gap between QuestDB and InfluxDB grows significantly as the number of simultaneous users increases. How the CPU is shared under QuestDB's load

Users share CPU cores and are served concurrently, fast. They also use cores to the maximum.

Get started

QuestDB supports ILP over UDP multicast and unicast sockets. TCP support will follow shortly. You don’t need to change anything in your application. For Telegraf, you can configure the UDP sender for QuestDB’s address and port.

Follow this link to download QuestDB. You can also use our sender against QuestDB and InfluxDB to reproduce the experiment.

You can use JOINs while modifying your data structure on the fly and querying it all in SQL.