Running ClickHouse on small servers with low memory
ClickHouse is designed for big data and large servers with 8+ cores and 32GB+ RAM, but sometimes you may need to run ClickHouse on small servers with low amount of memory, for example, on staging environments. This tutorial explains how to configure ClickHouse to minimize memory requirements so you can run it on a modest server with just 2 cores and 2GB of RAM, for example, t3.small instance on AWS.
The problem
When there is not enough memory and ClickHouse can't complete an operation, it will throw an exception like this:
ClickHouse exception, code: 241, host: XXXXXX, port: 8123; Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceeded: would use 9.32 GiB (attempt to allocate chunk of 9440624 bytes), maximum: 9.31 GiB
Often, you can lower memory requirements by tuning ClickHouse settings. But be warned that nothing comes for free and you will pay with slower query processing and sub-optimal merges.
Configuring ClickHouse
If you don't know about config.xml
and users.xml
files yet, see How to configure ClickHouse?.
max_concurrent_queries
max_concurrent_queries limits the maximum number of simultaneously processed queries:
<!-- config.xml -->
<clickhouse>
<max_concurrent_queries>{{ 2 * number_of_cpu_cores }}</max_concurrent_queries>
</clickhouse>
mark_cache_size
mark_cache_size limits the maximum size of the mark cache.
<!-- config.xml -->
<clickhouse>
<!--- Lower mark cache size from 5GB to 1GB. --->
<mark_cache_size>1073741824</mark_cache_size>
</clickhouse>
queue_max_wait_ms
queue_max_wait_ms
sets the wait time in the request queue if the number of concurrent requests exceeds the max_concurrent_queries.
<!-- users.xml -->
<clickhouse>
<profiles>
<default>
<!--- Wait up to 1000ms. --->
<queue_max_wait_ms>1000</queue_max_wait_ms>
</default>
</profiles>
</clickhouse>
max_execution_time
max_execution_time specifies the maximum query execution time in seconds.
<!-- users.xml -->
<clickhouse>
<profiles>
<default>
<!--- Abort query after 10 seconds. --->
<max_execution_time>10</max_execution_time>
</default>
</profiles>
</clickhouse>
Merge settings
As data arrives, ClickHouse writes it into different pieces/subfolders called parts. For optimal performance, ClickHouse must periodically merge many small parts into one big part. Because this process can consume a lot of CPU time and memory, ClickHouse provides various settings that control merge operations.
You can monitor the progress of merges with the following query:
select * from system.merges
background_pool_size
background_pool_size sets the number of threads performing background operations, for example, merges in MergeTree engine tables.
<!-- users.xml -->
<clickhouse>
<profiles>
<default>
<!--
Choose a value between (1 - 2) * number of CPU cores.
The default is 16.
-->
<background_pool_size>{{ 2 * number_of_cpu_cores }}</background_pool_size>
</default>
</profiles>
</clickhouse>
merge_max_block_size
merge_max_block_size configures the number of rows that are read from the merged parts into memory. Lowering this value will lower memory requirements at the expense of slower merges:
<!-- config.xml -->
<clickhouse>
<merge_tree>
<!--
Choose a value between 1024 and 4096.
The default is 8192.
-->
<merge_max_block_size>1024</merge_max_block_size>
</merge_tree>
</clickhouse>
max_bytes_to_merge_at_max_space_in_pool
max_bytes_to_merge_at_max_space_in_pool controls the maximum total parts size (in bytes) to be merged into one part.
<!-- config.xml -->
<clickhouse>
<merge_tree>
<!--
Choose a value between 1GB and 4GB.
The default is 150GB.
-->
<max_bytes_to_merge_at_max_space_in_pool>1073741824</max_bytes_to_merge_at_max_space_in_pool>
</merge_tree>
</clickhouse>
number_of_free_entries_in_pool_to_lower_max_size_of_merge
number_of_free_entries_in_pool_to_lower_max_size_of_merge
specifies the number of free entries in the pool when ClickHouse starts to lower the maximum size of merges to process. Because most of the time you won't have idle threads in the pool, set this value to 0
to let merges proceed:
<!-- config.xml -->
<clickhouse>
<merge_tree>
<number_of_free_entries_in_pool_to_lower_max_size_of_merge
>0</number_of_free_entries_in_pool_to_lower_max_size_of_merge>
</merge_tree>
</clickhouse>