🗓️ 03022025 1403
📎

hologres_table_creation_best_practises

Choose the Right hologres_distribution_key
- Select a column with high cardinality and evenly distributed values as the distribution key.
- This ensures data is evenly distributed across shards, preventing data skew and improving query performance for JOINs and GROUP BY operations.
Set an Appropriate Shard Count
- The hologres_shard count should be proportional to the number of hologres_worker_compute_nodes in your instance.
- Avoid setting the shard count higher than the number of computing cores in your instance to prevent resource contention.
Assign Tables to the Correct hologres_table_group
- Place related tables in the same table group to enable local joins, which are faster and more efficient.
- Avoid creating unnecessary table groups, as each shard occupies memory, even if unused.
Consider Data Volume and Query Patterns
- For small datasets (e.g., thousands of rows), set the shard count to 1 to minimize overhead.
- For large datasets or high-concurrency workloads, increase the shard count to improve parallelism.

References