šļø 03022025 1107
š
hologres_segment_key
If you want to perform UPDATE operations based on primary keys or perform queries that involve range filter conditions on large-sized datasets, you can configure event time columns for tables. The system sorts data files based on the event time columns and combine the data files. This reduces duplicate files and filters out files as many as possible. This way, the query efficiency is improved. Proper settings of event time columns help improve the processing efficiency, query speed, and overall performance of databases. This topic describes how to configure event time columns for tables in Hologres.
Overviewā
In Hologres V0.9, the name of the segment_key property is renamed event_time_column by default. However, the segment_key property can still be used in Hologres V0.9 and earlier.
Event time columns are applicable to the following scenarios:
-
Queries that contain range filter conditions, including equivalent conditions.
-
UPDATE operations based on primary keys.
An event time column needs to be specified for a table when you create the table. The following syntax is used to configure an event time column:
-- Syntax supported in Hologres V2.1 and later
CREATE TABLE <table_name> (...) WITH (event_time_column = '[<columnName>[,...]]');
-- Syntax supported in all Hologres versions
BEGIN;
CREATE TABLE <table_name> (...);
call set_table_property('<table_name>', 'event_time_column', '[<columnName> [,...]]');
COMMIT;
The following table describes parameters in the preceding syntax.
Parameter | Description |
table_name | The name of the table for which you want to configure the event time column. |
columnName | The name of the column that you want to configure as the event time column. |
Usage notesā
-
We recommend that you configure the columns where data monotonically increases or decreases, such as timestamp columns, as event time columns. The event_time_column property is applicable to columns that are strongly correlated with time, such as log and traffic columns. Appropriate setting of this property can improve query performance. After multiple files are merged, if data in the event time column is unordered, the data in each file has no distinguishing features and remains unordered. In this case, the data filtering effect is poor.
-
If a table does not contain columns where data monotonically increases or decreases, you can add anĀ
update_time
Ā column and use it as the event time column. Each time you perform the UPSERT operation, the current time is written to the update_time column. -
Data queries based on event time columns follow the leftmost matching principle. Therefore, we recommend that you do not specify multiple event time columns for a table. Otherwise, data queries based on the event time columns cannot be accelerated in some scenarios. In most cases, we recommend that you specify one or two event time columns for a table.
Limitsā
-
The event time columns of a table must meet the NOT NULL constraint. In Hologres V1.3.20 to V1.3.27, you can specify a column that contains null values as an event time column. In Hologres V1.3.28 and later, you cannot specify a column that contains null values as an event time column. This is because if an event time column contains null values, data accuracy may be adversely affected. If you must specify a column that contains null values as an event time column based on your business requirements, you can add the following SET command before the CREATE TABLE statement:
set hg_experimental_enable_nullable_segment_key = true;
You can execute SQL statements to check whether event time columns contain null values. Sample statements:
WITH t_base AS (
SELECT
*
FROM
hologres.hg_table_info
WHERE
collect_time::date = CURRENT_DATE
),
t1 AS (
SELECT
db_name,
schema_name,
table_name,
jsonb_array_elements(table_meta::jsonb -> 'columns') cols
FROM
t_base
),
t2 AS (
SELECT
db_name,
schema_name,
table_name,
cols ->> 'name' col_name
FROM
t1
WHERE
cols -> 'nullable' = 'true'::jsonb
),
t3 AS (
SELECT
db_name,
schema_name,
table_name,
regexp_replace(regexp_split_to_table(table_meta::jsonb ->> 'segment_key', ','), ':asc|:desc$', '') segment_key_col
FROM
t_base
WHERE
table_meta::jsonb -> 'segment_key' IS NOT NULL
)
SELECT
CURRENT_DATE,
t3.db_name,
t3.schema_name,
t3.table_name,
jsonb_build_object('nullable_segment_key_column', string_agg(t3.segment_key_col, ',')) as nullable_segment_key_column
FROM
t2,
t3
WHERE
t3.db_name = t2.db_name
AND t3.schema_name = t2.schema_name
AND t3.table_name = t2.table_name
AND t2.col_name = t3.segment_key_col
GROUP BY
t3.db_name,
t3.schema_name,
t3.table_name; -
You cannot modify the event_time_column property. If you want to modify the property, create another table.
-
You cannot configure the event_time_column property for a row-oriented table.
-
By default, the first non-null column of the TIMESTAMP or TIMESTAMPTZ data type in a column-oriented table is used as an event time column. If such a column does not exist, the first non-null column of the DATE data type is used as an event time column by default. In versions earlier than Hologres V0.9, no event time column is specified for a table by default.
-
The columns of the DECIMAL, NUMERIC, FLOAT, DOUBLE, ARRAY, JSON, JSONB, BIT, and MONEY data types and other complex data types are not supported.
How it worksā
The following figure shows how data in a shard is written.