DDL with Index and Hash Partition on Postgres Aurora

Question

Hi i am trying to create a DDL in PostgreSQL Aurora with Hash partition on OBJECT_ID. Also i want to create a Index on CUSTOMER_ID,OBJECT_TYPE,OBJECT_ID,PT_EVENT_ID

CREATE TABLE event_test(
   ID varchar(255) PRIMARY KEY NOT NULL,
   VERSION int(11) NOT NULL,
   ORDER_TYPE varchar(255) NOT NULL,
   EVENT_TYPE varchar(255) NOT NULL,
   CUSTOMER_ID varchar(255) DEFAULT NULL,
   DETAILS text,
   OBJECT_TYPE varchar(255) NOT NULL,
   UTC_DATE_TIME date DEFAULT NULL,
   EVENT_TO_UTC_DT date DEFAULT NULL,
   GROUP_ID varchar(255) DEFAULT NULL,
   OBJECT_NAME  varchar(2001) DEFAULT NULL,
   OBJECT_ID  varchar(255) DEFAULT NULL,
   USER_NAME  varchar(1500) DEFAULT NULL,
   USER_ID  varchar(255) DEFAULT NULL,
   PT_EVENT_ID  varchar(255) DEFAULT NULL,
   CUSTOM_NOTES  varchar(1000) DEFAULT NULL,
   SUMMARY  varchar(4000) DEFAULT NULL
);

Can some please help me with the DDL .

Note that varchar(255) is not by any means more efficient than varchar(257) - the 255 limit does not open up any magic performance optimizations if you expected that. — user330315
– user330315, Commented Mar 9, 2021 at 9:05
Are those IDs numbers? If yes, you should better store them in integer or bigint columns. Don't store everything in varchar(255). And `int(11)´ is not valid for Postgres to begin with — user330315
– user330315, Commented Mar 9, 2021 at 9:08
@a_horse_with_no_name so ID is GUID thanks for the quick help — Atharv Thakur
– Atharv Thakur, Commented Mar 9, 2021 at 9:10
@LaurenzAlbe i want to use hash partition because most of the my access is based on Object_id guid again .I have 10 TB size of the table and i do not have any column on which data can be distributed — Atharv Thakur
– Atharv Thakur, Commented Mar 9, 2021 at 9:17

score 2 · Accepted Answer · 2021-03-09 09:27:04Z

2

If all those IDs are in fact UUIDs, the columns should be defined with the uuid type.

The supposedly "magic" limit of 255 does not enable some hidden performance or storage optimizations (at least in Postgres). So blindly using varchar(255) doesn't really make sense (of course if you have a valid business requirement that a value for order_type or event_type may never be longer than 255 characters, then of course keep that constraint.

As documented in the manual there is also no "length" parameter for the integer data type (and it's not a value restriction in MySQL either, so it's pretty much useless to begin with).

So the DDL should be something like this:

CREATE TABLE event_test(
   ID              uuid PRIMARY KEY NOT NULL,
   VERSION         integer NOT NULL,
   ORDER_TYPE      varchar(255) NOT NULL,
   EVENT_TYPE      varchar(255) NOT NULL,
   CUSTOMER_ID     uuid DEFAULT NULL,
   DETAILS         text,
   OBJECT_TYPE     varchar(255) NOT NULL,
   UTC_DATE_TIME   date DEFAULT NULL,
   EVENT_TO_UTC_DT date DEFAULT NULL,
   GROUP_ID        uuid DEFAULT NULL,
   OBJECT_NAME     varchar(2001) DEFAULT NULL,
   OBJECT_ID       uuid DEFAULT NULL,
   USER_NAME       varchar(1500) DEFAULT NULL,
   USER_ID         uuid DEFAULT NULL,
   PT_EVENT_ID     uuid DEFAULT NULL,
   CUSTOM_NOTES    varchar(1000) DEFAULT NULL,
   SUMMARY         varchar(4000) DEFAULT NULL
);

To create an index, you use create index:

create index on event_test (customer_id,object_type,object_id,pt_event_id);

If "most of your access" is through the object_id then you need an index where that is the leading column:

create index on event_test (object_id);

Hash partitioning won't really help you there to make things faster.

You can use partitioning for the table, but this is hardly a performance tool. Due to the limitations of the Postgres partitioning implementation you will also be forced to include the id column in the partitioning key if you want to keep that as the primary key. But given your statement that "most access is through object_id the partitioning key (id, object_id) wouldn't help you at all.

edited Mar 9, 2021 at 9:27

answered Mar 9, 2021 at 9:18

user330315

Sign up to request clarification or add additional context in comments.

4 Comments

user330315 Over a year ago

@LaurenzAlbe: true, I changed it to relate to performance

Atharv Thakur Over a year ago

@a_horse_with_no_name you are right about performance .But i have this single table 10 TB data without any partition .Also the limitation is not valid for psotgress 12.4 Aurora so we can create partition on a column which is not unique and not part of primary key

user330315 Over a year ago

@AtharvThakur: I have no experience with Aurora, but with "vanilla" Postgres you can not use partition by hash (object_id); as long as id is defined as a primary key. You would need to define the primary key as (object_id, id) if you want partition by hash (object_id, id) - but then object_id must be declared as not null which you currently don't have. It's still totally unclear what problem exactly you are trying to solve with "partitioning"

Atharv Thakur Over a year ago

@a_horse_with_no_name so there are other sets of challenges we are facing with this big tables like if we have delete rows from this big table partition can help also aggregation query on this table is very very slow so i think i can run long running query partition wise

Collectives™ on Stack Overflow

DDL with Index and Hash Partition on Postgres Aurora

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related