0

Hi i am trying to create a DDL in PostgreSQL Aurora with Hash partition on OBJECT_ID. Also i want to create a Index on CUSTOMER_ID,OBJECT_TYPE,OBJECT_ID,PT_EVENT_ID

CREATE TABLE event_test(
   ID varchar(255) PRIMARY KEY NOT NULL,
   VERSION int(11) NOT NULL,
   ORDER_TYPE varchar(255) NOT NULL,
   EVENT_TYPE varchar(255) NOT NULL,
   CUSTOMER_ID varchar(255) DEFAULT NULL,
   DETAILS text,
   OBJECT_TYPE varchar(255) NOT NULL,
   UTC_DATE_TIME date DEFAULT NULL,
   EVENT_TO_UTC_DT date DEFAULT NULL,
   GROUP_ID varchar(255) DEFAULT NULL,
   OBJECT_NAME  varchar(2001) DEFAULT NULL,
   OBJECT_ID  varchar(255) DEFAULT NULL,
   USER_NAME  varchar(1500) DEFAULT NULL,
   USER_ID  varchar(255) DEFAULT NULL,
   PT_EVENT_ID  varchar(255) DEFAULT NULL,
   CUSTOM_NOTES  varchar(1000) DEFAULT NULL,
   SUMMARY  varchar(4000) DEFAULT NULL
);

Can some please help me with the DDL .

8
  • 1
    Note that varchar(255) is not by any means more efficient than varchar(257) - the 255 limit does not open up any magic performance optimizations if you expected that. Commented Mar 9, 2021 at 9:05
  • 1
    Are those IDs numbers? If yes, you should better store them in integer or bigint columns. Don't store everything in varchar(255). And `int(11)´ is not valid for Postgres to begin with Commented Mar 9, 2021 at 9:08
  • @a_horse_with_no_name so ID is GUID thanks for the quick help Commented Mar 9, 2021 at 9:10
  • 2
    Then you should use the uuid data type. Commented Mar 9, 2021 at 9:11
  • 1
    @LaurenzAlbe i want to use hash partition because most of the my access is based on Object_id guid again .I have 10 TB size of the table and i do not have any column on which data can be distributed Commented Mar 9, 2021 at 9:17

1 Answer 1

2

If all those IDs are in fact UUIDs, the columns should be defined with the uuid type.

The supposedly "magic" limit of 255 does not enable some hidden performance or storage optimizations (at least in Postgres). So blindly using varchar(255) doesn't really make sense (of course if you have a valid business requirement that a value for order_type or event_type may never be longer than 255 characters, then of course keep that constraint.

As documented in the manual there is also no "length" parameter for the integer data type (and it's not a value restriction in MySQL either, so it's pretty much useless to begin with).

So the DDL should be something like this:

CREATE TABLE event_test(
   ID              uuid PRIMARY KEY NOT NULL,
   VERSION         integer NOT NULL,
   ORDER_TYPE      varchar(255) NOT NULL,
   EVENT_TYPE      varchar(255) NOT NULL,
   CUSTOMER_ID     uuid DEFAULT NULL,
   DETAILS         text,
   OBJECT_TYPE     varchar(255) NOT NULL,
   UTC_DATE_TIME   date DEFAULT NULL,
   EVENT_TO_UTC_DT date DEFAULT NULL,
   GROUP_ID        uuid DEFAULT NULL,
   OBJECT_NAME     varchar(2001) DEFAULT NULL,
   OBJECT_ID       uuid DEFAULT NULL,
   USER_NAME       varchar(1500) DEFAULT NULL,
   USER_ID         uuid DEFAULT NULL,
   PT_EVENT_ID     uuid DEFAULT NULL,
   CUSTOM_NOTES    varchar(1000) DEFAULT NULL,
   SUMMARY         varchar(4000) DEFAULT NULL
);

To create an index, you use create index:

create index on event_test (customer_id,object_type,object_id,pt_event_id);

If "most of your access" is through the object_id then you need an index where that is the leading column:

create index on event_test (object_id);

Hash partitioning won't really help you there to make things faster.


You can use partitioning for the table, but this is hardly a performance tool. Due to the limitations of the Postgres partitioning implementation you will also be forced to include the id column in the partitioning key if you want to keep that as the primary key. But given your statement that "most access is through object_id the partitioning key (id, object_id) wouldn't help you at all.

Sign up to request clarification or add additional context in comments.

4 Comments

@LaurenzAlbe: true, I changed it to relate to performance
@a_horse_with_no_name you are right about performance .But i have this single table 10 TB data without any partition .Also the limitation is not valid for psotgress 12.4 Aurora so we can create partition on a column which is not unique and not part of primary key
@AtharvThakur: I have no experience with Aurora, but with "vanilla" Postgres you can not use partition by hash (object_id); as long as id is defined as a primary key. You would need to define the primary key as (object_id, id) if you want partition by hash (object_id, id) - but then object_id must be declared as not null which you currently don't have. It's still totally unclear what problem exactly you are trying to solve with "partitioning"
@a_horse_with_no_name so there are other sets of challenges we are facing with this big tables like if we have delete rows from this big table partition can help also aggregation query on this table is very very slow so i think i can run long running query partition wise

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.