Preambule
I have MSSQL 2016 instance with CDC enabled. And I stream changes by Debezium kafka connector. Connector in logs and topics refer Log Sequence Number (LSN) in form of 000000aa:af9c3f2d:0f5e1. In MSSQL I have tables like cdc.dbo_Entity_cdc_CT where present columns __$start_lsn, __$start_lsn, __$end_lsn but it has type varbinary(10).
I would match values.
Based mostly on information in Day 11 of 31 Days of Disaster: Converting LSN Formats article which give hint it should be converted in chain convert to binary style 1 -> bigint, I wrote helper functions for that: https://github.com/Hubbitus/sql/commit/4ccf96492f2aba9796ccaecb1b74e9ba53a45895#diff-e495aa591f98a29c28153735379ad6fd
It mostly works. Contains descriptions and even testing conversion doing right.
But there is one problem I can't solve now.
8 byte part exceeded bigint
When I convert input varbinary(16) value, I split it into two parts: 2 and 8 bytes length, each convert into bigint, and then into varchar, f.e.:
DECLARE @lsn VARBINARY(10) ...
-- Fill by \0 at left
SET @lsn = CONVERT(varbinary, REPLICATE(char(0), 10 - LEN(@lsn))) + @lsn
DECLARE @lsn_s varchar(max) = CONVERT(varchar, CONVERT(bigint, SUBSTRING(@lsn, 1, 2), 1)) + CONVERT(varchar, CONVERT(bigint, SUBSTRING(@lsn, 3, 8), 1))
It works in most cases (f.e. for test table I see only 4 records with error), f.e. for value like:
DECLARE @lsn varbinary(10) = CONVERT(varbinary(10), '0x025D020F35B80001', 1)
You may check it like:
SELECT dbo.__tmp_td_cdc_decode_check(CONVERT(varbinary(10), '0x025D020F35B80001', 1))
But fails, said on:
DECLARE @lsn varbinary(10) = CONVERT(varbinary(10), '0x0043275D010200440020', 1)
SELECT dbo.__tmp_td_cdc_decode_check(@sql)
Indeed, MSSQL have not unsigned datatypes.
The main question
I do not understand how to work around this problem in line:
DECLARE @lsn_s varchar(max) = CONVERT(varchar, CONVERT(bigint, SUBSTRING(@lsn, 1, 2), 1)) + CONVERT(varchar, CONVERT(bigint, SUBSTRING(@lsn, 3, 8), 1))
How I should correctly transform varbinary(10) into varchar in form of decimal digits for any cases?
What I also do not understand, sometimes I got error "data truncation" on conversions, but sometimes it happened silently. Please look:
DECLARE @lsn varbinary(10) = CONVERT(varbinary(10), '0x0043275D010200440020', 1) -- Bigint overflow
SELECT
dbo.__tmp_td_cdc_decode_check(@lsn) as ToCheck
,CONVERT(varchar, @lsn, 1) as orig_lsn_hex
,CONVERT(bigint, @lsn, 1) as orig_lsn_bigint -- In many cases it will lead overflow!
,dbo.__tmp_td_cdc_binary_decode_to_string(@lsn) as string
,dbo.__tmp_td_cdc_string_decode_to_binary_DECSTRING(dbo.__tmp_td_cdc_binary_decode_to_string(@lsn)) as decimal_string_reverse
Output:
┌─────────┬────────────────────────┬─────────────────────┬─────────────────────────┬────────────────────────┐
│ ToCheck │ orig_lsn_hex │ orig_lsn_bigint │ string │ decimal_string_reverse │
├─────────┼────────────────────────┼─────────────────────┼─────────────────────────┼────────────────────────┤
│ Error │ 0x0043275D010200440020 │ 2836424448419299360 │ 000012e6:2f707ba4:02580 │ 4838079590083609600 │
└─────────┴────────────────────────┴─────────────────────┴─────────────────────────┴────────────────────────┘
Hexadecimal value 0x0043275D010200440020 should be 1_238_768_277_386_959_257_632 and not 2_836_424_448_419_299_360 what I get from CONVERT(bigint, @lsn, 1)
P.S. Also interesting and helpful materials:
DECLARE @lsn varbinary(10) = CONVERT(varbinary(10), '0x0043275D010200440020', 1)can't generate an Overflow error for abigint; there are nobigints in that statement. Also, in some of your code you're declaring avarcharwithout a length; which is a really bad idea.bigintwould be 1,238,768,277,386,959,257,632. The largest number you can store in abigintis 9,223,372,036,854,775,807. So yes, your hex values is over 100 times too large.varbinary(10)would be0xFFFFFFFFFFFFFFFFFFFF, which as an integer would be 1,208,925,819,614,629,174,706,175. If you do need to store values as large as that, yours going to need adecimal(25,0).4838079590083609600intobinary(10)? My variant works almost all times, but in rare cases therebigintoverflow happened (and conversion silently incorrect).0x0043275D010200440020is too big for abigint, you'll need to use a different datatype likedecimal. However,binaryvalues of adecimalare not the same as aint. Thedecimal1238768277386959257632as abinary(10), for example, is0x19000001200044000201.