I am trying to create a regex for this task, but I really can't grasp the understanding of regex apart from very simple cases :-( :
The problem: I have this ("SQL like") query:
SELECT tcmcs003.*, tccom130.nama, tccom705.dsca, tcmcs052.dsca, tccom100.nama
FROM tcmcs003, tccom130,tccom705,tcmcs052,tccom100
WHERE tcmcs003.cadr REFERS TO tccom130
AND tcmcs003.casi REFERS TO tccom705
AND tcmcs003.cprj REFERS TO tcmcs052
AND tcmcs003.bpid REFERS TO tccom100
ORDER BY tcmcs003._index1
I want to "extract" all the table names and column names, and after that I want to simply add my characters to them... For example replace:
SELECT tcmcs003.*, tccom130.nama
with:
SELECT tcmcs003XXX.*, tccom130XXX.namaYYY
Up to now I have the "best" regex I have is this:
(?<gselect>SELECT\s+)*(?<tname>\w{5}\d{3})*(?<spaces>[\.\,\s])+(?<colname>\w{4})*
And replacement pattern:
${gselect}${tname}XXX${spaces}${colname}YYY
The output is really terrible :-(
SELECT tcmcs003.
m130
.nama
m705
.dsca
s052
.dsca
m100
.nama
FROM
s003
m130
,m705
,s052
,m100
WHER
s003
.cadr
REFE
m130
s003
How can I write the regex?
I want to capture repeteately something like
[(any string)(table name)(\.a dot or not)(column name)(any string) ] (repeat N times)
EDIT
I am writing in C#
The pattern should be a bit more general that: \b(tc(?:mcs|com)\d{3}XXX.\w+)\b
in the sense that table name is 5 characters (the first is always a t, followed by 4 random chars) followed by 3 random digits
table column is 4 random chars