I have several thousands of *.sql files all containing differently formatted T-SQL text and I want to match all the schema names and object names from those files.
I have seperated the following possible formats of schema names + object names
- dbo.[MainPart.SubPartA.SubPartB.SubPartC]
- [dbo].[MainPart.SubPartA.SubPartB.SubPartC]
- specialschema.[MainPart.SubPartA.SubPartB.SubPartC]
- specialschema.MainPart
- [MainPart.SubPartA.SubPartB.SubPartC]
I already created a regex to match the first four cases
(\[{0,1}(?<schema>\b\w*?\b){0,1}\]{0,1}\.){0,1}\[{0,1}(?<object>(\w|\.)+)\]{0,1}
It will create two groups "schema" and "object" for every match.
The problem is that the last case states that schema="MainPart" and object="SubPartA.SubPartB.SubPartC"
At this moment I am considering to break the regex into several parts to make it simpler (and more readeable), because I already have the match, only the groups aren't correct.
Or is there another regex technique to get the correct groups for all five cases (and still maintain or even improve the readability)?
An example of a SQL file:
/******************************************************************
Comment block
******************************************************************/
CREATE PROCEDURE [MainPart.SubPartA.SubPartB.SubPartC]
@Param INT = NULL
AS
BEGIN
SET NOCOUNT ON
SELECT * FROM dbo.[Table] WHERE fldParam = @Param
SET NOCOUNT OFF
END
GO