Issues with Flink CDC for SQL Server to SQL Server Data Synchronization

Question

Description:

Background: I am new to Flink and have no prior experience with big data. My knowledge of Java or Linux is also limited.
Requirement: I want to use Flink CDC to perform a simple data table synchronization test between SQL Server databases, Specifically, using yaml files instead of java code
Environment Information:

name	version
flink	flink-1.19.1
Flink CDC	flink-cdc-3.1.0
SQL Server	SQL Server 2022
JDK	OpenJDK 11.0.23
Operating system	centos7.9

Steps and Issues Encountered:

first conduct environmental verification,

using the flink sql client ,executing create table and insert into table commands, successfully synchronizing data from TEST_FOR_FLINK.dbo.orders to DW.dbo.orders_dw.

Then I followed the documentation here :
Extract the following file to the corresponding directory of /flink-1.19.1:

flink-cdc-3.1.0-bin.tar.gz

Copy the following files to /flink-1.19.1/lib/:

flink-connector-jdbc-3.1.2-1.18.jar
mssql-jdbc-12.6.3.jre11.jar
flink-sql-connector-sqlserver-cdc-3.1.0.jar

Copy the following file mssql-to-mssql-test01.yaml to /flink-1.19.1/pipeline/.

source:
  type: sqlserver-cdc
  hostname: x.x.x.x
  port: 1433
  username: sa
  password: xxxxxx
  database: TEST_FOR_FLINK
  tables: dbo.orders
  server-time-zone: UTC

sink:
  type: jdbc
  driver: com.microsoft.sqlserver.jdbc.SQLServerDriver
  url: jdbc:sqlserver://x.x.x.x:1433;databaseName=DW
  username: sa
  password: xxxxxx
  table-name: dbo.orders_dw

pipeline:
  name: Sync MSSQL Database to MSSQL
  parallelism: 2

Execute the following command:

[root@master pipeline]# flink-cdc.sh mssql-to-mssql-test01.yaml

Error Message:


Exception in thread "main" java.lang.RuntimeException: Cannot find factory with identifier "sqlserver-cdc" in the classpath.

Available factory classes are:

        at org.apache.flink.cdc.composer.utils.FactoryDiscoveryUtils.getFactoryByIdentifier(FactoryDiscoveryUtils.java:62)
        at org.apache.flink.cdc.composer.flink.translator.DataSourceTranslator.translate(DataSourceTranslator.java:47)
        at org.apache.flink.cdc.composer.flink.FlinkPipelineComposer.compose(FlinkPipelineComposer.java:101)
        at org.apache.flink.cdc.cli.CliExecutor.run(CliExecutor.java:71)
        at org.apache.flink.cdc.cli.CliFrontend.main(CliFrontend.java:71)

I hope someone can point out what I did wrong. The official documentation does not provide an ETL example for SQL Server to SQL Server use yaml file. If YAML files cannot be used for data transfer, where can I find documents or java project demo to use Java + Table API for data synchronization?

leonard · Accepted Answer · 2024-07-16 08:15:26Z

1

sqlserver-cdc pipeline connector is not supported yet, you can wait the PR https://github.com/apache/flink-cdc/pull/3445

answered Jul 16, 2024 at 8:15

leonard

1214 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Issues with Flink CDC for SQL Server to SQL Server Data Synchronization

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related