Use a view
I've had to tackle exactly this before, and the most reliable way to do it is to replicate to different tables, then create a view to union it all together.
- On the replication, you can set the article to use a different name on the publisher side.
- You can redirect the tables from each location into it's own schema. Depending on your existing use of schemas, you could replicate
dbo.Transactions to LocationA.Transactions or LocA_dbo.Transactions.
- You can use this feature to rename articles such that
dbo.Transactions becomes dbo.Transactions_LocationA.
- As an alternate to renaming, you could replicate each Publisher into it's own database, which entirely avoids naming conflicts, but potentially introduces some permissions headaches related to cross-database ownership chaining.
- Create a view that does a
UNION ALL of all the separate tables.
- This is really just partitioning without using the feature of the same name.
- In the view definition you can add a column with the source location listed as a constant to identify the data in the resultant view.
Some caution
In the above plan, I'd suggest you make sure you avoid SELECT * in the view for all the usual reasons. If a schema change is made to different publishers at different times, the view will likely be broken from the time the first table's and is changed until the final one is changed. Instead, explicitly list the columns and only update the view once the schema change is everywhere.
The same schema change considerations need to be made when replicating into a single table, as well. Though I'm that case, it's more likely to break replication from sending, rather than just breaking the view.
Many-to-one Replication
The way the Snapshot agent works is that it essentially just automates using BCP to export from the publisher and import to the subscriber. The default option is to truncate and reload when you re-initialize a publication. You can also change to use delete instead of truncate, but that will use a single, unbatched DELETE statement, which can cause blocking and transaction log bloat.
If your multiple publishers have overlapping PKs, then you'll need to uniquify them, much as you suggest. However, this can impact performance--potentially a significant cost. In addition to the size considerations of adding the column to every PK, if your PK is also your clustered index, the uniquifier will also be included in every non clustered index.
You'll also need to ensure the uniquifier is added to the END of the PK definition to not break SARGability of existing queries. However, even if you do this, you may notice changes in query plans that cause performance regressions.
The query optimizer knows that if ID is a single column PK, then ID = @id will return at most a single row. The same cardinality rules are used during optimization of set based queries and joins. Thus you may start seeing changes in query plans where a 1:1 join is now interpreted as a 1:many join. This can be further mitigated by adding a unique index on the "old" PK. You may even choose to keep the "old" PK as a unique clustered index, and making the "new" PK a non-clustered PK.
The various challenges with adding replication from many targets to a single subscriber table make it a very challenging solution. It requires significant changes to the publisher databases. I would recommend against this option, except in green field development where the schema and performance can be taken into consideration from the start.
Additionally, the inevitable need to re-snapshot a publisher means carefully deleting just the appropriate rows from the subscriber. Using Partitioning with a partition per publisher can help here, but introduces a different set of complications. IMHO, the pseudo-partitioning is an easier to manage solution long term.
Replicating to unique targets ensures that the publishers don't need significant changes and testing, and eases the ongoing support burden that will be involved in a single 1:many replication target