KS DB Merge Tools logo KS DB Merge Tools
Documentation
KS DB Merge Tools for SQLite logo for SQLite
 
KS DB Merge Tools for Oracle logo
KS DB Merge Tools for MySQL logo
MssqlMerge logo
KS DB Merge Tools for PostgreSQL logo
AccdbMerge logo
KS DB Merge Tools for Cross-DBMS logo

Id-Remap merge dialog

This dialog is opened to specify mapping of auto-generated primary key values during merge.

All the examples on this help page are based on the following database schema:

for SQLite, Id-Remap Merge demo schema

All tables except OrderLogToOrderItem have auto-increment primary key and for most of the tables that primary key is used as a foreign key for other tables. For example, tables Client and Order are defined as:

CREATE TABLE "Client" (
	"ClientID" INTEGER PRIMARY KEY AUTOINCREMENT,
	"Name" TEXT NOT NULL,
	"Address" TEXT NOT NULL
);
CREATE TABLE "Order" (
	"OrderID" INTEGER PRIMARY KEY AUTOINCREMENT,
	"ClientID" INTEGER NOT NULL REFERENCES "Client" ("ClientID"),
	"Number" INTEGER NOT NULL,
	"Date" DateTime NOT NULL
);

Application uses Parent/Child terminology to describe relation between these tables. In the example above Client is a parent table for table Order because the last one has foreign key on "ClientID" REFERENCES "Client" ("ClientID"). In turn, the Order table is a parent table for child tables OrderItem and OrderLog.

That's a pretty common design for database model and approach to use auto-incremented or identity columns for primary keys. However, it can produce some complexities if we want to migrate some data from one database to another. For example, we need to import client ABC with ClientID = 3 and all his orders from the source database to a target database which already has ClientID = 3 used for a different client XYZ. We can't just update Name in the Client table because it will damage existing data for XYZ client. New Client table row needs to be created for ABC. Let's say it will get the next available ClientID = 7. Then when we import Order records for that client, we need to change ClientID from 3 to 7 respectively. New Order records will get the new IDs as well and we'll have to change the same way source OrderID values for tables OrderLog and OrderItem. And finally, OrderLogToOrderItem table will have to get new IDs of new OrderLog and OrderItem rows in the target database.

The Id-Remap Merge logic is designed to help with all these old-id-to-new-id mappings during import.

The basic idea of Id-Remap Merge is the following:

  • for each merged table we create a mapping table in the target database, with columns SourceID and TargetID
  • we INSERT parent row without specifying its ID, retrieve its auto-generated ID and INSERT new mapping table row with SourceID equals to parent ID value from source database and TargetID equals to newly generated parent row ID in the target database
  • then we INSERT child table row with replacing parent ID value from original/source to new/target using our mapping table

Such approach has a number of advantages:

  • the result of merge is a reusable script which can be tested first on production-copy database
  • if we leave mapping tables, then we can split merge into multiple steps, for example we can merge the reference/lookup tables as a step #1, verify its result, and then merge operation/activity tables as step #2. In our database model example, we can merge tables User, Client, Product with ProductLog as step #1 and then merge all Order tables as step #2, using mapping tables left from step #1
  • after merge completion, mapping tables can be used to recognize merge rows in the target database, to compare these rows with original rows in the source database, or to cleanup merge result without need to restore the full database backup

Id-Remap Merge is based on the table foreign key dependency levels (DL). Tables on this diagram are divided into groups by DL:

  • tables User, Client and Product have no foreign keys - they have DL = 0
  • tables Order and ProductLog have DL = 1 because they have only DL = 0 parents
  • tables OrderLog and OrderItem have DL = 2 because they are children for parent tables with DL 0 and 1
  • table OrderLogToOrderItem has DL = 3 because it is child for parent tables with DL = 2

Now let's consider some typical scenarios.

Scenario #1: Basic merge of parent+child table pair

Let's merge some rows from parent table Product, with depending rows in child table ProductLog.

Application has two places that can be used to initiate Id-Remap Merge, it is Data diff tab and Batch data diff tab. The difference is:

  • Data Diff tab allows to specify individual parent table records to merge, for example we can merge only ProductIDs 2, 3 and 5. Application will generate a script to INSERT Product rows only for these three IDs and all ProductLog rows related to these IDs
  • Batch Data Diff does not allow to filter by specific IDs but it allows to generate script for multiple parent tables. For example, you can select DL=0 tables User, Client and Product and generate Id-Remap Merge script for the whole database, including Order and Order-dependent data

So, to merge Product rows with dependant ProductLog rows, you need to choose one of two ways:

  • open Data Diff tab for Product table, select required products and click Merge
  • open Batch Data Diff tab, select Product table and click Merge

In both cases the application detects that you're going to merge rows with auto-generated IDs which are used as foreign keys, meaning that you may need to perform Id-Remap Merge. Application asks you about it and you need to select option 'Use Id-Remap Merge' in the opened dialog and click OK. Application the opens Id-Remap Merge configuration dialog:

for SQLite, Id-Remap basic merge

The first table list is a list of all the tables depending on the Product table. That's the 'Merge tables' list. You can see that it contains not only Product and ProductLog but it also contains OrderItem child of the Product parent table (it has column ProductID referencing to Product) and OrderLogToOrderItem as child of OrderItem. To help with understanding of these relations, this grid contains "Direct children" column. Tables in the list are ordered by DL then by name.

The dialog can also contain the list of 'Other parents' table list, in our example it is Order and OrderLog. This configuration allows the use of previously generated mappings. The Merge tables list explains that Order mapping can be required to merge OrderItem, and OrderLog mapping can be required to merge OrderLogToOrderItem table. See Scenario #2 to get more information about how to use Other parents. In this scenario we don't need to merge any information about orders, so we need to unselect the OrderItem table. OrderLogToOrderItem table is unselected automatically because it is depending on OrderItem. All the rules of automatic parent/child selection are listed in the dialog before the Merge tables list.

Dialog mentions that new mapping tables are going to be created and contains an option to remove these tables. Let's leave this option unchecked - we're going to leave these tables for Scenario #2.

Here is the example of the script which can be generated for ProductID = 2 having only one dependant row in the ProductLog:

-- 1) generate mapping tables:

CREATE TABLE "~map_240123_191534_Product" (
	"SourceID" INTEGER PRIMARY KEY, 
	"TargetID" INTEGER NOT NULL);

CREATE TABLE "~map_240123_191534_ProductLog" (
	"SourceID" INTEGER PRIMARY KEY, 
	"TargetID" INTEGER NOT NULL);

-- 2) INSERT parent row and save its ID in the mapping table:

INSERT INTO "Product" (
	"Name")
VALUES (
	'Source-only product 1');
INSERT INTO "~map_240123_191534_Product"
	VALUES (2, last_insert_rowid());

-- 3) INSERT child row and use saved mapping for ProductID:

INSERT INTO "ProductLog" (
	"ProductID",
	"Date",
	"Action")
VALUES (
	(SELECT "TargetID"
		FROM "~map_240123_191534_Product"
		WHERE "SourceID" = 2),
	'2023-01-11',
	'Created');
INSERT INTO "~map_240123_191534_ProductLog"
	VALUES (2, last_insert_rowid());

Generated mapping tables are based on the source table names and have specific table name prefix (~map_YYMMDD_HHMMSS_..), this is done in order to not mix up these tables with others in the tables list. And as we'll see later in Scenario #2, this name format is used to search for suitable existing mapping tables.

Scenario #2: Multi-step merge using existing mapping tables

Now let's assume that both source and target databases have the same User and Client rows, and required Product rows are already merged with Scenario #1. Now we need to merge orders data - Order table and all dependant children tables - OrderItem, OrderLog and OrderLogToOrderItem.

The same way as in scenario #1, select required Order rows from the Data Diff tab or select Order table in the Batch Data Diff and click Merge. In the opened dialog confirm that you need to use Id-Remap Merge. Opened Id-Remap Merge configuration looks like this:

for SQLite, Id-Remap multi-step merge

'Other parents' list has tables Client, Product and User. Client and User have Mapping table = N/A meaning that there were no mapping tables created previously. These N/A mappings are here to point your attention that the data we're going to merge depends on some other auto-generated IDs that are not going to be merged. If you have forgotten to merge these tables, then the merge script will use ClientID and UserID values from the source database, which potentially can be not good because in the target database these IDs can be missing or correspond to different entities. However in our scenario that's not the case because we agreed that User and Client rows are the same in both DBs.

The Product other parent table has Mapping table dropdown shown as "Not selected, 1 table(s) available". We need to select our mapping table generated by Scenario #1.

Let's click OK and we'll get the script. It also contains new mapping tables and INSERT statements for our data. We will not list all the script here, just let's look closer at one of OrderItem INSERT statements using our Other parent mapping:

INSERT INTO "OrderItem" (
	"OrderID",
	"ProductID",
	"Quantity")
VALUES (
	(SELECT "TargetID"
		FROM "~map_240123_194107_Order"
		WHERE "SourceID" = 1),
	(COALESCE ((SELECT "TargetID"
		FROM "~map_240123_191534_Product"
		WHERE "SourceID" = 2), 2)),
	112);
INSERT INTO "~map_240123_194107_OrderItem"
	VALUES (2, last_insert_rowid());

OrderItem.ProductID value of 2 in the source database is transformed into COALESCE((SELECT .. WHERE "SourceID" = 2), 2) - so inserted value is taken from our mapping from Scenario #1 if found, and used source value as is otherwise.

If we don't specify Other parent mapping, then the same OrderItem INSERT is generated as:

INSERT INTO "OrderItem" (
	"OrderID",
	"ProductID",
	"Quantity")
VALUES (
	(SELECT "TargetID"
		FROM "~map_240123_194258_Order"
		WHERE "SourceID" = 1),
	2,
	112);
INSERT INTO "~map_240123_194258_OrderItem"
	VALUES (2, last_insert_rowid());

Scenario #3: Merge All

In this scenario we assume that we need to do the Id-Remap Merge for all the data in the source database: clients, users, products and orders, with all information from parent and child tables.

Open the Batch Data Diff and select all tables with DL = 0: Client, Product and User (for large databases it would make sense to click on the DL column for target database to sort all tables by target DL and get all DL = 0 tables at the top of the list). Then click Merge, in the opened dialog select 'Use Id-Remap Merge' and choose to merge All rows:

for SQLite, Id-Remap merge all, batch data diff

The opened Id-Remap Merge configuration contains all tables that need to be merged, all tables are selected - that's what we need. And since we merge all tables - there are no Other parents, all parent tables are selected to merge:

for SQLite, Id-Remap merge all

Click OK, wait for the script generation and you'll get the result.

Limitations

As always, there are some limitations of using Id-Remap Merge. At first, it is suitable only for parent tables using auto-generated/identity single-column primary keys and child tables have foreign keys referencing these parent primary keys. Next, there are some limitations caused by our application design:

  • To apply Id-Remap Merge for multiple tables from the Batch Data Diff, only tables having the same DL must be selected. That's because the application then asks which records we want to merge - all/new/changed, and this filter is applied only for the top-DL tables of our merge. For all dependant tables selected in the Id-Remap Merge configuration dialog (different from originally selected in the Batch Data Diff) - for these tables application is processing all rows referencing to the parent tables selected in the Batch Data Diff. This could produce a logical conflict between these filters (all/new/changed filter vs parent id filter).
  • Id-Remap Merge can not be applied for tables without DL. DL is missing when the database has circular foreign key references.

Free version limitations:

  • The whole dialog is not available

Last updated: 2024-01-23