23. Comparing databases

23. Comparing databases
Prev		Next

23.1. Compare two database schemas - WbSchemaDiff
23.2. Compare data across databases - WbDataDiff

There are two SQL Workbench/J specific commands that can compare either the structure of two databases or the data contained in them.

These commands (WbSchemaDiff and WbDataDiff) can be used like any other SQL command as long as they are run using SQL Workbench/J This includes the usage in scripts that are run in batch mode.

23.1. Compare two database schemas - WbSchemaDiff

WbSchemaDiff analyzes two schemas (or a list of tables) and outputs the differences between those schemas as an XML file. The XML file describes the changes that need to be applied to the target schema to have the same structure as the reference schema, e.g. modify column definitions, remove or add tables, remove or add indexes.

The output is intended to be transformed using XSLT (e.g. with the WbXSLT Command). Sample XSLT transformations are stored in the xslt subdirectory of the SQL Workbench/J installation directory. All scripts that are part of the download can also be found on the SQL Workbench/J homepage


	This feature should only be considered as a one-off solution to quickly compare to database schemas. Is not intended to replace a proper schema (script) management. You should consider tools like Liquibase or Flyway to manage a database schema. Those scripts should also be stored in a version control system (Subversion, GIT, ...)

The command supports the following parameters:

Parameter	Description
-referenceProfile	The name of the connection profile for the reference connection. If this is not specified, then the current connection is used.
-referenceGroup	If the name of your reference profile is not unique across all profiles, you will need to specify the group in which the profile is located with this parameter.
-referenceConnection	Allows to specify a full connection definition as a single parameter (and thus does not require a pre-defined connection profile). The connection is specified with a comma separated list of key value pairs: `username` - the username for the connection `password` - the password for the connection `url` - the JDBC URL `driver` - the class name for the JDBC driver. If this is not specified, SQL Workbench/J will try to determine the driver from the JDBC URL `driverJar` - the full path to the JDBC driver. This not required if a driver for the specified class is already configured e.g.: `"username=foo,password=bar,url=jdbc:postgresql://localhost/mydb"` For a sample connection string please see the documentation for WbCopy. If this parameter is specified `-referenceProfile` will be ignored.
-targetProfile	The name of the connection profile for the target connection (the one that needs to be migrated). If this is not specified, then the current connection is used. If you use the current connection for reference and target, then you should prefix the table names with schema/user or use the `-referenceschema` and `-targetschema` parameters.
-targetGroup	If the name of your target profile is not unique across all profiles, you will need to specify the group in which the profile is located with this parameter.
-targetConnection	Allows to specify a full connection definition as a single parameter (and thus does not require a pre-defined connection profile). The connection is specified with a comma separated list of key value pairs: `username` - the username for the connection `password` - the password for the connection `url` - the JDBC URL `driver` - the class name for the JDBC driver. If this is not specified, SQL Workbench/J will try to determine the driver from the JDBC URL `driverJar` - the full path to the JDBC driver. This not required if a driver for the specified class is already configured e.g.: `"username=foo,password=bar,url=jdbc:postgresql://localhost/mydb"` For a sample connection string please see the documentation for WbConnect. If this parameter is specified `-targetProfile` will be ignored.
-file	The filename of the output file. If this is not supplied the output will be written to the message area
-referenceTables	A (comma separated) list of tables that are the reference tables, to be checked.
-targetTables	A (comma separated) list of tables in the target connection to be compared to the source tables. The tables are "matched" by their position in the list. The first table in the `-referenceTables` parameter is compared to the first table in the `-targetTables` parameter, and so on. Using this parameter you can compare tables that do not have the same name. If you omit this parameter, then all tables from the target connection with the same names as those listed in `-referenceTables` are compared. If you omit both parameters, then all tables that the user can access are retrieved from the source connection and compared to the tables with the same name in the target connection.
-referenceSchema	Compare all tables from the specified schema (user)
-targetSchema	A schema in the target connection to be compared to the tables from the reference schema.
-excludeTables	A comma separated list of tables that should not be compared. If tables from several schemas are compared (using `-referenceTables=schema_one., schema_two.`) then the listed tables must be qualified with a schema, e.g. `-excludeTables=schema_one.foobar, schema_two.fubar`
-encoding	The encoding to be used for the XML file. The default is UTF-8
-includePrimaryKeys	Select whether primary key constraint definitions should be compared as well. The default is `true`. Valid values are `true` or `false`.
-includeForeignKeys	Select whether foreign key constraint definitions should be compared as well. The default is `true`. Valid values are `true` or `false`.
-includeTableGrants	Select whether table grants should be compared as well. The default is `false`.
-includeTriggers	Select whether table triggers are compared as well. The default value is `true`.
-includeConstraints	Select whether table and column (check) constraints should be compared as well. SQL Workbench/J compares the constraint definition (SQL) as stored in the database. The default is to compare table constraints (`true`) Valid values are `true` or `false`.
-useConstraintNames	When including check constraints this parameter controls whether constraints should be matched by name, or only by their expression. If comparing by names the diff output will contain elements for constraint modification otherwise only drop and add entries will be available. The default is to compare by names(`true`) Valid values are `true` or `false`.
-includeViews	Select whether views should also be compared. Note that this comparison is very unreliable, because this compares the source code, not the logical representation of the view definition. The source code is compared the way it is returned by the DBMS is compared. This comparison is case-sensitiv, which means `SELECT * FROM foo;` will be reported as a difference to `select * from foo;` even if they are logically the same. Aslo different indention or empty lines will result in the views being reported as "different". A comparison across different DBMS will not work. The default is `true` Valid values are `true` or `false`.
-includeProcedures	Select whether stored procedures should also be compared. When comparing procedures the source as it is stored in the DBMS is compared. This comparison is case-sensitive. A comparison across different DBMS will also not work! The default is `false` Valid values are `true` or `false`.
-includeIndex	Select whether indexes should be compared as well. The default is to not compare index definitions. Valid values are `true` or `false`.
-includeSequences	Select whether sequences should be compared as well. The default is to not compare sequences. Valid values are `true`, `false`.
-useJdbcTypes	Define whether to compare the DBMS specific data types, or the JDBC data type returned by the driver. When comparing tables from two different DBMS it is recommended to use `-useJdbcType=true` as this will make the comparison a bit more DBMS-independent. When comparing e.g. Oracle vs. PostgreSQL a column defined as `VARCHAR2(100)` in Oracle would be reported as being different to a `VARCHAR(100)` column in PostgreSQL which is not really true As both drivers report the column as java.sql.Types.VARCHAR, they would be considered as identical when using `-useJdbcType=true`. Valid values are `true` or `false`.
-additionalTypes	Select additional object types that are not compared by default (using the `-includeXXX` parameters) such as Oracle `TYPE` definitions. Those objects are compared on source code level (like procedures) rather than on attribute level. Valid values are object type names as shown in the "Type" drop down in the DbExplorer.
-xsltParameter	A list of parameters (key/value pairs) that should be passed to the XSLT processor. When using e.g. the `wbreport2liquibase.xslt` stylesheet, the value of the `author` attribute can be set using `-xsltParameter="authorName=42"`. This parameter can be provided multiple times for multiple parameters, e.g. when using `wbreport2pg.xslt`: `-xsltParameter="makeLowerCase=42" -xsltParameter="useJdbcTypes=true"`

WbSchemaDiff Examples

Compare all tables between two connections, and write the output to the file migrate_prod.xml and convert the XML to a series of SQL statements for PostgreSQL

WbSchemaDiff -referenceProfile="Staging"
             -targetProfile="Production"
             -file=migrate_prod.xml
             -styleSheet=wbdiff2pg.xslt
             -xsltOutput=migrate_prod.sql

Compare a list of matching tables between two databases and write the output to the file migrate_staging.xml ignoring all tables that start with TMP_ and exclude any index definition from the comparison. Convert the output to a SQL script for Oracle

WbSchemaDiff -referenceProfile="Development"
             -targetProfile="Staging"
             -file=migrate_stage.xml
             -excludeTables=TMP_*
             -includeIndex=false
             -styleSheet=wbdiff2oracle.xslt
             -xsltOutput=migrate_stage.sql

23.2. Compare data across databases - WbDataDiff

The WbDataDiff command can be used to generate SQL scripts that update a target database such that the data is identical to a reference database. This is similar to the WbSchemaDiff but compares the actual data in the tables rather than the table structure.

For each table the command will create up to three script files, depending on the needed statements to migrate the data. One file for UPDATE statements, one file for INSERT statements and one file for DELETE statements (if -includeDelete=true is specified)


	As this command needs to read every row from the reference and the target table, processing large tables can take quite some time, especially if `DELETE` statements should also be generated.

WbDataDiff requires that all involved tables have a primary key defined. If a table does not have a primary key, WbDataDiff will stop the processing.

To improve performance (a bit), the rows are retrieved in chunks from the target table by dynamically constructing a WHERE clause for the rows that were retrieved from the reference table. The chunk size can be controlled using the property workbench.sql.sync.chunksize The chunk size defaults to 25. This is a conservative setting to avoid problems with long SQL statements when processing tables that have a PK with multiple columns. If you know that your primary keys consist only of a single column and the values won't be too long, you can increase the chunk size, possibly increasing the performance when generating the SQL statements. As most DBMS have a limit on the length of a single SQL statement, be careful when setting the chunksize too high. The same chunk size is applied when generating DELETE statements by the WbCopy command, when syncDelete mode is enabled.

The command supports the following parameters:

Parameter	Description
-referenceProfile	The name of the connection profile for the reference connection. If this is not specified, then the current connection is used.
-referenceGroup	If the name of your reference profile is not unique across all profiles, you will need to specify the group in which the profile is located with this parameter. If the profile's name is unique you can omit this parameter
-referenceConnection	Allows to specify a full connection definition as a single parameter (and thus does not require a pre-defined connection profile). The connection is specified with a comma separated list of key value pairs: `username` - the username for the connection `password` - the password for the connection `url` - the JDBC URL `driver` - the class name for the JDBC driver. If this is not specified, SQL Workbench/J will try to determine the driver from the JDBC URL `driverJar` - the full path to the JDBC driver. This not required if a driver for the specified class is already configured e.g.: `"username=foo,password=bar,url=jdbc:postgresql://localhost/mydb"` For a sample connection string please see the documentation for WbCopy. If this parameter is specified `-referenceProfile` will be ignored.
-targetProfile	The name of the connection profile for the target connection (the one that needs to be migrated). If this is not specified, then the current connection is used. If you use the current connection for reference and target, then you should prefix the table names with schema/user or use the `-referenceschema` and `-targetschema` parameters.
-targetGroup	If the name of your target profile is not unique across all profiles, you will need to specify the group in which the profile is located with this parameter.
-targetConnection	Allows to specify a full connection definition as a single parameter (and thus does not require a pre-defined connection profile). The connection is specified with a comma separated list of key value pairs: `username` - the username for the connection `password` - the password for the connection `url` - the JDBC URL `driver` - the class name for the JDBC driver. If this is not specified, SQL Workbench/J will try to determine the driver from the JDBC URL `driverJar` - the full path to the JDBC driver. This not required if a driver for the specified class is already configured e.g.: `"username=foo,password=bar,url=jdbc:postgresql://localhost/mydb"` For a sample connection string please see the documentation for WbConnect. If this parameter is specified `-targetProfile` will be ignored.
-file	The filename of the main script file. The command creates two scripts per table. One script named `update_<tablename>.sql` that contains all needed `UPDATE` or `INSERT` statements. The second script is named `delete_<tablename>.sql` and will contain all `DELETE` statements for the target table. The main script merely calls (using WbInclude) the generated scripts for each table. You can enable writing a single file that includes all statements for all tables by using the parameter `-singleFile=true`
-singleFile	If this parameter's value is `true`, then only one single file containing all statements will be written.
-referenceTables	A (comma separated) list of tables that are the reference tables, to be checked. You can specify the table with wildcards, e.g. `-referenceTables=P%` to compare all tables that start with the letter `P`.
-targetTables	A (comma separated) list of tables in the target connection to be compared to the source tables. The tables are "matched" by their position in the list. The first table in the `-referenceTables` parameter is compared to the first table in the `-targetTables` parameter, and so on. Using this parameter you can compare tables that do not have the same name. If you omit this parameter, then all tables from the target connection with the same names as those listed in `-referenceTables` are compared. If you omit both parameters, then all tables that the user can access are retrieved from the source connection and compared to the tables with the same name in the target connection.
-referenceSchema	Compare all tables from the specified schema (user)
-targetSchema	A schema in the target connection to be compared to the tables from the reference schema.
-excludeTables	A comma separated list of tables that should not be compared. If tables from several schemas are compared (using `-referenceTables=schema_one., schema_two.`) then the listed tables must be qualified with a schema, e.g. `-excludeTables=schema_one.foobar, schema_two.fubar`
-checkDependencies	Valid values are `true`, `false`. Sorts the generated scripts in order to respect foreign key dependencies for deleting and inserting rows. The default is `true`.
-includeDelete	Valid values are `true`, `false`. Generates `DELETE` statements for rows that are present in the target table, but not in the reference table. Note that enabling this option will result in additional overhead reading the target table's data and will slow down the overal comparison speed. The default is `false`.
-type	Valid values are `sql`, `xml` Defines the type of the generated files.
-encoding	The encoding to be used for the SQL scripts. The default depends on your operating system. It will be displayed when you run `WbDataDiff` without any parameters. You can overwrite the platform default with the property `workbench.encoding` in the file `workbench.settings` XML files are always stored in UTF-8
-sqlDateLiterals	Valid values: `jdbc`, `ansi`, `dbms`, `default` Controls the format in which the values of DATE, TIME and TIMESTAMP columns are written into the generated SQL statements. For a detailed description of the possible values, please refer to the WbExport command.
-ignoreColumns	With this parameter you can define a list of column names that should not be considered when comparing data. You can e.g. exclude columns that store the last access time of a row, or the last update time if that should not be taken into account when checking for changes. They will however be part of generated `INSERT` or `UPDATE` statements unless `-exclueIgnored=true` is also specified.
-excludeIgnored	Valid values: `true`, `false` If this is set to `true`, the ignored columns will also be removed from any `INSERT` or `UPDATE` statement The default is `false`.
-alternateKey	With this parameter alternate keys can be defined for the tables that are compared. The parameter can be repeated multiple times to set the keys for multiple tables in the following format: `-alternateKey='table_1=column_1,column_2'` Note that each value has to be enclosed in either single or double quotes to mask the equals sign embedded in the parameter value. Once an alternate (primary) key has been defined, the primary key columns defined on the tables are ignored. By default the real PK columns will however be included in `INSERT` statement that are generated. To avoid this, set the parameter `-excludeRealPK` to true.
-excludeRealPK	Valid values are `true`, `false`. This parameter controls the usage of the real PK columns in case alternate PK columns are defined. If set to `true` the real PK columns are excluded from generated `INSERT` statements (assuming that a new PK value will be generated during inserting the rows). Note that this parameter will enable/disable the use of the real PK columns for all tables for which alternate key columns were defined. This parameter has no effect if no alternate keys were specified using the `-alternateKey` option.
-showProgress	Valid values: `true`, `false`, `<numeric value>` Control the update frequence in the status bar (when running in GUI mode). The default is every 10th row is reported. To disable the display of the progress specify a value of 0 (zero) or the value `false`. `true` will set the progress interval to 1 (one).

WbDataDiff Examples

Compare all tables between two connections, and write the output to the file migrate_staging.sql, but do not generate DELETE statements.

WbDataDiff -referenceProfile="Production"
           -targetProfile="Staging"
           -file=migrate_staging.sql
           -includeDelete=false

Compare a list of matching tables between two databases and write the output to the file migrate_staging.sql including DELETE statements.

WbDataDiff -referenceProfile="Production"
           -targetProfile="Staging"
           -referenceTables=person,address,person_address
           -file=migrate_staging.sql
           -includeDelete=true

Compare three tables that are differently named in the target database and ignore all columns (regardless in which table they appear) that are named LAST_ACCESS or LAST_UPDATE

WbDataDiff -referenceProfile="Production"
           -targetProfile="Staging"
           -referenceTables=person,address,person_address
           -targetTables=t_person,t_address,t_person_address
           -ignoreColumns=last_access,last_update
           -file=migrate_staging.sql
           -includeDelete=true

Prev		Next
22. Copy data across databases	Home	24. Other SQL Workbench/J specific commands