Data Copy Strategy

Replicate Source Database Structure at Destination

The default configuration used by Insert Bulk Bestia is based on the fact that the structure of the source and destination databases is the same. Thus, for the origin it will generate SELECT clauses for each of the tables that will contain all the fields, the order being defined in "sys.columns". For the destination, it will also generate SELECT clauses in the same way. The field matching performed in the SqlBulkCopy statement will be following the order in which the fields appear in both SELECT clauses, not by the name of the fields.

If structure modifications have been made to the target database due to continuous developments, Insert Bulk Bestia does not have enough knowledge to find out the correspondence between the source and target structures. Thus, it must be the application operator that must indicate the relationships by properly modifying the SELECTs described above.

The default behavior of Insert Bulk Beast is appropriate when the target database is a replica of the source database. When will this not happen? This will usually happen when it can no longer be guaranteed that what exists in "production" is the same as what exists in "test" and "development" environments. In those cases, it is best to remove the "test" and "development" databases and start from scratch to create the databases.

Thus, the first step will be to replicate the structure of the production database in the rest of the environments. To do this, you can make use of the resources that SQL Server Management Studio makes available to the application operator. The following figures describe the process to follow to obtain a replica of the source database structure in the target environments.

Note: Although the images show the SSMS interface in English, on certain occasions the corresponding image in Spanish will also be shown in order to facilitate the selection of options for users who very intelligently choose the use and promotion of this language.

First, it shows how to start the wizard that will help to obtain a script with the structure of the source database:

Generación del Script con la Estructura de la Base de Datos - 1

Generación del Script con la Estructura de la Base de Datos - 1

After selecting the script generation for the entire database, it is recommended to select the advanced options (General tab) shown in the following figure. Some of them are described for their special importance.

Generación del Script con la Estructura de la Base de Datos - 2

Generación del Script con la Estructura de la Base de Datos - 2

Generación del Script con la Estructura de la Base de Datos - 2 - ES

  • Include descriptive headers: It is convenient not to select this option, as it will facilitate the comparison of the generated script with other previously generated ones to verify the modifications made.
  • Script collation: On many occasions, the installations of SQL Server instances are not usually the same in all environments, so it is convenient to ensure that in all environments the definition of the tables is the same .
  • Include system constraint names: Although all elements created by Insert Bulk Beast have names, it is customary for programmers to let SQL Server "free will" assign automatically names certain objects, such as constraints. If the business logic requires the disabling of any of them and later their rehabilitation, it is convenient that the name given in the rest of the environments is the same as the one given in production.
  • Script extended properties: Insert Bulk Bestia makes use of these properties to indicate the version of the objects created and used by the application. If these properties are not included, the objects will be deleted and created again, thus the information stored in them will be lost.
  • Types of data to script: Since you only want to recreate the structure of the database, you must choose the "Schema only" option.

In the "Table/View Options" tab, the recommended selection is as shown below:

Generación del Script con la Estructura de la Base de Datos - 3

Generación del Script con la Estructura de la Base de Datos - 3

Generación del Script con la Estructura de la Base de Datos - 3 - ES

  • Script change tracking: Since Insert Bulk Bestia is going to unmount part of the database structure and remount it, it is recommended to deactivate this option if the user used for this process does not have the proper permissions. The permissions that this user needs will be detailed later, as long as change tracking is not activated.

Generación del Script con la Estructura de la Base de Datos - 4

Generación del Script con la Estructura de la Base de Datos - 4

Once the script is generated, it must be adapted to the environment to which it is directed. In the previous figure it has been wanted to highlight that normally the disk structure of the different environments is not the same, so the paths for creating files must be adapted to the environment.

Finally, indicate that if you want another name for the database, you simply have to find and replace that name in the script with the desired one.

Data Copy Strategy

The idea is quite simple, which always leads to great complications. Broadly speaking, the steps to follow are:

  1. Reproduce the structure of the database to be "copied" in the target environments.
  2. Install and License Insert Bulk Beast on an intermediate machine so that it does not unnecessarily load the source and destination servers.
  3. Make a first copy from the source to the first destination. This copy can be configured so that it does not overload the source server, in the event that this is the production server and is ONLINE.
  4. Make the rest of the copies from the first destination to the rest of the destinations, now taking advantage of the maximum available resources.

Estrategia de Copia de Datos - 1

Estrategia de Copia de Datos - 1

Data Copy Process

As has already been announced, simple ideas, ..., involve a lot of work and effort. Thus, in order for the data copying process to be as automatic as possible, setting up and making the first of the copies is somewhat more complicated than the rest of the copies, since the rest of the copies will be based on this first configuration.

To do this, Insert Bulk Beast is designed to create certain objects in the source database that have their replica in the target database. The objects that Insert Bulk Bestia creates in the origin are:

Finally, there are two objects, which are also created at the source, and whose operating strategy should be detailed so that the operator understands what possibilities they have when programming custom actions so that the data load adjusts to their needs:

  • Load Tables: ibb.Cargar_Tablas, ibb.Cargar_Tablas_Secuencias and auxiliary tables. In the first one, the structure of the source and destination databases necessary for loading data is stored. In the second, the existing sequences in the source and destination are stored.
    • In ibb.Cargar_Tablas, SELECT and WHERE clauses are saved that indicate the data to be copied from the source and where it should be copied to the destination. In addition, it contains the classification of the tables and their segmentation.
    • In ibb.Cargar_Tablas_Secuencias, Existing sequences are listed and how they must be restarted so that once the data is copied, those sequences will work properly.

During the data load, the structure of the source database will be read, it will be saved in the objects of the said database, said information will be transferred to the destination database and there it will be updated with the information of the structure of the target database.

Estructura de Origen y Destino - 1

Estructura de Origen y Destino - 1

  • 1- Reading Source Structure

    Reading the structure of the Origin Database. These steps are executed if in the origin parameterization the option is checked Actualizar Lista de Tablas, and of course the option Crear Objetos en Origen:

    • Inserted in ibb.Cargar_Tablas everything that is not stored. Fill in the SELECT fields for both the source and destination. By default, the field 'Table Classification' is filled with 0.
    • Update on ibb.Cargar_Tablas the SELECT fields of the source with the modifications that the source database has undergone. This step corresponds to the configuration option of the origin Actualizar Cláusula SELECT.
    • Remove from ibb.Cargar_Tablas those records corresponding to tables that have disappeared at the source.
    • Updates the number of rows in each table, the space taken up by the data in each table, and the space taken up by the indexes on each table. Update the field 'ClassificacionTabla' to 255 for those tables with "durability = 1".
  • 2- Copy Source Structure at Destination

    The table is always truncated ibb.Cargar_Tablas_Copia_Origen of destiny.

    If in the parameterization of origin the option Copiar Lista Tablas al Destino:

  • 3- Destination Update with Source Structure

    It must be taken into account that if in the parameterization of the origin, the option has not been marked Crear Objetos en Origen or the option has not been marked Copiar Lista Tablas al Destino, the table ibb.Cargar_Tablas_Copia_Origen will be empty.

    These steps are executed if the destination parameter is marked Actualizar Lista de Tablas. If this is the first time the configuration is run, and the table ibb.Cargar_Tablas_Copia_Origen is empty, absolutely nothing will be done, because the table ibb.Cargar_Tablas it will indicate that there are no tables to transfer when empty. If, on the contrary, that table already contains information, and the option Actualizar Lista de Tablas of the origin and destination is unchecked, said table will not be updated, and the configuration will be executed with the information that said table contains from previous loads.

    • Insert in ibb.Cargar_Tablas the records that correspond to those tables that are in ibb.Cargar_Tablas_Copia_Origen and they are not in ibb.Cargar_Tablas and that belong to the source database. But in addition, these tables must also exist in the destination. This insert includes the SELECT and WHERE fields of the source and the SELECT and WHERE fields of the destination but obtained from the source, that is, the SELECT and WHERE fields of the destination are the same as those of the source.
    • Update in ibb.Cargar_Tablas SELECT and WHERE fields with the contents of ibb.Cargar_Tablas_Copia_Origen for the origin in those tables that these fields are not equal. In other words, on the target, the structure corresponding to the source database that has changed since the last time the configuration was run is updated. This step is only executed if the origin configuration option is checked Actualizar Cláusula SELECT.
    • Remove from ibb.Cargar_Tablas those records corresponding to tables that have disappeared in the ibb.Cargar_Tablas_Copia_Origen, that is, those tables that have been dropped in the source database since the last time the configuration was run.
  • 4- Destination Update with Target Structure

    These steps are executed if the destination parameter is marked Actualizar Lista de Tablas.

    • Insert in ibb.Cargar_Tablas the records that correspond to those tables that are in the target database and are also in the table ibb.Cargar_Tablas_Copia_Origen O well, ibb.Cargar_Tablas_Copia_Origen está vacía. In other words: If ibb.Cargar_Tablas_Copia_Origen it is not empty, it will do nothing. But if that table is empty, it will infer the structure of the source database from the structure of the target database. Bear in mind that it is assumed that both databases have the same structure or are very similar.
    • Update in ibb.Cargar_Tablas the SELECT fields of the target with the modifications that the target database has undergone. This step corresponds to the configuration option for the destination Actualizar Cláusula SELECT
    • Delete from ibb.Cargar_Tablas those records corresponding to tables that have disappeared in the destination.
    • Updates the "ClassificacionTabla" field, the number of rows in each table, the space occupied by the data in each table and the space occupied by the indexes of each table from the content in ibb.Cargar_Tablas_Copia_Origen.

Advantages and Disadvantages

The main advantage is that with this strategy, the operator can have centralized in the main database all the customizations that make the application fit his needs. Once performed, each time you regenerate the structure of the target databases, it will take all the customizations with it, thus avoiding having to modify the behavior of Insert Bulk Bestia each time you execute a load of data.

Conversely, if you cannot modify the main database, you will have to use alternative strategies to preserve your customization, such as saving the scripts with the customizations and running them after a restore of the target database.

This strategy will be modified in the future to accommodate other data sources, such as Oracle, for example.

Data Copy Algorithm

The steps that Insert Bulk Bestia performs during the entire process of executing a given configuration are as follows:

  1. Checks the existence of the objects used by Insert Bulk Beast in the origin.
    • If they don't exist, they are created.
    • If they have a lower version than the one required by the application, update them if possible, or remove them and replace them with the new ones.
  2. Updates the content of these objects with the structure of the source database.
  3. Run the Custom Actions for the Origin.
    • Execute the stored procedure ibb.Cargar_Tablas_Acciones_Programables_Origen.
      • The objective of the actions programmed in this stored procedure must be the automatic "fine" adjustment of the SELECT and WHERE for the source database, as well as the automatic classification of the tables.
  4. Checks the existence of the objects used by Insert Bulk Beast in the destination.
    • If they don't exist, they are created.
    • If they have a lower version than the one required by the application, update them if possible, or remove them and replace them with the new ones.
  5. Truncate the tables ibb.Cargar_Tablas_Secuencias_Copia_Origen and ibb.Cargar_Tablas_Copia_Origen en el destino.
  6. It transfers the structure information from the source database to the destination.
  7. Update the tables ibb.Cargar_Tablas_Secuencias and ibb.Cargar_Tablas of the destination with the structure of the origin and destination as described in the previous section.
  8. For the execution of the configuration if the user has indicated it or an unexpected event has happened or there are new sequences.
    • This will allow the user, if it is the first time it is executed, to review the suitability of the data transfer, to program the custom actions and to indicate how the new sequences should be initialized.
    • If it was not the first time, you should not check the option to stop the execution, and it will only stop if new sequences have appeared, to give the user the opportunity to define their initialization.
    • After carrying out the appropriate actions, the user can re-run the configuration, this time without checking the option of Stop Tras Actualizar Tablas.
  9. Run the Initial Custom Actions on the Destination.
    • Execute the stored procedure ibb.Cargar_Tablas_Acciones_Programables_Destino_Iniciales
      • The objective of the actions programmed in this stored procedure should be the automatic "fine" adjustment of the SELECT and WHERE for the target database, as well as the automatic classification of the tables.
  10. Prepare the table ibb.Cargar_Tablas for the execution of the data transfer.
  11. Creates the objects required to dismount and mount the target database structure.
  12. Unmount the required structure from the target database.
    • Disable triggers.
    • Delete the Foreign Keys.
    • Disable restrictions.
    • Empty the tables to load.
    • Remove the indexes.
  13. Perform the data transfer by loading the tables in parallel.
  14. Mount the database structure that you previously unmounted
    • Enable restrictions.
    • Create the indexes and statistics. You can create them in parallel.
    • Raise the Foreign Keys:
      • If it was indicated to delete records to lift them, it will delete what is necessary to lift the Forenig Keys.
      • If that option was not checked, it will leave those Foreing Keys that it cannot lift un-lifted.
    • Enable triggers.
    • Restart the identity fields.
    • Restart the sequences.
  15. Execute Final Custom Actions on Destination
    • Execute the stored procedure ibb.Cargar_Tablas_Acciones_Programables_Destino_Finales
      • The objective of the actions programmed in this stored procedure must be the automatic "fine" adjustment of the transferred data, such as the hiding of confidential information, "path" adjustments when changing environments, and so on.