Comparing Database Structures Made Easy

When working with legacy systems, you might need to replace them with more modern solutions. However, switching over can be risky. One way to mitigate this risk is by developing a bi-directional syncing tool to keep both systems updated with changes from each other.

Syncing isn't always perfect, so it's useful to have a tool that can detect mismatches in records between the two systems. This might sound simple, but differences in data structures can complicate things. Using the legacy system's database structure as a common format can help, especially if it's denormalized and uses natural keys, making table-by-table comparisons easier.

For instance, if you're using a PostgreSQL database for the new system, you can write SQL queries that are nearly identical for both databases, allowing for row-by-row comparisons.

Here's an example query for the new database:

SELECT
  *
FROM
  (
    SELECT
      customers.identifier AS custid,
      orders.identifier AS ordid,
      stores.identifier AS storid,
      products.identifier AS prodid,
      order_allocations.quantity AS qty
    FROM
      order_allocations
    INNER JOIN stores ON stores.id = order_allocations.store_id
    INNER JOIN customers ON customers.id = stores.customer_id
    INNER JOIN order_products ON order_allocations.order_product_id = order_products.id
    INNER JOIN products ON products.id = order_products.product_id
    INNER JOIN order_groups ON order_groups.id = order_products.order_group_id
    INNER JOIN orders ON orders.id = order_groups.order_id
  ) ordline
WHERE
  custid = 'MMM'
  AND ordid = 'C010917A'
ORDER BY
  prodid,
  storid;

And here's how you can query the legacy Visual FoxPro database:

SELECT
  *
FROM
  ordline
WHERE
  custid = 'MMM'
  AND ordid = 'C010917A'
ORDER BY
  prodid,
  storid;

The trick is to use a complex query as a sub-query, aligning field names with those in the legacy system. This reshapes the new system's data to match the simpler structure of the legacy system.

This approach allows you to query both databases using the same field names, predicates, and record order, making it easier to compare datasets row-by-row, even if one system has a more complex structure.

Updated and derived from the original version of the article authored by me while working at Atomic Object.