When there is a schema change, this will trigger a new base table in your destination. You can leverage a destination's built-in query and notifications tools to get notified of these changes.
Our example details how to do this for BigQuery. However, the concept can also be applied to other destinations. We provide directional advice on how to do this with Snowflake later in this doc.
Check For New Google BigQuery Table Versions
This query will check the project and dataset for new tables created within the past day;
DECLARE table_count INT64;
SET table_count = (SELECT COUNT(*) FROM `your-project.your-dataset.INFORMATION_SCHEMA.TABLES` WHERE table_catalog = 'your-project' AND table_schema = 'your-dataset' AND creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY));
IF table_count > 0 THEN SELECT ERROR(CONCAT("Over threshold: ", CAST(table_count AS STRING), " new table(s) created in the last day."));
ELSE SELECT "No new tables created in the last day.";
You will need to replace
your-dataset with the actual name of your Google Cloud Project and the dataset, you want to monitor.
What does this query do? The query uses the
INFORMATION_SCHEMA.TABLES view to count the number of new tables created or modified in the last day. This query uses
TIMESTAMP_SUB function to subtract 1 day from the current timestamp and checks if there are any tables with a
creation_time greater than this value.
If the count exceeds 0, the query will error with a message that includes the number of new tables created. If the count is 0, the query will return a message indicating no new tables were created.
If not changes are detected, the query will look like this:
However, if we change the interval to 7 days, the query will detect that a new table was created and then the error:
You can adjust the interval as you'd like. However, we suggest leaving the default a 1 day to reduce noise and increase signal.
Scheduling BigQuery Notifications
Google BigQuery allows you to schedule a query to run at preferred intervals. Once you have verified, the query runs as expected. Create a schedule to run the query:
You can schedule this query to run regularly and configure a notification to be sent if the query errors. For example, you can set it to run every 12 hours;
Next, select receiving notifications. When selected, the owner of the scheduled query will receive e-mail notifications on scheduled query run failures.
These messages are sent to the user who set up the transfer. The messages are not configurable. If other users should receive transfer-run email notifications, set up email forwarding rules or an email group to distribute the messages.
How would you accomplish a check in Snowflake?
Snowflake's TASKS and NOTIFICATION functionality will send an email notification if a new table is created in Snowflake. Take a look at the Snowflake Task documentation for more information.
Here is a high-level overview of that process.
Create a new Snowflake Task that runs on a schedule every hour.
Use a statement in the Task definition to check the
INFORMATION_SCHEMA.TABLESview for new tables created in the last hour. You can do this using a
WHEREclause that filters by the
SELECTstatement returns any rows, use Snowflake's
SYSTEM$SEND_EMAILprocedure to send an email notification with the relevant details.
Save and enable the Task.
Below is an example code snippet that you can use as a starting point for your Task definition:
-- Define the Task
CREATE TASK check_new_tables
WAREHOUSE = my_warehouse
SCHEDULE = '1 HOUR'
-- Check for new tables created in the last hour
SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, CREATED_ON
WHERE CREATED_ON > DATEADD(HOUR, -1, CURRENT_TIMESTAMP())
AND TABLE_TYPE = 'BASE TABLE'
-- Send email notification if there are new tables
IF RESULT_SCAN.total > 0 THEN
'New Tables Created in the Last Hour',
'New tables were created in the last hour. Please check the attached file for details.',
This is just a sample code snippet; you must modify it based on your specific requirements. You'll also need to set up your email settings in Snowflake before using the