-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Note: this might be a bit complicated - maybe users should just use
executemany
andcopy_rows
and write their own INSERT statements to ensure that they get exactly what they want.
Summary
As an ETL Helper user I want to use skip_conficts
so that I can handle primary key violations at the database level without requiring ETL Helper's error handling.
Description
A common error in a big insert is where records already exist for a given primary key. Some databases have internal methods for handling these scenarios. For example SQLite has the ON CONFLICT
or INSERT
syntax:
https://www.sqlite.org/lang_conflict.html. PostgreSQL has ON CONFLICT DO NOTHING: https://www.postgresql.org/docs/current/sql-insert.html#:~:text=ON%20CONFLICT%20DO%20NOTHING%20simply,can%20perform%20unique%20index%20inference.
There should be implementations for each database. The specific way in which they are applied will need to be configured in the DbHelper classes. The SQL statement generated for the load
function may need changes around the INSERT keyword, or as a suffix.
Acceptance critera
- Calling
load
withskip_conflicts=True
means that primary key violations pass silently and rows are unchanged