Right now bigquery.insertAll does not use RetryHelper. Retrying a failed insertAll operation might insert duplicates into the table unless an ID is associated to each row.
We have 2 options here:
- Use
RetryHelper and document that unless row IDs are used, duplicates might be inserted
- Use
RetryHelper and when retrying add a randomUUID() to all rows that have no ID specified
It is worth noting that BigQuery duplicate detection based on row ID is best effort. Opinions/suggestions are welcome!
/cc @jtigani
Right now
bigquery.insertAlldoes not useRetryHelper. Retrying a failedinsertAlloperation might insert duplicates into the table unless an ID is associated to each row.We have 2 options here:
RetryHelperand document that unless row IDs are used, duplicates might be insertedRetryHelperand when retrying add a randomUUID() to all rows that have no ID specifiedIt is worth noting that BigQuery duplicate detection based on row ID is best effort. Opinions/suggestions are welcome!
/cc @jtigani