Duplicates Recipe | DataCraft Analytics

Here is a quick recipe for those of our customers that use Analytique.

If you want to check that for duplicate data in your Analytique data stores, you can use the following SQL statement (edited for your specific view and fields).

/* This query illustrates the logic you can quickly use to identify if there are any duplicates
* in a data store for any facility. */

/* The unique key for this table is made up by the external, intra-facility primary key whic
* is the ACHOLDERID and WHENTRANS fields. This is defined in your query if you are unsure.
* We also have to add FacilityID to ensure that this key is unique across facilities. If
* we don't add into the select, then we'll find suprious duplicates where unique keys
* between facilities match. This is possible.
*
* By using the view instead of the raw table, we restrict records to only those marked as
* being the latest, but also conveniently get the joined log info which includes the
* FacilityID. Without the view, we'd have extra joins.
*
* We use the having clause to filter based on an aggregate function, in this case where
* count > 1 , e.g. 2 or above. */

select
FacilityID,
ACHOLDERID, /* <---- edit this to your needs */
WHENTRANS, /* <---- edit this to your needs */
count(*) as RecordCount
from
vw_MT32_TRANS /* <---- edit this to your needs */
group by
FacilityID,
ACHOLDERID, /* <---- edit this to your needs */
WHENTRANS /* <---- edit this to your needs */
having
count(*) > 1;

/* To modify this to find duplicates in a different table, you will need to change:
* 1. the view that is referenced in the from clause
* 2. the list of fields which make up the intra-facility primary key
* a. in the select clause; and
* b. in the group by clause
*
* You'll want to leave the FacilityID field there always.*/