GDPR compliance
CDQ supports customers in maintaining GDPR compliance by identifying business partner records that may contain personal data. This is crucial for the data sharing approach, since shared insights must remain privacy-compliant. Only non-personal, business-related data can be exchanged among participants.
Our algorithms analyze patterns such as national registration numbers, legal forms that indicate sole proprietorships, and typical name structures to estimate the probability that a record refers to a natural person. A neural network trained on large sets of company and forename data enhances this detection logic. By ensuring that personal data is excluded from shared datasets, CDQ guarantees that collaboration remains secure, ethical, and fully compliant with European data protection laws.
Approach to identify personal information
The algorithm to identify personal data distinguishes between
- contact information provided in a business partner's name such as "CDQ AG attn: Simon Schlosser"
- Registered individuals, i.e. natural persons that are registered as e.g. sole proprietors in an official register (e.g. Simon Schlosser e.K.)
- Individuals, i.e. natural persons where there is no evidence that the person is actually registered (e.g. freelancers as "Simon Schlosser")
| Example | Type | Description |
|---|---|---|
| CDQ AG | Legal Entity | Typical example. There is no name information (person name) included and there is a legal form available. |
| CDQ AG z.Hd. Simon Schlosser | Legal Entity with contact information | |
| Simon Schlosser | Individual | Typical example. There is a person name, no legal form and no VAT ID. The record is not to be stored in the CDL database. |
| Simon Schlosser e.K. | Registered individual | Individuals that are registered and have a legal form. There are different legal forms for natural persons in different countries. |
Identification strategies
In order to identify personal information, different strategies are applied to a given record. The following strategies are available and are executed in the following order:
Name list check
CDQ manages lists of typical forenames for different countries and trains a neuronal network with this data to identify such terms in a given record. Feedback from the Data Sharing Community is also used as training input to improve matching results.
Legal form check
If legal form information is provided in a given record, our services try to identify and enrich the related legal form. Some legal forms in some countries indicate registered individuals or sole proprietors and thus provide evidence on personal information in terms of natural person names.
No legal forms maintained.
Identifier check
Derivation
Based on certain identifiers, it is possible to identify natural persons. For example in PT (Portugal), the first digit of the VAT number indicates whether the record represents an individual: 1-3 are regular people, 5 are companies.
| Identifier schema | Country | Identifier | Description |
|---|---|---|---|
| 000000000057H0CS5Z46ACS04Y | The first digit indicates whether the record represents an individual, i.e.: 1-3 are regular people, 5 are companies for non-residents (only subject to final withholding at source) the ID starts with "45" | ||
| 00000000007CGYKN2FKHXN1JB9 | For natural persons this number consists of 12 digits | ||
| 0000000000BGWDBG0HD1V5FAWY | Specifies how to derive a sole proprietor from a DIC number. The DIC number may have different formats and thus different patterns apply for identifying individuals.
| ||
| 0000000000DCRZ8Y1PB8PCXV09 | 10 characters: 5 letters + 4 digits + 1 letter 4th character informs about the holder of the card: "P" - stands for Individuals ("Proprietor") | ||
| 0000000000Q4F2WYGAGTRQMW7P | For Entities such as Company’s or Associations of Persons (AOP) the TIN is designated as the National Tax Number (NTN). For individuals the TIN / NTN assumes the following format: AAAAA-AAAAAAA-N (total of 13 digits), A identifies that it must be a alphanumeric digit, N identifies that it must be a numeric digit. | ||
| 0000000000V72GY0NA3SG7YVPP | Specifies how to derive a sole proprietor from a EU VAT in Czech Republic. The EU VAT is identical with the DIC number (Czech Republic) and thus the identical patterns apply. |
Personal identification numbers
Moreover, there are identifiers that are only assigned to natural persons such as the US_SEC_ID (US - Social security number) in the US (United States of America).
| Personal Identification Number | Country | Name |
|---|---|---|
| BR_CPF (BR - Natural Persons Register) | BR (Brazil) | Cadastro de Pessoas Físicas (pt) Natural Persons Register (en) |
| ES_NIE (ES - Tax Identification Number) | ES (Spain) | Numero de Identificacion de Extranjero (es) Tax Identification Number (en) |
| FO_FIN (FO - Faroese ID Num.) | FO (Faroe Islands) | Faroese Identification Number (en) |
| FO_FPN (FO - Faroese P Number) | FO (Faroe Islands) | Faroese P Number (en) |
| GL_CPR (GL - CPR number) | GL (Greenland) | CPR number (en) |
| JE_SSN (JE - Social Security Number) | JE (Jersey) | Social Security Number (en) |
| KE_PIN (KE - Personal ID) | KE (Kenya) | Personal Identification Number (en) |
| KR_RES_ID (KR - Resident ID) | KR (South Korea) | Resident Registration Number (en) |
| SM_SSI (SM - Social Security Number) | SM (San Marino) | Social Security Number (en) |
| UK_IN_ID (UK - NI number) | GB (United Kingdom of Great Britain and Northern Ireland) | NI number (en) |
| US_SEC_ID (US - Social security number) | US (United States of America) | Social security number (en) |
Contact information parsing
In order to identify contact information, typical keywords such as attn:, z.Hd., attention to etc. are searched. This parsing is provided via the data quality rule Contact information misplaced.