Data Sharing Community

From CDQ
Revision as of 22:50, 16 September 2024 by Simonschlosser (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Welcome to the Portal of the CDQ Data Sharing Community

What's new? (RSS)

Updates on Data Quality Rules (19 September 2024)

In our recent data quality rules update we have changed 36 rules into RELEASED status and worked on improvements regarding post codes in Vietnam and Iran. Data quality rules now allow to check old and new post code formats in Vietnam (old format is 6 digits length and the new one is 5 digits length). Moreover, we have improved format checks of post codes in Iran. Additionally, we have added and improved rules for checking Business Registration and Tax Identification Numbers in Nigeria. The list of affected rules is the following:

Post code rules:


List of 36 rules which status has been changed to RELEASED:

Europe:
Asia-Pacific
United States of America:
Africa:


Nigeria-related rules:

Integration of CZ.VAT - Czech Republic VAT Register (Beta) (10 September 2024)

We are excited to announce the integration of a new data source into our system: CZ.VAT - Czech Republic VAT Register, provided by the Czech tax authority General Financial Directorate.
This service allows us to retrieve crucial information about the reliability of VAT payers and the bank accounts associated with them.

Key Features:

Business Partner Data:

Each business partner record includes:

  • Company Name
  • Legal Form
  • Address
  • VAT Identification Number (DIC)
  • National and International Bank Account Numbers
  • VAT Payer Status (Reliable/Unreliable)

Input Requirements:

To search for data in the CZ.VAT data source, users must provide a valid VAT Identification Number (CZ_DIC) as the input data.

Current Limitations (Beta Phase):

Daily Query Limit:

In this beta phase, the number of searches to the CZ.VAT data source is limited to 2000 queries per day. We are actively working on increasing this limit in the near future as we enhance the service capabilities.

Inaugural Release Note: Introducing the CDQ Email Domain Guard (30 August 2024)

We are pleased to introduce the Email Domain Guard Headless REST API, a comprehensive suite of email verification and domain analysis services

Why We Launched Email Domain Guard

The need for secure and reliable email communication has never been more critical. Businesses are facing escalating threats from sophisticated phishing attacks, Business Email Compromise (BEC), and fraudulent invoicing scams. These issues not only undermine trust but also expose organizations to significant financial and legal risks. The growing shift towards remote work has further exacerbated these vulnerabilities, with decentralized operations increasing the likelihood of security breaches. Additionally, stringent data protection regulations, such as GDPR, place immense pressure on companies to secure their communications and ensure compliance. Recognizing these challenges, we developed Email Domain Guard to provide businesses with a comprehensive, cutting-edge solution to safeguard their email communications.


What is the CDQ Email Domain Guard?

Our Email Domain Guard is a state-of-the-art email verification and domain analysis solution designed to elevate the security, accuracy, and reliability of your business communications. By leveraging multi-factor analysis, Email Domain Guard assesses the risk associated with email addresses, providing clear, actionable insights that empower businesses to make informed decisions and proactively manage risks.


Key features include

  • Email Risk Score: Our tool analyzes multiple risk indicators, such as disposable and freemail status, domain age, DNS-based blacklists (DNSBL), email breach history, and presence in shared email databases. This comprehensive analysis results in a quantified risk score, helping you classify and manage email risks effectively.
  • Multi-Factor Email Verification: Email Domain Guard goes beyond basic email verification by checking the structure, domain existence, role categories, and whether the email is associated with disposable or freemail services. This ensures that your communication channels remain accurate and secure.
  • Data Breach Checker: By integrating with the "Have I Been Pwned" API, Email Domain Guard identifies whether email addresses or domains have been compromised in known data breaches, allowing you to take proactive measures to protect sensitive information.
  • Shared Email Checker (Experimental): This innovative feature allows for community-driven validation of email addresses, enabling businesses to verify contact information through anonymous data sharing, all while preserving privacy.


How the CDQ Email Domain Guard Enhances Your Operations

  • Operational Efficiency: By ensuring that your email data is accurate and up-to-date, Email Domain Guard helps reduce bounce rates, optimize communication workflows, and enhance overall productivity.
  • Security and Fraud Prevention:' Email Domain Guard provides a robust defense against BEC, phishing, and other email-based threats by verifying the legitimacy of email addresses and flagging potential risks before they become issues.
  • Regulatory Compliance: With built-in checks for data breaches and a focus on accurate email validation, our solution supports your compliance efforts, helping you avoid the hefty fines and reputational damage associated with data protection violations.
  • Brand Reputation Management: Protect your brand’s integrity by avoiding spam traps, engaging only with legitimate email addresses, and maintaining a strong sender reputation.

The launch of our Email Domain Guard comes at a time when businesses are increasingly vulnerable to cyber threats and regulatory pressures. Our solution addresses these needs head-on, providing the tools necessary to secure your communications in an ever-evolving digital landscape.


Get Started with Email Domain Guard

With our Email Domain Guard, you’re not just adopting an email verification tool—you’re investing in a comprehensive security solution that will protect your business from the growing threats of email fraud and ensure the integrity of your communications. Explore the capabilities of Email Domain Guard today and take the first step towards more secure, reliable, and compliant email operations. For more details on how to implement Email Domain Guard and integrate it into your existing systems, visit our Developer Portal.

Guide that will walk you through the process of using the Email Domain Guard API: How to verify e-mail address?

Please refer to the API documentation for detailed information: Email Analysis API

... further results

Data model

An important prerequisite for collaborative data management is a common understanding of the shared data. For the CDQ Data Sharing Community, this common understanding is specified by the CDQ Data Model. The concepts of this model are defined and documented in this wiki which can be used as a business vocabulary. Moreover, the wiki provides a machine-readable interface to reuse this metadata by using semantic annotations.

This is a graph with borders and nodes that may contain hyperlinks.

Data maintenance procedures

A procedure is a common standard or "how-to" for a specific data management task. Within the CDQ Data Sharing Community, companies agree on such procedures to ensure similar rules and guidelines for similar tasks. For several countries, the CDQ Wiki provides such information, e.g. data quality rules, trusted information sources, legal forms, or tax numbers. Try

or select another country from the list.

Data sources

Active data sourcesRecords
Data source BR.RF61,336,075
Data source VIES50,000,000
Data source FR.RC39,535,714
Data source CDQ.INTEL31,423,242
Data source GB-EAW.CR8,253,221
Data source US-FL.BER5,857,492
Data source JP.CR5,525,075
Data source AU.BR4,761,917
... further results
The CDQ Data Sharing Community uses a collaboratively managed reference data repository. This incorporates the integration of external data sources for enriching or validating business partner and address data. Examples of available data sources are 316 countries (e.g. WORLD (World), AT (Österreich, Austria, Autriche, 奥地利), BE (Belgien, Belgium, Belgique, België, 比利时)), 993 legal forms (e.g. ), and 72 active business partner data sources (e.g. Data source CDQ.POOL, Data source VIES, Data source CH.UIDR).

Metadata and Standards: Metadata-driven Data Quality

Data quality plays a pivotal role in ensuring compliance with legal, regulatory, and industry standards. One of the core challenges in achieving high data quality is adhering to dynamic data requirements that evolve due to changes in national regulations. These requirements vary by country, making it essential for businesses to track and update compliance criteria continuously.

In many countries, official company information is available as Open Data, but the lack of a standardized data model or provision method complicates the process of integrating this data. The Data Sharing Community actively collaborates to identify global data requirements and reference data sources, whether Open Data or commercial.

Short description
Managed reference data for administrative areas with language-specific terms and short names according to ISO 3166-2.
Managed reference data for bank accounts worldwide.
Managed reference data for types of identifiers per country.
Basic data concepts of CDQ Cloud Services.
Managed reference data about compliance lists considered in the sanction and watchlist screening services
Managed reference data for countries with language-specific names and short names according to ISO 3166-2.
Documentation of data quality rules with explanation and technical constraints to validate business partner data records.
Data quality rule functions are methods implemented in a programming language for being used in data quality rule implementations. They can be e.g. used in custom data quality rules similar to functions employed by business users in popular spreadsheet applications such as Microsoft Excel.
Managed reference data for legal forms with official and commonly used abbreviations and corresponding country.
Managed reference data for localities, such as exonyms.
Managed reference data for post codes
Managed reference data for postal delivery points, such as Post Office Boxes used for identification, extraction, harmonization and standardization.
Managed reference data for issuing bodies of identifiers
Managed reference data for thoroughfares of type Street (CDQ.POOL) used for harmonization and standardization.

Data Quality Rules

Transformation of human-documented data requirements into executable data quality rules is mostly a manual IT effort. Changing requirements cause IT efforts again and again. Some checks, e.g. tax number validity (not just format!), require external services. Other checks, e.g. validity of legal forms, require managed reference data (e.g. legal forms by country, plus abbreviations). Continuous data quality assurance (i.e. batch analyses) and real-time checks in workflows often use different rule sets.

Data requirements and related reference data are collected and updated collaboratively by the Data Sharing Community. Data quality rules are derived from these requirements automatically. All data quality rules are executed behind 1 interface, in real-time. Batch jobs and single-record checks use the same rule set and can be integrated by APIs.

For proving that a data quality rule is content-wise correct we maintain supporting document(s) per data quality rule which share the rule's source. This could be:

  • a public authority source
  • any other trustful webpage
  • a data standard of a specific community member

We manage the URL (if any), a screenshot of the relevant parts (if any) and the source's name (e.g. Community member data standard, European Commission, National ....) See Identifier format invalid (SIREN (France)) as an exemplary rule that was specified and implemented based on information provided by the OECD.