Data Sharing Community

From CDQ
Jump to navigation Jump to search

Welcome to the Portal of the CDQ Data Sharing Community

What's new? (RSS)

Monitor Status for Business Partners (27 June 2025)

We’re excited to introduce a new Status feature in the Monitors module of the Data Clinic app. This enhancement provides visibility into the current state of Business Partners (BPs) within Augmentation and Data Quality Profiling monitors, helping users make informed decisions when downloading or reviewing BP data.

Key Benefits:

  • Avoid consuming incomplete data by easily identifying BPs that are mid-reprocessing.
  • Monitor BP readiness and take action only when data is up to date.
  • Streamline workflows by accessing real-time status data directly in the UI or via API.


Status Overview:

Each BP in a monitor is now assigned one of four statuses:

  • Ready – The BP has up-to-date results and is ready for download.
  • In Progress – The BP is currently undergoing processing or reprocessing.
  • Retry – The BP requires processing has failed and needs to be reprocessed.
  • Blocked – The BP cannot be processed due to an issue that needs to be resolved.


Where to See It:

  • Configuration Page: Get an overview of how many BPs are in each status category across your monitor.
  • Data Review (Single Mode): Check the individual status of a selected BP during manual review.
  • API Access: Status information is also available programmatically via our API responses.


This feature aims to improve data accuracy, transparency, and efficiency when working with monitored BPs.

Augmented Business Partner Report & Legal Entity Report v3 (26 June 2025)

We’re excited to announce two enhancements to our reporting suite: the Augmented Business Partner Report and an updated Legal Entity Report.

Augmented Business Partner Report

This new report delivers the full results of augmentation monitoring for your business partners. You’ll see every proposed value alongside its original “before” data, plus all Update Assessment details—action taken (added, modified, deleted), classification, data provenance, similarity scores, and modification timestamp. Just like the existing Update Report, you can filter by summary classification and creation date to zero in on exactly what matters.

Legal Entity Report (v3)

The revamped Legal Entity Report leverages our latest curation logic to surface official register data—legal status, VAT registration, and any local attributes—alongside the same augmentation monitoring insights (action, classification, provenance, scores, and modification date). Both the new and legacy versions remain available, giving you flexibility to evaluate the improvements and switch over at your own pace.

New data source integrated: Swedish Companies Registration Office (SE.BR) (25 June 2025)

Summary

We are pleased to announce the integration of a new data source: SE.BR, powered by the Swedish Companies Registration Office (Bolagsverket) — the official authority responsible for registering businesses and managing company data in Sweden.

Details

The dataset contains comprehensive information about legal entities registered in Sweden, including:

  • Organization identifier (SE_ORG_ID)
  • Company name
  • Legal form
  • Registered address
  • Status of the company


This dataset currently contains 2.9 million records and is available to all CDQ customers.
To access this data in CDQ services (e.g., Business Partner Lookup or Curation API), please activate the data source SE.BR in Global Settings under the Reference Data Source Management section.

... further results

Data model

An important prerequisite for collaborative data management is a common understanding of the shared data. For the CDQ Data Sharing Community, this common understanding is specified by the CDQ Data Model. The concepts of this model are defined and documented in this wiki which can be used as a business vocabulary. Moreover, the wiki provides a machine-readable interface to reuse this metadata by using semantic annotations.

This is a graph with borders and nodes that may contain hyperlinks.

Data maintenance procedures

A procedure is a common standard or "how-to" for a specific data management task. Within the CDQ Data Sharing Community, companies agree on such procedures to ensure similar rules and guidelines for similar tasks. For several countries, the CDQ Wiki provides such information, e.g. data quality rules, trusted information sources, legal forms, or tax numbers. Try

or select another country from the list.

Data sources

Active data sourcesRecords
Data source BR.RF65,819,242
Data source CDQ.INTEL53,424,836
Data source VIES50,000,000
Data source FR.RC41,793,664
Data source US-CA.BER8,822,934
Data source GB-EAW.CR8,798,094
Data source US-FL.BER6,334,811
Data source JP.CR5,641,854
... further results
The CDQ Data Sharing Community uses a collaboratively managed reference data repository. This incorporates the integration of external data sources for enriching or validating business partner and address data. Examples of available data sources are 316 countries (e.g. WORLD (World), AT (Österreich, Austria, Autriche, 奥地利), BE (Belgien, Belgium, Belgique, België, 比利时)), 993 legal forms (e.g. ), and 72 active business partner data sources (e.g. Data source CDQ.POOL, Data source VIES, Data source CH.UIDR).

Metadata and Standards: Metadata-driven Data Quality

Data quality plays a pivotal role in ensuring compliance with legal, regulatory, and industry standards. One of the core challenges in achieving high data quality is adhering to dynamic data requirements that evolve due to changes in national regulations. These requirements vary by country, making it essential for businesses to track and update compliance criteria continuously.

In many countries, official company information is available as Open Data, but the lack of a standardized data model or provision method complicates the process of integrating this data. The Data Sharing Community actively collaborates to identify global data requirements and reference data sources, whether Open Data or commercial.

Short description
Managed reference data for administrative areas with language-specific terms and short names according to ISO 3166-2.
Managed reference data for bank accounts worldwide.
Managed reference data for types of identifiers per country.
Basic data concepts of CDQ Cloud Services.
Managed reference data about compliance lists considered in the sanction and watchlist screening services
Managed reference data for countries with language-specific names and short names according to ISO 3166-2.
Documentation of data quality rules with explanation and technical constraints to validate business partner data records.
Data quality rule functions are methods implemented in a programming language for being used in data quality rule implementations. They can be e.g. used in custom data quality rules similar to functions employed by business users in popular spreadsheet applications such as Microsoft Excel.
Managed reference data for legal forms with official and commonly used abbreviations and corresponding country.
Managed reference data for localities, such as exonyms.
Managed reference data for post codes
Managed reference data for postal delivery points, such as Post Office Boxes used for identification, extraction, harmonization and standardization.
Managed reference data for issuing bodies of identifiers
Managed reference data for thoroughfares of type Street (CDQ.POOL) used for harmonization and standardization.

Data Quality Rules

Transformation of human-documented data requirements into executable data quality rules is mostly a manual IT effort. Changing requirements cause IT efforts again and again. Some checks, e.g. tax number validity (not just format!), require external services. Other checks, e.g. validity of legal forms, require managed reference data (e.g. legal forms by country, plus abbreviations). Continuous data quality assurance (i.e. batch analyses) and real-time checks in workflows often use different rule sets.

Data requirements and related reference data are collected and updated collaboratively by the Data Sharing Community. Data quality rules are derived from these requirements automatically. All data quality rules are executed behind 1 interface, in real-time. Batch jobs and single-record checks use the same rule set and can be integrated by APIs.

For proving that a data quality rule is content-wise correct we maintain supporting document(s) per data quality rule which share the rule's source. This could be:

  • a public authority source
  • any other trustful webpage
  • a data standard of a specific community member

We manage the URL (if any), a screenshot of the relevant parts (if any) and the source's name (e.g. Community member data standard, European Commission, National ....) See Identifier format invalid (SIREN (France)) as an exemplary rule that was specified and implemented based on information provided by the OECD.