Data Sharing Community
New feature: Matching Candidate Review via Web Application enabled (26 February 2021)
The decision log app enables to store decisions of report reviews for dedicated services into a storage. Currently this functionality is available for the matching service. (Duplicate & Linkage). One of the main goals to run a report, is to get information about data defects and to adjust then the corresponding information in the source system. Via services like duplicate matching, data profiling, data curation, data defect are detected and proposed with suggestions to improve it. All kind of services will never guarantee a 100% correctness. E.g., the duplicate identification, which is mainly based on fuzzy search, will always show some false-positive and false-negative values. The report, which is the basis for the review, contains now a field to mark such information. The reviewed report can then be uploaded in the decision-log app and is then ready to use in the next run, so that the reviews are respected.
Please check out our use case description here: https://meta.cdq.com/Use_case/Iterative_Duplicate_Check
Minor metadata change in the duplicate and consolidation reports (19 February 2021)
The coversheet of the duplicate and consolidation reports has been extended with metadata about the matching configuration id that was used for creating the results. This allows for tracing the origin of results.
Open data from the Czech Register of Economic Entities connected (19 February 2021)
A new public data source has been connected to the CDQ Cloud Services. The Czech Register of Economic Entities can be now searched using the business partner lookup and employed for enrichments. There are more than 1 Mio records available.... further results
Business partner data management is heavily redundant: Many companies manage data for the same entities such as country names and codes, bill-to, ship-to, and ordering addresses, or legal hierarchies of customers and suppliers. The CDQ collaboration approach is based on a trusted network of user companies that share and collaborativelay maintain this data.
An important prerequisite for collaborative data management is a common understanding of the shared data. For the CDQ Data Sharing Community, this common understanding is specified by the CDQ Data Model. The concepts of this model are defined and documented in this wiki which can be used as a business vocabulary. Moreover, the wiki provides a machine-readable interface to reuse this metadata by using semantic annotations.
A procedure is a common standard or "how-to" for a specific data management task. Within the CDQ Data Sharing Community, companies agree on such procedures to ensure similar rules and guidelines for similar tasks. For several countries, the CDQ Wiki provides such information, e.g. data quality rules, trusted information sources, legal forms, or tax numbers. Try
or select another country from the list.
From an integration perspective, CDQ web services are the most important component of the CDQ infrastructure. They provide the technical link between your business applications and the CDQ cloud services. We follow the REST design principle for web services which allows for lightweight interface design and easy integration. Of course, all web services are also available at WSDL interfaces.
Transformation of human-documented data requirements into executable data quality rules is mostly a manual IT effort. Changing requirements cause IT efforts again and again. Some checks, e.g. tax number validity (not just format!), require external services. Other checks, e.g. validity of legal forms, require managed reference data (e.g. legal forms by country, plus abbreviations). Continuous data quality assurance (i.e. batch analyses) and real-time checks in workflows often use different rule sets. Data requirements and related reference data are collected and updated collaboratively by the Data Sharing Community. Data quality rules are derived from these requirements automatically, auditor approved. All data quality rules are executed behind 1 interface, in real-time, 1’000+ rules in < 1s. Batch jobs and single-record checks use the same rule set and can be integrated by APIs. If reference data (e.g. correct tax numbers) is available, fix proposals are provided for incorrect records.
Companies are facing an ever increasing number of digitized frauds, meanwhile on a very professional level. Among other types, falsified invoices are causing significant financial damage, in some cases more than 1 Mio. USD by just one attack. One critical challenge to uncover those fraud attacks is to identify bank accounts (e.g. given by an invoice) which are not owned by the declared business partner (e.g. the supplier of an invoice) but by a third party, i.e. the attacker. The CDQ Data Sharing community is addressing this challenge by sharing information on known fraud cases and on proven bank accounts. The Fraud Case Database comprises known fraud cases, shared by community members. Other members can lookup these cases by bank account data (e.g. IBAN) to automate screening for critical accounts. On the other hand, the Whitelist comprises bank accounts which are declared "save" by community members. You can lookup shared Trust Scores to check a new bank account and to ensure that this account is already used by another member.