Address Standardization

From CDQ
API/Data Curation API/Profile/ADDRESS STANDARDIZATION
Jump to navigation Jump to search


Name Name of a concept, e.g. a data model concept. In contrast to terms, the name does not depend on a given context, e.g. a country-specific language. Address Standardization
Description Informal and comprehensive human-readable definition of a concept. The Address Standardization profile standardizes a given input address according to the CDQ standards without considering any reference addresses from address data sources. It extracts different address components and places them in distinct fields (e.g. a PO Box maintained as street is put into a separate PO Box concept), enriches address components only based on already provided input (e.g. country name is enriched based on a given country code) and harmonizes given components (e.g. post code is formatted according to the reference standard in a country)
Technical key Defines a unique key by which e.g. data model concepts can be referenced in a technical integration context. These keys are unique in the CDL context. ADDRESS_STANDARDIZATION
API  API/Data Curation API

Activated features

 NameDescription
Detect industrial zoneDetect industrial zone
  • Detects and moves industrial zone from address fields to the last thoroughfare of type industrial zone. This data is also added to District for the SAP format record. The Zone is detected during the precuration process from the administrative area, locality, thoroughfare and premise. This feature works only for few countries with special terms.
  • The cleansing process doesn’t perform industrial zone enrichment. If this information is not provided in input data, the outputs won’t contain it. Google or here curation can remove the information about the industrial zone, but there are implemented special rules, which keep that data and after the curation paste it to the last thoroughfare.
Enrich administrative area ISOEnrich administrative area ISO
  • Enriches administrative area shortname with ISO value using managed administrative areas.
  • Standardizes administrative area into target language
Extract address contextExtract address context
  • Identifies and enriches address context
  • Moves name part after legal form to address context
Extract care ofExtract care of
  • Extracts care of from name and sets it in address.
  • Moves care of information from name local to careOf
  • Removes care of information from name international.
Normalize addressNormalize address
  • Capitalize locality, thoroughfare and premise
  • Normalize first level of the locality to CDQ standards
Parse addressParse address
  • Harmonize address data by parsing thoroughfare numbers, etc.
  • Tries to detect country if data invalid
  • Ensure that country shortname and value are set and existent
  • Extracts Postal Delivery Point value and number from PostCode
  • Removes or changes local elements in thoroughfare like esquina (spanish), mieszkania (polish), etc.
  • Extracts building number from thoroughfare
  • Identify and extract premise information, see also Premise Enrichment and Harmonization for details
  • Identifiy special patterns and extract accordingly such as
    • Ensure that kilometre patterns are identified and extracted
    • Detects and moves industrial zone from address fields to a premise of type industrial zone.
Preprocess addressPreprocess address
  • Trims each field of the address
  • Removes prefix like (D, D-, DE, DE-) from post code
  • Removes hyphens which are between words and number in thoroughfare value