• Skip to primary navigation
  • Skip to main content

← cbs.dk

CBS logo

DanTermBank

  • About the project
    • Knowledge aquisition
      • Automatic knowledge extraction
      • Automatic merging and validation of data
    • Knowledge validation
    • Knowledge dissemination
    • User scenarios
    • Project participants
    • Advisory Board
  • Project funding
  • Testimonials
  • Events
    • Closing conference
  • Publications
  • Links
  • Contact
  • Dansk
  • English
  • Show Search
Hide Search

Automatic merging

and validation of data

The aim of this subproject is to develop methods for converting and combining terminology data from various existing sources. Two very complex types of problems exist in this process. The first type of problems that are likely to be encountered pertains to form: The data are likely to have different structures and be stored in different formats. The second type of problems pertains to content: The data may be of varying quality, and entries from the various resources may contain information about the same concept, but be associated with different sets of synonyms and with slightly varying definitions, or the other way round, have overlapping form but be associated with different concepts.

We have developed a taxonomy of datatypes for termbases, see the publication Madsen et al. (2013) in eDITion and visit the database: vip.iterm.dk (select the database: DanTermBank Data Categories from the drop-down list, Login and password: PUBLIC)

Furthermore, we have started developing methods for merging entries containing equivalent concepts, see the publication Madsen et al. (2012) from TKE. Further work on merging entries has been postponed for a later phase of the overall DanTermBank project.

Copyright © 2023 · Copenhagen Business School

  • Accessibility Statement
  • Privacy Policy
  • Cookies