Populating a Linked Data Entity Name System: A Big Data Solution to Unsupervised Instance Matching

Populating a Linked Data Entity Name System: A Big Data Solution to Unsupervised Instance Matching Front Cover
0 Reviews
by
2017-01-15
190 pages

Book Description

Resource Description (RDF) is a -based data model used to publish data as a Web of Linked Data. RDF is an emergent foundation for large-scale data integration, the problem of providing a unified view over multiple data sources. An Entity Name System (ENS) is a thesaurus for entities, and is a crucial component in a data integration architecture. Populating a Linked Data ENS is equivalent to solving an problem called instance matching, which concerns identifying pairs of entities referring to the same underlying entity. This publication presents an instance matcher with 4 properties, namely automation, heterogeneity, scalability and domain independence. Automation is addressed by employing inexpensive but well-performing heuristics to automatically generate a set, which is employed by other machine learning in the pipeline. Data-driven alignment are adapted to deal with structural heterogeneity in RDF . Domain independence is established by actively avoiding prior assumptions about input domains, and through evaluations on 10 RDF test cases. The full system is scaled by implementing it on cloud infrastructure using MapReduce algorithms. Resource Description Framework (RDF) is a graph-based data model used to publish data as a Web of Linked Data. RDF is an emergent foundation for large-scale data integration, the problem of providing a unified view over multiple data sources. An Entity Name System (ENS) is a thesaurus for entities, and is a crucial component in a data integration architecture. Populating a Linked Data ENS is equivalent to solving an Artificial Intelligence problem called instance matching, which concerns identifying pairs of entities referring to the same underlying entity.

Table of Contents

Chapter 1: Introduction
Chapter 2: Background
Chapter 3: Related Work
Chapter 4: Type Alignment
Chapter 5: Training Set Generation
Chapter 6: Property Alignment
Chapter 7: Blocking and Classification
Chapter 8: Scalability
Chapter 9: Conclusion
Appendix A: MapReduce

Book Details

  • Title: Populating a Linked Data Entity Name System: A Big Data Solution to Unsupervised Instance Matching
  • Author:
  • Length: 190 pages
  • Edition: 1
  • Language: English
  • Publisher:
  • Publication Date: 2017-01-15
  • ISBN-10: 1614996911
  • ISBN-13: 9781614996910
File HostFree Download LinkFormatSize (MB)Upload Date
UsersCloud Click to downloadTrue PDF6.805/27/2018
How to Download? Report Dead Links & Get a Copy

Leave a Reply