A Comparison of Statistical and Machine Learning Algorithms on the Task of Link Completion
by Anna Goldenberg, Jeremy Kubica, Paul Komarek, Andrew Moore and Jeff Schneider
author = "Anna Goldenberg and Jeremy Kubica and Paul Komarek and Andrew Moore and Jeff Schneider",
title = "A Comparison of Statistical and Machine Learning Algorithms on the Task of Link Completion",
booktitle = "KDD Workshop on Link Analysis for Detecting Complex Behavior",
year = "2003"
Link data, consisting of a collection of subsets of entities, can be
an important source of information for a variety of fields including
the social sciences, biology, criminology, and business intelligence.
However, these links may be incomplete, containing one or more unknown
members. We consider the problem of link completion, identifying
which entities are the most likely missing members of a link given the
previously observed links. We concentrate on the case of one missing
entity. We compare a variety of recently developed along with standard
machine learning and strawman algorithms adjusted to suit the
task. The algorithms were tested extensively on a simulated and
a range of real-world data sets.