CoDEx: A Comprehensive Knowledge Graph Completion Benchmark | Tara Safavi | Learning w the Machines
CoDEx is a set of knowledge graph COmpletion Datasets EXtracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty.
In terms of scope, CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false.
CoDEx is characterized by thorough empirical analyses and benchmarking experiments.
First, each CoDEx dataset is analyzed in terms of logical relation patterns.
Next, baseline link prediction and triple classification results are reported on CoDEx for five extensively tuned embedding models.
Finally, CoDEx differentiates from the popular FB15K-237 knowledge graph completion dataset by showing that CoDEx covers more diverse and interpretable content, and is a more difficult link prediction benchmark.
Data, code, and pretrained models are available at https://bit.ly/2EPbrJs
--
Tara Safavi. Researcher, University of Michigan
Tara is a PhD candidate in computer science at the University of Michigan working with Danai Koutra. Her research, which is at the intersection of graph-based machine learning and natural language processing, focuses on relational knowledge representation and reasoning in machines.
--
Welcome to Connected Data London's #ThrowbackThursday
Every Thursday at 3pm GMT, we are releasing gems from our vault on #YouTube
Tune in and learn from leaders and innovators; subscribe to our channel and watch premieres as they are released!
#KnowledgeGraph #Reaearch #OpenSource #Wikidata #Wikipedia #AI