EPSRC logo

Details of Grant 

EPSRC Reference: EP/Y017706/1
Title: OntoEm: Semantic Embedding for Ontologies
Principal Investigator: Chen, Dr J
Other Investigators:
Researcher Co-Investigators:
Project Partners:
SNOMED International University of Oxford
Department: Computer Science
Organisation: University of Manchester, The
Scheme: New Investigator Award
Starts: 01 February 2024 Ends: 31 January 2027 Value (£): 449,494
EPSRC Research Topic Classifications:
Artificial Intelligence Computational Linguistics
EPSRC Industrial Sector Classifications:
Information Technologies
Related Grants:
Panel History:
Panel DatePanel NameOutcome
25 Sep 2023 EPSRC ICT Prioritisation Panel Sept 2023 Announced
Summary on Grant Application Form
Ontology is a formal, explicit and shared representation of conceptual knowledge. Modern ontologies in Web Ontology Language (OWL) can represent different kinds of knowledge, and support logical reasoning, with wide applications in many domains such as Artificial Intelligence (AI), Knowledge Management, Bioinformatics and the Semantic Web. A few OWL ontologies such as the food ontology FoodOn, the gene ontology GO and the disease ontology DOID have become the standard for concept definition and knowledge exchange within a domain. OWL ontology is also quite general, covering some other popular knowledge representation forms, including Taxonomy which can be regarded as ontology's class hierarchy, and Google's Knowledge Graph (KG) which can be regarded as ontology's assertion knowledge.

Meanwhile, Machine Learning (ML) especially Deep Neural Network has become a critical technique of AI, with great success achieved in many applications such as image understanding and machine translation. ML is even applied in ontology and KG construction, especially for knowledge extraction and missing knowledge prediction. Since ML usually processes numeric data while ontology is mainly composed of symbols, there is a big gap to apply ML to ontology processing or to combine ML and ontology. One promising technology is Semantic Embedding which is to represent symbols such as words in text and entities in a KG into a vector space with their semantics (relationships) preserved. For example, considering the fact that London is the capital of the UK, it is expected that the vectors (embeddings) of London, UK and "capitalOf" satisfy some condition e.g., "capitalOf" maps London to UK.

Embedding for KGs has been widely studied for around ten years with a few successful methods proposed, while embedding the more general and more complex OWL ontologies is much more challenging and its exploration is still quite preliminary. Some previous embedding solutions like box lattice embeddings and hyperbolic embeddings can be applied to ontology but can only consider its concept hierarchy. The current OWL ontology embedding methods can be divided into syntactic approaches such as OPA2Vec and OWL2Vec*, and model-theoretic approaches such as ELEm and BoxEL. The former aim to preserve syntactic regularities such as co-occurrence between entities, relying on literals like entities' textual labels and definitions with limited attention to logics, while the latter embed the logic structure, but currently can only support a part of the logics and some features of OWL.

More fundamental research on OWL ontology embedding as well as its integration with ML is very promising and urgently required. Embedding the logics and inject them into ML can help address challenges such as sample shortage and model transfer, leading to new paradigms of neural-symbolic integration, while embedding more complete semantics (e.g., both logics and literals) will lead to more robust knowledge inference in ontology construction and curation, and augment ontology's application in many domains such as life science. This proposed project will study a range of OWL ontology embedding methods for (1) supporting more logical relationships that can be defined by OWL, and (2) jointly embedding multi-modal semantics including textual literals, numerical literals, and formal logics. These embeddings are expected to be integrated with ML such that (1) missing knowledge in an ontology can be more accurately predicted with explanation and human interaction supported, (2) some real-life problems such as protein function prediction and protein-protein interaction prediction in life science can be better addressed with OWL ontology-based knowledge presentation, and (3) the sample shortage problem in ML gets some more robust solutions through the injection of OWL ontology-based external knowledge, with better performance in some typical tasks such as KG completion and image classification.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.man.ac.uk