WIT Press


An XML Based Semantic Protein Map

Price

Free (open access)

Volume

33

Pages

10

Published

2004

Size

299 kb

Paper DOI

10.2495/DATA040061

Copyright

WIT Press

Author(s)

A. S. Sidhu, T. S. Dillon & H. Setiawan

Abstract

From the nature of the algorithms for data mining we note that an XML framework can be represented using graph matching algorithms. Various techniques currently exist for graph matching of data structures such as the Adjacency Matrix or Algebraic Representation of Graphs. The Graph Representation can be easily converted to a string representation. Both Graph and String Representations miss semantic relationships that exist in the data. These relationships can be captured by using semi-structured XML as a representation format. We already have an approach to integrate different data formats into a Unified Database. The technique is successfully applied to diverse Protein Databases in a Bioinformatics Domain. An XML representation of this comprehensive database preserving order and semantic relationships is already generated. In this paper we propose an approach to a Semantic Protein Map (PMAP) by building a shared ontology on our structured database model. This ontology can be used by various Bioinformatics researchers from one single site. This site will host mirrors of Protein Databases along with BIODB and have tools on Similarity Searching. Keywords: bioinformatics, protein structures, biomedical ontologies, data integration, data semantics, semantic web. 1 Introduction Bioinformatics is the field of science in which, biology, computer science, and information technology merge to form a single discipline. The ultimate goal of the field is to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned. [1]. Data integration issues have stymied computer scientists and genetics alike for last 20 years, and yet successfully overcoming them is critical to success of genomics research as it transitions from wet-lab activity to an electronic-based

Keywords

bioinformatics, protein structures, biomedical ontologies, data integration, data semantics, semantic web.