Protein remote homology detection represents a crucial and challenging task in bioinformatics: even if effective methods appeared in recent years, in several cases a proper characterization of remote evolutionary correlation can not be derived. In such situations, it may be possible that information derived from other sources helps, provided that it is possible to properly integrate such (even partial) information into existing models. In this paper, we provide some evidence that this route is feasible: inspired by the multimodal retrieval literature, we show how it is possible to exploit a simple multimodal approach to improve a model learned from a set of sequences, by using knowledge derived from a partial set of corresponding 3D structures. We investigate (with the SCOP 1.53 benchmark) the suitability of the proposed multimodal scheme, showing that a beneficial effect can be obtained even when a very reduced amount of structures are available. A further detailed analysis on a member of the GPCR superfamily confirms that this multimodal approach can extract information that cannot be obtained from sequence-based techniques.

A Multimodal Approach for Protein Remote Homology Detection

LOVATO, PIETRO;GIORGETTI, ALEJANDRO;BICEGO, Manuele
2015-01-01

Abstract

Protein remote homology detection represents a crucial and challenging task in bioinformatics: even if effective methods appeared in recent years, in several cases a proper characterization of remote evolutionary correlation can not be derived. In such situations, it may be possible that information derived from other sources helps, provided that it is possible to properly integrate such (even partial) information into existing models. In this paper, we provide some evidence that this route is feasible: inspired by the multimodal retrieval literature, we show how it is possible to exploit a simple multimodal approach to improve a model learned from a set of sequences, by using knowledge derived from a partial set of corresponding 3D structures. We investigate (with the SCOP 1.53 benchmark) the suitability of the proposed multimodal scheme, showing that a beneficial effect can be obtained even when a very reduced amount of structures are available. A further detailed analysis on a member of the GPCR superfamily confirms that this multimodal approach can extract information that cannot be obtained from sequence-based techniques.
2015
Multimodal approach, Ngrams, FragBag, topic models, GPCR
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/930174
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 7
social impact