A small tool written in Java for cluster analysis of sequences.




Frequently, homology is used as a reason to transfer knowledge about function or structure from known to unknown proteins. Although phylogenies are the method of choice when attempting to determine homology, the most frequently used marker is pairwise sequence similarity.

The problem of finding groups of co-regulated genes across a number of microarray experiments is quite similar to the problem of finding groups of homologous proteins in a large dataset. In both cases we have huge amounts of data and are looking for those few genes that show some kind of significant similarity.

Putative co-regulation as well as putative homology are generally inferred from similarities in the feature set of a gene. In the first case the feature set consists of the expression levels across the experiments, in the second case of the amino-acid sequences of the proteins.

A certain similarity is also apparent when you look at the graphs generated for either protein sequence data or microarray data. Get CLANS and take it for a spin to see what it can actually do for you!
Last updated on January 18th, 2011

