Tracking down phylogenetic signals in non-coding cluster sequences of Hox clusters
Usually, phylogenetic analyses make use of coding sequences. The reasons are historical ones: for a long time, geneticist focused on the evolution of coding sequences. They collected large amounts of orthologous genes with high levels of homology from distantly related species. Accordingly, the knowledge about evolutionary mechanisms and strategies for multiple alignment and phylogenetic reconstructions were also deduced from gene sequences.
With the emergence of non-coding sequences in the databases and adequate alignment tools, one could start to extract phylogenetic signals from orthologous non-coding sequences. These signals are more or less conserved non-coding nucleotides (CNCNs) that occur clustered at so-called phylogenetic footprints. In cases where the gene framework is highly conserved, large amounts of footprints can be extracted from adjacent non-coding sequences.
We used phylogenetic signals within the non-coding sequences of Hox clusters to look at sequence phylogenies and to shed light on the sequence of cluster duplication events during vertebrate evolution. Furthermore, we could use the large amounts of data together with statistical methods to study the loss and retention of conserved non-coding regions.