Figure 2. 

The absolute branch lengths vary between a) and b) because the evolutionary rates of the two loci differ. But the relative branch length of each terminal taxon is constant. There is some variation in the relative branch length depending on the rate of variation regarding each taxon in each locus (compare b) and c)). This „normal“ variation leads to normal distribution of relative terminal branch lengths for each taxon (d)). In case a sequence is affected by contamination, e.g. cross-contamination of DNA from a different organism, or chimera assembly of different organisms, the placement of the contaminated taxon may be different in a new phylogenetic inference (e.g. taxon B in e)). But especially using many short loci, most single loci phylogenetic inferences are poor and not very precise. It is thus difficult to assess if the different topology is the product of limited phylogenetic signal in the locus, a contamination or true gene tree discordance. If, however we constrain the topology of the data set in e), the odd taxon will sit on an exceptionally long terminal branch (f)), that can be identified as an outlier in the distribution of relative terminal branch lengths for taxon B. In the case of true gene tree discordance we can expect the relative branch lengths for several taxa to be flagged as being outliers. The approach should be independent of the constraint tree used, as the distribution of branch lengths for a misplaced taxon will generally be too long, but an outlier can still be detected.

 
  Part of: Pauls SU, Graf W, Hjalmarsson AE, Lemmon A, Lemmon EM, Petersen M, Vitecek S, Frandsen PB (2023) Gill Structure Linked to Ecological and Species Diversification in a Clade of Caddisflies. Arthropod Systematics & Phylogeny 81: 917-929. https://doi.org/10.3897/asp.81.e110014