The haplotype inference (HI) problem is defined as
the problem of inferring 2n haplotype pairs from n
observed genotype vectors. The inference of haplotype
information from genotype data (the latter of which
is more readily available) is very useful in
researching genes affecting health, disease and
responses to drugs and environmental factors. The PPH
or the Perfect Phylogeny Haplotype model assumes that
inferred haplotypes from a sample can be derived
using a single tree, i.e. a perfect phylogeny.
However, there are biological events such as
recombination that violate this model. Stochastic
methods on the other hand can infer haplotypes
despite recombination but they can be time consuming
and their inferences often depend on the initial
state randomly chosen during a run. The research
described in this monograph aimed to analyse previous
models and solutions to the haplotype inference
problem and engineer algorithms that would infer
haplotypes from genotypes in the presence of
recombination using disjoint and overlapping regions
of perfect phylogeny and scale better in terms of
time complexity.
the problem of inferring 2n haplotype pairs from n
observed genotype vectors. The inference of haplotype
information from genotype data (the latter of which
is more readily available) is very useful in
researching genes affecting health, disease and
responses to drugs and environmental factors. The PPH
or the Perfect Phylogeny Haplotype model assumes that
inferred haplotypes from a sample can be derived
using a single tree, i.e. a perfect phylogeny.
However, there are biological events such as
recombination that violate this model. Stochastic
methods on the other hand can infer haplotypes
despite recombination but they can be time consuming
and their inferences often depend on the initial
state randomly chosen during a run. The research
described in this monograph aimed to analyse previous
models and solutions to the haplotype inference
problem and engineer algorithms that would infer
haplotypes from genotypes in the presence of
recombination using disjoint and overlapping regions
of perfect phylogeny and scale better in terms of
time complexity.