kGAP
Our proprietary informatics engine for processing human NGS data
kGAP
kGAP is our proprietary informatics engine, designed to align, call, annotate, and compare human genome, exome, and targeted next generation sequencing data. It both used by our Services group and packaged as a major component of the knoSYS™100 interpretation system. Developed by a team of software engineers, geneticists, clinicians, and bioinformaticians, kGAP has been used in the interpretation of thousands of human genomes and exomes.
Aligns and calls NGS data

kGAP seamlessly integrates widely-used alignment and calling packages into its pipeline, including Burrows-Wheeler Aligner (BWA) and the Genome Analysis Toolkit (GATK).
Annotates known and novel variants

kGAP is based on a knowledge base of known human allele-phenotype associations. Our state-of-the-art curation integrates, reconciles, and refines reference data from diverse sources in order to yield accurate and detailed insights into sequence variants carried by each genome or exome under examination. In addition to analyzing previously annotated variants, kGAP offers crucial predictive insight into the potential functional effects of novel variants, prioritizing them for empirical follow-up based on a set of custom criteria that identifies leading candidates for plausible involvement in the phenotype of interest.
• Reference genome (HG19 (chromosomes 1-22/X/Y), GRC37 (chromosome M))
• dbSNP
• Ensembl: gene definitions
• Ensembl/VEP: ConDel scores, GO terms, PubMed
• Gene Aliases
• HapMap III
• Exome Variant Server, with allelisms and allele frequencies
• 1000 Genomes, with allelisms and allele frequencies
• Human Protein Reference db (HPRD)
• Molecular Signatures db, including KEGG and Reactome (MSigDB)
• Human Gene Mutation db (HGMD)
• Phastcons 33-mammal track
Drawing on the resulting reference database (with over 130,000 phenotype associations), kGAP richly annotates (including gene-associated phenotypes, frequency within populations, genes, effect on protein function, appropriate risk estimates for site-phenotype associations) and provides links to relevant publications.
Enables multi-genome comparison

Investigation often requires comparing many genomes to each other in order to spot alleles that are distinctively shared by people with a given phenotype or by one tissue (e.g., a tumor) versus another. kGAP was built from the ground up for multi-genome comparison. It sifts through vast amounts of sequence data to identify alleles shared by particular subsets of genomes in a study, flagging variants most likely implicated in the phenotype of interest.
kGAP is designed to meet the big data challenges of genomics, having been used to simultaneously compare over 1,000 whole human genomes.
In addition to creating a VSD files for each genome, kGAP creates a single compact database (a Variable Site Comparison Database, or VSCDs) that summarizes the distribution of variants among all studied genomes, enabling the fast and flexible querying of multiple genomes.