
Overview
Since our first genome interpretation in 2008—for the third named person ever to be sequenced, we have been developing cutting-edge technologies to speed and optimize the process of interpreting human whole genomes. In preparation for the next generation of large-scale whole genome studies, our technologies have been built from the ground up to either interpret a single whole genome or to simultaneously compare and interpret thousands of whole genomes.
kGAP™
Used for both knomeDISCOVERY and knomeBASE, kGAP is a genome informatics engine that automates the process of standardizing, distilling, annotating, and comparing sequence data. Designed to process many genomes at once, kGAP can complete in a day what would otherwise require months of effort and a team of specialists.

Annotates known and novel variants. kGAP is based on a database of known human allele-phenotype associations. Its state-of-the-art curation integrates, reconciles, and refines reference data from diverse sources (including public, proprietary, and project-specific) in order to yield accurate and detailed insights into sequence variants carried by each genome or exome under examination. In addition to analyzing previously annotated variants, kGAP offers crucial predictive insight into the potential functional effects of novel variants, prioritizing them for empirical follow-up based on a set of custom criteria that identifies leading candidates for plausible involvement in the phenotype of interest.
Enables multi-genome comparison. Investigation often requires comparing many genomes to each other in order to spot alleles that are distinctively shared by people with a given phenotype or by one tissue (e.g., a tumor) versus another. kGAP was built from the ground up for multi-genome comparison. It sifts through vast amounts of sequence data to identify alleles shared by particular subsets of genomes in a study, flagging variants most likely implicated in the phenotype of interest.
Distills data. For every genome, kGAP produces a compact, fully annotated, easily queriable database of variation. These databases are accessible through a robust API that enables the rapid development of new interpretation applications.
Once processed through kGAP, genomes can be quickly and flexibly interrogated by our software tools, such as knomeVARIANTS and knomePATHWAYS, in order to identify the genetic variants, genes, and pathways most likely to govern the disease or drug response of concern.
knomeVARIANTS™

knomeVARIANTS is a query kit that lets users search for candidate causal variants in studied genomes. It includes a query interface (see above), scripting libraries, and data conversion utilities. It is used by Knome’s geneticists and also available to clients as part of knomeDISCOVERY and knomeBASE.
Users select cases and controls, input a putative inheritance mode, and add sensible filter criteria (variant functional class, rarity/novelty, location in prior candidate regions, etc.) to automatically generate a sorted short-list of leading candidates. The application includes a SQL query interface to let users query the database as they wish, including by complex or novel sets of criteria.
In addition to querying, the application lets users export subsets of the database for viewing in MS Excel. Subsets can be output that target common research foci, including the following:
• Sites implicated in phenotypes, regardless of subject genotypes
• Sites where at least one studied genome mismatches the reference
• Sites where a particular set of one or more genomes, but no other genomes, show a novel variant
• Sites in phenotype-implicated genes
• Sites with nonsense, frameshift, splice-site, or read-through variants, relative to reference
• Sites where some but not all subject genome were called
knomePATHWAYS™

knomePATHWAYS is a visualization tool that overlays variants found in each sample genome onto known gene interaction networks in order to help spot functional interactions between variants in distinct genes, and pathways enriched for variants in cases versus controls, differential drug responder groups, etc.
knomePATHWAYS is used by Knome’s interpretation teams and is also available to clients as part of knomeDISCOVERY and knomeBASE.
knomePATHWAYS integrates reference data from many sources, including GO, HPRD, and MsigDB (which includes KEGG and Reactome data). The application is particularly helpful in addressing higher-order questions, such as finding candidate genes and protein pathways, that are not readily addressed from tabular annotation data alone.
