PanHunter Glossary page

This page provides definition of key terms used throughout the PanHunter platform.
  • Alignment The process of matching sequencing reads to a reference genome to determine their origin.

  • Annotation Information about genomic features (e.g., genes, exons) used to interpret sequencing data.

  • Barcode (sequencing) A short DNA sequence used to label and identify reads from individual samples or cells.

  • Biotype A classification of genes based on their function or characteristics (e.g., protein-coding, rRNA, pseudogene).

  • CDS (Coding Sequence) The portion of a gene that is translated into a protein.

  • Coverage The number of sequencing reads that align to a specific region of a gene or genome.

  • Deduplication The process of removing duplicate reads that originate from the same molecule, often using UMIs.

  • Gene body The full length of a gene from the start (5′ end) to the end (3′ end).

  • Genomic features Functional regions of the genome such as exons, introns, and regulatory regions.

  • Introns Non-coding regions within genes that are removed during RNA processing.

  • Library preparation The experimental process of preparing RNA or DNA samples for sequencing.

  • Mapping rate The percentage of reads that successfully align to the reference genome.

  • Mitochondrial genes Genes located in the mitochondrial genome, often used as indicators of cell quality or stress.

  • Normalization Adjusting data to account for differences (e.g., sequencing depth) so samples can be compared fairly.

  • Outlier A sample or value that differs significantly from others in the dataset.

  • Pseudogene A DNA sequence similar to a gene but typically non-functional.

  • Read (sequencing read) A short DNA or RNA sequence obtained from sequencing.

  • RNA degradation Breakdown of RNA molecules, which can affect data quality and lead to biased results.

  • Spike-in transcripts Artificial RNA sequences added to samples as controls to monitor experimental performance.

  • Transcript An RNA copy of a gene produced during gene expression.

  • UMI (Unique Molecular Identifier) A short sequence attached to reads to distinguish original molecules and remove duplicates.

  • UTR (Untranslated Region) Regions of a gene (5′ or 3′) that are not translated into protein but play regulatory roles.