Skip to contents

This function reads data from a single GenBank file or directory with GenBank files. It allows selective extraction of information by specifying sections and features.

Usage

read_gbk(path, sections = NULL, features = NULL, origin = TRUE)

Arguments

path

A string representing the file path to the target GenBank (.gbk) file or directory.

sections

An optional vector of strings representing the names of specific sections within the GenBank file to extract (e.g., "LOCUS", "DEFINITION", "ACCESSION", "VERSION"). If `NULL` (the default), the function extracts all available sections.

features

An optional vector of strings indicating specific feature types to extract from the FEATURES section of the GenBank file (e.g., "CDS", "gene", "mRNA"). If `NULL` (the default), the function extracts all feature types present in the FEATURES section.

origin

A boolean flag; when set to `TRUE` (the default), the origin sequence data is included in the output.

Value

A list containing the contents of the specified sections and features of the GenBank file. Each section and feature is returned as a separate list element.

Examples

# \donttest{
# Path to example GenBank file in the package
genbank_file <- system.file(
  "extdata",
  "BGC0000001.gbk",
  package = "geneviewer"
  )

# Read all data from the example GenBank file
gbk_data <- read_gbk(genbank_file)

# Read only specific sections from the example GenBank file
gbk_data <- read_gbk(genbank_file, sections = c("LOCUS", "DEFINITION"))

# Read specific features from the FEATURES section of the example GenBank file
gbk_data <- read_gbk(genbank_file, features = c("gene", "CDS"))

# Read data without the origin sequence
gbk_data <- read_gbk(genbank_file, origin = FALSE)
# }