This function reads data from a single GenBank file or directorty with GenBank files. It allows selective extraction of information by specifying sections and features.
Arguments
- path
A string representing the file path to the target GenBank (.gbk) file or directory.
- sections
An optional vector of strings representing the names of specific sections within the GenBank file to extract (e.g., "LOCUS", "DEFINITION", "ACCESSION", "VERSION"). If `NULL` (the default), the function extracts all available sections.
- features
An optional vector of strings indicating specific feature types to extract from the FEATURES section of the GenBank file (e.g., "CDS", "gene", "mRNA"). If `NULL` (the default), the function extracts all feature types present in the FEATURES section.
- origin
A boolean flag; when set to `TRUE` (the default), the origin sequence data is included in the output.
Value
A list containing the contents of the specified sections and features of the GenBank file. Each section and feature is returned as a separate list element.
Examples
if (FALSE) {
# Read all data from a GenBank file
gbk_data <- read_gbk("path/to/genbank_file.gbk")
# Read all data from a directory of GenBank files
gbk_data <- read_gbk("path/to/genbank/directory")
# Read only specific sections from a GenBank file
gbk_data <- read_gbk(
"path/to/genbank_file.gbk",
sections = c("LOCUS", "DEFINITION")
)
# Read specific features from the FEATURES section of a GenBank file
gbk_data <- read_gbk("path/to/genbank_file.gbk", features = c("gene", "CDS"))
# Read data without the origin sequence
gbk_data <- read_gbk("path/to/genbank_file.gbk", origin = FALSE)
}