Loads, validates and combines multiple aspects of metagenome data into one dataframe for use with all mmgenome2 functions, including scaffold assembly sequences, scaffold coverage, essential genes, taxonomy, and more.
mmload( assembly, coverage = NULL, essential_genes = NULL, taxonomy = NULL, additional = NULL, kmer_pca = FALSE, kmer_BH_tSNE = FALSE, kmer_size = 4L, verbose = TRUE, ... )
assembly | (required) A character string with the path to the assembly FASTA file, or the assembly as already loaded with |
---|---|
coverage | (required) A path to a folder to scan for coverage files, or otherwise a named
|
essential_genes | Either a path to a CSV file (comma-delimited ",") containing the essential genes, or a 2-column dataframe with scaffold names in the first column and gene ID's in the second. Can contain duplicates. (Default: |
taxonomy | A dataframe containing taxonomy assigned to the scaffolds. The first column must contain the scaffold names. (Default: |
additional | A dataframe containing any additional data. The first column must contain the scaffold names. (Default: |
kmer_pca | (Logical) Perform Principal Components Analysis of kmer nucleotide frequencies (kmer size defined by |
kmer_BH_tSNE | (Logical) Calculate Barnes-Hut t-Distributed Stochastic Neighbor Embedding (B-H t-SNE) representations of kmer nucleotide frequencies (kmer size defined by |
kmer_size | The kmer frequency size (k) used when |
verbose | (Logical) Whether to print status messages during the loading process. (Default: |
... | Additional arguments are passed on to |
A dataframe (tibble) compatible with other mmgenome2 functions.
Kasper Skytte Andersen ksa@bio.aau.dk
if (FALSE) { library(mmgenome2) mm <- mmload( assembly = "path/to/assembly.fa", coverage = list( nameofcoverage1 = read.csv("path/to/coveragetable1.csv", col.names = TRUE), nameofcoverage2 = read.csv("path/to/coveragetable2.csv", col.names = TRUE) ), essential_genes = "path/to/ess_genes.txt", verbose = TRUE ) mm }