Finds all scaffolds that fall into a selection polygon highlighted in a plot generated by mmplot.

mmextract(mm, selection, min_length = 0, inverse = FALSE)

Arguments

mm

(required) A dataframe loaded with mmload.

selection

(required) A 2-column dataframe with the x and y coordinates of a selection of points in an mmplot. The column names of the provided dataframe must match column names in mm.

min_length

Filter all scaffolds with a length at or below this value before the extraction. (Default: 0)

inverse

(Logical) If TRUE, then the scaffolds within the selection are instead removed. (Default: FALSE)

Value

A dataframe (tibble) compatible with other mmgenome2 functions.

Author

Kasper Skytte Andersen ksa@bio.aau.dk

Soren M. Karst smk@bio.aau.dk

Mads Albertsen MadsAlbertsen85@gmail.com

Examples

library(mmgenome2) data(mmgenome2) mmgenome2
#> # A tibble: 97,285 x 13 #> scaffold length gc cov_C13.11.14 cov_C13.11.25 cov_C13.12.03 cov_C14.01.09 #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 1 8264 57.8 1.44 53.6 0 0.066 #> 2 2 1027 57.0 0.625 24.2 0 0 #> 3 3 1665 55.9 13.5 434. 0.166 0.177 #> 4 4 9056 35.9 0.01 23.4 0 0 #> 5 5 3343 64.0 3.20 16.4 0 0 #> 6 6 98207 39.1 0.00966 24.5 3.29 9.85 #> 7 7 6480 63.0 2.61 19.2 1.46 12.3 #> 8 8 15790 61.7 2.78 21.2 1.62 10.3 #> 9 9 1403 70.4 85.1 192. 0 0 #> 10 10 2018 70.2 50.3 101. 0 0 #> # … with 97,275 more rows, and 6 more variables: PC1 <dbl>, PC2 <dbl>, #> # PC3 <dbl>, geneID <chr>, taxonomy <fct>, rRNA16S <fct>
selection <- data.frame( cov_C13.11.25 = c(7.2, 16.2, 25.2, 23.3, 10.1), cov_C14.01.09 = c(47, 77, 52.8, 29.5, 22.1) ) mmgenome2_extraction <- mmextract(mmgenome2, min_length = 3000, selection = selection, inverse = FALSE )
#> 47 scaffolds (or 1.33% of the scaffolds in mm, weighted by length) remain after 97238 of 97285 scaffolds have been filtered.
mmgenome2_extraction
#> # A tibble: 47 x 13 #> scaffold length gc cov_C13.11.14 cov_C13.11.25 cov_C13.12.03 cov_C14.01.09 #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 180 22093 74.0 0.298 12.0 7.34 37.9 #> 2 227 190826 65.4 0.348 13.0 6.91 38.5 #> 3 253 296715 70.2 1.06 15.2 7.00 41.1 #> 4 315 218862 71.7 0.366 12.6 7.17 38.8 #> 5 343 27203 57.6 0.235 10.7 5.07 32.0 #> 6 381 6128 67.2 8.11 22.6 5.89 48.7 #> 7 386 164942 68.2 0.359 12.3 6.73 37.2 #> 8 396 7448 72.6 0.329 13.0 7.59 41.0 #> 9 423 100199 68.7 1.40 15.3 6.71 40.0 #> 10 431 446220 72.4 0.409 12.6 6.80 37.9 #> # … with 37 more rows, and 6 more variables: PC1 <dbl>, PC2 <dbl>, PC3 <dbl>, #> # geneID <chr>, taxonomy <fct>, rRNA16S <fct>