Merges 2 or more GRanges objects by combining all of their ranges and
associated signal (e.g. readcounts). If multiplex = TRUE
, the input
datasets are reversibly combined into a multiplexed GRanges containing a
field for each input dataset.
mergeGRangesData(
...,
field = "score",
multiplex = FALSE,
makeBRG = TRUE,
exact_overlaps = FALSE,
ncores = getOption("mc.cores", 2L)
)
Any number of GRanges objects in which signal (e.g. readcounts)
are contained within metadata. Lists of GRanges can also be passed, but
they must be named lists if multiplex = TRUE
. Multiple lists can be
passed, but if any inputs are lists, then all inputs must be lists.
One or more input metadata fields to be combined,
typically the "score" field. Fields typically contain coverage information.
If only a single field is given (i.e. all input GRanges use the same
field), that same field will be used for the output. Otherwise, the
"score"
metadata field will be used by default. The output metadata
fields are different if multiplex
is enabled.
When set to FALSE
(the default), input GRanges are
merged irreversibly into a single new GRange, effectively combining the
reads from different experiments. When multiplex = TRUE
, the input
GRanges data are reversibly combined into a multiplexed GRanges object,
such that each input GRanges object has its own metadata field in the
output.
If TRUE
(the default), the output GRanges will be made
"basepair-resolution" via
makeGRangesBRG
. This is always set to codeFALSE if
exact_overlaps = TRUE
.
By default (FALSE
), any ranges that cover more
than a single base are treated as "coverage" signal (see
getCountsByRegions
). If
exact_overlaps = TRUE
, all input ranges are conserved exactly as
they are; ranges are only combined if they overlap exactly, and the signal
of any combined range is the sum of the input ranges that were merged.
Number of cores to use for computations.
A disjoint, basepair-resolution (single-width) GRanges object comprised of all ranges found in the input GRanges objects.
If multiplex = FALSE
, single fields from each input are combined
into a single field in the output, the total signal of which is the sum of
all input GRanges.
If multiplex = TRUE
, each field of the output corresponds to an
input GRanges object.
If multiplex = TRUE
,
the datasets are only combined into a single object, but the data
themselves are not combined. To subset field_i
, corresponding to
input dataset_i
:
multi.gr <- mergeGRangesData(gr1, gr2, multiplex = TRUE)
subset(multi.gr, gr1 != 0, select = gr1)
# select gr1
data("PROseq") # load included PROseq data
#--------------------------------------------------#
# divide & recombine PROseq (no overlapping positions)
#--------------------------------------------------#
thirds <- floor( (1:3)/3 * length(PROseq) )
ps_1 <- PROseq[1:thirds[1]]
ps_2 <- PROseq[(thirds[1]+1):thirds[2]]
ps_3 <- PROseq[(thirds[2]+1):thirds[3]]
# re-merge
length(PROseq)
#> [1] 47380
length(ps_1)
#> [1] 15793
length(mergeGRangesData(ps_1, ps_2, ncores = 1))
#> [1] 31586
length(mergeGRangesData(ps_1, ps_2, ps_3, ncores = 1))
#> [1] 47380
#--------------------------------------------------#
# combine PRO-seq with overlapping positions
#--------------------------------------------------#
gr1 <- PROseq[10:13]
gr2 <- PROseq[12:15]
PROseq[10:15]
#> GRanges object with 6 ranges and 1 metadata column:
#> seqnames ranges strand | score
#> <Rle> <IRanges> <Rle> | <integer>
#> [1] chr4 42618 + | 1
#> [2] chr4 42619 + | 2
#> [3] chr4 42621 + | 1
#> [4] chr4 42622 + | 2
#> [5] chr4 42652 + | 3
#> [6] chr4 42657 + | 1
#> -------
#> seqinfo: 7 sequences from an unspecified genome
mergeGRangesData(gr1, gr2, ncores = 1)
#> GRanges object with 6 ranges and 1 metadata column:
#> seqnames ranges strand | score
#> <Rle> <IRanges> <Rle> | <integer>
#> [1] chr4 42618 + | 1
#> [2] chr4 42619 + | 2
#> [3] chr4 42621 + | 2
#> [4] chr4 42622 + | 4
#> [5] chr4 42652 + | 3
#> [6] chr4 42657 + | 1
#> -------
#> seqinfo: 7 sequences from an unspecified genome
#--------------------------------------------------#
# multiplex separate PRO-seq experiments
#--------------------------------------------------#
multi.gr <- mergeGRangesData(gr1, gr2, multiplex = TRUE, ncores = 1)
multi.gr
#> GRanges object with 6 ranges and 2 metadata columns:
#> seqnames ranges strand | gr1 gr2
#> <Rle> <IRanges> <Rle> | <integer> <integer>
#> [1] chr4 42618 + | 1 0
#> [2] chr4 42619 + | 2 0
#> [3] chr4 42621 + | 1 1
#> [4] chr4 42622 + | 2 2
#> [5] chr4 42652 + | 0 3
#> [6] chr4 42657 + | 0 1
#> -------
#> seqinfo: 7 sequences from an unspecified genome
#--------------------------------------------------#
# subset a multiplexed GRanges object
#--------------------------------------------------#
subset(multi.gr, gr1 > 0)
#> GRanges object with 4 ranges and 2 metadata columns:
#> seqnames ranges strand | gr1 gr2
#> <Rle> <IRanges> <Rle> | <integer> <integer>
#> [1] chr4 42618 + | 1 0
#> [2] chr4 42619 + | 2 0
#> [3] chr4 42621 + | 1 1
#> [4] chr4 42622 + | 2 2
#> -------
#> seqinfo: 7 sequences from an unspecified genome
subset(multi.gr, gr1 > 0, select = gr1)
#> GRanges object with 4 ranges and 1 metadata column:
#> seqnames ranges strand | gr1
#> <Rle> <IRanges> <Rle> | <integer>
#> [1] chr4 42618 + | 1
#> [2] chr4 42619 + | 2
#> [3] chr4 42621 + | 1
#> [4] chr4 42622 + | 2
#> -------
#> seqinfo: 7 sequences from an unspecified genome