R/roi_functions.R
getMaxPositionsBySignal.Rd
For each signal-containing region of interest, find the single site with the most signal. Sites can be found at base-pair resolution, or defined for larger bins.
getMaxPositionsBySignal(
dataset.gr,
regions.gr,
binsize = 1L,
bin.centers = FALSE,
field = "score",
keep.signal = FALSE,
expand_ranges = FALSE
)
A GRanges object in which signal is contained in metadata (typically in the "score" field).
A GRanges object containing regions of interest.
The size of bin in which to calculate signal scores.
Logical indicating if the centers of bins are returned, as opposed to the entire bin. By default, entire bins are returned.
The metadata field of dataset.gr
to be counted.
Logical indicating if the signal value at the max site
should be reported. If set to TRUE
, the values are kept as a new
MaxSiteSignal
metadata column in the output GRanges.
Logical indicating if ranges in dataset.gr
should
be treated as descriptions of single molecules (FALSE
), or if ranges
should be treated as representing multiple adjacent positions with the same
signal (TRUE
). See
getCountsByRegions
.
Output is a GRanges object with regions.gr metadata, but each range
only contains the site within each regions.gr
range that had the
most signal. If binsize > 1
, the entire bin is returned, unless
bin.centers = TRUE
, in which case a single-base site is returned.
The site is set to the center of the bin, and if the binsize is even, the
site is rounded to be closer to the beginning of the range.
The output may not be the same length as regions.gr
, as regions
without signal are not returned. If no regions have signal (e.g. as could
happen if running this function on single regions), the function will
return an empty GRanges object with intact metadata columns.
If keep.signal = TRUE
, the output will also contain metadata for the
signal at the max site, named MaxSiteSignal
.
data("PROseq") # load included PROseq data
data("txs_dm6_chr4") # load included transcripts
#--------------------------------------------------#
# first 50 bases of transcripts
#--------------------------------------------------#
pr <- promoters(txs_dm6_chr4, 0, 50)
pr[1:3]
#> GRanges object with 3 ranges and 2 metadata columns:
#> seqnames ranges strand | tx_name gene_id
#> <Rle> <IRanges> <Rle> | <character> <character>
#> [1] chr4 879-928 + | FBtr0346692 FBgn0267363
#> [2] chr4 42774-42823 + | FBtr0344900 FBgn0266617
#> [3] chr4 44774-44823 + | FBtr0340499 FBgn0265633
#> -------
#> seqinfo: 7 sequences from dm6 genome
#--------------------------------------------------#
# max sites
#--------------------------------------------------#
getMaxPositionsBySignal(PROseq, pr[1:3], keep.signal = TRUE)
#> GRanges object with 2 ranges and 3 metadata columns:
#> seqnames ranges strand | tx_name gene_id MaxSiteSignal
#> <Rle> <IRanges> <Rle> | <character> <character> <integer>
#> [1] chr4 42798 + | FBtr0344900 FBgn0266617 3
#> [2] chr4 44800 + | FBtr0340499 FBgn0265633 1
#> -------
#> seqinfo: 7 sequences from dm6 genome
#--------------------------------------------------#
# max sites in 5 bp bins
#--------------------------------------------------#
getMaxPositionsBySignal(PROseq, pr[1:3], binsize = 5, keep.signal = TRUE)
#> GRanges object with 2 ranges and 3 metadata columns:
#> seqnames ranges strand | tx_name gene_id MaxSiteSignal
#> <Rle> <IRanges> <Rle> | <character> <character> <integer>
#> [1] chr4 42794-42798 + | FBtr0344900 FBgn0266617 7
#> [2] chr4 44799-44803 + | FBtr0340499 FBgn0265633 1
#> -------
#> seqinfo: 7 sequences from dm6 genome