For each signal-containing region of interest, find the single site with the most signal. Sites can be found at base-pair resolution, or defined for larger bins.

getMaxPositionsBySignal(
  dataset.gr,
  regions.gr,
  binsize = 1L,
  bin.centers = FALSE,
  field = "score",
  keep.signal = FALSE,
  expand_ranges = FALSE
)

Arguments

dataset.gr

A GRanges object in which signal is contained in metadata (typically in the "score" field).

regions.gr

A GRanges object containing regions of interest.

binsize

The size of bin in which to calculate signal scores.

bin.centers

Logical indicating if the centers of bins are returned, as opposed to the entire bin. By default, entire bins are returned.

field

The metadata field of dataset.gr to be counted.

keep.signal

Logical indicating if the signal value at the max site should be reported. If set to TRUE, the values are kept as a new MaxSiteSignal metadata column in the output GRanges.

expand_ranges

Logical indicating if ranges in dataset.gr should be treated as descriptions of single molecules (FALSE), or if ranges should be treated as representing multiple adjacent positions with the same signal (TRUE). See getCountsByRegions.

Value

Output is a GRanges object with regions.gr metadata, but each range only contains the site within each regions.gr range that had the most signal. If binsize > 1, the entire bin is returned, unless

bin.centers = TRUE, in which case a single-base site is returned. The site is set to the center of the bin, and if the binsize is even, the site is rounded to be closer to the beginning of the range.

The output may not be the same length as regions.gr, as regions without signal are not returned. If no regions have signal (e.g. as could happen if running this function on single regions), the function will return an empty GRanges object with intact metadata columns.

If keep.signal = TRUE, the output will also contain metadata for the signal at the max site, named MaxSiteSignal.

Author

Mike DeBerardine

Examples

data("PROseq") # load included PROseq data
data("txs_dm6_chr4") # load included transcripts

#--------------------------------------------------#
# first 50 bases of transcripts
#--------------------------------------------------#

pr <- promoters(txs_dm6_chr4, 0, 50)
pr[1:3]
#> GRanges object with 3 ranges and 2 metadata columns:
#>       seqnames      ranges strand |     tx_name     gene_id
#>          <Rle>   <IRanges>  <Rle> | <character> <character>
#>   [1]     chr4     879-928      + | FBtr0346692 FBgn0267363
#>   [2]     chr4 42774-42823      + | FBtr0344900 FBgn0266617
#>   [3]     chr4 44774-44823      + | FBtr0340499 FBgn0265633
#>   -------
#>   seqinfo: 7 sequences from dm6 genome

#--------------------------------------------------#
# max sites
#--------------------------------------------------#

getMaxPositionsBySignal(PROseq, pr[1:3], keep.signal = TRUE)
#> GRanges object with 2 ranges and 3 metadata columns:
#>       seqnames    ranges strand |     tx_name     gene_id MaxSiteSignal
#>          <Rle> <IRanges>  <Rle> | <character> <character>     <integer>
#>   [1]     chr4     42798      + | FBtr0344900 FBgn0266617             3
#>   [2]     chr4     44800      + | FBtr0340499 FBgn0265633             1
#>   -------
#>   seqinfo: 7 sequences from dm6 genome

#--------------------------------------------------#
# max sites in 5 bp bins
#--------------------------------------------------#

getMaxPositionsBySignal(PROseq, pr[1:3], binsize = 5, keep.signal = TRUE)
#> GRanges object with 2 ranges and 3 metadata columns:
#>       seqnames      ranges strand |     tx_name     gene_id MaxSiteSignal
#>          <Rle>   <IRanges>  <Rle> | <character> <character>     <integer>
#>   [1]     chr4 42794-42798      + | FBtr0344900 FBgn0266617             7
#>   [2]     chr4 44799-44803      + | FBtr0340499 FBgn0265633             1
#>   -------
#>   seqinfo: 7 sequences from dm6 genome