This function returns ranges that are defined relative to the strand-specific start and end sites of regions of interest (usually genes).

genebodies(
  genelist,
  start = 300L,
  end = -300L,
  fix.start = "start",
  fix.end = "end",
  min.window = 0L
)

Arguments

genelist

A GRanges object containing genes of interest.

start

Depending on fix.start, the distance from either the strand-specific start or end site to begin the returned ranges. If positive, the returned range will begin downstream of the reference position; negative numbers are used to return sites upstream of the reference. Set start = 0 to return the reference position.

end

Identical to the start argument, but defines the strand-specific end position of returned ranges. end must be downstream of start.

fix.start

The reference point to use for defining the strand-specific start positions of returned ranges, either "start" or "end".

fix.end

The reference point to use for defining the strand-specific end positions of returned ranges, either "start" or "end". Cannot be set to "start" if fix.start = "end".

min.window

When fix.start = "start" and fix.end = "end", min.window defines the minimum size (width) of a returned range. However, when fix.end = fix.start, all returned ranges have the same width, and min.window simply size-filters the input ranges.

Value

A GRanges object that may be shorter than genelist due to filtering of short ranges. For example, using the default arguments, genes shorter than 600 bp would be removed.

Details

Unlike GenomicRanges::promoters, distances can be defined to be upstream or downstream by changing the sign of the argument, and both the start and end of the returned regions can be defined in terms of the strand-specific start or end site of the input ranges. For example, genebodies(txs, -50, 150, fix.end = "start") is equivalent to promoters(txs, 50, 151) (the downstream edge is off by 1 because promoters keeps the downstream interval closed). The default arguments return ranges that begin 300 bases downstream of the original start positions, and end 300 bases upstream of the original end positions.

Author

Mike DeBerardine

Examples

data("txs_dm6_chr4") # load included transcript data
txs <- txs_dm6_chr4[c(1, 2, 167, 168)]

txs
#> GRanges object with 4 ranges and 2 metadata columns:
#>       seqnames      ranges strand |     tx_name     gene_id
#>          <Rle>   <IRanges>  <Rle> | <character> <character>
#>   [1]     chr4    879-5039      + | FBtr0346692 FBgn0267363
#>   [2]     chr4 42774-43374      + | FBtr0344900 FBgn0266617
#>   [3]     chr4  5829-11765      - | FBtr0333684 FBgn0052011
#>   [4]     chr4  6374-11765      - | FBtr0089182 FBgn0052011
#>   -------
#>   seqinfo: 7 sequences from dm6 genome

#--------------------------------------------------#
# genebody regions from 300 bp after the TSS to
# 300 bp before the polyA site
#--------------------------------------------------#

genebodies(txs, 300, -300)
#> GRanges object with 4 ranges and 2 metadata columns:
#>       seqnames     ranges strand |     tx_name     gene_id
#>          <Rle>  <IRanges>  <Rle> | <character> <character>
#>   [1]     chr4  1179-4739      + | FBtr0346692 FBgn0267363
#>   [2]     chr4      43074      + | FBtr0344900 FBgn0266617
#>   [3]     chr4 6129-11465      - | FBtr0333684 FBgn0052011
#>   [4]     chr4 6674-11465      - | FBtr0089182 FBgn0052011
#>   -------
#>   seqinfo: 7 sequences from dm6 genome

#--------------------------------------------------#
# promoter-proximal region from 50 bp upstream of
# the TSS to 100 bp downstream of the TSS
#--------------------------------------------------#

promoters(txs, 50, 101)
#> GRanges object with 4 ranges and 2 metadata columns:
#>       seqnames      ranges strand |     tx_name     gene_id
#>          <Rle>   <IRanges>  <Rle> | <character> <character>
#>   [1]     chr4     829-979      + | FBtr0346692 FBgn0267363
#>   [2]     chr4 42724-42874      + | FBtr0344900 FBgn0266617
#>   [3]     chr4 11665-11815      - | FBtr0333684 FBgn0052011
#>   [4]     chr4 11665-11815      - | FBtr0089182 FBgn0052011
#>   -------
#>   seqinfo: 7 sequences from dm6 genome

genebodies(txs, -50, 100, fix.end = "start")
#> GRanges object with 4 ranges and 2 metadata columns:
#>       seqnames      ranges strand |     tx_name     gene_id
#>          <Rle>   <IRanges>  <Rle> | <character> <character>
#>   [1]     chr4     829-979      + | FBtr0346692 FBgn0267363
#>   [2]     chr4 42724-42874      + | FBtr0344900 FBgn0266617
#>   [3]     chr4 11665-11815      - | FBtr0333684 FBgn0052011
#>   [4]     chr4 11665-11815      - | FBtr0089182 FBgn0052011
#>   -------
#>   seqinfo: 7 sequences from dm6 genome

#--------------------------------------------------#
# region from 500 to 1000 bp after the polyA site
#--------------------------------------------------#

genebodies(txs, 500, 1000, fix.start = "end")
#> GRanges object with 4 ranges and 2 metadata columns:
#>       seqnames      ranges strand |     tx_name     gene_id
#>          <Rle>   <IRanges>  <Rle> | <character> <character>
#>   [1]     chr4   5539-6039      + | FBtr0346692 FBgn0267363
#>   [2]     chr4 43874-44374      + | FBtr0344900 FBgn0266617
#>   [3]     chr4   4829-5329      - | FBtr0333684 FBgn0052011
#>   [4]     chr4   5374-5874      - | FBtr0089182 FBgn0052011
#>   -------
#>   seqinfo: 7 sequences from dm6 genome