RRBS BC

Breast cancer epigenomic classification based on the genome-wide bisulfite DNA sequencing (XmaI-RRBS) data.

Sample of data processing scripts:

GSE122799.R

    • loads data from GEO to rrbsMatrix two layered object

    • stores *.cov files in local folders for caching

    • filters data with methylKit package

    • stores objects of intermediate steps of qc and filtering to local folder in .RData container

    • stores object m_he and m_flt to local folder in .RData container

ftrs_view_list-m_var.R

    • builds list of 'featuresData' objects with comparisons of groups

    • stores results variable ftrs_view_list locally folder in .RData container

heatplot_ftrs_all.R

    • draws parts of big complex heatplot (Fig. 1) on stored ftrs_view_list-m_he.RData

Notes:

rrbsData R package required with dependencies (see below).

Data (GEO)

Project data on Gene Expression Omnibus GSE122799.

Notes:

URL to download methylation data (.cov file) stored in property "supplementary_file_1".

rrbsTools (GIT repositories)

https://github.com/tanas80/rrbsTools.git

Contains C++ tools to manipulate NGS reads in bismark pipeline according to the method described in PMID: 28488887.

barcode_splitter is a command line tool to split reads from sequencing machine for custom barcode set.

bismark_cleanup is a command line tool to cleanup sequencing reads of artificial methylated cytosines after partial fill-in of 5'-overhang sticky DNA fragment ends.

rrbsData (R package)

https://github.com/tanas80/rrbsData.git