Breast cancer epigenomic classification based on the genome-wide bisulfite DNA sequencing (XmaI-RRBS) data.
Sample of data processing scripts:
GSE122799.R
loads data from GEO to rrbsMatrix two layered object
stores *.cov files in local folders for caching
filters data with methylKit package
stores objects of intermediate steps of qc and filtering to local folder in .RData container
stores object m_he and m_flt to local folder in .RData container
ftrs_view_list-m_var.R
builds list of 'featuresData' objects with comparisons of groups
stores results variable ftrs_view_list locally folder in .RData container
heatplot_ftrs_all.R
draws parts of big complex heatplot (Fig. 1) on stored ftrs_view_list-m_he.RData
Notes:
rrbsData R package required with dependencies (see below).
Project data on Gene Expression Omnibus GSE122799.
Notes:
URL to download methylation data (.cov file) stored in property "supplementary_file_1".
rrbsTools (GIT repositories)
https://github.com/tanas80/rrbsTools.git
Contains C++ tools to manipulate NGS reads in bismark pipeline according to the method described in PMID: 28488887.
barcode_splitter is a command line tool to split reads from sequencing machine for custom barcode set.
bismark_cleanup is a command line tool to cleanup sequencing reads of artificial methylated cytosines after partial fill-in of 5'-overhang sticky DNA fragment ends.
rrbsData (R package)