Use this URL to cite or link to this record in EThOS:
Title: Transcription factor binding dynamics and spatial co-localization in human genome
Author: Ma, Xiaoyan
ISNI:       0000 0004 7224 9454
Awarding Body: University of Cambridge
Current Institution: University of Cambridge
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Transcription factor (TF) binding has been studied extensively in relation to binding site affinity and chromosome modifications; however, the relationship between genome spatial organisation and transcription factor binding is not well studied. Using the recently available high resolution Hi-C contact map of human GM12878 lymphoblastoid cells, we investigated computationally the genome-wide spatial co-localization of transcription factor binding sites, for both within the same type and between different types. First, we observed a strong positive correlation between site occupancy and homotypic TF co-localization based on Hi-C contacts, consistent with our predictions from biophysical simulations of TF target search. This trend is more prominent in binding sites with weak binding sequences and within enhancers, suggesting genome spatial organisation plays an essential role in determining binding site occupancy, especially for weak regulatory elements. Furthermore, when investigating spatial co-localization between different TFs, we discovered two distinct co-localization networks of TFs in lymphoblastoid cells, one of which is enriched in lymphocyte specific pathways and distal enhancer binding. These two TF networks have strong biases for either the A1 or A2 chromosome subcompartment, but nonetheless are still preserved within each, indicating a potential causal link between cell-type-specific transcription factor binding and chromosome subcompartment segregation. We called 40 pairs of significantly co-localized TFs according to the genome wide Hi-C contact map, which are enriched in previously reported, physical interactions, thus linking TF spatial network to co-functioning. In addition to the above main project, I also worked on a side project to find compute-efficient ways in scaling binding site strength across different TFs based on Position-Weight-Matrices (PWM). While common bioinformatics tools produce scores that can reflect the binding strength between a specific TF and the DNA, these scores are not directly comparable between different TFs. We provided two approaches in estimating a scaling parameter $\lambda$ to the PWM score for different TFs. The first approach uses a PWM and background genomic sequence as input to estimate $\lambda$ for a specific TF, which we applied to show that $\lambda$ distributions for different TF families correspond with their DNA binding properties. Our second method can reliably convert $\lambda$ between different PWMs of the same TF, which allows us to directly compare PWMs that were generated by different approaches.
Supervisor: Adryan, Boris Sponsor: Chinese Scholarship Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
Keywords: transcription factor ; Hi-C contact map ; chromatin organisation ; Position Weight Matrix