(A) GC content variance around CO breakpoints (blue dots and line). The window 0 on the x-axis is the GC content of the breakpoints and the negative and positive values represent the distance away from the breakpoints. Each of these windows is defined as 2 kb sequence and the GC content is calculated for each window. The red dots and line are one of the GC content random samples simulated like the numbers of CO breakpoints (blue dot and line). After 10,000 repeats, not one of random samples is as extreme as the observed (blue line) (P <0.0001). (B) Relationship between recombination and GC content. When the chromosomes are dissected into 10 kb non-overlapping regions, recombination rate (cM/Mb) and GC content can be obtained for each of them. After the bins are sorted by the GC content, the windows are divided into 31 groups based on GC content (approximately 20% to 51%, 1% interval), and the average (and s.e.m.) recombination rates reported for each group.
In both we dissect the genome into 10 kb non-overlapping windows of which there are 19,297. First, we ask about the raw correlation between GC% and cM/Mb for these windows, which as expected is positive and significant (Spearman’s rho = 0.192; P <10 -15 ). Second, we wish to know the average effect of increasing one unit in either parameter on the other. Given the noise in the data (and given that current recombination rate need not imply the ancestral recombination rate) we approach this issue using a smoothing approach. We start by rank ordering all windows by GC content and then dividing them into blocks of 1% GC range, after excluding windows with more than 10% ‘N'. The resulting plot is highly skewed by bins with very high GC (55% to 58%) as these have very few data points (Additional file 1: Figure S10E) (the same outliers likely effect the raw correlation too). Removing these three results in a more consistent trend (Additional file 1: Figure S10F). This also suggests that below circa 20% GC the recombination rate is zero (Additional file 1: Figure S10F). Removing those with GC <20% and, more generally, any bins with fewer than 100 windows (all bins with GC < 20% have fewer than 100 windows) leaves 18,680 (96.8%) of the windows, these having a GC content between approximately 20% and 51%.
Relationships anywhere between recombination and GC-posts
Of the observation, i guess you to typically a-1 cm/Mb upsurge in recombination rate try with the a rise in GC stuff of approximately 0.5%. Having said that a 1% increase in GC content corresponds to an around 2 cM/Mb upsurge in recombination rates. We ending that considering the apparent rarity of NCO gene sales, no less than in the bee genome, extrapolation off GC stuff in order to mediocre crossing-over rate for this reason seems to be justifiable, at least to possess GC blogs over 20%. I notice as well one on significant GC content new recombination rates could be more or underestimated. This could mirror a great discordance between latest and you may past recombination rates.
Talking about accustomed build Figure 4B, and that gift suggestions a fairly music-100 % free (after smoothing) monotonic matchmaking among them details
Crossing-more than rate is also for the nucleotide diversity, gene density, and you can backup matter type countries (Shape S11-S13 in Extra file 1) . Given the removal of hetSNPs regarding data the second outcome is not trivially a great CNV associated artifact. Our good-level analyses tell you a positive correlation between nucleotide variety and recombination speed anyway the latest scales from 10, one hundred, 2 hundred, or five-hundred kb sequence screen (Profile S11 into the A lot more document 1). It bolsters past analyses, among hence claimed the brand new development but found it is non-extreme, when you find yourself other reported a development anywhere between populace genetic quotes off recombination and you will genetic variety. The development accords towards belief one recombination reasons quicker Mountain-Robertson interference thus permitting reduced rates regarding hitchhiking and you may records choice, very providing higher assortment. We along with discover a robust bad relationship anywhere between recombination and gene thickness (Shape S12 when you look at the A lot more file step 1) and a powerful positive relationship between recombination plus the amount of multi-content countries at the some windows systems (Figure S13 for the A lot more document fcn chat step 1). The new correlation which have CNVs try in keeping with a job for low-allelic recombination creating duplications and deletions via uneven crossing-over .