Rays are found in ChrUn which contains mostly pericentromeric regions (indicated
Rays are found in ChrUn which contains mostly pericentromeric regions (indicated by PC suffix in TRPC-21A name). All TRPC-21A arrays were divided into ten groups according to the similarity to the specific locus in the reference genome (Table 4). The longest array of 30 kb (N35 in Additional file 1, Table S3) probably belongs to chromosome 17 due to the high sequence and length similarity with the array at the end of this chromosome (Table 3). Most arrays show similarity with the band 3A that has the large TRPC-21A field at the end of chromosome (Tables 3 and 4). Arrays of TRPC-21A are organized by multiplication of the basic 21 bp unit, although TRPC-21A arrays are more homogeneous than MaSat arrays (Figure 5). All TRPC-21A arrays have a HOR structure on dot-plot. In this case even 60-mer units appeared (Figure 5A). PCR with specific primers on the template of total M. musculus DNA gave the ladder for TRPC-21A as well as for MaSat, indicating the characteristic feature of the satDNA, also caused by variable monomers organized in HOR (data not shown). All the features of TRPC-21A are those of a “big classical” satDNA such as human satellites 1-4 [33]. They are known to be chromosome-specific. For example, the bulk of human satellite 3 (HS3) is located onchromosome 1, but it could be distinguished from HS3 on chromosome 9 [34]. To design a FISH probe for TRPC-21A we selected the array with a high similarity to the band 3A2.Multi locus, single locus and unplaced familiesThe Heterogeneous TR superfamily (Table 2) is classified into families according to their presence (ML, SL) or absence (UnP) in the reference genome (Tables 5, 6, and 7). The most abundant ML subfamily, PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/28859980 TR-22A, was found in four loci in the reference genome; three are associated with centromeric gap (Table 3and 4A2, 6A2, 18B2 in Table 5) and one is located more distant from the centromeric gap (7A2, Table 5). ML TR-4A consists of a very short AT-rich unit. About a half of the ML subfamilies is present on the sex chromosomes (Table 5). It could be explained by more accurate assembly of the heterochromatic regions on the sex chromosomes relative to autosomes. On the other hand, it is known that the sex chromosomes have unique DNA repeats [35-37] and ML TR-4A can be one of them. Despite the minimal sequence similarity, several ML and SL subfamilies have similar GC-content, unit size, and array variability, forming three visually distinct groups (clouds) on the graph: GC-rich, AT-rich, and GC-neutral (Figure 2). TR-22A subfamily is the core of GC-rich cloud in the area of 55-60 GC, while TR-6A, TR-57A, TR-16A and TR-31B are closely adjoined. At least one subfamilyKomissarov et al. BMC Genomics 2011, 12:531 http://www.biomedcentral.com/1471-2164/12/Page 7 ofFigure 4 MaSat HOR structure. A: The dot plot of the MaSat array N707 (Additional file 2); a window size is 13 bp, sequence similarity is shown in gray scale. HOR units are shown as arrows with indicated length; smaller arrows indicate HOR subunits; different colors and letters indicate GSK1363089MedChemExpress EXEL-2880 subunits variants. B: 2154 bp HOR unit structure; the color code for different units is shown. C: The structure of conventional MaSat 234 bp heterotetramer. D: 58 bp unit is built of 28 bp and 30 bp subunits consisting of 7-11 bp subunits; letters indicate subunits variants.Table 4 TRPC-21A-MM familyN 1 2 3 4 5 6 7 8 Unit (bp) 42* 21 63 42 21 21 21 21 Chromo Bands 3A2 3A2, 4A2 3A2,17A2 16A2,17A2 7D1,16A2,17A2 3A2,4A2,17A2 3A2,16A2,17A2 3A2,.