Oblems include the periodicity transform [22] and the exactly periodic signal decomposition
Oblems include the periodicity transform [22] and the exactly periodic signal decomposition [23], which are linear in period. Another example is maximum likelihood period estimation [24], which has been shown to perform well for eroded sequences, and accounts for several key types of periodicity. Dyadic wavelet methods, notably including use of the Haar basis, are of interest as an orthogonal decomposition [25,26], however these can only be applicable to exponential period scales, e.g. periods 2r, r ?. In general, exploratory period estimation methods suffer from the lack of an orthogonal integer-periodic signal decomposition [22]. A key desirable property of confirmatory period estimation is a measure of statistical significance.Epps et al. Biology Direct 2011, 6:21 http://www.biology-direct.com/content/6/1/Page 3 ofAutocorrelation, DFT, IPDFT, Hybrid Exploratory estimation Dominant periodSynthetic data with eroded or approximate period p Genomic DNA from yeast or mouse SequenceConvert to 1 if (poly)nucleotide(s) of interest PF-04418948 chemical information present, 0 otherwise Period of interestConfirmatory estimation g-statistic, 2, Embedded BWBConfidence measureFigure 1 Overview of methods for estimating periodic signals from sequence data. In this work, both synthetic and real data are employed after a symbolic to numeric conversion. The smaller arrow connecting Synthetic data with Sequence represents a possible connection but in this study we directly synthesized the numeric data. The methods applied to exploratory or confirmatory period estimation are indicated above these elements. The embedded blockwise bootstrap (BWB) methods are embedded Autocorrelation, embedded Hybrid and embedded integer period discrete Fourier transform (IPDFT).Table 1 Significant period-10 sequences in WP nucleosomes for embedded Autocorrelation, embedded IPDFT and embedded HybridPeriod 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Autocorr. 0.03 0.00 0.00 0.00 0.00 0.00 0.07 0.12 3.78 0.00 0.07 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Total 2895 2910 2414 2352 2389 1629 1493 1707 1428 1240 1385 1030 939 836 625 591 605 550 419 IPDFT 3.12 1.62 1.32 2.29 2.20 1.21 1.62 1.17 69.0 1.05 1.37 1.50 1.66 1.34 1.04 2.09 1.68 1.81 2.10 Total 448 802 531 830 954 988 1048 1026 1304 1234 1244 1336 1147 1042 1058 910 951 720 1002 Hybrid 1.91 0.40 0.38 1.71 0.29 0.08 0.23 0.88 44.07 1.05 0.77 0.43 0.00 0.29 0.20 0.00 PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/27735993 0.32 0.27 0.94 Total 471 995 792 1172 1375 1209 1298 1371 1586 1426 1434 1406 1230 1036 995 870 943 748A sequence was counted as significant if PBWB(period-10) < 0.05. Period dominant period for the sequence; Total – the number of sequences identified with that dominant period; the percentage of Total sequences with the period for which PBWB(period-10) < 0.05.Biological sequences are rarely exactly periodic and the estimated period returned by exploratory analysis may be anywhere from very weakly to very strongly dominant with respect to the remaining sequence components (which may contain other periods or be essentially nonperiodic). A measure of significance facilitates comparison with, for example, other candidate periods of interest or the strength of the same periodic component in other sequence data. In practise, period estimation techniques have been widely applied in the genomics literature with little or no consideration of the statistical significance of the period estimate. Examples of confirmatory period estimation include quantifying the significance of Fourier-bas.