Use this URL to cite or link to this record in EThOS:
Title: Algorithms and combinatorics of repetitions in strings
Author: Toopsuwan, Chalita
ISNI:       0000 0004 6497 6264
Awarding Body: King's College London
Current Institution: King's College London (University of London)
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
Repetitions in strings constitute one of the most fundamental areas of string combinatorics with exactly essential applications to text algorithms, data compression, and also analysis of biological sequences. It is relevant to periodicities, regularities, and compression. The higher compression rate can be obtained from the repetitive behavior of strings, and reversely some compression techniques are at the core of fast algorithms for detecting repetitions. Repetitions are highly periodic factors (or substrings) in strings, there are various type of repetitions such as repeat, repetition, squares, cubes, palindrome, maximal periodicitiie which is also called runs. The aim of this thesis is concentrated on the repetitions in strings in algorithmic and combinatorics approaches as they are very intricate and plenty of interesting works remain as open problems. The critical study of this thesis firstly approach to the maximal periodicities or runs. It presents in Algorithmics of repetitions, local periods and critical factorization. An algorithm is designed in order to compute all runs for a string drawn from an infinite alphabet. On a string of length n, the algorithm runs optimally in time O(n log n) while there is a linear number of runs. The key model of computation is the comparison of letters which is done with the equality operator only. Under the same proposition, another time-optimal algorithm is created. This gives the same running time to compute local periods and all critical factorisations. The prefix table of input strings is applied as the main tool of those algorithms. In this study, we also design a simple algorithm based on the Dictionary of Basic Factors of the input string. The notion of Gapped Palindrome and its Anti-exponent goes toward this research. A palindrome is a string x = a1 · · · an which is equal to its reversal ex = an · · · a1. The definition of a gapped palindromes is given by a string of the form uveu, where u, v are strings, |v| 2, and eu is the reversal of u. Replicating the standard notion of string exponent, we together define the anti-exponent of a gapped palindrome uveu as the quotient of |uveu| by |uv|. In this work, an algorithm is described to compute the maximal anti-exponent of gapped palindromes occurring in an ordinary palindrome-free string. To get an efficient computation of maximal anti-exponent of factors in a palindrome-free string, we apply techniques based on the suffix automaton and the reversed Lempel-Ziv factorisation. The complexity analyse shows that algorithm runs in linear-time on a fixed-size alphabet. Repeats are also of main concern in the domains of text compression and of pattern matching so lastly the study of repeat and its exponents are discussed in this thesis. Here we create linear-time algorithm to compute maximal exponent of repeats occurring in an overlapping-free string. Two main tools for the algorithm are a factorisation of the string and the Suffix Automaton of some factors. Eventually, we obtain the graceful result as the direct consequence from this research. There is the linearity of the number of occurrences of repeats whose exponent is maximal in an overlap-free string. Among all of the previous researches and our further viewpoints in this thesis, acquiring knowledge on repetitions in string remains interesting open questions to continue.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available