Title:

Channel coding algorithms for UltraReliable Low Latency Communication

The UltraReliable Low Latency Communication (URLLC) concept has been conceived for the emerging Fifth Generation (5G) systems, targeting a roundtrip endtoend latency of less than 1 ms in conjunction with ultrahigh reliability. Therefore, this thesis proposes several novel channel coding schemes in order to meet the latency requirements of the URLLC mobile communication standard. First, an Arbitrarily Parallel Turbo Decoder (APTD) is proposed to support an arbitrarily high degree of parallel processing, facilitating significantly higher processing throughputs and substantially lower processing latencies than the Stateoftheart (SOTA) Long Term Evolution (LTE) turbo decoder. As in conventional turbo decoding algorithms, the proposed APTD decomposes each block of information bits into a sequence of windows, where the bits within different windows are processed simultaneously using forward and backward recursions in a serial manner. However, in contrast to conventional turbo decoding algorithms, the APTD does not require the different windows to be composed of an identical number of bits. This allows the use of an arbitrary number of windows and hence an arbitrary degree of parallelism, when decoding information bits of an arbitrary block length. Furthermore, conventional turbo decoding algorithms alternate between simultaneously processing the windows in the upper decoder and those in the lower decoder. By contrast, the APTD processes the oddindexed windows in the upper decoder at the same time as the evenindexed windows in the lower decoder and alternates between this and the reversed arrangement, hence further improving the decoding throughput and latency. Furthermore, the APTD achieves a reduced hardware resource requirement by calculating the extrinsic information based only on the outputs of the forward recursions, rather than being based on both the forward and backward recursions of conventional turbo decoding algorithms. We demonstrate that the proposed APTD achieves superior latency, throughput and computational efficiency compared to the SOTA LTE turbo decoder at all block lengths, but particularly at the short block lengths that are typically used in URLLC approaches. For example, at a block length of N = 504 bits, the proposed APTD achieves an Block Error Rate (BLER) of 10􀀀5 at the same Eb=N0 as I = 8 iterations of a conventional turbo decoder, but with a computational efficiency that is 6 times higher than that of the SOTA turbo decoder, while achieving a latency and throughput that are 0:7 and 1:4 times those of the SOTA decoder, respectively. Additionally, the URLLC service requires an order of magnitude improvements in all layers of the wireless communication stack. This is a particular challenge for the physical layer, where typically a processing time of the order of microseconds is required for the computationally intensive demodulation and error correction processing, among other operations. Conventionally, the reception of signals, the demodulation processing and the error correction processing are performed consecutively at the receiver. However, this approach is associated with processing times on the order of hundreds of microseconds, preventing URLLC. Therefore, this paper proposes a novel processing architecture, which is capable of performing reception, Orthogonal FrequencyDivision Multiplexing (OFDM) demodulation and turbo decoding concurrently, rather than consecutively, hence significantly reducing the processing time. In order to achieve concurrent operation, the OFDM demodulation is performed using a novel cumulative Fast Fourier Transform (FFT), which produces successively more reliable estimates of all transmitted symbols in each successive clock cycle. At the same time, a FullyParallel Turbo Decoder (FPTD) is used to recover successively more reliable estimates of all bits in each successive clock cycle. Then, a detailed tutorial on the Cyclic Redundancy Check (CRC)aided Logarithmic Successive Cancellation Stack (LogSCS) algorithm conceived for polar codes is provided, followed by a pair of refinements for improving the error correction performance. We also apply these algorithms for the ultrareliable decoding of polar codes, which has relevance for the control channels of the URLLC version of the 3rd Generation Partnership Project (3GPP) New Radio (NR). In contrast to the bit probabilities of all previous work on SCS polar decoding, the LogSCS algorithm operates on the basis of LogarithmicLikelihood Ratios (LLRs), which facilitates lowcomplexity fixed point implementation and reduced storage requirements. Furthermore, we extend the computation to consider frozen bits in stack decoding when determining the most likely sequence of information bits, which improves the error correction performance despite reducing the decoding complexity. During the exploitation of the CRC codes, for improving the error correction performance, we propose a novel technique which limits the number of CRC checks performed, in order to maintain a consistent error detection performance. Additionally, a pair of techniques for further improving the performance of the LogSCS polar decoder are proposed and we demonstrate that the proposed S = 128 Improved LogSCS decoder achieves a similar error correction capability as a Logarithmic Successive Cancellation List (LogSCL) decoder having a list size of L = 128 across the full range of block lengths supported by the 3GPP NR Physical Uplink Control Channel (PUCCH). This is achieved without increasing its memory requirement, while dramatically reducing its complexity, which becomes up to seven times lower than that of a L = 8 LogSCL decoder. Following the Improved LogSCS algorithm, a novel fast LogSCS polar decoder is proposed, which employs several techniques that is previously considered by the fast SCL decoder. This LogSCS polar decoder is capable of attaining a decoding latency that is lower than that of the SOTA fast SCL polar decoders without the loss of error correction performance. First, a 32bit fixed point LogSCS polar decoder is achieved in this paper, which is capable of maintaining the same BLER as that of the floatingpoint LogSCS polar decoder, allowing the software implementation on x86 processors. In addition, the simplified pathmetric computation of the rate0, rate1 and repetition subgraphs is applied in the proposed fast LogSCS decoder which reduces the decoding complexity by 50% on average. In addition, the software implementation of the fast LogSCS polar decoder is achieved on the x86 processors that support Single Instruction Multiple Data (SIMD) instructions with 512bit Advanced Vector Extensions (AVX512) for the first time, satisfying the lowlatency requirements of SoftwareDefined Radio (SDR) systems. By implementing the 32bit fast LogSCS polar decoder into the x86 processors in conjunction with AVX512 SIMD instructions, a maximum parallelization degree of 16 may be attained, and an 80% latency reduction may be achieved.
