Title:

A parallel implementation of the Newton's method in solving steady state NavierStokes equations for hypersonic viscous flows, αGMRES : a new parallelisable iterative solver for large sparse nonsymmetric linear systems

The motivation for this thesis is to develop a parallelizable fully implicit numerical NavierStokes solver for hypersonic viscous flows. The existence of strong shock waves, thin shear layers and strong flow interactions in hypersonic viscous flows requires the use of a high order high resolution scheme for the discretisation of the NavierStokes equations in order to achieve an accurate numerical simulation. However, high order high resolution schemes usually involve a more complicated formulation and thus longer computation time as compared to the simpler central differencing scheme. Therefore, the acceleration of the convergence of high order high resolution schemes becomes an increasingly important issue. For steady state solutions of the NavierStokes equations a time dependent approach is usually followed using the unsteady governing equations, which can be discretised in time by an explicit or an implicit method. Using an implicit method, unconditional stability can be achieved and as the time step approaches infinity the method approaches the Newton's method, which is equivalent to directly applying the Newton's method for solving the Ndimensional nonlinear algebraic system arising from the spatial discretisation of the steady governing equations in the global flowfield. The quadratic convergence may be achieved by using the Newton's method. However one main drawback of the Newton's method is that it is memory intensive, since the Jacobian matrix of the nonlinear algebraic system generally needs to be stored. Therefore it is necessary to use a parallel computing environment in order to tackle substantial problems. In the thesis the hypersonic laminar flow over a sharp cone at high angle of attack provides test cases. The flow is adequately modelled by the steady state locally conical NavierStokes (LCNS) equations. A structured grid is used since otherwise there are difficulties in generating the unstructured Jacobian matrix. A conservative cell centred finite volume formulation is used for the spatial discretisation. The schemes used for evaluating the fluxes on the cell boundaries are Osher's flux difference splitting scheme, which has continuous first partial derivatives, together with the third order MUSCL (Monotone Upwind Schemes for Conservation Law) scheme for the convective fluxes and the second order central difference scheme for the diffusive fluxes. In developing the Newton's method a simplified approximate procedure has been proposed for the generation of the numerically approximate Jacobian matrix that speeds up the computation and reduces the extent of cells in which the discretised physical state variables need to be used in generating the matrix element. For solving the large sparse non symmetric linear system in each Newton's iterative step the αGMRES linear solver has been developed, which is a robust and efficient scheme in sequential computation. Since the linear solver is designed for generality it is hoped to apply the method for solving similar large sparse nonsymmetric linear systems that may occur in other research areas. Writing code for this linear solver is also found to be easy. The parallel computation assigns the computational task of the global domain to multiple processors. It is based on a new decomposition method for the Nth order Jacobian matrix, in which each processor stores the nonzero elements in a certain number of columns of the matrix. The data is stored without overlap and it provides the main storage of the present algorithm. Corresponding to the matrix decomposition method any Ndimensional vector decomposition can be carried out. From the parallel computation point of view, the new procedure for the generation of the numerically approximate Jacobian matrix decreases the memory required in each processor. The alphaGMRES linear solver is also parallelizable without any sequential bottleneck, and has a high parallel efficiency. This linear solver plays a key role in the parallelization of an implicit numerical algorithm. The overall numerical algorithm has been implemented in both sequential and parallel computers using both the sequential algorithm version and its parallel counterpart respectively. Since the parallel numerical algorithm is on the global domain and does not change any solution procedure compared with its sequential counterpart, the convergence and the accuracy are maintained compared with the implementation on a single sequential computer. The computers used are IBM RISC system/6000 320H workstation and a Meiko Computer Surface, composed of T800 transputers.
