An Implementation of the Finite Element Method for the Velocity-Current
Magnetohydrodynamics Equations
by
K. Daniel Brauss
A dissertation submitted to the Graduate Faculty of
Auburn University
in partial ful llment of the
requirements for the Degree of
Doctor of Philosophy
Auburn, Alabama
August 2, 2014
Keywords: magnetohydrodynamics,  nite element method, numerical solution
Copyright 2014 by K. Daniel Brauss
Approved by
A. J. Meir, Chair, Professor of Mathematics and Statistics
Paul G. Schmidt, Professor of Mathematics and Statistics
Yanzhao Cao, Professor of Mathematics and Statistics
Dmitry Glotov, Professor of Mathematics and Statistics
Abstract
The aim of this research is to extend the numerical results in [78] of a velocity-current
magnetohydrodynamics formulation proposed by A.J. Meir and Paul G. Schmidt in [74]
for a stationary  ow that models a conductive  uid in a bounded domain. A parallel  -
nite element algorithm was successfully implemented on a high-performance computing,
distributed-memory architecture at the Alabama Supercomputer Center (ASC) using the
freely available, open-source academic and government libraries deal.ii, p4est and Trilinos.
Extending the work of Elman, Silvester and Wathen [37] for the Navier-Stokes equations,
a Schur complement preconditioner was developed for the current saddle-point problem to
successfully utilize the iterative Krylov subspace solver GMRES (generalized minimal resid-
ual method) and solve large linear systems of equations arising from mesh re nement. To
simplify and lower operation costs in forming the preconditioner, spectral equivalence was
established between the Schur complement and a mass matrix. The resulting C++ code was
tested succesfully on problems from [78] with similar results.
ii
Acknowledgments
I have to thank God for the opportunity to continue my studies here at Auburn University. I
have to thank my wife Minvera Rosario Brauss Blanco for her unwavering support. Without
my wife, I would not have made it this far. Thank you Rose. I thank Eliana Catherine
Brauss, my daughter, for being here and  lling our lives. I thank all of my family for their
support with all that we do. I thank all of my teachers for the passion that they have given
to me. I thank Dr. A. J. Meir for his much appreciated support, patience and guidance as
my Ph.D. advisor. Dr. Meir is a positive in uence and a well-respected role model. I have
learned a deeper independence from him that has helped me to grow. I thank Dr. Paul G.
Schmidt for his support and his courses in functional analysis. Thanks to my other committee
members Dr. Yanzhao Cao and Dr. Dmitry Glotov. Lastly, thanks to Dr. Narendra Govil
and Dr. Geraldo DeSouza.
iii
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Preconditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Weak Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Matrix System and Preconditioner . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Matrix Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Finite Element Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Preconditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
iv
4 Computational Experiments and Results . . . . . . . . . . . . . . . . . . . . . . 47
4.1 Introduction to the Computational Environment . . . . . . . . . . . . . . . . 47
4.1.1 Choice of the Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.2 Working with the Libraries . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Example 1 - A Problem with an Exact Solution . . . . . . . . . . . . . . . . 51
4.3 Example 2 - An Applied Problem . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.1 Preconditioning Results . . . . . . . . . . . . . . . . . . . . . . . . . 58
5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Torus Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3 Code Speedup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
A Weak Formulation and Identities . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.1 Navier-Stokes Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.1.1 Pressure Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.1.2 Di usion Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.1.3 Advection Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
A.2 Ohms Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
B Newton?s Method Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
C Iterative Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
C.1 Conjugate Gradient (CG) Method . . . . . . . . . . . . . . . . . . . . . . . . 89
v
C.2 Generalized Minimal Residual (GMRES) Method . . . . . . . . . . . . . . . 91
D Exact Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
vi
List of Figures
2.1 Newton?s Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1 2-D Example of Q1 Basis Function for Pressure . . . . . . . . . . . . . . . . . . 30
3.2 2-D Example of Q2 Basis Function for Velocity . . . . . . . . . . . . . . . . . . 30
4.1 Schafer Turek Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Example 1 for Cube Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Example 1 for Lshaped Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Example 2 for Cube Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5 Example 2 for Lshaped Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6 Eigenvalues for System Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.7 Eigenvalues for Preconditioned Stokes-Ohms System . . . . . . . . . . . . . . . 60
4.8 Eigenvalues for Preconditioned Stokes-Ohms System with Inertial Term . . . . . 61
4.9 Eigenvalues for Preconditioned System Matrix . . . . . . . . . . . . . . . . . . . 61
vii
5.1 Example 1 for Lshaped Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2 Example 2 for Torus Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.3 Example 2 for Full Torus Domain . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 CPU Time Versus Number of Processors . . . . . . . . . . . . . . . . . . . . . . 68
5.5 CPU Time Versus Number of Processors . . . . . . . . . . . . . . . . . . . . . . 70
viii
List of Tables
4.1 Errors and Convergence Rates for the Unit Cube Domain . . . . . . . . . . . . 53
4.2 Errors and Convergence Rates for the Lshaped Domain . . . . . . . . . . . . . . 53
5.1 Errors and Convergence Rates for the Quarter Torus Domain . . . . . . . . . . 63
ix
Chapter 1
Introduction
1.1 Overview
Scienti c theories often have a quantitative, mathematical nature to them, such as a well-
founded set of governing equations developed through experimental tests and results. The
equations ususally depict a variable, quantity or quantities that are of scienti c interest and
may be a direct measure of the success of a process or system (e.g., velocity in mass transfer)
or an indirect measure of a system?s state (e.g., pressure or concentration pro les indicating
compression or swelling). Models whose governing equations involve complex multiphysics
and systems of partial di erential equations often arise in physics and engineering. Advances
in scienti c computing make it possible to numerically approximate solutions to the quanti-
ties of interest in such equations with increased accuracy and less simplifying assumptions.
Numerically approximating and visualizing solutions to complex multiphysics models can
then allow a scientist to investigate and analyze a process, assess its conditions, and address
possible optimizations. With growth in computing, the area of computational science and
applied mathematics has established itself as a the third pillar of scienti c investigation [31].
By way of visualization, scienti c computing can o er a deep understanding of phenomena
being scienti cally investigated as well as the limits of our theory and understanding of
natural phenomena [23].
In this research, our interests are in the predictions of scienti c quantities according to
governing equations for electrically conducting  uid  ow. We are interested in the dynamics
of electrically conductive  uids in the presence of electromagnetic  elds, placing our research
1
in the area of magnetohydrodynamics. We study viscous, incompressible, homogeneous,
conductive  uids and therefore the governing incompressible Navier-Stokes equations [69]
 @v@t    u + (u r) u +rp = F + J B
r v = 0:
We consider the Lorentz force J B to act on the  uid by way of electrical and magnetic
 elds governed by the Maxwell equations [51, 38] and Ohm?s Law
r E = q 
0
(Gauss?s Law for Electric Field);
r B = 0 (Gauss?s Law for Magnetic Field);
r E =  @B@t (Faraday?s law);
r B =  0
 
J + 0@E@t
 
(Ampere-Maxwell law);
and
J =  (E + u B) (Ohm?s Law)
The quantities of interest are the current J, velocity u, pressure p, electric potential  , and
magnetic  eld B.
Magnetohydrodynamics (MHD) is the study of the dynamics of electrically conducting  uids
under electromagnetic  elds. Our focus is on the interaction of electrically conducting  uids
with electromagnetic  elds at the macroscopic level [29]. Applications modeled by such
equations arise in liquid metal solidi cation processes [77, 83, 84, 41, 42, 70, 90, 49, 79,
32, 18] and silicon crystal growth processes used in the semiconductor and microelectronics
2
industry [50, 73, 56, 55] to hold, liquify and stir melts, control undesired convection, and  lter
out impurities. Other example applications are electromagnetic pumping, nuclear reactor
cooling, plasmas in nuclear fusion and propulsion devices [66, 54, 76, 52, 62, 57, 3, 72, 71,
88]. We note that these studies address grand challenges seen in the National Academy of
Engineering?s (NAE) 21st Century?s Grand Challenges for Engineering [82] and the 5 Grand
Challenges from the Basic Energy Sciences Advisory Committee (BESAC) [39, 58] of the
U.S. Department of Energy (DOE). One particular grand challenge from NAE is providing
energy from thermonuclear fusion - the harnessing of the sun?s energy. The above research
also addresses the main theme of the BESAC grand challenges - designing and controlling
material processes.
The velocity-current formulation of the MHD equations that we study was proposed by A.J.
Meir and Paul G. Schmidt [74]. They have established existence and uniqueness of a solution
to the equations as well as a convergent  nite element iterative numerical approximation
method [78]. The equations are a stationary (steady-state) form of the equations above and
will be reviewed in chapter 2.
The main goal of this research was to extend the work conducted by A.J. Meir and Paul
G. Schmidt in the numerical solution of the velocity-current MHD equations on three-
dimensional domains by implementing a  nite element method using parallel computing
and a high-performance distributed memory cluster. To provide some motivation for this
e ort, we consider the  ow of a  uid contained in a simple domain such as the unit cube.
If we partition the unit cube in each coordinate direction into 101 segments to create a
100-by-100-by-100 domain mesh, and we perform the  nite element method on this mesh,
then the resulting linear system of equations will have unknowns that reach well into the
millions. We have interest to apply our method to more complicated and larger domains as
well as extend our work to include further multiphysics. This would mean that the number
of unknowns could reach into the billions. We note that we are solving a nonlinear system
of the partial di erential equations using the  nite element method and a picard iteration
3
scheme, and therefore solve this size of a system repeatedly to convergence. As might be
expected, a large portion of computational time is spent solving the linear system and is
common in scienti c computing [99, 94].
1.2 Solver
We therefore wish to set up and e ciently solve a linear system Ax = b. Two general
choices for solving a linear system are a direct solver or an iterative solver. A direct solver
can be viewed as a form of Gaussian elimination performed on the system to decompose
it into lower L and upper U triangular factors, A = LU. Many iterative solvers can be
viewed as projection methods [85, 95] where the iterative solution to the system is obtained
by projecting the exact solution to the system onto a particular subspace using a technique
such as least squares. As our linear system is large, our interest is in the operations and
memory required for such solvers. We summarize from a serial point of view.
The operation count for Gaussian elimination on a n n matrix A, excluding row inter-
changes, is O(n3). Indeed, the pivot on the main diagonal element at the (1,1) position
results in n multiplications across the  rst row. The elimination (zeroing) of the element
below the pivot in the  rst column requires a multiplication on the  rst row (n  ops) and
a row addition (n  ops). Performing elimination for each of the n 1 rows below the pivot
results in a total of n+ (n+n)(n 1) = n+ 2n(n 1)  ops ( oating point operations). The
total number of operations across each column can be written as a series
n 1X
i=0
(n i) + 2(n i)(n 1 i) =
n 1X
i=1
i+ 2(i)(i 1) = 2
n 1X
i=1
i2 
n 1X
i=1
i
= (n 1)(n)(2n 1)3  (n 1)(n)2 = O(n3):
4
To solve the system Ax = b, the same operations are applied to the system?s right-hand side
b creating a column vector we call c and requiring
n+
n 1X
i=1
2(i 1) = n(n 1) n = O(n2)
 oating point operations. Upon completion, the system of equations has been transformed
to Ux = c. To obtain the solution x, a backsolve is performed, starting from the lower row
of U and working upward. The number of operations to solve the ith element of x is one
multiplication and i 1 additions. The complexity for solving all the elements of x, the
entire backsolve, is
nX
i=1
1 +
nX
i=1
(i 1) =
nX
i=1
i = n(n+ 1)2 = O(n2):
The order of operations for the entire solution process is therefore O(n3). We note that bands
in the system matrix are common for numerical approximation techniques to solutions of
partial di erential equations, such as the  nite element method, and this matrix structure is
taken advantage of by direct solvers. For a banded matrix with b nonzero diagonals below the
main diagonal, the complexity when solving with Gaussian elimiation is less and dependent
on the band size O(nb2) [44].
To compare the number of operations between a direct and iterative solver, we take as an
example a second order PDE discretized using the  nite element method over an irregular
three-dimensional grid to obtain the system of equations Ax = b. A typical bandwith for
this problem is b = O(n2=3) [97]. The number of operations using the banded direct solver
would therefore be O(nb2) = O(n7=3). For an iterative solver, the symmetry and positive
de niteness of the system matrix makes the conjugate gradient (CG) method a natural
choice. Per iteration the number of  oating point operations is O(n), see appendix C. To
5
estimate the number of operations for the complete iterative solve, we use an error bound
after l iterations of the method given in the following theorem [44].
Theorem 1.1 Suppose that A2Rn n is symmetric positive de nite and b2Rn. The error
after l iterations of the CG algorithm in solving the system Ax = b can be bounded as follows:
kx xlkA 2
 p
 (A) 1p
 (A) + 1
!l
kx x0kA;
where  (A) is the condition number of A.
For discretized second order PDEs, it can be shown that  (A) = O(n2=3) [97]. Setting the
iterative solver error tolerance to  , we estimate the required number of iterations l to achieve
the tolerance  by using the error estimate from the theorem above. Here we have assumed
that n is large and used  (A) = O(n2=3) to simplify.
 
1 1=p (A)
1 + 1=p (A)
!l
<
 
1 2p (A)
!l
<e
 2lp
 (A) < 
Solving for l, we have the number of conjugate gradient iterations bounded l< ln( )
p
 (A)
2 =
O(n1=3). With this estimate, the number of  oating point operations for the method would
be O(n4=3). For large matrix sizes n, this example indicates the conjugate gradient method
would require less  oating point operations. It would make sense to choose the CG solver
using this measurement.
Another important aspect of the solver is memory requirements. As can be seen with the
conjugate gradient method, iterative solvers can require only the memory to store the sparse
matrix A. If A has m nonzero entries per row and m<<n, then the memory requirement
would be O(n). In comparison, the factorization that occurs during a direct solve often
creates  ll-in and increases memory consumption. Memory requirements for Krylov iterative
solvers such as the conjugate gradient method can also be reduced by the use of matrix-free
6
methods. Matrix-free methods do not store the matrix A itself, but only the action of A on
a vector.
It must be noted that iterative solvers are not as robust as direct solvers in terms of dealing
with arbitrary and ill-conditioned matrices. Slow convergence can be a likely scenario for
 uid dynamics and electronic device simulations [85]. Further, modern sparse direct solvers
can tackle problems with several million unknowns [101]. These solvers e ciently handle
and minimize the  ll-in that may occur in the factors L and U. To achieve such e ciency,
two steps are commonly employed before factorization by such solvers: (i) an ordering of
rows and columns that can minimize  ll and (ii) a symbolic analysis of the matrix structure
to determine a good pivot sequence that can reduce memory requirements and  oating-point
operations [45].
To solve our system we have considered Krylov subspace iterative solvers that project the
solution of the system Ax = b into the Krylov space, span (b;Ab;A2b;:::). Two examples of
commonly-used Krylov iterative solver methods are the conjugate gradient (CG) method and
the generalized minimal residual (GMRES) method, see appendix C. The CG method is both
an iterative and direct solver, designed to solve symmetric, positive de nite systems in a  nite
number of steps - at most the size of the matrix. GMRES is an iterative least squares solver
developed for nonsymmetric inde nite systems. As our system is nonsymmetric inde nite,
we have decided to use GMRES. [97].
For system matrices having sizes numbering into the millions and larger, the bene ts men-
tioned for a Krylov iterative solver, like GMRES, over a direct solver can be signi cant.
There is one stipulation to success with a solver like GMRES. A critical component of such
a method must be obtained - a preconditioner. [15, 95]. Indeed, without a preconditioner
we have seen GMRES fail to converge for our system.
7
1.3 Preconditioner
Having chosen the iterative solver GMRES, the main focus of our research was  nding and
implementing a good preconditioner. In the development of the theory of the solvability of
our magnetohydrodynamic equations and MHD equations in general [77, 2] work is often
extended from the analysis of the Navier-Stokes equations. With no known preconditioners
available for our system, it seems natural to start our search for an e ective preconditioner
in work done for numerically solving the Navier-Stokes equations. Two common methods,
to solve these equations are ILU (incomplete LU factorization method) and block precondi-
tioning using the Schur complement [98, 40]. ILU is a general preconditioning method for
linear systems, where Schur complement methods are applicable to saddle-point problems.
Preconditioners for the Stokes and Navier-Stokes problems based on the Schur complement
have been shown to be e ective and independent of mesh size where the ILU method may
not [91, 37, 36, 34, 35, 33]. In chapter 3 we review and extend the work of Elman, Silvester
and Wathen and exploit similarities of our matrix system, developing a preconditioner for
our system of equations. In this direction we note the works on general and particular sad-
dlepoint problems related to maxwell and navier-stokes equations [102, 48, 16]. In chapter
4 we show results from implementation of the preconditioner on examples from [77] for two
di erent domains.
1.4 Implementation
One research goal was to implement our method using current computing capabilities. This
included using high performance computing, a distributed memory architecture, object-
oriented programming, and freely available open-source software libraries. We have in
mind ease in extension of the research to more complex physics and problem types. A
necessary choice to start the implementation was the operating platform. A commonly-
used operating system platform for HPC (High Performance Computing) systems is linux.
8
Motivation to use the linux operating system becomes evident when viewing the top 500
fastest HPC systems in the world at http://www.top500.org. Running approximately
one million cores at ten petaFLOPs per second, the  ve fasted distributed-memory high
performance supercomputers, as of November 2013, use the linux operating system http:
//en.wikipedia.org/wiki/TOP500.
Several advantages of freely available open source libraries are (i) code that constitutes build-
ing blocks for a certain types of solution methods (e.g.,  nite element method) and eliminates
the necessity to reinvent e cient components, (ii) code that has been tested, revised and
written to exploit high performance computing architectures, (iii) a generalized tool for later
use to develop code for other types of applied problems, (iv) exposure to software engi-
neering standards through open-source  les, (v) contributions to software development with
advanced programming techniques, (vi) development of networks with peers and discussions
of options for such solution methods.
Object oriented programming allows for modularization and implementation of design pat-
terns for e ective use of code, particularly important for large codes and libraries. Library
development can be hidden from the interface between the user and the library, allowing
users to operate from a high level without an in-depth understanding of the complete work-
ings that make up each component of the library. Without object-oriented programming,
the user may have to understand the underpinnings of a large portion (hundreds of thou-
sands of lines of code) of the library code to realize the purpose of a speci c part. The
library can advance in the development of its objects (e.g., revised data structures) while
letting the user implement these advances in a familiar setting. Combining this technique
with version control, such as the subversion version control system (SVN) [25, 80], teams of
programmers, developers and software engineers can tackle various parts of the library sep-
arately and at the same time - having interfaces well de ned. Open-source software libraries
that are object-oriented and well-documented allow the user the opportunity to modify the
library?s code to  t user needs. Object-oriented, open-source programming can further the
9
development of the library through implementation of user modi cations that may bene t
the library community.
In searching for such a library, we found deal.ii. deal.ii (Di erential Equations Analysis
Library) is a software library having these qualities and geared toward the  nite element
method. The library is actively developed. Initially developed at the University of Heidelberg
in Heidelberg, Germany [10, 8, 11, 9, 12], deal.ii became the main support library for code
development in this research. deal.ii interfaces to several libraries. For parallel iterative and
direct solutions of systems of equations, deal.ii interfaces to the libraries Trilinos and PETSc
from the Sandia National Laboratory and the Argonne National Laboratory [60, 59, 86, 7],
respectively. deal.ii also interfaces to the library p4est, developed at the University of Texas
at Austin, for domain decomposition [20, 65] over the distributed memory architecture. We
mention these three libraries in particular, since they were the main support libraries used
in parallel development. At the time of this research project, Trilinos and p4est were being
incorporated and tested on thousands of cores by the deal.ii developers. PETSc was also
capable of utilization for parallel solution methods with domain decomposition using the
library METIS. However, this technique required each cluster?s node to keep the domain?s
entire mesh in memory. In comparison, p4est decomposes the domain mesh and distributes
pieces to each node. As our code developed, we switched from PETSc and METIS to Trilinos
and p4est due to this e ciency in use of memory.
Two important components for successful use of a software library are active development
and quality documentation. The quality of the deal.ii documentation made for e ective use
of the library. The tutorials were particularly helpful. In particular, the tutorial step-32 [68]
became a skeleton (with the multi-threading removed) for the current code with respect to
the use of p4est and trilinos. Another contribution to the successful use of deal.ii was its
active mailing list. The developers and community of users were very helpful in answering
questions concerning the library and troubleshooting the installation of software.
10
Chapter 2
The Problem
2.1 Problem Formulation
The velocity-current formulation developed by Meir and Schmidt assumes the  uid to be
incompressible and  nitely conducting. The  uid is contained in a bounded domain, and
is allowed to interact with magnetic  elds both within and outside of the domain. The
boundaries of the  uid domain are assumed to be nonideal. Ideal boundaries refer to per-
fectly conducting walls [89]. In the presence of perfectly conducting walls, one can restrict
attention to only the domain containing the body of conducting  uid and neglect the  uid?s
electromagnetic interaction with the rest of Euclidean space. The model problem that we
study is the steady  ow of a conducting  uid in a bounded three-dimensional domain. The
 ow is driven by currents entering and exiting the domain at the boundary as well as induced
magnetic  elds in the domain and known magnetic  elds external to the domain.
The stationary (steady-state)  ow of a viscous electrically-conducting incompressible  uid
in a bounded domain was modelled by the Navier-Stokes equations
   u + (u r) u +rp J B = F; (2.1)
where the parameters are the viscosity  and the density  , and the variables are the velocity
u, the pressure p, the current density J, the magnetic  eld B and the given body forces F.
The  ow was allowed to interact with electric currents and magnetic  elds through the
11
coupling of the Navier-Stokes equations and Ohm?s Law
  1J +r  u B = E; (2.2)
via the Lorentz force J B and the magnetic induction u B. The new quantities introduced
in Ohm?s law are a parameter, the conductivity  , a variable, the electric potential  , and
the externally generated electric  eld E.
Incompressibility, conservation of mass and conservation of charge were accounted for by the
continuity equations
r u = 0 and r J = 0: (2.3)
To eliminate the magnetic  eld as an unknown, superposition was used
B = B0 +B(J); (2.4)
where B0 is the sum of the  elds due to external sources andB(J) is the  eld generated by
the  uid through induction. The  eld B(J) was obtained from the Biot-Savart law
B(J) (x) =  4 
Z
 
x y
jx yj3  J (y)dy (2.5)
where x 2R3 and  is the magnetic permeability. The  elds due to external sources were
decomposed into their components as well
B0 (x) = Bext (x)  4 
Z
R3n 
x y
jx yj3  Jext (y)dy (2.6)
for x 2 R3. Bext consisted of external magnetic  elds other than the one induced by
known current(s) Jext outside the domain. Further constraints were conservation of charge
r Jext = 0 for the external current(s) and r Bext = 0 in R3 and r Bext = 0 in  .
12
For the system 2.1 - 2.3 to be well-posed, boundary conditions for u and J on the boundary
 are necessary. The conditions
u = g on  (2.7)
and
J n = Jext n on  (2.8)
were imposed.
We let L2 and H1 denote the usual Lebesgue and Sobolev spaces of square-integrable func-
tions and square-integrable functions with a generalized square-integrable derivative, on the
implied domains (i.e.,  , Rn , or R3). Let H1=2 ( ) denote the trace space of H1 ( ). Let
H10 denote the subspace of functions from H1 ( ) that vanish on the boundary  (in the
sense of traces) and let H 1 ( ) denote its dual. Finally we let boldface type denote spaces
consisting of R3 vector-valued functions. For example, L2 ( ) = (L2 ( ))3. Let W1 (R3) be
the Beppo-Levi space whose inner product is given by hf;giW1(R3) = RR3rf rg.
With the given spaces, we state a problem formulation such that the equations and the
singular integrals in the decomposition of B are well-de ned in the weak formulation that
will follow.
Problem P0: Given parameters  ; ; ; > 0 and data
F2H 1 ( );E2L2 ( );
g2H1=2 ( ) with
Z
 
g n = 0;
Jext 2L2 R3n  with r Jext = 0 in R3n and
Z
 
Jext n = 0;
and
Bext 2W1 R3 with r Bext = 0 in R3 and r Bext = 0 in  ;
13
 nd functions u2H1 ( );J2L2 ( );p2L2 ( =R); and  2H1 ( =R), such that
   u + (u r) u +rp J B = F in  ;
  1J +r  u B = E in  ;
r u = 0 and r J = 0 in  ;
u = g and J n = Jext n on  ;
and
B = B0 +B(J);
B(J) (x) =   4 
Z
 
x y
jx yj3  J (y)dy;
B0 (x) = Bext (x)  4 
Z
R3n 
x y
jx yj3  Jext (y)dy:
are satis ed.  
Letting ~J2L2 (R3) be de ned as J in  and Jext in R3n then r ~J = 0 (in the sense of
distributions on R3) and B = Bext + ~B with ~B given by
~B (x) =  
4 
Z
R3
x y
jx yj3  
~J (y)dy
for x2R3. ~B2W1 (R3) is the unique solution of Maxwell?s equations
r ~B = 0 and r ~B =  ~J (2.9)
in W1 (R3). We require Bext to belong to W1 (R3) as well [78].
14
2.2 Weak Formulation
A mixed weak formulation of the original problem is obtained by multiplying the system
equations (2.1) - (2.3) by test functions v 2 H10 ( ), K 2 L2 ( ), q 2 L2 ( )=R, and
 2H1 ( )=R, respectively and integrating over the domain  .
  
Z
 
 u v + 
Z
 
((u r) u) v +
Z
 
rp v 
Z
 
(J B) v =
Z
 
F v
  1
Z
 
J K +
Z
 
r  K 
Z
 
(u B) K =
Z
 
E K
Z
 
(r u)q = 0 and
Z
 
(r J) = 0
We add the  rst two equations and the third and fourth. Using integration by parts, the
divergence theorem and some identities (see Appendix A) we obtain
 
Z
 
ru: rv +  2
 Z
 
((u r) u) v 
Z
 
((u r) v) u
 
 
Z
 
pr v 
Z
 
(J B) v
+  1
Z
 
J K +
Z
 
r  K +
Z
 
(K B) u =
Z
 
F v +
Z
 
E K
Z
 
(r u)q +
Z
 
J r =
Z
 
J n
Combining together Navier-Stokes and Ohm?s Law and then the continuity equations, we
have the system in terms of the multilinear forms
a0 ((u;J);(v;K)) +a1 ((u;J);(u;J);(v;K)) +b((v;K);(p; )) =
Z
 
F v +
Z
 
E K;
b((u;J);(q; )) =
Z
 
Jext n;
where a0 : (H1 ( ) L2 ( )) (H1 ( ) L2 ( ))!R is de ned as
a0 ((u;J);(v;K)) =  
Z
 
(ru) : (rv) +  1
Z
 
J K +
Z
 
((K B0) u (J B0) v);
15
a1 : (H1 ( ) L2 ( )) (H1 ( ) L2 ( )) (H1 ( ) L2 ( ))!R as
a1 ((u;J);(u;J);(v;K)) =  2
Z
 
(((u r) u) v ((u r) v) u)
+
Z
 
((K B(J)) u (J B(J)) v);
and b: (H1 ( ) L2 ( )) (L2 ( )=R H1 ( )=R)!R as
b((v;K);(p; )) = 
Z
 
(r v)P(p) +
Z
 
K (r )
The pressure solution p to the system is only unique up to a constant, and therefore the L2
orthogonal projection P has been introduced for uniqueness and is given by
P(f) := f 1j j
Z
 
f:
Hence,
P(p+c) = (p+c) 1j j
Z
 
p+c
= p+c 1j j
Z
 
p 1j j
Z
 
c = p 1j j
Z
 
p
= P(p)
To shorten the notation above, de ne Y := H1 ( ) L2 ( ), M := L2 ( )=R H1 ( )=R,
and X := H10 ( ) L2 ( ).
To help establish existence and uniqueness, the interial term of the Navier-Stokes equa-
tions has been \skew-symmetrized", making the bilinear form a1 ((v0;K0);( ; );( ; )) skew-
symmetric on Y Y.
1
2
Z
 
(((v1 r) v2) v3 ((v1 r) v3) v2) =
Z
 
((v1 r) v2) v3
16
whenever r v1 = 0 and v3   = 0. Given below is a second version of our problem.
Problem P1: Given parameters  ; ; ; > 0 and data
F2H 1 ( );E2L2 ( );
g2H1=2 ( ) with
Z
 
g n = 0
Jext 2L2 R3n  with r Jext = 0 in R3n and
Z
 
Jext n = 0; and
Bext 2W1 R3 with r Bext = 0 in R3 and r Bext = 0 in  ;
 nd functions (u;J)2Y with u   = g and (p; )2M, such that
a0 ((u;J);(v;K)) +a1 ((u;J);(u;J);(v;K)) +b((v;K);(p; )) =
Z
 
F v +
Z
 
E K;
b((u;J);(q; )) =
Z
 
Jext n
for all (q; )2M and (v;K)2X. B: L2 ( )!W1 (R3) de ned by
B(f) (x) :=  4 
Z
 
x y
jx yj3  f (y)dy 8x2R
3
is a bounded linear operator from L2 ( ) to W1 (R3) [75] and B0 is the component of the
magnetic  eld that is generated by outside sources, previously de ned.  
The properties of a0, a1 and b needed to establish existence and uniqueness are collected in
the lemma below.
Lemma 1
1. The forms a0, a1 and b are bounded on on Y Y, Y Y Y, and Y M, respectively,
with norms
ka0k  cmax 1; ;  1;  
 
1 +kJextkL2(R3n ) +  Bext     L3( )
 
17
ka1k  cmaxf ; g; and kbk p3;
where c depends only on  (c = c( )).
2. The form a0 is positive de nite on X X with a number   c 1 minf ;  1g such
that
a0 ((v;K);(v;K))  k(v;K)k2Y 8(v;K)2X
(where c = c( )).
3. The form b satis es the Ladyzhenskaya-Babuska-Brezzi condition (LBB-condition) on
X M with a number  ( ) > 0 such that
inf
(q; )2M
sup
(v;K)2X
b((v;K);(q; ))
k(v;K)kYk(q; )kM   :
We note that the two conditions
inf
q2L2( )=R
sup
v2H1( )
R
 (r v)q
kvkH1( )kqkL2( )=R   1 and
inf
 2H1( )=R
sup
K2L2( )
R
 K (r )
kKkL2( )k kH1( )=R   2
are su cient to establish the last part of the lemma [21].
Thanks to the LBB-condition, there exist u0 2H1 ( ) and J0 2L2 ( ) [78] such that
r u0 = 0 in  ; u0 = j on  and r J0 = 0 in  ; J0 n = j on  :
Therefore, the problem can be reduced by writing
u = u0 + ~u and J = J0 + ~J
18
and de ning the forms a: X X X!R and l2X by
a((v1;K1);(v2;K2);(v3;K3))
:= a0 ((v2;K2);(v3;K3)) +a1 ((v1;K2);(v2;K2);(v3;K3))
+a1 ((v2;K2);(u0;J0);(v3;K3)) +a1 ((u0;J0);(v2;K2);(v3;K3))
and
l(v;K) :=
Z
 
F v +
Z
 
E K
 a0 ((u0;J0);(v;K)) a1 ((u0;J0);(u0;J0);(v;K)):
With the space V := f(v;K)2X: b((v;K);(q; )) = 0 8(q; )2Mg, we have a  nal
problem version.
Problem P2: Find
 
~u;~J
 
2 V such that a
  
~u;~J
 
;
 
~u;~J
 
;(v;K)
 
= l(v;K) for all
(v;K)2V.  
We refer to [78] for the equivalence of problem versions. To establish existence and uniqueness
we have the lemma and theorem below.
Lemma 2
1. The mapping (v;K) !a((v;K);(v;K);(v0;K0)), for any (v0;K0) 2 V, is weakly
sequentially continuous on V.
2. For every (v0;K0)2V and all (v;K)2V, we have
a((v0;K0);(v;K);(v;K)) (   ka1kk(g;j)k)k(v;K)k2Y
where  and  are constants established in [78] andk(g;j)kdenotes the norm of (g;j)2
H1=2 ( ) H 1=2 ( ).
19
3. The mapping (v0;K0) ! a((v0;K0);( ; );( ; )) is uniformly Lipschitz continuous,
with Lipschitz constant ka1k from V into the space L(V;V ) of bounded linear oper-
ators from V into V .
Theorem 2.1 Let N = N (F;E;g;Jext;Bext) denote the norm of the functional l  V. Let
k(g;j)k denote the norm of (g;j) in H1=2 ( ) H 1=2 ( ), where j = Jext n, and choose
constants  and  (mentioned above). For Problem P (Version 3) we have the following
1. If k(g;j)k<   ka1k, then there exists at least one solution
 
~u;~J
 
that satis es
  
 
 
~u;~J
   
 
Y
 N   ka
1kk(g;j)k
:
2. If k(g;j)k<   ka1k, then the solution is unique.
The theorem asserts the existence of a solution for all versions of the problem if the boundary
data g and j = Jext n are su ciently small, and uniqueness is guaranteed if all the data
F;E;g;Jext, and Bext are su ciently small with the constants  ; , and ka1k independent
of the data [78].
In steps similar to the in nite-dimensional problem, existence and uniqueness was established
for a  nite-dimensional approximation to the problem [78]. In that direction, we introduce
the notation B for a Banach space and  Bh h2I for a family of  nite-dimensional subspaces
of B. I is a subset of the interval (0;1) such that 0 is its only limit point.  Bh h2I is said to
be a  nite-dimensional approximation of B if 8f 2B, we have inffh2Bh  f fh  B ! 0 as
h!0.
To continue, we assume that Yh1 h2I, Yh1 h2I, Mh1 h2I, and Mh2 h2I are  nite-dimensional
approximations of Y1 = H1 ( ), Y2 = L2 ( ), M1 = L2 ( )nR, and M2 = H1 ( )nR,
respectively. Therefore, Yh := Yh1  Yh2 and Mh := Mh1  Mh2 approximate Y and M,
respectively. With X1 := H10 ( ), X2 := Y2 and X := X1 X2 we de ne Xh1 := Yh1 TXh1,
20
Xh2 := Yh2 , Xh := Xh1  Xh2, and let Yh1; denote the trace space of Yh1
Yh1; =  vh   : vh2Yh1 
of Y1; := H1=2 ( ). For further details, please see [78].
With the set of parameters  ; ; ; and data F;E;g;Jext, and Bext given and j := Jext n,
we choose a family  gh h2I of approximate boundary values gh2Yh1; such that gh!g in
Y1; as h!0. Finally, we consider a family Ph1 (h2I) of  nite-dimensional approximations
to Problem P1, as follows
Problem Ph1 : Find  uh;Jh 2Yh with uh   = gh and  ph; h 2Mh such that
a0  uh;Jh ; vh;Kh  +a1  uh;Jh ; uh;Jh ; vh;Kh  
+b  vh;Kh ; ph; h  =
Z
 
F v +
Z
 
E K
and
b  uh;Jh ; qh; h  =
Z
 
(Jext n) h;
for all  qh; h 2Mh and  vh;Kh 2Xh. With existence and uniqueness established for the
 nite-dimensional problem, we consider a convergent  nite element iteration scheme [78] to
numerically approximate the in nite-dimensional solution to the problem.
2.3 Linearization
We use Newton?s method
X(k+1) = X(k) H
 X(k) 
H0(X(k)) (2.10)
21
Figure 2.1: Newton?s Method.
as a heuristic tool to develop the convergent iterative scheme introduced in [78], where the
vector function H (X)
H(X) =
2
66
66
66
66
66
66
66
66
66
66
4
  u1 + (u r)u1 + @p@x (J2B3 J3B2) F1
  u2 + (u r)u2 + @p@y + (J1B3 J3B1) F2
  u3 + (u r)u3 + @p@z (J1B2 J2B1) F3
  1J1 + @ @x (u2B3 u3B2) E1
  1J2 + @ @y  (u1B3 u3B1) E2
  1J3 + @ @z  (u1B2 u2B1) E3
@u1
@x +
@u2
@y +
@u3
@z
@J1
@x +
@J2
@y +
@J3
@z
3
77
77
77
77
77
77
77
77
77
77
5
and the variable X = (u1;u2;u3;p;J1;J2;J3; ) were obtained from the equations
   u + (u r) u +rp J B = F
  1J +r  u B = E;
r u = 0 and r J = 0:
Since
B = B0 +B(J)
22
= B0  4 
Z
 
x y
jx yj3  J (y)dy =
2
66
66
4
(B0)1  4 R (x2 y2)J3(y) (x3 y3)J2(y)jx yj3 dy
(B0)2  4 R (x1 y1)J3(y) (x3 y3)J1(y)jx yj3 dy
(B0)3  4 R (x1 y1)J2(y) (x2 y2)J1(y)jx yj3 dy
3
77
77
5
we have
H(X) =
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
  u1 + (u r)u1 + @p@x J2
 
(B0)3  4 R (x1 y1)J2 (x2 y2)J1jx yj3 dy
 
+J3
 
(B0)2  4 R (x1 y1)J3 (x3 y3)J1jx yj3 dy
 
 F1
  u2 + (u r)u2 + @p@y + J1
 
(B0)3  4 R (x1 y1)J2 (x2 y2)J1jx yj3 dy
 
 J3
 
(B0)1  4 R (x2 y2)J3 (x3 y3)J2jx yj3 dy
 
 F2
  u3 + (u r)u3 + @p@z J1
 
(B0)2  4 R (x1 y1)J3 (x3 y3)J1jx yj3 dy
 
+J2
 
(B0)1  4 R (x2 y2)J3 (x3 y3)J2jx yj3 dy
 
 F3
  1J1 + @ @x u2
 
(B0)3  4 R (x1 y1)J2 (x2 y2)J1jx yj3 dy
 
+u3
 
(B0)2  4 R (x1 y1)J3 (x3 y3)J1jx yj3 dy
 
 E1
  1J2 + @ @y + u1
 
(B0)3  4 R (x1 y1)J2 (x2 y2)J1jx yj3 dy
 
 u3
 
(B0)1  4 R (x2 y2)J3 (x3 y3)J2jx yj3 dy
 
 E2
  1J3 + @ @z  u1
 
(B0)2  4 R (x1 y1)J3 (x3 y3)J1jx yj3 dy
 
+u2
 
(B0)1  4 R (x2 y2)J3 (x3 y3)J2jx yj3 dy
 
 E3
@u1
@x +
@u2
@y +
@u3
@z
@J1
@x +
@J2
@y +
@J3
@z
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
5
Di erentiating H (X) using the Gateaux deritave
H0G X(k) ( X) = lim !0 H
 X(k) +  X  H X(k) 
 ; (2.11)
23
where  X = X(k+1)  X(k) and   X =  ( u1; u2; u3; p; J1; J2; J3); we obtain
(see Appendix B) the Newton iteration
2
66
66
66
66
66
66
66
66
66
66
4
  r ru(k+1) +  u(k+1) r u(k)   u(k) r u(k) +  u(k) r u(k+1)
+rp(k+1) J(k) B J(k+1) + J(k) B J(k)  J(k+1) B J(k) 
  1J(k+1) +r (k+1) u(k) B J(k+1) + u(k) B J(k)  u(k+1) B J(k) 
r u(k+1)
r J(k+1)
3
77
77
77
77
77
77
77
77
77
77
5
=
2
66
66
66
66
66
66
66
66
64
F
E
0
0
3
77
77
77
77
77
77
77
77
75
: (2.12)
Dropping o the  rst two terms from the linearization of each of the three nonlinear terms
of the equations, we have a Picard iteration [53]
2
66
66
66
66
66
66
66
66
64
  r ru(k+1) +  u(k) r u(k+1) +rp(k+1) J(k+1) B J(k) 
  1J(k+1) +r (k+1) u(k+1) B J(k) 
r u(k+1)
r J(k+1)
3
77
77
77
77
77
77
77
77
75
=
2
66
66
66
66
66
66
66
66
64
F
E
0
0
3
77
77
77
77
77
77
77
77
75
:
We can see that these are the original equations with a component of each nonlinear term
being lagged to create the linearization. The corresponding weak formulation would then
be
 
Z
 
 ru
(k+1)
 : (rv) + +Z
 
  u
(k) r
 u
(k+1)
  v Z
 
p(k+1) (r v) 
Z
 
 J
(k+1) B
 J
(k)
   v
 
Z
 
 J
(k+1) B0
  v +  1Z
 
J(k+1) K +
Z
 
r (k+1) K 
Z
 
 u
(k+1) B
 J
(k)
   K
 
Z
 
 u
(k+1) B0
  K = Z
 
F v +
Z
 
E K
Z
 
 r u
(k+1)
 q +Z
 
J(k+1) r =
Z
 
(Jext n) 
24
Recalling the forms
a0 ((v1;K1);(v2;K2)) :=  
Z
 
(rv1) : (rv2) +  1
Z
 
K1 K2
+
Z
 
((K2 B0) v1 (K1 B0) v2); (2.13)
a1 ((v1;K1);(v2;K2);(v3;K3)) :=  2
Z
 
(((v1 r) v2) v3 ((v1 r) v3) v2)
+
Z
 
((K3 B(K1)) v2 (K2 B(K1)) v3); (2.14)
and
b((v;K);(q; )) := 
Z
 
(r v)P(q) +
Z
 
K (r ); (2.15)
we are now ready to state the convergent Picard iteration scheme proposed by Meir and
Schmidt for the numerical solution of Problem Ph1 .
Iteration Scheme. Given  uh0;Jh0 2 Yh with uh0   = gh, for n2N,  nd  uhn;Jhn 2 Yh
with uhn   = gh and  phn; hn 2Mh such that
a0  uh(k+1);Jh(k+1) ; vh;Kh  +a1  uh(k);Jh(k) ; uh(k+1);Jh(k+1) ; vh;Kh  
+b  vh;Kh ; ph(k+1); h(k+1)  =
Z
 
F vh +
Z
 
E Kh
and
b  uh(k+1);Jh(k+1) ; qh; h  =
Z
 
(Jext n) h
for all  vh;Kh 2Xh and  qh; h 2Mh. For further details, we refer to [78].
25
Chapter 3
Matrix System and Preconditioner
3.1 Matrix Formulation
Restating the convergent Picard iteration scheme with n in place of k + 1,
a0  uh(n);Jh(n) ; vh;Kh  +a1  uh(n 1);Jh(n 1) ; uh(n);Jh(n) ; vh;Kh  
+b  vh;Kh ; ph(n); h(n)  =
Z
 
F vh +
Z
 
E Kh
and
b  uh(n);Jh(n) ; qh; h  =
Z
 
(Jext n) h
for all  vh;Kh 2 Xh and  qh; h 2Mh. The scheme is expanded using the de nition of
a0, a1, and b
 
Z
 
ruh(n) : rvh +  1
Z
 
Jh(n) Kh +
Z
 
 Kh B
0
  uh
(n) 
Z
 
 
Jh(n) B0
 
 vh
+ 
Z
 
  
uh(n 1) r
 
uh(n)
 
 vh +
Z
 
 
Kh B
 
Jh(n 1)
  
 uh(n) 
Z
 
 
Jh(n) B
 
Jh(n 1)
  
 vh
 
Z
 
ph(n)r vh +
Z
 
r h(n) Kh =
Z
 
F vh +
Z
 
E Kh
 
Z
 
r uh(n)q +
Z
 
Jh(n)r h =
Z
 
(Jext n) 
Choosing a basis for each of the  nite-dimensional subspaces
n~
 uk
onhu
k=1
 Yh1 ,
n~
 Jk
onh
J
k=1
 Yh2 ,
f pkgnhpk=1  Mh1 , and
n
  k
onh
 
k=1
 Mh2 , we can write the unknowns of problem Ph1 as a linear
26
combination of the respective basis
ph(n) =
nhpX
k=1
c(p;n)k ?(p)k
uh(n) =
nhuX
k=1
c(u;n)k ~?(u)k
 h(n) =
nh X
k=1
c( ;n)k ?( )k
Jh(n) =
nhJX
k=1
c(J;n)k ~?(J)k
Substituting this representation into the iteration scheme and and using distributive prop-
erties of the gradient, divergence, cross product and dot product we have
nhuX
i=1
c(u;n)i  
Z
 
r~?(u)i : r~?(u)j +
nhJX
i=1
c(J;n)i   1
Z
 
~?(J)i  ~?(J)k
+
nhuX
i=1
c(u;n)i
Z
 
 
~?(J)k  ~B0
 
 ~?(u)i  
nhJX
i=1
c(J;n)i
Z
 
 
~?(J)i  ~B0
 
 ~?(u)j
+
nhuX
i=1
c(u;n)i  
Z
 
  
~uh(n 1) r ~?(u)i
 
 ~?(u)j +
nhuX
i=1
c(u;n)i
Z
 
 
~?(J)k  ~B
 ~
Jh(n 1)
  
 ~?(u)i
 
nhJX
i=1
c(J;n)i
Z
 
 
~?(J)i  ~B
 ~
Jh(n 1)
  
 ~?(u)j  
nhpX
i=1
c(p;n)i
Z
 
?(p)i r ~?(u)j
+
nh X
i=1
c( ;n)i
Z
 
r?( )i  ~?(J)k =
Z
 
F ~?(u)j +
Z
 
E ~?(J)k
and
 
nhuX
i=1
c(u;n)i
Z
 
r ~?(u)i ?(p)l +
nhJX
i=1
c(J;n)i
Z
 
~?(J)i  r?( )m =
Z
 
(~Jext ~n)?( )l :
27
Letting
A(u) =
h
a(u)(i;j)
i
:=
 
 
Z
 
r~?(u)i : r~?(u)j
 
1 i nu;1 j nu;
A(J) =
h
a(J)(i;k)
i
:=
 
  1
Z
 
~?(J)i  ~?(J)k
 
1 i nJ;1 k nJ;
B(p) =
h
b(p)(i;l)
i
:=
 Z
 
?(p)l r ~?(u)i
 
1 i nu;1 l np;
B( ) =
h
b( )(i;m)
i
:=
 Z
 
~?(J)i  r?( )m
 
1 i nJ;1 m n ;
C =  c(i;k) :=
 Z
 
 
~?(J)k  ~B0
 
 ~?(u)i
 
1 i nu;1 k nJ;
N(u) =
h
n(u)(i;j)
i
:=
 
 
Z
 
  
~uh(n 1) r ~?(u)i
 
 ~?(u)j
 
1 i nu;1 j nu;
and
N(J) =
h
n(J)(i;k)
i
:=
 Z
 
 
~?(J)k  ~B
 ~
Jh(n 1)
  
 ~?(u)i
 
1 i nu;1 k nJ
denote the matrices formed by the respective terms of the iteration scheme, given on the
right-hand side of each de nition. We group the coordinates from each linear combination of
the uknowns of problem Ph1 into a vector. We do that same for the respective right-hand sides
of each equation of the iteration scheme, and for the sake of simplicity reuse the notations
F and E.
c(u;n) :=
h
c(u;n)1 :::c(u;n)i :::c(u;n)nu
it
;
c(p;n) :=
h
c(p;n)1 :::c(p;n)i :::c(p;n)np
it
;
c(J;n) :=
h
c(J;n)1 :::c(J;n)i :::c(J;n)nJ
it
;
c( ;n) :=
h
c( ;n)1 :::c( ;n)i :::c( ;n)n 
it
;
F =
 Z
 
F  (u)1 :::
Z
 
F  (u)j :::
Z
 
F  (u)nu
 t
;
E =
 Z
 
E  (J)1 :::
Z
 
E  (J)k :::
Z
 
E  (J)nJ
 t
;
28
and
G =
 Z
 
~Jext?( )1 :::
Z
 
~Jext?( )l :::
Z
 
~Jext?( )n
 
 t
:
With these de nitions we can write the iterative scheme in a matrix form
2
66
66
66
64
A(u) +N(u) Bt(p)   N(J) +C t 0
B(p) 0 0 0
N(J) +C 0 A(J) Bt( )
0 0 B( ) 0
3
77
77
77
75
2
66
66
66
64
c(u;n)
c(p;n)
c(J;n)
c( ;n)
3
77
77
77
75
=
2
66
66
66
64
F
0
E
G
3
77
77
77
75
:
3.2 Finite Element Spaces
We choose  nite element spaces for our  nite-dimensional spaces. Recalling from [78] that
the spaces must satisfying the LBB-conditions
inf
qh2Mh1
sup
vh2Yh1
R
 
 r vh qh
kvhkYh
1
kqhkMh
1
  1 > 0
and
inf
 h2Mh2
sup
Kh2Yh2
R
 K
h  r h 
kKhkYh
2
k hkMh
2
  2 > 0;
we determine velocity-pressure pairs  Xh1;Mh1 and the current-potential pairs  Xh2;Mh2 .
The LBB condition for the velocity-pressure pair appears in the theory on the Stokes and
Navier-Stokes equations. Taylor-Hood elements are examples of velocity-pressure  nite-
element pairs that satisfy the condition. For Taylor-Hood elements we choose triquadratics
for the velocity and trilinears for the pressure.
29
(a) Q1 Trilinear Nodes. (b) Q1 Basis Function.
Figure 3.1: 2-D Example of Q1 Basis Function for Pressure
(a) Q2 Triquadratic Nodes.
(b) 2D Q2 Basis Function.
Figure 3.2: 2-D Example of Q2 Basis Function for Velocity
For each component k of the velocity vector basis function ~?(u)i we use triquadratic elements
of the form
 
?( )i (x)
 
k
=
X
i1;i2;i3  0
i1 +i2 +i3  2
 i1;i2;i3xi11 xi22 xi33 2  1 i 3  i0 + i1xi + i2x2i :  ij2R :
For the pressure basis functions ?(p)i , we use trilinears
?(p)i (x) =
X
i1;i2;i3  0
i1 +i2 +i3  1
 i1;i2;i3xi11 xi22 xi33 2f 1 i 3 ( i0 + i1xi) :  ij2Rg:
To satisfy the LBB condition for the current density and electric potential pairs, we chose
triquadratic elements for the electric potential and Nedelec elements of the  rst kind and
30
second degree for the current density [81, 92, 17],
~?(J)i (x)Pk =
8
>>>>
<
>>>
>:
u =
0
BB
BB
@
u1
u2
u3
1
CC
CC
A
: u1 2Qk 1;k;k;u2 2Qk;k 1;k;u3 2Qk;k;k 1
9>
>>>
=
>>>
>;
:
where k = 2. For example, with k = 2 the  rst component u1 is located in the tensor product
of a linear in x, quadratic in y, and quadratic in z. u1 will be linear and discontinuous in x
and quadratic and continuous in y and z. These properties of the Nedelec element are due
to the degrees of freedom having the form [17]
Z
e
u  qds 8q2Pk 1 (e);
1
jfj
Z
f
(u n) qdA; 8q2(Pk 2 (f))3;
and Z
K
u qdV; 8q2(Pk 3 (K))3;
where K is a reference element, e is an edge of K, f is a face of K,  is the unit vector along
the edge e, and Pk is the linear space of polynomials of degree k. Note p2Pk if and only
if
p(x) =
X
i1;i2;i3  0
i1 +i2 +i3  k
 i1;i2;i2xi11 xi22 xi33
3.3 Preconditioner
We are now in a position to numerically solve the linear system. Having chosen GMRES to
solve the system, we consider the preconditioner. To solve the system e ciently, we wish to
minimize the number of iterations to convergence of the iterative solver. Hence we consider
31
bounds on the size of the residual rn = b Axn, since the tolerance on this value indicates
when the solver has converged. In particular, we have the theorem [95]
Theorem 3.1 At step n of the GMRES iteration for the system Ax = b, the residual rn =
b Axn satis es
krnk=kpn(A)bk inf
pn2Pn
kpn(A)kkbk  (V) inf
pn2Pn
kpnk (A)kbk
where  (A) is the set of eigenvalues of A, V is a nonsingular matrix of eigenvectors (as-
suming A is diagonalizable), Pn = fp a polynomial : degree (p) n;p(0) = 1g and kpnk (A)
is de ned by
kpnk (A) = sup
z2 (A)
fpn (z) : z2 (A) R;pn2Png
From the theorem, we conclude that if the system matrix A is normal, then a small condition
number for the matrix of eigenvectors V will decrease the estimate onkrnkand help to reduce
the number of iterations to reach convergence. Perhaps more importantly, polynomials in
Pn existing such that their values when applied to A have a small norm will also decrease
the bound on the residual. More can be said if the matrix A is normal. The norm on the
polynomials is considered over the spectrum of A ( (A)). If n is small then the zeros of the
polynomial would also be small. For this to occur, the number of eigenvalues of A would
have to be small or grouped together in a small number of clusters m. With this in mind,
we consider our matrix system
2
66
66
66
64
A(u) +N(u) Bt(p)   N(J) +C t 0
B(p) 0 0 0
N(J) +C 0 A(J) Bt( )
0 0 B( ) 0
3
77
77
77
75
2
66
66
66
64
c(u;n)
c(p;n)
c(J;n)
c( ;n)
3
77
77
77
75
=
2
66
66
66
64
F
0
E
G
3
77
77
77
75
:
32
To construct a preconditioner, we start by considering the work done for the Stokes and
Navier-Stokes equations. These equations correspond to the upper left 2-by-2 block of our
system matrix. Successful constructions of preconditioners for this system have been carried
out by Elman, Silvester and Wathen [37]. The lower-right two-by-two block of our system
is similar in form to the Navier-Stokes system. We therefore plan to extend the work of
Elman, Silvester and Wathen to this portion of our system. To the best of our knowledge,
this extension to our system can not be found in the literature. As a  rst preconditioner
we construct one that is block-diagonal, and based on the system after removal of the terms
due to the nonlinearities u B (magnetic induction), J B (the Lorentz force), and u ru
(the interial term). We therefore plan to construct a preconditioner for the system
2
66
66
66
64
A(u) Bt(p) 0 0
B(p) 0 0 0
0 0 A(J) Bt( )
0 0 B( ) 0
3
77
77
77
75
and
2
64A Bt
B 0
3
75 =
2
66
66
66
66
66
4
A(u) ... Bt(p) 0
0 A(J) ... 0 Bt( )
::: ::: ::: :::
B(p) 0 ... 0 0
0 B( ) ... 0 0
3
77
77
77
77
77
5
:
For the Stokes equations, Elman, Silvester and Wathen developed a block-diagonal precon-
ditioner using the pressure Schur complement S(p) = BpA 1u Btp. Their preconditioner was
P1 =
2
64A 1u 0
0 BpA 1u Btp
3
75
To extend their work, we consider the potential Schur complement S( ) = B( )A 1(J)Bt( )
and make plans to use it to precondition the lower-right 2-by-2 block of our system in a
similar manner. Following their work, we recall the de nition of the pressure mass matrix
Q(p) =
h
q(p)ij
i
=
hR
 ?
(p)
i ?
(p)
j
i
along with the following theorem about its spectral equivalence
with the pressure Schur complement matrix S(p) [37].
33
Theorem 3.2 For any  ow problem with a Dirichlet boundary condition and discretized
using a uniformly stable mixed approximation on a shape regular, quasi-uniform subdivision
of R3, the pressure Schur complement matrix B(p)A 1(u)Bt(p) is spectrally equivalent to the
pressure mass matrix Q(p):
 21  
D
B(p)A 1(u)Bt(p)q;q
E
 Q
(p)q;q
  1 8q2Rnp such that q6= 1
and further
 21  
D
Bt(p)Q 1(p)B(p)v;v
E
 A
(u)v;v
  1 for all v2Rnu with v =2null B(p) :
A de nition for spectral equivalence is given below [96].
De nition 1 Consider two sequences of Hermitian positive de nite matrices Ah and Ch
and assume that all the eigenvalues  of C 1h Ah satisfy c1 <  < c2 with positive c1 and c2
independent of h. Then Ah and Ch are said to be spectrally equivalent.
From the de nition, we can see that Ch is a good preconditioner for the system matrix Ah
in terms of GMRES, since the preconditioned system?s eigenvalues cluster between c1 and
c2. The de nition holds for Q(p) and B(p)A 1(u)Bt(p) upon considering the eigenvalue problem
Q 1(p)B(p)A 1(u)Bt(p)x =  x. and the resultant inner product
D
B(p)A 1(u)Bt(p)x;x
E
=  
D
Q 1(p)x;x
E
.
The bounds from the theorem give the desired result. Hence, pressure mass matrix Q(p) is a
good preconditioner for pressure Schur complement B(p)A 1(u)Bt(p).
To construct, a preconditioner for the lower right 2-by-2 block, we de ne the laplace potential
mass matrix Q(r ) =
h
q(r )ij
i
=
hR
 r?
( )
i  r?
( )
j
i
and state and prove a corresponding
theorem.
34
Theorem 3.3 For any uniformly stable mixed approximation for the electric potential and
current-density on a shape regular, quasi-uniform subdivision of R3, the potential Schur
complement matrix S( ) = B( )A 1(J)Bt( ) is spectrally equivalent to the laplace potential mass
matrix Q(r ):
 22  
D
B( )A 1(J)Bt( )q;q
E
 Q
(r )q;q
  1 8q2Rn such that q6= 1:
The condition number of the potential Schur complement satis es
  S( ) =  max (S ) 
min (S )
< C 2
2c
:
To make clear some of the statements in the theorem, we give de nitions for shape regular
and quasi-uniform [37] and prove a lemma. LetTh be a decomposition of a convex polyhedral
domain of interest into hexahedral \brick" elements  k, and let h(k)x , h(k)y , and h(k)z be the
lengths of the brick element  k in the x, y, and z directions, respectively. We call a sequence
of hexahedral meshes fThg shape regular [37] if there exists a maximum brick edge ratio
  such that every brick element  k2Th satis es 1   k    , where
  k = max
 (
h(k)x
h(k)y
;h
(k)
x
h(k)z
;h
(k)
y
h(k)x
;h
(k)
y
h(k)z
;h
(k)
z
h(k)x
;h
(k)
z
h(k)y
)!
:
From this de nition, we see that the ratio of the longest edge to the shortest edge of each
element  k is bounded from above and below, and the uniform boundedness guarantees
that the elements do not degenerate with re nement. Letting hk denote the length of the
longest edge of  k, we call the sequence of meshes fThg quasi-uniform [37] if there exists
a constant  > 0 such that
min
 k2Th
hk  max
 k2Th
hk
 
with max
 k2Th
hk min
 k2Th
hk
 
:
35
for every mesh in the sequence. The de nition implies that the local mesh size hk is about
constant for each mesh, with the ratio of the largest local mesh size to the smallest local
mesh size bounded between 1 and 1, and these bounds carry over with each re nement.
Lemma 3 Using a Q2 triquadratic element approximation of the electric potential on a
shape-regular subdivision in R3 for which a shape regular condition holds, the laplace potential
mass matrix Q(r ) approximates the scaled identity matrix in the sense that
ch3  
 Q
(r )q;q
 
hq;qi  Ch
3 8q2Rn ;
where h = max k2Thhk and the constants c and C are independent of h.
Proof:
First, note that Q(r ) =
hR
 r?
( )
i  r?
( )
j
i
is symmetric and positive de nite. In particular,
0  
  
 r h
  
 
2
L2( )
=
Z
 
r h r h =
Z
 
r h r
n X
i=1
c( )i ?( )i =
n X
i=1
c( )i
Z
 
r h r?( )i
=
 R
 r 
h r?( )1 ::: R
 r 
h r?( )n
 
 
2
66
66
4
c( )1
...
c( )n 
3
77
77
5
=
 P
n 
j=1 c
( )
j
R
 r?
( )
j  r?
( )
1 :::
Pn 
j=1 c
( )
j
R
 r?
( )
j  r?
( )n
 
 
2
66
66
4
c( )1
...
c( )n 
3
77
77
5
=
 
c( )1 ::: c( )n 
 
2
66
66
4
R
 r?
( )
1  r?
( )
1 :::
R
 r?
( )
1  r?
( )n
 
... ... ...
R
 r?
( )n
  r?
( )
1 :::
R
 r?
( )n
  r?
( )n
 
3
77
77
5
2
66
66
4
c( )1
...
c( )n 
3
77
77
5
=
D
Q(r )c( );c( )
E
:
36
We can therefore form a basis for Rn out of the eigenvectors of Q(r ) and denote the
minimum and maximum eigenvalues as  min and  max respectively, with
 minhq;qi  Q(r )q;q   maxhq;qi 8q2Rn :
Consider an arbitrary element  k of the grid Th. Letting Q(k)(r ) =
hR
 kr?
( )
i  r?
( )
j
i
, we
have Q(r ) = Pk2ThQ(k)(r ) and
0    h  2L2( 
k)
=
D
Q(k)(r )c( );c( )
E
:
With Q(k)(r ) symmetric and positive de nite, the same is true for 1h(k)
x h
(k)
y h
(k)
z
Q(k)(r ). Similar to
Q(r ), we have
 (k)minhq;qi 
*
1
h(k)x h(k)y h(k)z
Q(k)(r )q;q
+
  (k)maxhq;qi 8q2Rn :
Shape regularity of the triangulations fThg implies that there exists a maximum brick edge
ratio   such that every element  k2Th satis es 1   k    . Without loss of generality,
let h(k)x  h(k)y  h(k)z = hk. Then h(k)xh(k)
z
< h(k)yh(k)
z
< 1 and   k = h(k)zh(k)
x
> h(k)yh(k)
x
> 1. With
h(k)x h(k)y h(k)z = h
(k)
x
h(k)z
h(k)y
h(k)z
 h(k)
z
 3 = 1
  k
h(k)y
h(k)z
h3k 1 
 k
h(k)x
h(k)z
h3k 1 2
 k
h3k 1 
 
h3k
and
h(k)x h(k)y h(k)z = h
(k)
y
h(k)x
h(k)z
h(k)x
 h(k)
x
 3  h(k)y
h(k)x
h(k)z
h(k)x
h3k h
(k)
z
h(k)x
h(k)z
h(k)x
h3k =  2 kh3k  2 h3k;
we have
h3k 1 2
 
 (k)minhq;qi  h(k)x h(k)y h(k)z  (k)minhq;qi 
D
Q(k)(r )q;q
E
 h(k)x h(k)y h(k)z  (k)maxhq;qi h3k 2  (k)maxhq;qi:
37
The sequence of triangulations fThg are quasi-uniform and there exists a constant  > 0
such that min k2Thhk  max k2Thhk. With h = max k2Thhk we have
 3h3 1 2
 
 (k)minhq;qi 
D
Q(k)(r )q;q
E
 h3 2  (k)maxhq;qi
and since  Q(r )q;q = P k2Th Q(k)q;q we have
 3h3 1 2
 
hq;qi
n kX
k=1
 (k)min  Q(r )q;q  h3 2 hq;qi
n kX
k=1
 (k)max:
Letting c =  3 2
 
Pn 
k=1  
(k)
min and C =  
2
 
Pn k
k=1  
(k)
max, gives
ch3  
 Q
(r )q;q
 
hq;qi  Ch
3 q2Rn :  
We now prove Theorem 3.3.
Proof:
The lower bound of the inequality that we wish to establish is a consequence of the LBB
condition
 2  min
 h6=constant max~Kh6=~0
   Kh;r h   
k hk1; kKhk0; :
We start  rst with the upper bound. Writing the components from the numerator of the
LBB condition as linear combinations of basis functions
Kh =
nJX
i=1
ki~?(J)i = ?t(J)k and  h =
n X
i=1
qi?( )i = ?t( )q;
38
the integral in the numerator of the LBB condition can then be written as
Z
 
Kh r h =
nJX
i=1
ki
n X
j=1
qj
Z
 
~?(J)i  r hj =  k;Bt( )q :
For the denominator we have
  
 ~Kh
  
 
2
L2( )
=
Z
 
~Kh ~Kh =
nJX
i=1
ki
nJX
j=1
kj
Z
 
~?(J)i  ~?(J)j =  A(J)k;k 
and
   h  2
L2( ) =
Z
 
 h h +r h r h
=
n X
i=1
qi
n X
j=1
qj
Z
 
?( )i ?( )j +
n X
i=1
qi
n X
j=1
qj
Z
 
r?( )i  r?( )j
=  q;Q( )q + q;Q(r )q =  q; Q( ) +Q(r ) q   q;Q(r )q ;
since Q( ) and Q(r ) are symmetric positive de nite. With A(J) symmetric positive de nite
we can diagonalize it with a matrix P and de ne its root
A(J) = PDP 1 and A1=2(J) = PD1=2P 1:
Thus,
   k;Bt
( )q
   =     ~Kh;r h       Kh  
(L2( ))3
  r h  
L2( ) =
q 
k;A(J)k 
q
hq;Qr qi
and we obtain the upper bound
  
 
D
k;Bt( )q
E  
 
q 
k;A(J)k phq;Qr+ qi
 1 8k2RnJ 8q2Rn :
39
In particular, for k = A 1(J)Bt( )q we have
rD
B( )A 1(J)Bt( )q;q
E
q 
q;Q(r )q 
 1 8q2Rn :
For the lower bound, we start with the LBB condition
 2  min
 h6=const max~Kh6=~0
  
 R ~Kh (r h)
  
  
  ~Kh   
(L2( ))3
k hkH1( )
= min
q6=1
max
k6=0
  
 
D
k;Bt( )q
E  
 
q 
k;A(J)k 
q 
q; Q( ) +Q(r ) q 
 min
q6=1
max
k6=0
  
 
D
k;Bt( )q
E  
 
q 
k;A(J)k 
q 
q;Q(r )q 
= min
q6=1
1q
 q;Q
(r )q
 maxk6=0
  
 
D
k;Bt( )q
E  
 
q 
k;A(J)k 
= min
q6=1
1q
 q;Q
(r )q
 maxw6=0
  
 
D
A 1=2(J) w;Bt( )q
E  
 
rD
A1=2(J)k;A1=2(J)k
E = minq6=1
1q
 q;Q
(r )q
 maxw6=0
  
 
D
w;A 1=2(J) Bt( )q
E  
 
phw;wi
= min
q6=1
1q
 q;Q
(r )q
 
D
A 1=2(J) Bt( )q;A 1=2(J) Bt( )q
E
rD
A 1=2(J) Bt( )q;A 1=2(J) Bt( )q
E = minq6=1
rD
B( )A 1(J)Bt( )q;q
E
q 
q;Q(r )q 
:
The maximum is attained when w = A 1=2(J) Bt( )q. This can be seen by incorporating the
norm of w in the denominator into the  rst term of inner product in the numerator and
recalling the scalar product formula jajjbjcos ( ) = ha;bi. The largest absolute value will
occur when the unit vector wphw;wi is in the same direction or opposite direction of the other
vector. The two inequalities yield the result
 22  
D
B( )A 1(J)Bt( )q;q
E
 Q
(r )q;q
  1 8q2Rn such that q6= 1
Using the previous lemma, we have the bounds
 22ch3 <  min
 
B( )A 1(J)Bt( )
 
hq;qi 
D
B( )A 1(J)Bt( )q;q
E
  max
 
B( )A 1(J)Bt( )
 
<Ch3 8q2Rn with q6= 1
40
and the condition number of the potential Schur complement satis es
 
 
B( )A 1(J)Bt( )
 
=
 max
 
B( )A 1(J)Bt( )
 
 min
 
B( )A 1(J)Bt( )
 < C 2
2c
 
Corollary 1 From the eigenvalues  of
2
64A 1(J) 0
0 Q 1
3
75
2
64 0 Bt( )
B( ) 0
3
75
2
64k
 
3
75 =  
2
64k
 
3
75
we have that
 22  
D
Bt( )Q 1(r )B( )k;k
E
 A
(J)k;k
  1 8k2RnJ such that k =2null B( ) 
Proof: We consider two cases  = 0 and  6= 0. For the case  = 0 with
 
kt  t
 t
6= 0
(eigenvalue problem), the preconditioning matrix on the left is nonsingular and therefore
implies that Bt( ) = 0 and B( )k = 0. Conversely, if k6= 0 and  6= 0 with Bt( ) = 0 and
B( )k = 0, then  = 0. Looking at the individual rows we have
Bt( ) =  A(J)k and B( )k =  Q :
Therefore
 k;Bt
( ) 
 =   A
(J)k;k
 and   ;B
( )k
 =  hQ ; i:
Further,  A(J)k;k =hQ ; i, if  6= 0. From this equality and the positive de niteness of
A(J) and Q(r ),  6= 0 implies k6= 0 and  6= 0. Our interest is thus in the case  6= 0.
41
Multiplying the system by the diagonal matrix shown below
2
64B( ) 0
0 Bt( )
3
75
2
64A 1(J) 0
0 Q 1
3
75
2
64 0 Bt( )
B( ) 0
3
75
2
64k
 
3
75 =  
2
64B( ) 0
0 Bt( )
3
75
2
64k
 
3
75
we can write each row in terms of a single component of the eigenvector
B( )A 1(J)Bt( ) =  B( )k =  2Q and Bt( )Q 1( )B( )k =  Bt( ) =  2A(J)k:
Hence, we obtain two related eigenvalue problems
Q 1B( )A 1(J)Bt( ) =  2 and A 1(J)Bt( )Q 1( )B( )k =  2k:
Further, D
B( )A 1(J)Bt( ) ; 
E
hQ ; i =  
2 =
D
Bt( )Q 1(r )B( )k;k
E
 A
(J)k;k
 :
The positive de niteness of Q(r ) and that of the potential Schur complement from the
previous theorem, we have n positive eigenvalues  2. By this relation and the previous
theorem, we obtain
 22  
D
Bt( )Q 1(r )B( )k;k
E
 A
(J)k;k
  1
as desired.  
Following similar arguments as in [37] we formulate the block-diagonal preconditioner that
we will use for our problem. In the following theorem we look at its e ects on the eigenvalues
of the simpli ed system.
Theorem 3.4 If uh, ph and Jh, h have uniformly stable mixed approximations such that
spectral equivalence of the Schur complements with the mass matrices hold, then for the
42
system
2
66
66
66
64
A 1(u) 0 0 0
0 A 1(J) 0 0
0 0 Q 1(p) 0
0 0 0 Q 1(r )
3
77
77
77
75
2
66
66
66
64
A(u) 0 Bt(p) 0
0 A(J) 0 Bt( )
B(p) 0 0 0
0 B( ) 0 0
3
77
77
77
75
2
66
66
66
64
u
J
p
 
3
77
77
77
75
=  
2
66
66
66
64
u
J
p
 
3
77
77
77
75
all negative eigenvalues satisfy
 1   12  
p1 + 4 min ( 
1; 2)
2 ;
and all positive eigenvalues satisfy
1   12 +
p5
2 :
Upon replacing Q(p) and Q(r ) by the corresponding Schur complements, the system has
exactly three eigenvalues  = 1; 12  
p5
2 .
Proof:
Let x =
2
64u
J
3
75 and y =
2
64p
 
3
75. De ne A;B, and Q by
2
64A Bt
B 0
3
75
2
64x
y
3
75 : =
2
66
66
66
64
A(u) 0 Bt(p) 0
0 A(J) 0 Bt( )
B(p) 0 0 0
0 B( ) 0 0
3
77
77
77
75
2
66
66
66
64
u
J
p
 
3
77
77
77
75
43
=  
2
66
66
66
64
A(u) 0 0 0
0 A(J) 0 0
0 0 Q(p) 0
0 0 0 Q(r )
3
77
77
77
75
2
66
66
66
64
u
J
p
 
3
77
77
77
75
=:  
2
64A 0
0 Q
3
75
2
64x
y
3
75:
Proving the last part of the theorem  rst, replace Q by the block diagonal matrix BA 1Bt
of the pressure and potential Schur complements . The result follows from consideration of
two cases: y = 0 and y6= 0.
For y = 0 we have x6= 0 else our eigenvector is the zero vector. Looking at the second row
of the system: Ax +Bty =  Ax, we see Ax =  Ax. Using the positive de niteness of A, we
have hAx;xi> 0 and therefore  = 1.
For y 6= 0, we set the  rst and second row of the system equal to each other. From the
second row of the system, we have Bx =  Qy =  BA 1Bty and therefore
(  1)Bx = (  1) BA 1Bty:
Solving the  rst row Ax +Bty =  Ax for Bty and multiplying by BA 1 we have
Bty = (  1)Ax =) BA 1Bty = (  1)Bx:
Therefore, BA 1Bty = (  1) BA 1Bty and with BA 1Bt positive de nite (from the last
theorem) we have hBA 1Bty;yi> 0 and
1 = (  1) with  = 12  
p5
2 :
using the quadratic formula.
44
To determine the intervals from the  rst part of the theorem, we consider the two cases  < 0
and  > 0. Suppose that  < 0 is an eigenvalue of the system. We claim this implies that
y6= 0. In contradiction, suppose that y = 0. Then x6= 0 to have a meaningful eigenvalue
problem. Looking at the  rst row of the system, we have Ax + Bty =  Ax. Since y = 0,
Ax =  Ax. With A positive de nite, and x6= 0 it must be that  = 1 > 0, a contradiction.
The  rst row of the system Ax + Bty =  Ax also implies that x = 1  1A 1Bty. We can
substitute this result for x into the second row of the system Bx =  Qy, and obtain
1
  1BA
 1Bty =  Qy and BA 1Bty =  (  1)Qy =  (1  )Qy:
The positive de niteness and spectral equivalence of the Schur complements with their cor-
responding mass matrices along with  < 0 imply
  hQy;yi   (1  )hQy;yi=  BA 1By;y  hQy;yi:
With y6= 0 we have    1. From the lower bounds of the spectral equivalences
  (1  )hQy;yi=  BA 1Bty;y  min ( 1; 2)hQy;yi:
Hence,  2   min ( 1; 2) 0. Factorizing, we have (   +) (    ) 0, where
  = 12  
p1 + 4 min ( 
1; 2)
2 :
From    + or     , and we obtain the upper bound   12  
p
1+4 min( 1; 2)
2 . With the
bounds  1   12  
p
1+4 min( 1; 2)
2 established for  < 0, we are ready to look at the last
case  > 0.
For  > 0 we have that x 6= 0. Suppose by contradiction that x = 0. Then, having an
eigenvalue problem with  > 0 it must be that y6= 0. Taking an inner product of the second
45
row of the system Bx = 0 =  Qy with y implies that y = 0, since Q(r ) positive de nite
and  > 0.
Equating the  rst and second row of the system by way of the inner product hBty;xi =
hBx;yi, we have
(  1)hAx;xi=  hQy;yi 0:
And with x6= 0, we have   1. For the upper bound, the second row provides y = 1 Q 1Bx,
and we substitute this into the  rst row
(  1)Ax = Bty = 1 BtQ 1Bx:
From the previous corollary, we have
(  1)hAx;xi= 1 
 1
 B
tQ 1Bx;x
 
 1 hAx;xi
and x6= 0 implies that  2   1 0. Completing the square, we have    12 2 
p5
2
2  0
and the di erence of squares gives the upper bound   12 +
p5
2 .  
The theorem on the block-diagonal preconditioner can be made more general as in [37].
Rather than using A in the preconditioner, spectral equivalence of A with a simpler matrix
is used in establishing eigenvalue bounds, However, the theorem above will allow us to make
comparisons in the next chapter. In particular, we will see the e ect of the preconditioner
from the theorem on eigenvalues of the simpli ed system for a particular example.
46
Chapter 4
Computational Experiments and Results
4.1 Introduction to the Computational Environment
4.1.1 Choice of the Libraries
The deal.ii library was used to implement the  nite element method, p4est was used to
partition the domain to the nodes of the distributed memory cluster, and a GMRES im-
plementation from Trilinos was used to approximate the solution of the linear system in
parallel as well invoke part of the preconditioner. Freely available libraries were also used for
mesh generation and solution visualization. The open source library Gmsh [43] was used to
generate several meshes, and the open-source library VisIt [22, 23, 100] was used to visualize
the approximated solution.
Successful meshes were generated using Gmsh with particular coding techniques applied to
the Gmsh script  le and alterations to output .msh  les (removing all but volume elements).
The techniques were necessary, since Gmsh generates meshes using triangles in two dimen-
sions and tetrahedrons in three dimensions (n-simplices), where deal.ii works with meshes
that are quadrilaterals in two dimensions and and hexahedrals in three dimensions. There-
fore, using Gmsh required unconventional coding in its script language in order to obtain
hexahedral three-dimensional meshes.
The need for care with Gmsh was  rst noticed with Navier-Stokes benchmark tests for  ow
around a cylinder discussed by Turek and Schaefer [87]. During these tests it became evident
47
(a) Velocity Field with Presure Contours
(b) Velocity Vector Field
Figure 4.1: Schafer Turek Benchmark
that boundary conditions were not being properly enforced for the velocity. Nonzero veloc-
ities were occurring on the boundary where zero velocities were prescribed. After trouble-
shooting the code, it became clear that the grid was causing problems. Some web research
48
concerning Gmsh and hexahedrals lead to editing of the Gmsh script  les and the output
.msh  les. The problem with the boundary was  xed by removing elements in the out-
put .msh  le that were not volume elements (see GMSH manual http://geuz.org/gmsh/).
Results comparable to Schafer and Turek were obtained as seen in the  gure 4.1. With in-
creasing library of meshes o ered by newer versions of deal.ii and continued obstacles using
Gmsh for generation, Gmsh was replaced by the deal.ii library for mesh generation toward
the end of this research. In particular, code was added to the deal.ii library to create a
toroidal mesh. A factor that appears to have in uenced this decision was the activity of the
mailing list. The depth of resources and documentation became important with increased
investment in a library.
All of the visualization of data conducted during this research was performed using the library
VisIt. VisIt was initially developed and is supported by the Lawrence Livermore National
Lab. VisIt has decent documentation. However, the lack of documentation on movies with
resolution providing crisp clarity have brought into consideration review of other libraries
such as Paraview for comparison of capabilities and documentation.
4.1.2 Working with the Libraries
It should be mentioned that problems running code with the libraries can, do and have
occurred during development. It is di cult to recall all bugs and problems encountered.
They range from a variety of sources. Problems may occur in the installation process of the
open-source libraries. This can be due to compiling a library with a faulty or unsupported
compiler, attempting to link di erent libraries that have not been compiled with the same
compiler, trying to link to versions of libraries that a particular library does not support,
trying to link shared and static libraries, or not being able to properly locate and link to
a compiler that has been installed by a system administrator. Bugs may also occur in the
code being developed, such as a segmentation fault for an incorrectly set vector size. Bugs
49
may be found in the codes of the libraries that are being used, such as a bug encountered
with the deal.ii PETSc vector values associated with ghost cells not being properly zeroed or
a bug encountered with deal.ii function calls that work properly for serial code but corrupt
for parallel code - as in mesh generation problems for an L-shaped domain. Bugs may
occur that allow the code to compile and run but give unexpected results. An example
is the approximate solution to the continuous pressure solutions of example 1 that gave
discontinuous pressure contours. Rather than set the pressure at a node, the mean value
pressure was calculated and then subtracted o the solution. This change in determining
the unique solution eliminated the discontinuity appearing in the pressure contours. As the
code was developed this problem eventually disappeared when both methods were retested.
Attempts to recreate the bug have been unsuccessful. Most problems can be traced to a
source, however some may disappear for uknown reasons with code development.
Bugs can occur outside of code run time, such as during visualization or mesh creation. An
example of this type of problem is mentioned with the opensource software Gmsh opensource
software when creating the  ow around a cylinder to test out the Navier Stokes code while
developing toward the MHD code. The problem appeared to be creating hexahedral elements
using software designed to create tetrahedral elements. The problem was successfully over-
come for a mesh used to model the  ow around a cylinder in a rectangular duct. However,
the problem was encountered again with the torus mesh with no apparent  x. Fortunately,
deal.ii?s mesh generation capabilities were developed enough to add code to the library to
create a torus from hexhedral elements.
The use of multiple libraries can be forboding. The cause of an error may appear (and can
be) di cult to determine or understand. In troubleshooting all the bugs encountered, the
location of the bug was found  rst. Then, to understand the bug?s behavior, adjustments to
the code were made. Examples are changing the sample problem being tested or the size of
a vector. The most common bugs occurred at interfaces between libraries. Important tips
50
to minimize work in tracking bugs are developing, programming, compiling and testing code
frequently and in increments.
4.2 Example 1 - A Problem with an Exact Solution
The parallel  nite element code was tested for expected convergence rates on two di erent
domains: the cube [0;1]3 and an L-shaped domain
[0;1]3
[ 
x+ (0;1;0): x2[0;1]3 
[ 
x+ (1;1;0): x2[0;1]3 ;
Using the example problem from [78]
u (x;y;z) :=
2
66
66
4
0
  cos( x) exp( s) 12  z 
  cos( x) exp( s) y 12 
3
77
77
5
J (x;y;z) :=
2
66
66
4
2 sin( x)(1 s) exp( s)
  cos( x) exp( s) y 12 
  cos( x) exp( s) z 12 
3
77
77
5
p(x;y;z) :=  12 sin2 ( x)sexp( 2s);
 (x;y;z) := 2 cos( x);
B (x;y;z) :=
2
66
66
4
1
sin( x) exp( s) 12  z 
sin( x) exp( s) y 12 
3
77
77
5
with r =  0;y 12;z 12 and s = jrj2 =  y 12 2 +  z 12 2. Note that u and B are
divergence free andr B = J (see Appendix D). The above set of equations is therefore a
smooth solution to the velocity-current magnetohydrodynmaic equations with all parameters
51
 ,  , , and  equal to unity and the data F, E, g, Jext, and Bext de ned by
F :=   u + (u r)u +rp J B;= 
2
66
66
4
r ru1
r ru2
r ru3
3
77
77
5
+
2
66
66
4
u ru1
u ru2
u ru3
3
77
77
5
+
2
66
66
4
px
py
pz
3
77
77
5
 
2
66
66
4
J2B3 J3B2
J3B1 J1B3
J1B2 J2B1
3
77
77
5
=
2
66
66
66
66
66
4
0
 cos( x) exp( s) z 12   2 + 9 4s + sin2( x) exp( 2s) y 12 
  2 cos2( x) exp( 2s) y 12 
  cos( x) exp( s) y 12   2 + 9 4s + sin2( x) exp( 2s) z 12 
  2 cos2( x) exp( 2s) z 12 
3
77
77
77
77
77
5
E := J +r  u B =
2
66
66
4
2 sin( x) ((1 s) exp( s) 1)
0
0
3
77
77
5
and
g := u   ; Jext := J  R3n ; Bext := i:
Further, instead of integrating the Biot-Savart formula over R3n , the magnetic  eld induced
by Jext can be obtained from B Bext B J    . For uniqueness of the pressure and the
electric potential, the exact solutions above for the pressure and potential were adjusted by
a constant
p(x;y;z) :=  12 sin2 ( x)sexp( 2s) p;
 (x;y;z) := 2 cos( x) 2 ;
52
where p = 1 R  12 sin2 ( x)sexp( 2s)dx was the mean pressure value of the exact solution
over the domain  . For the electric potential  one of the degrees of freedom on the x = 0
boundary of the domain [0;1]3 was set to the value of the electric potential  at x = 0 (
 (0) = 0 ) by using the deal.ii constraint matrix.
The rates of convergence on the cube and the L-shaped domain are given in the tables 4.1
and 4.2.
Table 4.1: Errors and Convergence Rates for the Unit Cube Domain
Domain Infomation u J p  
# Cells # DOFs Cell size (h) H1-error L2-error L2-error H1-error
H1-rate L2-rate L2-rate H1-rate
512 34,253 0.125 0.015004 0.000573 0.008111 0.006400
(8x8x8)
4096 253,205 0.0625 0.003751 0.000136 0.002031 0.001602
(16x16x16) 2.000000 2.019216 1.999429 1.999674
32,768 1,945,637 0.03125 0.000938 0.000034 0.000508 0.000401
(32x32x32) 1.999615 2.000000 1.999290 1.998200
Table 4.2: Errors and Convergence Rates for the Lshaped Domain
Domain Infomation u J p  
# Cells # DOFs Cell size (h) H1-error L2-error L2-error H1-error
H1-rate L2-rate L2-rate H1-rate
24 2183 0.5 0.404921 0.146763 0.0154336 0.217988
(2x2x2)
192 13,969 0.25 0.102145 0.037215 0.00490111 0.00558378
(4x4x4) 2.000000 2.076676 1.997737 1.998621
1536 44,508 0.125 0.02557343 0.00933088 0.000997906 0.0140478
(8x8x8) 2.000001 2.019216 1.999429 1.999674
Visualizations of the numerical approximations to the solution of this example (see Figure
4.2) show symmetry about the line at the intersection of the planes x = 12 and y = 12. Note
that to obtain illustrative perspectives in the plots for each variable, the point of view for
each variable?s plot is slightly di erent as indicated by the axes in the lower left corner. In
particular, to compare the velocity vector  eld with the pressure contour plot, the pressure
53
(a) Current in Cube Domain - Example 1 (b) Potential in Cube Domain - Example 1
(c) Velocity in Cube Domain - Example 1 (d) Pressure in Cube Domain - Example 1
Figure 4.2: Example 1 for Cube Domain
plot would have to be rotated 90 degrees counter-clockwise around the z-axis. The symmetry
in each plot would then match.
Approximate solutions to the problem on an L-shaped domain were also obtained. with
graphs of the results shown in  gure 4.3. From the  gure, we see that the L-shaped domain
extends the pro les of each approximated function occurring on the cube domain.
4.3 Example 2 - An Applied Problem
Approximate solutions to a second problem with more realistic conditions were also obtained
for the unit cube and L-shaped domains. The problem prescribed a current entering at one
54
(a) Current in L-shaped Domain - Ex 1 (b) Potential in L-shaped Domain - Ex 1
(c) Velocity in L-shaped Domain - Ex 1 (d) Pressure in L-shaped Domain - Ex 1
Figure 4.3: Example 1 for Lshaped Domain
55
end (part of a domain face parallel to the z-axis) of the domain and exiting at the an end
opposite to the entrance. The Dirichlet boundary conditions for the velocity were zero and
the external magnetic  eld was zero for the problem as well. The magnitude of the current
entering and exiting the domain through the boundary was  xed at jJextj = 100. In order
to obtain a unique solution for the electric potential and pressure, the problem was handled
similar to the previous problem with an exact solution. In particular, the pressure mean
value was calculated in a post process and subtracted from the solved pressure solution. The
results can be seen in Figures 4.4, 4.5 and 5.2.
(a) Current in Cube Domain - Example 2 (b) Potential in Cube Domain - Example 2
(c) Velocity in Cube Domain - Example 2 (d) Pressure in Cube Domain - Example 2
Figure 4.4: Example 2 for Cube Domain
The current density vector  eld plots exhibit the current entering and exiting the domain, as
indicated above. This can be seen for both the cube and L-shaped domains, where the current
56
(a) Current in L-shaped Domain - Ex 2 (b) Potential in L-shaped Domain - Ex 2
(c) Velocity in L-shaped Domain - Ex 2 (d) Pressure in L-shaped Domain - Ex 2
Figure 4.5: Example 2 for Lshaped Domain
travels into and out of the domain in the same direction. Since the external magnetic  eld is
zero, the induced magnetic  eld described by the Biot-Savart formula is the only contributor
to the forces experienced by the  uid - the Lorentz force J B in the momentum equation.
In the vector  eld plot for the velocity, we can see that maximum  uid speeds are occurring
in the center of the domain with an upward direction. Again, the only driving force for this
 ow is the Lorentz force J B due to the induced magnetic  eld. In consideration of the
Biot-Savart formula and the symmetry of the current-density in the domain, we see that
the upward maximum velocity occurring in the center may be expected. The BiotSavart
formula provides for large contributions to the induced magnetic  eld in regions where the
57
magnitudes J (y) of the current are large as well as where the distancesjx yjare small. We
see that the current magnitudes are largest at the boundary. In calculating the Biot-Savart
integrand at the center of the domain x = (1=2;1=2;1=2) and incorporating the negative
sign from outside the integral, vectors pointing in the direction of the negative x-axis for
both exiting and entering currents on the boundary are obtained. Likewise, close to the
center of the domain y where the distance jx yj to the center x is small, the negative
cross-product of the integrand again gives a direction toward the negative x-axis. With the
induced magnetic  eld at the center having this direction, we can determine the body force
acting on the  uid at this location by taking the cross product of this  eld with the current
moving in the direction of the positive y-axis. This cross product yields an upward-pointing
Lorentz force in the positive z-direction. This upward body force acting on the  uid does
indeed correspond to the upward velocity pro le seen at the center of the domain. Further,
the contributions noted on both sides of the plane y = 1=2 can only occur near the center
of the cube and imply a maximum magnitude there. The lack of domain symmetry with
respect to the entering and exiting current density of the L-shaped domain makes it harder
to argue the velocity pro les seen for this case.
4.3.1 Preconditioning Results
Returning to the second problem with the more realistic conditions and the cube domain, we
compare the expected theoretical results to computed computational results for the eigen-
values of this problem. Recall the form of our preconditioner
P =
2
66
66
66
64
A 1(u) 0 0 0
0 Q 1(p) 0 0
0 0 A 1(J) 0
0 0 0 Q 1(r )
3
77
77
77
75
;
58
where Q(p) is the pressure mass matrix and Q(r ) is the laplacian mass matrix.
For each eigenvalue plot, we have considered the second iteration in the Picard iteration
sequence. We plot the eigenvalues on the complex plane for the unit cube re ned once
in each direction to yield 8 cells. We start with the plot of the eigenvalues of the system
matrix before preconditioning, shown in 4.6. From this plot, we can see that the eigenvalues
are not clustered in groups, especially near the origin. We may expect that the number of
GMRES iterations to convergence to be high, and indeed without a preconditioner GMRES
fails converge (for the default tolerance and maximum number of iterations).
Figure 4.6: Eigenvalues for System Matrix
Next, we examine the eigenvalue plot in  gure 4.7 for simpli ed system of theorem 3.4 after
applying our preconditioner. From the plot, we can see clustering in three locations. Looking
closely, we see the locations are exactly the eigenvalues stated in the last part of the theorem
when constructing the preconditioner by using the Schur complements instead of the mass
59
matrices. It appears that constructing with the mass matrices gives the same result. Roughly
twenty eigenvalues are located away from the three clusters. These eigenvalues are believed
to be due to the velocity boundary conditions enforced by deal.ii by way of the system matrix
[12, 1]. Further, a zero eigenvalue can be seen in the graph. This was determined to be due
to the null space of the pressure having dimension one. We note that GMRES did not have
a problem with convergence for such a system.
Figure 4.7: Eigenvalues for Preconditioned Stokes-Ohms System
Finally, we look at the e ects of preconditioning on the eigenvalues for the simpli ed system
with the inertial term,  gure 4.8, and for the simpli ed system with both the inertial term
and Lorentz force terms (the entire system matrix),  gure 4.9. From these plots, we see
that clustering of eigenvalues is still occurring, though not as well as for only the simpli ed
system. We note that GMRES still converged with less than 100 iterations.
60
Figure 4.8: Eigenvalues for Preconditioned Stokes-Ohms System with Inertial Term
Figure 4.9: Eigenvalues for Preconditioned System Matrix
61
Chapter 5
Conclusion and Future Work
5.1 Summary
We have implemented a parallel, object-oriented,  nite element code that utilizes open-
source, academic and government software libraries. The code is capable of approximating
the solution to the velocity-current MHD equations on high performance computer clusters.
We have established a preconditioner that permits using GMRES to approximate the solution
to the system of equations iteratively. We have successfully tested the code on the unit cube
and an L-shaped domain for two problem types. For the problem with an exact solution we
have obtained expected orders of convergence [78]. We have also successfully run the code
with millions of unknowns on a 64 node distributed memory architecture at the Alabama
Supercomputer Center DMC. From these results and opportunities to discuss our work, we
have a couple di erent directions that we would like to develop in the future that will be
outlined in the upcoming sections.
5.2 Torus Domain
Discussions with physicists have indicated a potential for the velocity-current equations
to model plasma in fusion reactors. Two general types of fusion reactors are stellarators
(named after the devices purpose of harnessing the power of a stellar object - the sun) and
tokomaks (a Russion acronym, due to its origin). The main di erence between a tokomak
and a stellarator are their methods of generating the magnetic  elds that con ne the plasma.
The stellarator uses two magnetic  elds generated by currents running through coils that
62
surround the plasma. The tokomak generates one  eld by running current through a set
of coils encircling the plasma, but creates the second  eld by sending current through the
plasma. Our system of equations appears to apply naturally to the tokamak reactor design,
with a current being applied to the domain to create the second magnetic  eld. We note that
research on fusion energy is active, with fusion energy being an engineering grand challenge
problem for the 21st century and grants being available through the NSF on fusion energy
research. In fact, the ITER (International Thermonuclear Experimental Reactor) project
located in the south of France is the largest tokomak reactor in the world and has been
designed to be the  rst full scale fusion power plant. We can see that With the domain
of a tokomak being essentially a torus, we have interest in approximating solutions to our
equations for a toroidal domain. We have started testing out equations on this domain. In
particular, we are testing the  rst problem from chapter 3, having an exact solution, on a
quarter torus. Results at this time are located in in the  gure 5.1 and table 5.1.
Table 5.1: Errors and Convergence Rates for the Quarter Torus Domain
Domain Infomation u J p  
# Cells # DOFs Cell size (h) H1-error L2-error L2-error H1-error
H1-rate L2-rate L2-rate H1-rate
80 5553 h1 0.132016 0.0270524 0.0257992 0.0828241
640 40,261 12h1 0.0349727 0.00647049 0.0134858 0.0214727
1.916412 2.063810 0.935884 1.947547
5120 306,597 14h1 0.00894921 0.00164881 0.0129277 0.00552349
1.966397 1.972450 0.060975 1.958852
From the convergence table for the quarter-torus we see the expected rates of approximation,
except for the pressure. Work continues on this problem to understand the problematic rate
of convergence for the pressure. We have seen a similar situation with the L-shaped domain
and will be testing our code with this in mind.
63
(a) Current in Torus Domain - Ex 1 (b) Potential in Torus Domain - Ex 1
(c) Velocity in Torus Domain - Ex 1 (d) Pressure in Torus Domain - Ex 1
Figure 5.1: Example 1 for Lshaped Domain
(a) Current in Torus Domain - Ex 2 (b) Potential in Torus Domain - Ex 2
64
(c) Velocity in Torus Domain - Ex 2 (d) Pressure in Torus Domain - Ex 2
Figure 5.2: Example 2 for Torus Domain
We have also obtained results on the quarter-torus domain for the second example of chapter
3. In this case the current can be seen entering and exiting through the cross-sections of
the torus (see Figure 5.2), with the directions of entry and departure perpendicular to each
other. The predictions for the velocity and the pressure are interesting. According to the
velocity vector  eld, the  uid does not appear to be cycling through to torus in the xy-plane,
but rather the movement is vertical and along the torus?s radial cross-sections. At the angle
of capture of the velocity graph, this may be di cult to see. The layers of the pressure
contours also correspond to the movement of the  uid into and out of the torus? radial axis,
with high pressures located along that axis.
Finally, we have started numerical approximations of the equations on a full torus. Related
to the second example of chapter three with the velocity at the boundary being zero and
the current entering and exiting a particular part of the domain, we have the results seen in
 gure 5.3
65
(a) Current in Full Torus Domain - Ex 2 (b) Potential in Full Torus Domain - Ex 2
(c) Velocity in Full Torus Domain - Ex 2 (d) Pressure in Full Torus Domain - Ex 2
Figure 5.3: Example 2 for Full Torus Domain
5.3 Code Speedup
Another research direction is the consideration of code bottlenecks. We have started work in
this direction by comparing the three slowest components of the code: the assembly of the
equation system, the solution of the system and the calculation of the Biot-Savart integral at
each quadrature point during the assembly. In the plots in  gure 5.4, we quickly see that the
slowest component of the code during runtime is the calculation of the Biot-Savart integral.
Figure 5.4 considers strong scaling, where the problem size (the number of unknowns or
domain re nement) has been  xed and the number of processors is varied. We recall that
the speedup for a parallel program under a  xed problem size is the ratio of the parallel
66
program?s time over the sequential program?s time. Ideally, if two processors are used instead
of one, then the time to complete a portion of code should decrease by a factor of two, and
the speedup would be 2. In our observations, as we double the number of processors we
use to numerically approximation the solution to the equations of  xed mesh size, we see
(in the plot on the left in  gure 5.4) that the time to completion is roughly halved for each
of the components of the code being considered. Recall that parallel e ciency is the ratio
of speedup over the number of processors, and that ideal e ciency has a value of 1. We
conclude from the graphs that the libraries being used in the assembly (deal.ii) and the
solution (trilinos) of the system of equations have good parallel e ciency in this range of
processors for this particular problem size. From the graphs as well, the in-house MPI code
for the BiotSavart integral also exhibits the same level of e ciency in this processor range.
To see the speedup and e ciency more clearly, we use a log-log plot with a base of two.
Doubling the number of processors corresponds to incrementing by one unit on the x-axis,
and halving the cpu time corresponds to a decrease on the y-axis by one unit. We expect
in the ideal case that an increment of one unit along the x-axis would result in a decrease
of one unit along the y-axis, and that we would see a linear trend in the log-log plot with a
slope of negative one. The plot on the right in  gure 5.4 does indeed show trends close to
ideal.
With the three slowest components working near optimal e ciency with respect to strong
scaling observations, we would like to address the slowest component of the code, the Biot-
Savart integral. The Biot-Savart integral calculates the magnetic  eld induced by the current
density in the domain. The code does this using quadrature. Each processor contains
information - previous current densities and quadrature points - needed to perform the
quadrature calculation over the domain for a particular location x in
B (Jn 1) (x) =  4 
Z
 
x y
jx yj3  Jn 1 (y)dy
67
Figure 5.4: CPU Time Versus Number of Processors
to determine the value of the Biot-Savart integral at that particular location. With all
the information readily available on each processor (after a numerical solve), the major
component of the calculation is the MPI communication between processors. To see if
we can shorten the time to approximate the induced magnetic  eld at these locations, we
consider approximating the solution to the magnetic  eld in the Maxwell equations
r B = 0 and r B = J(n 1) on  
with the boundary condition
68
B n = B (Jn 1) n
by a parallel  nite element method, using deal.ii. We approximate the solution to the system
of equations using the least squares  nite element method [67, 19]. Just as with the solution
method using only the Biot-Savart integral, the obtained magnetic  eld values are used in
the next solve for the four uknowns J, u, p,  . For the  rst example problem from chapter
three with exact solutions, we look at strong scaling using this method, and compare this
to the Biot-Savart method. Note that the method using only the Biot-Savart integral was
optimized to make the comparison as fair as possible - leading to considerable reductions
that can be seen when comparing  gure 5.5 to 5.4. However, even with optimizations to the
communication occurring during the Biot-Savart method, we can see that the solution of the
div-curl system is almost an order of magnitude faster and appears to parallelize well.
As can be seen in the  gure 5.5, we even tested an approximation to the BiotSavart integral
method, where we used the fact that the denominator in the integrand of the BiotSavart
integral caused the magnetic  eld values to be small with large distances between x and y.
This method was a  rst step toward a fast multipole method for the integral [24, 47, 13, 46,
93, 64]. However, the least squares  nite element method to calculate the induced magnetic
 eld in the domain outperformed this simplifying approximation to the quadrature method
for the BiotSavart integral. This div-curl  nite element solver for the induced magnetic  eld
seems to be a natural direction for future work. For future work, we would like to do some
analysis on this code and extend this code to domains that are nonconvex. A major issue
is the possible occurrence of singularities in such a domains [26]. Hence, the geometry of
the domain must be taken into consideration [27, 14, 63, 28] with the possible use of the
discontinuous  nite element method. It may also be of interest to address the problem from
the point of view of  nite element exterior calculus [61, 6, 4, 5].
69
Figure 5.5: CPU Time Versus Number of Processors
70
Appendices
71
Appendix A
Weak Formulation and Identities
A.1 Navier-Stokes Identities
Here, we go over the identities
Z
 
v rp =  
Z
 
pr v +
Z
@ 
pn v
 
Z
 
v r2u =
Z
 
ru: rv 
Z
@ 
(n ru) v
 
Z
 
((u r) u) v =  2
 Z
 
((u r) u) v 
Z
 
((u r) v) u
 
to write Navier-Stokes equations in the form below.
Z
 
F v =   
Z
 
 u v + 
Z
 
((u r) u) v +
Z
 
rp v
=  
Z
 
ru: rv +  2
 Z
 
((u r) u) v 
Z
 
((u r) v) u
 
 
Z
 
pr v
A.1.1 Pressure Term
We weaken the pressure in the term R rp v by using
@
@x (v a) =
@
@x (v1a1 +v2a2 +v3a3)
= v1@a1@x + @v1@xa1 +v2@a2@x + @v2@xa2 +v3@a3@x + @v3@xa3
72
= v 
 @
@xa
 
+
 @
@xv
 
 a
to form the identity
r (pv) =
 @pv
1
@x ;
@pv2
@y ;
@pv3
@z
 
=
 
p@v1@x +v1@p@x;p@v2@y +v2@p@y;p@v3@z +v3@p@z
 
= v rp+pr v:
Using this result with the divergence theorem R r v = R@ v n we have
Z
 
v rp =  
Z
 
pr v +
Z
 
r (pv)
=  
Z
 
pr v +
Z
@ 
pn v
A.1.2 Di usion Term
The di usion term R r2u v is weakened using integration by parts and the identity
r (r~u ~v) = r~u: r~v +~v r2~u
Writing the gradient of the velocity vector in matrix form
ru =
2
66
66
4
@
@x
@
@y
@
@z
3
77
77
5
 
u1 u2 u3
 
=
2
66
66
4
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
3
77
77
5
(A.1)
73
The di usion term can then be written as
 u = r2u =r (ru) =
 
@
@x1
@
@x2
@
@x3
 
2
66
66
4
u1;x u2;x u3;x
u1;y u2;y u3;y
u1;z u2;z u3;z
3
77
77
5
=
2
66
66
4
@2u1
@x2 +
@2u1
@y2 +
@2u1
@z2
@2u2
@x2 +
@2u2
@y2 +
@2u2
@z2
@2u3
@x2 +
@2u3
@y2 +
@2u3
@z2
3
77
77
5
=
2
66
66
4
r ru1
r ru2
r ru3
3
77
77
5
=  r2u1;r2u2;r2u3 
Here we will write the tensor-tensor product ru :rv as
ru :rv =
2
66
66
4
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
3
77
77
5
:
2
66
66
4
@v1
@x
@v2
@x
@v3
@x
@v1
@y
@v2
@y
@v3
@y
@v1
@z
@v2
@z
@v3
@z
3
77
77
5
= u1;xv1;x +u1;yv1;y +u1;zv1;z +u1;xv2;x +u2;yv2;y +u2;zv2;z
+u3;xv3;x +u3;yv3;y +u3;zv3;z
= (ru1 rv1) + (ru2 rv2) + (ru3 rv3)
(A.2)
Consider the divergence of the tensor-vector product ru v
r (ru v) = r 
0
BB
BB
@
0
BB
BB
@
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
1
CC
CC
A
 v
1
CC
CC
A
=r 
  @
@xu
 
 v;
 @
@yu
 
 v;
 @
@zu
 
 v
 
= @@x
 @
@xu v
 
+ @@y
 @
@yu v
 
+ @@z
 @
@zu v
 
=
 @2u
@x2  v +
@v
@x 
@u
@x
 
+
 @2u
@y2  v +
@v
@y 
@u
@y
  @2u
@z2  v +
@v
@z  
@u
@z
 
= v hr ru1;r ru2;r ru3i+ (rv1 ru1 +rv2 ru2 +rv3 ru3)
= v  r2u1;r2u2;r2u3 +rv: ru = v r2u +rv: ru
74
Hence,
Z
 
r (ru v) =
Z
 
v (r ru) +
Z
 
ru: rv
With the help of the divergence thereom, we can rewrite the di usion term  R r2u v in
Navier Stokes
 
Z
 
v r2u =
Z
 
ru: rv 
Z
 
r (ru v)
=
Z
 
ru: rv 
Z
@ 
(ru v) n
=
Z
 
ru: rv 
Z
@ 
(n ru) v:
The last equality is true since
(ru v) n =
0
BB
BB
@
0
BB
BB
@
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
1
CC
CC
A
 v
1
CC
CC
A
 n =
0
BB
BB
@
v @@xu
v @@yu
v @@zu
1
CC
CC
A
 n
= n1v @@xu +n2v @@yu +n3v @@zu =
 
n1 @@xu +n2 @@yu +n3 @@zu
 
 v
=
 
n 
 @
@xu;
@
@yu;
@
@zu
  
 v =
0
BB
BB
@
n 
0
BB
BB
@
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
1
CC
CC
A
1
CC
CC
A
 v
= (n ru) v
A.1.3 Advection Term
For the advection term, we wish to show
 
Z
 
((u r) u) v =  2
 Z
 
((u r) u) v 
Z
 
((u r) v) u
 
75
To start,
 
Z
 
((u r) u) v =  2
Z
 
((u r) u) v +  2
Z
 
((u r) u) v
=  2
 Z
 
((u r) u) v +
Z
 
((u r) u) v
 
We wish to show that
 
Z
 
((u r) u) v =
Z
 
((u r) v) u
we have that
((u r) u) v =
2
66
66
4
u1@u1@x +u2@u1@y +u3@u1@z
u1@u2@x +u2@u2@y +u3@u2@z
u1@u3@x +u2@u3@y +u3@u3@z
3
77
77
5
 v
= v1
 
u1@u1@x +u2@u1@y +u3@u1@z
 
+v2
 
u1@u2@x +u2@u2@y +u3@u2@z
 
+v3
 
u1@u3@x + +u2@u3@y +u3@u3@z
 
(A.3)
and that
((u r) v) u =
2
66
66
4
u1@v1@x +u2@v1@y +u3@v1@z
u1@v2@x +u2@v2@y +u3@v2@z
u1@v3@x +u2@v3@y +u3@v3@z
3
77
77
5
 u
= u1
 
u1@v1@x +u2@v1@y +u3@v1@z
 
+u2
 
u1@v2@x +u2@v2@y +u3@v2@z
 
+v3
 
u1@v3@x + +u2@v3@y +u3@v3@z
 
(A.4)
Using Integration-By-Parts and assuming that u   = 0, we have
Z
 
u1@u1@x
1
v1 =
Z
 
(u1v1) @u1@x
1
= 
Z
 
@(v1u1)
@x1 u1 +
Z
 
v1u1u1dS
76
=  
Z
 
@(v1u1)
@x1 u1 = 
Z
 
v1@(u1)@x
1
u1 
Z
 
u1@(v1)@x
1
u1
Therefore,
Z
 
v1
 
u1@u1@x +u2@u1@y +u3@u1@z
 
+
Z
 
v2
 
u1@u2@x +u2@u2@y +u3@u2@z
 
+
Z
 
v3
 
u1@u3@x + +u2@u3@y +u3@u3@z
 
= 
Z
 
v1@u1@xu1 
Z
 
u1@v1@xu1 
Z
 
v1@u2@y u1 
Z
 
u2@v1@yu1 
Z
 
v1@u3@z u1 
Z
 
u3@v1@z u1
 
Z
 
v2@u1@xu2 
Z
 
u1@v2@xu2 
Z
 
v2@u2@y u2 
Z
 
u2@v2@yu2 
Z
 
v2@u3@z u2 
Z
 
u3@v2@z u2
 
Z
 
v3@u1@xu3 
Z
 
u1@v3@xu3 
Z
 
v3@u2@y u3 
Z
 
u2@v3@yu3 
Z
 
v3@u3@z u3 
Z
 
u3@v3@z u3
(A.5)
Therefore, collecting terms we have
Z
 
v1
 
u1@u1@x +u2@u1@y +u3@u1@z
 
+
Z
 
v2
 
u1@u2@x +u2@u2@y +u3@u2@z
 
+
Z
 
v3
 
u1@u3@x + +u2@u3@y +u3@u3@z
 
= 
 Z
 
u1@v1@xu1 +
Z
 
u2@v1@y u1 +
Z
 
u3@v1@z u1
 
 
 Z
 
v1@u1@x u1 +
Z
 
v1@u2@y u1 +
Z
 
v1@u3@z u1
 
 
 Z
 
u1@v2@xu2 +
Z
 
u2@v2@y u2 +
Z
 
u3@v2@z u2
 
 
 Z
 
v2@u1@x u2 +
Z
 
v2@u2@y u2 +
Z
 
v2@u3@z u2
 
 
 Z
 
u1@v3@xu3 +
Z
 
u2@v3@y u3 +
Z
 
u3@v3@z u3
 
 
 Z
 
v3@u1@x u3 +
Z
 
v3@u2@y u3 +
Z
 
v3@u3@z u3
 
(A.6)
Writing this in vector notation, we have
77
Z
 
((u r) u) v =
Z
 
(u ru1)v1 + (u ru2)v2 + (u ru3)v3
= 
Z
 
(u rv1)u1 
Z
 
v1 (r u)u1 
Z
 
(u rv2)u2 
Z
 
v2 (r u)u2
 
Z
 
(u rv3)u3 
Z
 
v3 (r u)u3
= 
Z
 
(u rv1)u1 
Z
 
(u rv2)u2 
Z
 
(u rv3)u3 = 
Z
 
((u r) v) u (A.7)
Hence,
 
Z
 
((u r) u) v =  2
 Z
 
((u r) u) v +
Z
 
((u r) u) v
 
=  2
 Z
 
((u r) u) v 
Z
 
((u r) v) u
 
A.2 Ohms Law
We wish to show
Z
 
(u B) K = 
Z
 
(K B) u:
This amounts to show that for any vector a, b, and c
Z
 
(a b) c = 
Z
 
(c b) a (A.8)
To start, recall that the cross product between two vectors a and b is
78
a b =ha1;a2;a3i hb1;b2;b3i=
  
  
  
  
  
i j k
a1 a2 a3
b1 b2 b3
  
  
  
  
  
=
  
  
  
 
a2 a3
b2 b3
  
  
  
 
i 
  
  
  
 
a1 a3
b1 b3
  
  
  
 
j +
  
  
  
 
a1 a2
b1 b2
  
  
  
 
k
= (a2b3 a3b2) i (a1b3 a3b1) j + (a1b2 a2b1) k (A.9)
Now, we wish to show that the identity
Z
 
(a b) c = 
Z
 
(c b) a (A.10)
is true.
First, note that a (b c) = (a b) c
a (b c) =ha1;a2;a3i (hb1;b2;b3i hc1;c2;c3i)
=ha1;a2;a3i ((b2c3 b3c2) i (b1c3 b3c1) j + (b1c2 b2c1) k)
= a1 (b2c3 b3c2) a2 (b1c3 b3c1) +a3 (b1c2 b2c1)
= a1b2c3 a1b3c2 a2b1c3 +a2b3c1 +a3b1c2 a3b2c1
= (a2b3 a3b2)c1 (a1b3 a3b1)c2 + (a1b2 a2b1)c3 = (a b) c (A.11)
Since a b = b a and a b = b a
(a b) c = a (b c) = (b c) a = (c b) a (A.12)
Therefore,
79
Z
 
(a b) c = 
Z
 
(c b) a (A.13)
80
Appendix B
Newton?s Method Linearization
Recalling Newton?s method
X(k+1) = X(k) H
 X(k) 
H0(X(k)) (B.1)
with
H(X) =
2
66
66
66
66
66
66
66
66
66
66
4
  u1 + (u r)u1 + @p@x (J2B3 J3B2) F1
  u2 + (u r)u2 + @p@y + (J1B3 J3B1) F2
  u3 + (u r)u3 + @p@z (J1B2 J2B1) F3
  1J1 + @ @x (u2B3 u3B2) E1
  1J2 + @ @y  (u1B3 u3B1) E2
  1J3 + @ @z  (u1B2 u2B1) E3
@u1
@x +
@u2
@y +
@u3
@z
@J1
@x +
@J2
@y +
@J3
@z
3
77
77
77
77
77
77
77
77
77
77
5
and the variable X = (u1;u2;u3;p;J1;J2;J3; ) obtained from the equations
   u + (u r) u +rp J B = F
  1J +r  u B = E;
r u = 0 and r J = 0:
Also recall the Gateaux deritave
H0G X(k) ( X) = lim !0 H
 X(k) +  X  H X(k) 
 ; (B.2)
81
where  X = X(k+1) X(k) and   X =  ( u1; u2; u3; p; J1; J2; J3): Note that  
will be used for increment as opposed to using it for the Laplacian.
Since
B = B0 +B(J)
= B0  4 
Z
 
x y
jx yj3  J (y)dy =
2
66
66
4
(B0)1  4 R (x2 y2)J3(y) (x3 y3)J2(y)jx yj3 dy
(B0)2  4 R (x1 y1)J3(y) (x3 y3)J1(y)jx yj3 dy
(B0)3  4 R (x1 y1)J2(y) (x2 y2)J1(y)jx yj3 dy
3
77
77
5
82
we have
H
 
X(k)
 
=
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
 r ru(k)1 +  u(k) r u(k)1 + @p(k)@x
 J(k)2
 
(B0)3  4 R (x1 y1)J(k)2  (x2 y2)J(k)1jx yj3 dy
 
+J(k)3
 
(B0)2  4 R (x1 y1)J(k)3  (x3 y3)J(k)1jx yj3 dy
 
 F1
 r ru(k)2 +  u(k) r u(k)2 + @p(k)@y
+J(k)1
 
(B0)3  4 R (x1 y1)J(k)2  (x2 y2)J(k)1jx yj3 dy
 
 J(k)3
 
(B0)1  4 R (x2 y2)J(k)3  (x3 y3)J(k)2jx yj3 dy
 
 F2
 r ru(k)3 +  u(k) r u(k)3 + @p(k)@z
 J(k)1
 
(B0)2  4 R (x1 y1)J(k)3  (x3 y3)J(k)1jx yj3 dy
 
+J(k)2
 
(B0)1  4 R (x2 y2)J(k)3  (x3 y3)J(k)2jx yj3 dy
 
 F3
  1J(k)1 + @ (k)@x  u(k)2
 
(B0)3  4 R (x1 y1)J(k)2  (x2 y2)J(k)1jx yj3 dy
 
+u(k)3
 
(B0)2  4 R (x1 y1)J(k)3  (x3 y3)J(k)1jx yj3 dy
 
 E1
  1J(k)2 + @ (k)@y  u(k)1
 
(B0)3  4 R (x1 y1)J(k)2  (x2 y2)J(k)1jx yj3 dy
 
+u(k)3
 
(B0)1  4 R (x2 y2)J(k)3  (x3 y3)J(k)2jx yj3 dy
 
 E2
  1J(k)3 + @ (k)@z  u(k)1
 
(B0)2  4 R (x1 y1)J(k)3  (x3 y3)J(k)1jx yj3 dy
 
+u(k)2
 
(B0)1  4 R (x2 y2)J(k)3  (x3 y3)J(k)2jx yj3 dy
 
 E3
@u(k)1
@x +
@u(k)2
@y +
@u(k)3
@z
@J(k)1
@x +
@J(k)2
@y +
@J(k)3
@z
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
and
83
H
 
X(k) +  X
 
=
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
 r r
 
u(k)1 +  u1
 
+   u(k) +  u  r 
 
u(k)1 +  u1
 
+@(p
(k)+  p)
@x  
 
J(k)2 +  J2
  
(B0)3  4 R (x1 y1)
 
J(k)2 +  J2
 
 (x2 y2)
 
J(k)1 +  J1
 
jx yj3 dy
 
+
 
J(k)3 +  J3
  
(B0)2  4 R (x1 y1)
 
J(k)3 +  J3
 
 (x3 y3)
 
J(k)1 +  J1
 
jx yj3 dy
 
 F1
 r r
 
u(k)2 +  u2
 
+   u(k) +  u  r 
 
u(k)2 +  u2
 
+ @(p
(k)+  p)
@y
+
 
J(k)1 +  J1
  
(B0)3  4 R (x1 y1)
 
J(k)2 +  J2
 
 (x2 y2)
 
J(k)1 +  J1
 
jx yj3 dy
 
 
 
J(k)3 +  J3
  
(B0)1  4 R (x2 y2)
 
J(k)3 +  J3
 
 (x3 y3)
 
J(k)2 +  J2
 
jx yj3 dy
 
 F2
 r r
 
u(k)3 +  u3
 
+   u(k) +  u  r 
 
u(k)3 +  u3
 
+ @(p
(k)+  p)
@z
 
 
J(k)1 +  J1
  
(B0)2  4 R (x1 y1)
 
J(k)3 +  J3
 
 (x3 y3)
 
J(k)1 +  J1
 
jx yj3 dy
 
+
 
J(k)2 +  J2
  
(B0)1  4 R (x2 y2)
 
J(k)3 +  J3
 
 (x3 y3)
 
J(k)2 +  J2
 
jx yj3 dy
 
 F3
  1
 
J(k)1 +  J1
 
+ @( 
(k)+   )
@x
 
 
u(k)2 +  u2
  
(B0)3  4 R (x1 y1)
 
J(k)2 +  J2
 
 (x2 y2)
 
J(k)1 +  J1
 
jx yj3 dy
 
+
 
u(k)3 +  u3
  
(B0)2  4 R (x1 y1)
 
J(k)3 +  J3
 
 (x3 y3)
 
J(k)1 +  J1
 
jx yj3 dy
 
 E1
  1
 
J(k)2 +  J2
 
+ @( 
(k)+   )
@y
 
 
u(k)1 +  u1
  
(B0)3  4 R (x1 y1)
 
J(k)2 +  J2
 
 (x2 y2)
 
J(k)1 +  J1
 
jx yj3 dy
 
+
 
u(k)3 +  u3
  
(B0)1  4 R (x2 y2)
 
J(k)3 +  J3
 
 (x3 y3)
 
J(k)2 +  J2
 
jx yj3 dy
 
 E2
  1
 
J(k)3 +  J3
 
+ @( 
(k)+   )
@z
 
 
u(k)1 +  u1
  
(B0)2  4 R (x1 y1)
 
J(k)3 +  J3
 
 (x3 y3)
 
J(k)1 +  J1
 
jx yj3 dy
 
+
 
u(k)2 +  u2
  
(B0)1  4 R (x2 y2)
 
J(k)3 +  J3
 
 (x3 y3)
 
J(k)2 +  J2
 
jx yj3 dy
 
 E3
@
 
u(k)1 +  u1
 
@x +
@
 
u(k)2 +  u2
 
@y +
@
 
u(k)3 +  u3
 
@z
@
 
J(k)1 +  J1
 
@x +
@
 
J(k)2 +  J2
 
@y +
@
 
J(k)3 +  J3
 
@z
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
5
84
Then, since
 
  
u(k) +  u
 
 r
  
u(k)1 +  u1
 
=  
  
u(k) +  u
 
 r
 
u(k)1 + 
  
u(k) +  u
 
 r
 
  u1
=  
 
u(k) r
 
u(k)1 + (  u r)u(k)1 + 
 
u(k) r
 
  u1
+ (  u r)  u1
 
J(k)2 +  J2
 0
@(B0)3  4 
Z
 
(x1 y1)
 
J(k)2 +  J2
 
 (x2 y2)
 
J(k)1 +  J1
 
jx yj3 dy
1
A
= J(k)2
0
@(B0)3  4 
Z
 
(x1 y1)
 
J(k)2 +  J2
 
 (x2 y2)
 
J(k)1 +  J1
 
jx yj3 dy
1
A
+  J2
0
@(B0)3  4 
Z
 
(x1 y1)
 
J(k)2 +  J2
 
 (x2 y2)
 
J(k)1 +  J1
 
jx yj3 dy
1
A
= J(k)2
 
(B0)3  4 
Z
 
(x1 y1)J(k)2  (x2 y2)J(k)1
jx yj3 dy
!
+ J(k)2
 
0  4 
Z
 
(x1 y1)  J2 (x2 y2)  J1
jx yj3 dy
!
+  J2
 
(B0)3  4 
Z
 
(x1 y1)J(k)2  (x2 y2)J(k)1
jx yj3 dy
!
+  J2
 
0  4 
Z
 
(x1 y1)  J2 (x2 y2)  J1
jx yj3 dy
!
lim !0 H
 X(k) +  X  H X(k) 
 
85
=
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
  r r( u1) + ( u r)u(k)1 +  u(k) r  u1 + @ p@x
 J(k)2
 
0  4 R (x1 y1) J2 (x2 y2) J1jx yj3 dy
 
  J2
 
(B0)3  4 R (x1 y1)J(k)2  (x2 y2)J(k)1jx yj3 dy
 
+J(k)3
 
0  4 R (x1 y1) J3 (x3 y3) J1jx yj3 dy
 
+  J(k)3
 
(B0)2  4 R (x1 y1)J(k)3  (x3 y3)J(k)1jx yj3 dy
 
 r r u2 + ( u r)u(k)2 +  u(k) r  u2 + @ p@y
+J(k)1
 
0  4 R (x1 y1) J2 (x2 y2) J1jx yj3 dy
 
+  J(k)1
 
(B0)3  4 R (x1 y1)J(k)2  (x2 y2)J(k)1jx yj3 dy
 
 J(k)3
 
0  4 R (x2 y2) J3 (x3 y3) J2jx yj3 dy
 
  J(k)3
 
(B0)1  4 R (x2 y2)J(k)3  (x3 y3)J(k)2jx yj3 dy
 
 r r u3 + ( u r)u(k)3 +  u(k) r  u3 + @ p@z
 J(k)1
 
0  4 R (x1 y1) J3 (x3 y3) J1jx yj3 dy
 
  J1
 
(B0)2  4 R (x1 y1)J(k)3  (x3 y3)J(k)1jx yj3 dy
 
+J(k)2
 
0  4 R (x2 y2) J3 (x3 y3) J2jx yj3 dy
 
+  J(k)2
 
(B0)1  4 R (x2 y2)J(k)3  (x3 y3)J(k)2jx yj3 dy
 
  1 J1 + @  @x
 u(k)2
 
0  4 R (x1 y1) J2 (x2 y2) J1jx yj3 dy
 
  u2
 
(B0)3  4 R (x1 y1)J(k)2  (x2 y2)J(k)1jx yj3 dy
 
+u(k)3
 
0  4 R (x1 y1) J3 (x3 y3) J1jx yj3 dy
 
+  u3
 
(B0)2  4 R (x1 y1)J(k)3  (x3 y3)J(k)1jx yj3 dy
 
  1 J2 + @  @y
 u(k)1
 
0  4 R (x1 y1) J2 (x2 y2) J1jx yj3 dy
 
  u1
 
(B0)3  4 R (x1 y1)J(k)2  (x2 y2)J(k)1jx yj3 dy
 
+u(k)3
 
0  4 R (x2 y2) J3 (x3 y3) J2jx yj3 dy
 
+  u3
 
(B0)1  4 R (x2 y2)J(k)3  (x3 y3)J(k)2jx yj3 dy
 
  1 J3 + @  @z
 u(k)1
 
0  4 R (x1 y1) J3 (x3 y3) J1jx yj3 dy
 
  u1
 
(B0)2  4 R (x1 y1)J(k)3  (x3 y3)J(k)1jx yj3 dy
 
+u(k)2
 
0  4 R (x2 y2) J3 (x3 y3) J2jx yj3 dy
 
+  u2
 
(B0)1  4 R (x2 y2)J(k)3  (x3 y3)J(k)2jx yj3 dy
 
@ u1
@x +
@ u2
@y +
@ u3
@z
@ J1
@x +
@ J2
@y +
@ J3
@z
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
86
=
2
66
66
66
64
r r( u) + (( u) r)u(k) +  u(k) r ( u) +r( p) J(k) B( J) ( J) B J(k) 
  1 ( J) +r(  ) u(k) B( J) ( u) B J(k) 
r ( u)
r ( J)
3
77
77
77
75
So, we have
H0G X(k) ( X) =
2
66
66
66
66
66
66
66
66
66
66
4
r r( u) + (( u) r) u(k) +  u(k) r ( u) +r( p)
 J(k) B( J) ( J) B J(k) 
  1 ( J) +r(  ) u(k) B( J) ( u) B J(k) 
r ( u)
r ( J)
3
77
77
77
77
77
77
77
77
77
77
5
(B.3)
Recalling  X = X(k+1) X(k),  X = ( u1; u2; u3; p; J1; J2; J3), and Newton?s
method H0 X(k) ( X) = H X(k) , we have
2
66
66
66
64
  r ru(k+1) +  u(k+1) r u(k) +  u(k) r u(k+1) +rp(k+1) J(k) B J(k+1)  J(k+1) B J(k) 
  1J(k+1) +r (k+1) u(k) B J(k+1)  u(k+1) B J(k) 
r u(k+1)
r J(k+1)
3
77
77
77
75
 
2
66
66
66
64
  r ru(k) +  u(k) r u(k) +  u(k) r u(k) +rp(k) J(k) B J(k)  J(k) B J(k) 
  1J(k) +r (k) u(k) B J(k)  u(k) B J(k) 
r u(k)
r J(k)
3
77
77
77
75
= H
 
X(k)
 
87
Therefore
2
66
66
66
66
66
66
66
66
66
66
4
  r ru(k+1) +  u(k+1) r u(k)   u(k) r u(k) +  u(k) r u(k+1)
+rp(k+1) J(k) B J(k+1) + J(k) B J(k)  J(k+1) B J(k) 
  1J(k+1) +r (k+1) u(k) B J(k+1) + u(k) B J(k)  u(k+1) B J(k) 
r u(k+1)
r J(k+1)
3
77
77
77
77
77
77
77
77
77
77
5
=
2
66
66
66
66
66
66
66
66
64
F
E
0
0
3
77
77
77
77
77
77
77
77
75
88
Appendix C
Iterative Solvers
C.1 Conjugate Gradient (CG) Method
We recall the conjugate gradient method from the point of view of steepest descent, where
we try to  nd the minimum of the function  (x) = 12xtAx xtb where b2Rn and A2Rn n
is symmetric positive de nite. Withr (x) =
h
@
@x1 (x);:::;
@
@xk (x);:::;
@
@xn (x)
i
, consider
@
@xk (x) =
@
@xk
 1
2x
tAx xtb
 
= @@x
k
 1
2hAx;xi hb;xi
 
= @@x
k
 
1
2
nX
i=1
xi
nX
j=1
aijxj 
nX
i=1
bixi
!
= 12
nX
i=1
nX
j=1
aijxj @@x
k
xi + 12
nX
i=1
nX
j=1
aijxi @@x
k
xj 
nX
i=1
bi @@x
k
xi
= 12
nX
j=1
akjxj + 12
nX
i=1
xiaik bk =
nX
j=1
akjxj bk = rowk (A)x bk
Thenr (x) = Ax b and an extreme value is attained whenr = 0, Ax = b, and we have
solve our system of equations. Recalling that at a location xn, a function decreases greatest
in the opposite direction of its gradient  r (xn) = b Axn = rn and in the direction
of steepest descent. If the gradient is nonzero, we can decrease  (xn) by moving in the
direction of rn. Due to the symmetry and positive de niteness of A, the quadratic form
 (x) will attain a minimum at some location xn + rn in this direction. A good example to
illustrate  (x) is when n = 2, x = (x1;x2), and  (x) is the elliptic paraboloid. To determine
89
 and the location of the minimum in the direction rn, we  nd the critical value of
~ ( ) =  (xn + rn)
= 12 (xn rn)tA(xn rn) (xn + rn)tb
= 12hAxn;xni+ hAxn;rni+ 2hArn;rni hb;xni  hb;rni:
Di erentiating,
@
@ 
~ ( ) = hAxn;rni+ 2 hArn;rni  hb;rni
= hAxn b;rni+ 2 hArn;rni= hrn;rni+ 2 hArn;rni:
Setting the derivative to zero, @@ ~ ( ) = 0, we have  = hrn;rni2hArn;rni. The steepest descent
algorithm is given below.
Method of Steepest Descent
int k = 0;
vector xk; // initial guess x0
vector rk = b  Axk; // initial residual r0
while rk 6= 0
k = k + 1;
 k = hrk 1;rk 1i = hArk 1;rk 1i;
xk = xk 1 +  krk 1;
rk = b  Axk;
However, if the matrix A is ill-conditioned, the method of determining search directions
using the gradients and steepest descent may not be the most e cient in terms of the
number of iterations to convergence. By bringing the structure of A into the algorithm to
more e ectively determine search directions we can obtain the conjugate gradient method
90
[44] shown below. With  k = 0 the conjugate gradient algorithm returns to the method of
steepest descent.
Conjugate Gradient (CG) Method
int k = 0;
vector xk; // initial guess x0
vector rk = b  Axk; // initial residual r0
while rk 6= 0
k = k + 1;
if k = 1
p1 = r0
else
 k = hrk 1;rk 1i = hrk 2;rk 2i;
pk = rk 1 +  k pk 1
end
 k = hrk 1;rk 1i = hApk;pki;
xk = xk 1 +  kpk;
rk = rk 1   A pk;
The operations per iteration for this algorithm are two inner products, one matrix-vector
product, three scalar-vector products, two scalar-scalar multiplications and three additions.
If the system matrix A is sparse with m nonzeros per row and m<<n, then the resulting
number of  oating point operations per iteration is O(n).
C.2 Generalized Minimal Residual (GMRES) Method
The GMRES method was developed as an extension of the Minimal Residual Method to
treat systems of equations that were not necessary symmetric, where the minimal residual
method had extended the conjugate gradient method to symmetric inde nite systems. The
91
GMRES algorithm iteratively solves the system Ax = b using two major steps. The  rst
is the orthonormalization of the Krylov vectors fb;Ab;A2b;:::;An 1bg spanning a Krylov
space. The second step is a least squares solution to the best approximation xn in this space
with respect to the minimization of the size of the residualkrnk=kb Axnk. The algorithm
is given below.
Generalized Minimal Residual (GMRES) Method
vector r0 = b  Ax0;
scalar  = kr0k2;
vector v1 = r0 =  ; // first orthonormal vector of Krylov space
for j = 1;:::;m // orthonomalizing Krylov vectors
w = A vj;
for i = 1;:::;j
hi;j = hw; vii // projecting w onto previous vectors vi
w = w  hi;j vi // orthogonalizing w w.r.t. vi
end
hj+1;j = kwk2;
vj+1 = w = hj+1;j; // normalizing w
end
Vm = [v1;:::;vm];
~Hm = [hi;j] (1  i  j + 1; 1  j  m);
ym = miny
  
  e1  ~Hm y
  
 
2
; xm = x0 + Vm ym;
if (krmk2 =kb Axmk2 <  )
stop;
else
x0 = xm; return to start;
end
92
To take a closer look at the least squares solve at the end of the algorithm, consider the
matrices Vm, Vm+1, and ~Hm formed by the algorithm
Vm+1 =
2
66
66
4
j j j j
v1 v2 ::: vm vm+1
j j j j
3
77
77
5
and ~Hm =
2
66
66
66
66
66
4
h1;1 ::: ::: h1;n
h2;1 h2;2 ::: h2;n
0 ... ... ...
... ... h
m;m 1 hm;m
0 ::: 0 hm+1;m
3
77
77
77
77
77
5
Looking closely at the algorithm, we can identify the matrix equation AVm = Vm+1 ~Hm with
the Gram-Schmidt orthonormalization process. Indeed, the equality of the  rst columns
resulting from the matrix products in the equation yield the  rst orthonormalized vector
Av1 = h11v1 +h21v2 =) v2 = 1h
21
(Av1 h11v1):
Continuing in a similar manner we see
Av2 = h12v1 +h22v2 +h32v3 =) v3 = 1h
32
(Av2 h12v1 h22v2):
With the matrix equation AVm = Vm+1 ~Hm, we elaborate on the least squares step at the
end of the algorithm. Recall that we wish to minimize the l2 norm of the residual vector rm
and that our approximate solution xm for the mth iteration resides in the space x0 +Km =
x0 + spanfv1;Av1;:::;Am 1v1g, where v1 = r0 = b Ax0kb Ax0k
2
. Further, the vectors v1;v2;:::vm
form an orthonormal basis for Km. Therefore, for some y 2Rm, such that xm is a linear
combination of the orthonormal basis, we have
krmk22 = kb Axmk22 =kb A(x0 +Vmy)k22 =kr0 AVmyk22
=
  
  v1 Vm+1 ~Hmy
  
 
2
2
=
  
 Vm+1
 
 e1 ~Hmy
   
 
2
2
=
  
  e1 ~Hmy
  
 
2
2
93
=
  
 Qm e1 Qm ~Hmy
  
 
2
2
=
  
 ~g ~Rmy
  
 
2
2
=j~gm+1j2 +kg Rmyk22;
where e1 is the standard Euclidean basis vector in the  rst coordinate direction, Vm+1 being
orthonormal is norm preserving, and Qm is the product of Givens rotations that put the
upper Hessenberg matrix ~Hm in upper right-triangular form ~Rm. Since the last row of ~Rm
is therefore a row of zeros, the last row of ~g ~Rmy will be equal to the last element of ~g -
~gm+1. De ning g and Rm to be all but the last row of ~g and ~Rm, respectively, we obtain the
last equality. Noting that Rm is invertible, we see there is a y such that g Rmy = 0 and
therefore krmk22 =j~gm+1j2.
Looking at the operation count for one iteration of GMRES, we see that the algorithm starts
o with a vector subtraction (n  ops), a vector inner product, a square root and a scalar
division (n + 2  ops). We assume that A is sparse with approximately k << n nonzeros
per row and that the dimension of the Krylov space m << n as well. A Gram-Schmidt
orthogonalization iteration consists of a matrix-vector product (kn  ops), a vector inner
product (jn<mn  ops), vector subtraction (jn<mn  ops), another vector inner product
with a square root and a scalar division (n+2  ops). This is carried out m times. If m2 <<n
then the Gram-Schmidt operations are O(m2n) = O(n). The least squares methods consists
of Givens rotations (O(m2)  ops each) and backsolve to obtain y (O(m2)  ops. Therefore,
with m2 <<n the GMRES  op count is O(n).
94
Appendix D
Exact Solution
Recall the  rst example of chapter 4 having the exact solution given below.
u (x;y;z) :=
2
66
66
4
0
  cos( x) exp( s) 12  z 
  cos( x) exp( s) y 12 
3
77
77
5
J (x;y;z) :=
2
66
66
4
2 sin( x)(1 s) exp( s)
  cos( x) exp( s) y 12 
  cos( x) exp( s) z 12 
3
77
77
5
p(x;y;z) :=  12 sin2 ( x)sexp( 2s);
 (x;y;z) := 2 cos( x);
B (x;y;z) :=
2
66
66
4
1
sin( x) exp( s) 12  z 
sin( x) exp( s) y 12 
3
77
77
5
with r =  0;y 12;z 12 and s = jrj2 =  y 12 2 +  z 12 2. Note that u and B are
divergence free and r B = J
r B = B1;x +B2;y +B3;z
= @@x(1) + @@y sin( x) exp ( s)
 1
2  z
 
+ @@z sin ( x) exp ( s)
 
y 12
 
= 0 + sin( x)
 1
2  z
 @
@y exp
 
 
 
y 12
 2
 
 
z 12
 2!
+ sin( x)
 
y 12
 @
@z exp
 
 
 
y 12
 2
 
 
z 12
 2!
95
=  2 sin( x)
 1
2  z
  
y 12
 
exp ( s) 2 sin( x)
 
y 12
  
z 12
 
exp ( s) = 0
r u = u1;x +u2;y +u3;z
= @@x(0)  @@y cos( x) exp ( s)
 1
2  z
 
  @@z cos( x) exp ( s)
 
y 12
 
= 0  cos( x)
 1
2  z
 @
@y exp
 
 
 
y 12
 2
 
 
z 12
 2!
  cos( x)
 
y 12
 @
@z exp
 
 
 
y 12
 2
 
 
z 12
 2!
= 2 cos( x)
 1
2  z
  
y 12
 
exp ( s) + 2 cos( x)
 
y 12
  
z 12
 
exp ( s) = 0
and r B = J:
r B =
2
66
66
4
B3;y B2;z
B1;z B3;x
B2;x B1;y
3
77
77
5
=
2
66
66
4
@
@y sin( x) exp( s)
 y 1
2
  @
@z sin( x) exp( s)
 1
2  z
 
@
@z(1) 
@
@x sin( x) exp( s)
 y 1
2
 
@
@x sin( x) exp( s)
 1
2  z
  @
@y(1)
3
77
77
5
=
2
66
66
4
sin( x) @@y exp( s) y 12  sin( x) @@z exp( s) 12  z 
  cos( x) exp( s) y 12 
  cos( x) exp( s) z 12 
3
77
77
5
=
2
66
66
4
sin( x) exp( s)
 
1 2 y 12 2
 
 sin( x) exp( s)
 
 1 + 2 z 12 2
 
  cos( x) exp( s) y 12 
  cos( x) exp( s) z 12 
3
77
77
5
=
2
66
66
4
2(1 s) sin( x) exp( s)
  cos( x) exp( s) y 12 
  cos( x) exp( s) z 12 
3
77
77
5
= J;
96
F :=   u + (u r)u +rp J B;
=  
2
66
66
4
r ru1
r ru2
r ru3
3
77
77
5
+
2
66
66
4
u ru1
u ru2
u ru3
3
77
77
5
+
2
66
66
4
px
py
pz
3
77
77
5
 
2
66
66
4
J2B3 J3B2
J3B1 J1B3
J1B2 J2B1
3
77
77
5
=
2
66
66
4
 u1;xx u1;yy u1;zz +u1@u1@x +u2@u1@y +u3@u1@z +px J2B3 +J3B2
 u2;xx u2;yy u2;zz +u1@u2@x +u2@u2@y +u3@u2@z +py J3B1 +J1B3
 u3;xx u3;yy u3;zz +u1@u3@x +u2@u3@y +u3@u3@z +pz J1B2 +J2B1
3
77
77
5
=
2
66
66
66
66
66
66
66
66
64
 0 + 0  ssin( x) cos( x) exp( 2s) + ssin( x) cos( x) exp( 2s)
  cos( x) exp( s) 12  z   2 + 8 4s   2 cos2( x) exp( 2s) y 12 
+ sin2( x) exp( 2s) y 12 (2s 1) + cos( x) exp( s) z 12 
+2 (1 s) sin2( x) exp( 2s) y 12 
  cos( x) exp( s) y 12   2 + 8 4s   2 cos2( x) exp( 2s) z 12 
+ sin2( x) exp( 2s) z 12 (2s 1) 2 (1 s) sin2( x) exp( 2s) 12  z 
  cos( x) exp( s) y 12 
3
77
77
77
77
77
77
77
77
75
=
2
66
66
66
66
66
4
0
 cos( x) exp( s) z 12   2 + 9 4s + sin2( x) exp( 2s) y 12 
  2 cos2( x) exp( 2s) y 12 
  cos( x) exp( s) y 12   2 + 9 4s + sin2( x) exp( 2s) z 12 
  2 cos2( x) exp( 2s) z 12 
3
77
77
77
77
77
5
E := J +r  u B
=
2
66
66
4
2 sin( x)(1 s) exp( s) 2 sin( x) 0
  cos( x) exp( s) y 12 + 0 + cos( x) exp( s) y 12 
  cos( x) exp( s) z 12 + 0  cos( x) exp( s) 12  z 
3
77
77
5
=
2
66
66
4
2 sin( x) ((1 s) exp( s) 1)
0
0
3
77
77
5
and
97
g := u   ; Jext := J  R3n ; Bext := i:
To determine F and E we form the components of the laplacian
u2;xx = @@x
 @
@x
 
  cos( x) exp( s)
 1
2  z
   
= @@x
 
 2 sin( x) exp( s)
 1
2  z
  
=  3 cos( x) exp( s)
 1
2  z
 
u2;yy = @@y
 @
@y
 
  cos( x) exp( s)
 1
2  z
   
= @@y
 
  cos( x)
 1
2  z
  
 2
 
y 12
 
exp( s)
  
= @@y
 
2 cos( x) exp( s)
 
y 12
  1
2  z
  
= 2 cos( x)
 1
2  z
  
exp( s) 2
 
y 12
 2
exp( s)
!
=  2 cos( x) exp( s)
 
z 12
  
1 2
 
y 12
 2!
u2;zz = @@z
 @
@z
 
  cos( x) exp( s)
 1
2  z
   
= @@z
 
  cos( x)
 
 exp( s) +
 1
2  z
  
 2
 
z 12
 
exp( s)
   
= @@z
 
 cos( x) exp( s)
 
1 2
 
z 12
 2!!
=  cos( x)
 
 4
 
z 12
 
exp( s) +
 
1 2
 
z 12
 2! 
 2
 
z 12
  
exp( s)
!
=  2 cos( x) exp( s)
 
z 12
  
3 2
 
z 12
 2!
u2;xx +u2;yy +u2;zz =  3 cos( x) exp( s)
 1
2  z
 
+ 2 cos( x) exp( s)
 1
2  z
 
(4 2s)
=  cos( x) exp( s)
 1
2  z
  
 2 + 8 4s 
u3;xx = @@x
 @
@x
 
  cos( x) exp( s)
 
y 12
   
= @@x
 
 2 sin( x) exp( s)
 
y 12
  
=  3 cos( x) exp( s)
 
y 12
 
u3;yy = @@y
 @
@y
 
  cos( x) exp( s)
 
y 12
   
= @@y
 
  cos( x)
 
exp( s) 2
 
y 12
 2
exp( s)
!!
= @@y
 
  cos( x) exp( s)
 
1 2
 
y 12
 2!!
98
=   cos( x)
 
exp( s)
 
 4
 
y 12
  
 2
 
y 12
  
1 2
 
y 12
 2!
exp( s)
!
= 2 cos( x) exp( s)
 
y 12
  
3 2
 
y 12
 2!
u3;zz = @@z
 @
@z
 
  cos( x) exp( s)
 
y 12
   
= @@z
 
  cos( x)
 
y 12
  
 2
 
z 12
 
exp( s)
  
= @@z
 
2 cos( x) exp( s)
 
y 12
  
z 12
  
= 2 cos( x)
 
y 12
  
exp( s) 2
 
z 12
 2
exp( s)
!
= 2 cos( x) exp( s)
 
y 12
  
1 2
 
z 12
 2!
u3;xx +u3;yy +u3;zz =  3 cos( x) exp( s)
 
y 12
 
+ 2 cos( x) exp( s)
 
y 12
 
(4 2s)
=  cos( x) exp( s)
 
y 12
  
 2 + 8 4s 
 u =
2
66
64
r ru1
r ru2
r ru3
3
77
75 =
2
66
64
u1;xx +u1;yy +u1;zz
u2;xx +u2;yy +u2;zz
u3;xx +u3;yy +u3;zz
3
77
75 =
2
66
64
0
 cos( x) exp( s) 12  z   2 + 8 4s 
 cos( x) exp( s) y 12   2 + 8 4s 
3
77
75
the inertial term
u ru =
2
66
66
4
u ru1
u ru2
u ru3
3
77
77
5
=
2
66
66
4
u1@u1@x +u2@u1@y +u3@u1@z
u1@u2@x +u2@u2@y +u3@u2@z
u1@u3@x +u2@u3@y +u3@u3@z
3
77
77
5
=
2
66
66
4
0
(u2) 2 cos( x) exp( s) 12  z  y 12  + (u3)
 
  cos( x) exp( s)
 
 1 + 2 12  z 2
  
(u2)
 
  cos( x) exp( s)
 
1 2 y 12 2
  
+ (u3) 2 cos( x) exp( s) z 12  y 12  
3
77
77
5
=
2
66
66
4
0
 2 2 cos2( x) exp( 2s) 12  z 2 y 12 + 2 cos2( x) exp( 2s) y 12 
 
 1 + 2 12  z 2
 
 2 cos2( x) exp( 2s) 12  z 
 
1 2 y 12 2
 
 2 2 cos( x) exp( 2s) y 12 2 z 12 
3
77
77
5
99
=
2
66
66
4
0
  2 cos2( x) exp( 2s) y 12 
  2 cos2( x) exp( 2s) z 12 
3
77
77
5
the gradient of the pressure
rp =
2
66
66
4
px
py
pz
3
77
77
5
=
2
66
66
4
 12 @@x sin2( x)sexp( 2s)
 12 @@y sin2( x)sexp( 2s)
 12 @@z sin2( x)sexp( 2s)
3
77
77
5
=
2
66
66
4
  sin( x) cos( x)sexp( 2s)
 12 sin2( x) sexp( 2s)  4 y 12  + exp( 2s) 2 y 12   
 12 sin2( x) sexp( 2s)  4 z 12  + exp( 2s) 2 z 12   
3
77
77
5
=
2
66
66
4
  ssin( x) cos( x) exp( 2s)
sin2( x) exp( 2s) y 12 (2s 1)
sin2( x) exp( 2s) z 12 (2s 1)
3
77
77
5
the cross-product of the current density with the magnetic  eld
J B =
2
66
66
4
J2B3 J3B2
J3B1 J1B3
J1B2 J2B1
3
77
77
5
=
2
66
66
4
  sin( x) cos( x) exp( 2s) y 12 2  sin( x) cos( x) exp( 2s) z 12 2
  cos( x) exp( s) z 12  2(1 s) sin2( x) exp( 2s) y 12 
2(1 s) sin2( x) exp( 2s) 12  z + cos( x) exp( s) z 12 
3
77
77
5
=
2
66
66
4
  ssin( x) cos( x) exp( 2s)
  cos( x) exp( s) z 12  2(1 s) sin2( x) exp( 2s) y 12 
2(1 s) sin2( x) exp( 2s) 12  z + cos( x) exp( s) y 12 
3
77
77
5
the gradient of the electric potential
100
r =
2
66
66
4
 x
 y
 z
3
77
77
5
=
2
66
66
4
 2 sin( x)
0
0
3
77
77
5
and the cross-product of the velocity with the magnetic  eld
u B =
2
66
66
4
u2B3 u3B2
u3B1 u1B3
u1B2 u2B1
3
77
77
5
=
2
66
66
66
64
  cos( x) exp( s) 12  z sin( x) exp( s) y 12 
+ cos( x) exp( s) y 12 sin( x) exp( s) 12  z 
  cos( x) exp( s) y 12  0
0 + cos( x) exp( s) 12  z 
3
77
77
77
75
=
2
66
66
4
0
  cos( x) exp( s) y 12 
 cos( x) exp( s) 12  z 
3
77
77
5
The equations were checked for correctness against output from the symbolic package maxima
(wxMaxima).
101
Bibliography
[1] M. Ainsworth. Essential boundary conditions and multi-point constraints in  -
nite element analysis. Computer Methods in Applied Mechanics and Engineering,
190(48):6323{6339, 2001.
[2] G. Alekseev. Solvability of control problems for stationary equations of magnetohy-
drodynamics of a viscous  uid. Siberian Mathematical Journal, 45(2):197{213, 2004.
[3] D. Anderson, W. Cooper, R. Gruber, S. Merazzi, and U. Schwenn. Methods for the
e cient calculation of the (mhd) magnetohydrodynamic stability properties of magnet-
ically con ned fusion plasmas. International Journal of High Performance Computing
Applications, 4(3):34{47, 1990.
[4] D. Arnold. Di erential complexes and numerical stability. arXiv preprint
math/0212391, 2002.
[5] D. Arnold, R. Falk, and R. Winther. Di erential complexes and stability of  nite
element methods i. the de rham complex. In Compatible spatial discretizations, pages
23{46. Springer, 2006.
[6] D. Arnold, R. Falk, and R. Winther. Finite element exterior calculus, homological
techniques, and applications. Acta numerica, 15:1{155, 2006.
[7] S. Balay, J. Brown, K. Buschelman, V. Eijkhout, W. Gropp, D. Kaushik, M. Knepley,
L. Curfman McInnes, B. Smith, and H. Zhang. Petsc users manual revision 3.4. 2013.
[8] W. Bangerth. Assembling matrices in deal. ii. Technical report, Technical report, ETH
Z urich, 2002.
[9] W. Bangerth, C. Burstedde, T. Heister, and M. Kronbichler. Algorithms and data
structures for massively parallel generic adaptive  nite element codes. ACM Transac-
tions on Mathematical Software (TOMS), 38(2):14, 2011.
[10] W. Bangerth, R. Hartmann, and G. Kanschat. deal. ii|a general-purpose object-
oriented  nite element library. ACM Transactions on Mathematical Software (TOMS),
33(4):24, 2007.
[11] W. Bangerth and G. Kanschat. Concepts for Object Oriented Finite Element Software:
The Dea1. II Library. IWR, 1999.
102
[12] W. Bangerth and O. Kayser-Herold. Data structures and requirements for hp  nite
element software. ACM Transactions on Mathematical Software (TOMS), 36(1):28,
2009.
[13] R. Beatson and L. Greengard. A short course on fast multipole methods. Wavelets,
multilevel methods and elliptic PDEs, 1:1{37, 1997.
[14] R. Bensow and M. Larson. Discontinuous least-squares  nite element method for the
div-curl problem. Numerische Mathematik, 101(4):601{617, 2005.
[15] M. Benzi. Preconditioning techniques for large linear systems: a survey. Journal of
Computational Physics, 182(2):418{477, 2002.
[16] M. Benzi and A. Wathen. Some preconditioning techniques for saddle point problems.
In Model order reduction: theory, research aspects and applications, pages 195{211.
Springer, 2008.
[17] M. Bergot and P. Lacoste. Generation of higher-order polynomial basis of n ed elec h
(curl)  nite elements for maxwell?s equations. Journal of Computational and Applied
Mathematics, 234(6):1937{1944, 2010.
[18] O. Besson, J. Bourgeois, P. Chevalier, J. Rappaz, and R. Touzani. Numerical modelling
of electromagnetic casting processes. Journal of computational physics, 92(2):482{507,
1991.
[19] P. Bochev and M. Gunzburger. Least-Squares Finite Element Methods: Finite Element
Methods, volume 166. Springer, 2009.
[20] C. Burstedde, L. Wilcox, and O. Ghattas. p4est: Scalable algorithms for parallel
adaptive mesh re nement on forests of octrees. SIAM Journal on Scienti c Computing,
33(3):1103{1133, 2011.
[21] M. Charina-Kehrein. A MHD free boundary value problem. PhD thesis, Auburn Uni-
versity, Auburn, Alabama, 2002.
[22] H. Childs. Visit: An end-user tool for visualizing and analyzing very large data. 2013.
[23] H. Childs, D. Pugmire, S. Ahern, B. Whitlock, M. Howison, M. Prabhat, G. We-
ber, and W. Bethel. Extreme scaling of production visualization software on diverse
architectures. 2010.
[24] R. Coifman, V. Rokhlin, and S. Wandzura. The fast multipole method for the wave
equation: A pedestrian prescription. Antennas and Propagation Magazine, IEEE,
35(3):7{12, 1993.
[25] B. Collins-Sussman, B. Fitzpatrick, and M. Pilato. Version control with subversion.
O?Reilly Media, Inc., 2004.
[26] M. Costabel and M. Dauge. Singularities of electromagnetic  elds in polyhedral do-
mains. Archive for Rational Mechanics and Analysis, 151(3):221{276, 2000.
103
[27] M. Costabel and M. Dauge. Weighted regularization of maxwell equations in polyhedral
domains. Numerische Mathematik, 93(2):239{277, 2002.
[28] M. Costabel, M. Dauge, and C. Schwab. Exponential convergence of hp-fem for
maxwell equations with weighted regularization in polygonal domains. Mathemati-
cal Models and Methods in Applied Sciences, 15(04):575{622, 2005.
[29] P. Davidson. An Introduction to Magnetohydrodynamics. Cambridge University Press,
Cambridge, UK, 2001.
[30] J. Dongarra, J. Hittinger, J. Bell, L. Chacon, R. Falgout, M. Heroux, P. Hovland,
E. Ng, C. Webster, S. Wild, et al. Applied mathematics research for exascale comput-
ing, 2014.
[31] J. Dongarra, D. Reed, R. Bajcsy, M. Fernandez, J. Gri ths, R. Mott, C. Johnson,
A. Inouye, W. Miner, M. Matzke, et al. Computational science: Ensuring america?s
competitiveness, 2005.
[32] P. Dreyfuss and J. Rappaz. Numerical modelling of induction heating for two dimen-
sional geometries. Preprint EPFL, 2001.
[33] H. Elman, V. Howle, J. Shadid, D. Silvester, and R. Tuminaro. Least squares precon-
ditioners for stabilized discretizations of the navier-stokes equations. SIAM Journal
on Scienti c Computing, 30(1):290{311, 2007.
[34] H. Elman, D. Silvester, and A. Wathen. Iterative methods for problems in computa-
tional  uid dynamics. 1998.
[35] H. Elman, D. Silvester, and A. Wathen. Block preconditioners for the discrete in-
compressible navier{stokes equations. International journal for numerical methods in
 uids, 40(3-4):333{344, 2002.
[36] H. Elman, D. Silvester, and A. Wathen. Performance and analysis of saddle point pre-
conditioners for the discrete steady-state navier-stokes equations. Numerische Mathe-
matik, 90(4):665{688, 2002.
[37] H. Elman, D. Silvester, and A. Wathen. Finite Elements and Fast Iterative Solvers:
with Applications in Incompressible Fluid Dynamics. Oxford University Press, Oxford,
UK, 2006.
[38] D. Fleisch. A Student?s guide to Maxwell?s equations. Cambridge University Press,
2008.
[39] G. Fleming and M. Ratner. Grand challenges in basic energy sciences. Physics Today,
28, 2008.
[40] T. Geenen, C. Vuik, G. Segal, S. MacLachlan, et al. On iterative methods for the
incompressible stokes problem. International Journal for Numerical methods in  uids,
65(10):1180{1200, 2011.
104
[41] J. Gerbeau. A stabilized  nite element method for the incompressible magnetohydro-
dynamic equations. Numerische Mathematik, 87(1):83{111, 2000.
[42] J. Gerbeau, C. Le Bris, et al. Existence of solution for a density-dependent magneto-
hydrodynamic equation. Advances in Di erential Equations, 2(3):427{452, 1997.
[43] C. Geuzaine and J. Remacle. Gmsh: A 3-d  nite element mesh generator with built-
in pre-and post-processing facilities. International Journal for Numerical Methods in
Engineering, 79(11):1309{1331, 2009.
[44] G. Golub and C. Van Loan. Matrix computations, volume 3. JHU Press, 1996.
[45] N. Gould, J. Scott, and Y. Hu. A numerical evaluation of sparse direct solvers for the
solution of large sparse symmetric linear systems of equations. ACM Transactions on
Mathematical Software (TOMS), 33(2):10, 2007.
[46] L. Greengard. The rapid evaluation of potential  elds in particle systems. MIT press,
1988.
[47] L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. Journal of
computational physics, 73(2):325{348, 1987.
[48] C. Greif and D. Sch otzau. Preconditioners for the discretized time-harmonic maxwell
equations in mixed form. Numerical Linear Algebra with Applications, 14(4):281{297,
2007.
[49] R. Griesse and K. Kunisch. Optimal control for a stationary mhd system in velocity-
current formulation. SIAM Journal on Control and Optimization, 45(5):1822{1845,
2006.
[50] R. Griesse and A. Meir. Modelling of a magnetohydrodynamics free surface prob-
lem arising in czochralski crystal growth. Mathematical and Computer Modelling of
Dynamical Systems, 15(2):163{175, 2009.
[51] D. Gri ths. Introduction to electrodynamics, volume 3. prentice Hall Upper Saddle
River, NJ, 1999.
[52] R. Gruber and J. Rappaz. Finite element methods in linear ideal magnetohydrody-
namics. Berlin and New York, Springer-Verlag, 1985, 189 p., 1, 1985.
[53] M. Gunzburger. Finite element methods for viscous incompressible  ows: a guide to
theory, practice, and algorithms. Academic Press, Computer Science and Scienti c
Computing, Boston, MA, 1989.
[54] M. Gunzburger, A. Meir, and J. Peterson. On the existence, uniqueness, and  nite
element approximation of solutions of the equations of stationary, incompressible mag-
netohydrodynamics. Mathematics of Computation, 56(194):523{563, 1991.
105
[55] M. Gunzburger, E. Ozugurlu, J. Turner, and H. Zhang. Magnetic control of transport
phenomena in cz crystal growth processes. ASME-PUBLICATIONS-HTD, 369:123{
130, 2001.
[56] M. Gunzburger, E. Ozugurlu, J. Turner, and H. Zhang. Controlling transport phenom-
ena in the czochralski crystal growth process. Journal of Crystal Growth, 234(1):47{62,
2002.
[57] M. Gunzburger and C. Trenchea. Analysis and discretization of an optimal control
problem for the time-periodic mhd equations. Journal of mathematical analysis and
applications, 308(2):440{466, 2005.
[58] J. Hemminger, G. Fleming, and M. Ratner. Directing matter and energy: Five chal-
lenges for science and the imagination, 2007.
[59] M. Heroux, R. Bartlett, V. Howle, R. Hoekstra, J. Hu, T. Kolda, R. Lehoucq, K. Long,
R. Pawlowski, E. Phipps, et al. An overview of the trilinos project. ACM Transactions
on Mathematical Software (TOMS), 31(3):397{423, 2005.
[60] M. Heroux and J. Willenbring. Trilinos users guide. United States. Department of
Energy, 2003.
[61] R. Hiptmair. Finite elements in computational electromagnetism. Acta Numerica,
11:237{339, 2002.
[62] L. Hou and A. Meir. Boundary optimal control of mhd  ows. Applied Mathematics
and Optimization, 32(2):143{162, 1995.
[63] P. Houston, I. Perugia, and D. Schotzau. Mixed discontinuous galerkin approximation
of the maxwell operator. SIAM Journal on Numerical Analysis, 42(1):434{459, 2004.
[64] A. Ihler. An overview of fast multipole methods. Area Exam, April, 2004.
[65] T. Isaac, C. Burstedde, and O. Ghattas. Low-cost parallel algorithms for 2: 1 octree
balance. In Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th
International, pages 426{437. IEEE, 2012.
[66] S. Jardin. Review of implicit methods for the magnetohydrodynamic description of
magnetically con ned plasmas. Journal of Computational Physics, 231(3):822{838,
2012.
[67] B. Jiang. The least-squares  nite element method: theory and applications in compu-
tational  uid dynamics and electromagnetics. Springer, 1998.
[68] M. Kronbichler, T. Heister, and W. Bangerth. High accuracy mantle convection
simulation through modern numerical methods. Geophysical Journal International,
191(1):12{29, 2012.
[69] L. Landau. Fluid mechanics: volume 6 (course of theoretical physics) Author: LD
Landau, EM Lifshitz, Publisher: Bu. Butterworth-Heinemann, 1987.
106
[70] C. Le Bris, T. Leli evre, et al. Mathematical methods for the magnetohydrodynamics of
liquid metals. Oxford University Press, 2006.
[71] A. Meir. Thermally coupled magnetohydrodynamics  ow. Applied mathematics and
computation, 65(1):79{94, 1994.
[72] A. Meir. Thermally coupled, stationary, incompressible mhd  ow; existence, unique-
ness, and  nite element approximation. Numerical Methods for Partial Di erential
Equations, 11(4):311{337, 1995.
[73] A. Meir, S. Bakhtiyarov, R. Overfelt, and P. Schmidt. Numerical simulation of steady
liquid-metal  ow in the presence of a static magnetic  eld. Journal of applied mechan-
ics, 71(6):786{795, 2004.
[74] A. Meir and P. Schmidt. A velocity current formulation for stationary mhd  ow.
Applied Mathematics and Computation, 65(1{3):95{109, 1994.
[75] A. Meir and P. Schmidt. Variational methods for stationary mhd  ow under natural
interface conditions. Nonlinear Analysis: Theory, Methods & Applications, 26(4):659{
689, 1996.
[76] A. Meir and P. Schmidt. Analysis and  nite element simulation of mhd  ows, with an
application to seawater drag reduction. In Proceedings of an International Symposium
on Seawater Drag Reduction, pages 401{406. Citeseer, 1998.
[77] A. Meir and P. Schmidt. Analysis and  nite element simulation of mhd  ows, with an
application to liquid metal processing. In Fluid Flow Phenomena in Metals Processing,
Proceedings, 1999 TMS Annual Meeting, pages 561{569, 1999.
[78] A. Meir and P. Schmidt. Analysis and numerical approximation of a stationary
mhd  ow problem with nonideal boundary. SIAM Journal on Numerical Analysis,
36(4):1304{1332, 1999.
[79] A. Meir, P. Schmidt, S. Bakhtiyarov, and R. Overfelt. Velocity, potential, and temper-
ature distributions in molten metals during electromagnetic stirring: Part ii-numerical
simulations. ASME-PUBLICATIONS-FED, 249:87{96, 1999.
[80] W. Nagel. Subversion Version Control: Using the Subversion Version Control System
in Development Projects. Prentice Hall PTR, 2005.
[81] J. N ed elec. Mixed  nite elements in r3. Numerische Mathematik, 35(3):315{341, 1980.
[82] W. Perry, A. Broers, F. El-Baz, W. Harris, B. Healy, W. Hillis, et al. Grand challenges
for engineering. National Academy of Engineering, Washington, DC, 2008.
[83] J. Rappaz and R. Touzani. On a two-dimensional magnetohydrodynamic problem.
i. modelling and analysis. ESAIM: Mathematical Modelling and Numerical Analysis-
Mod elisation Math ematique et Analyse Num erique, 26(2):347{364, 1992.
107
[84] J. Rappaz and R. Touzani. On a two-dimensional magnetohydrodynamic problem. ii.
numerical analysis. Mod elisation math ematique et Analyse num erique, 30(2):215{235,
1996.
[85] Y. Saad. Iterative Methods for Sparse Linear Systems. Society of Industrial and
Applied Mathematics, Philadelphi, PA, 2003.
[86] M. Sala, M. Heroux, and D. Day. Trilinos tutorial. Sandia National Laboratories, Tech.
Rep. SAND2004-2189, 2004.
[87] M. Sch afer, S. Turek, F. Durst, E. Krause, and R. Rannacher. Benchmark computations
of laminar  ow around a cylinder. Springer, 1996.
[88] P. Schmidt. On a magnetohydrodynamic problem of euler type. Journal of di erential
equations, 74(2):318{335, 1988.
[89] P. Schmidt. A galerkin method for time-dependent mhd  ow with nonideal boundaries.
Communications in Applied Analysis, 3(3):383{398, 1999.
[90] D. Sch otzau. Mixed  nite element methods for stationary incompressible magneto{
hydrodynamics. Numerische Mathematik, 96(4):771{800, 2004.
[91] D. Silvester, H. Elman, D. Kay, and A. Wathen. E cient preconditioning of the
linearized navier{stokes equations for incompressible  ow. Journal of Computational
and Applied Mathematics, 128(1):261{279, 2001.
[92] P. Solin, K. Segeth, and I. Dolezel. Higher-order  nite element methods. CRC Press,
2004.
[93] J. Strickland and R. Baty. An overview of fast multipole methods. Technical report,
Sandia National Labs., Albuquerque, NM (United States), 1995.
[94] E. Strohmaier, S. Williams, A. Kaiser, K. Madduri, K. Ibrahim, D. Bailey, J. Demmel,
T. Simos, G. Psihoyios, and C. Tsitouras. A kernel testbed for parallel architecture,
language, and performance research. In AIP Conference Proceedings, volume 1281,
page 1297, 2010.
[95] L. Trefethen and D. Bau III. Numerical Linear Algebra. Society of Industrial and
Applied Mathematics, Philadelphi, PA, 1997.
[96] E. Tyrtyshnikov and R. Chan. Spectral equivalence and proper clusters for matrices
from the boundary element method. International Journal for Numerical Methods in
Engineering, 49(9):1211{1224, 2000.
[97] H. Van der Vorst. Iterative Krylov methods for large linear systems, volume 13. Cam-
bridge University Press, 2003.
[98] C. Vuik, G. Segal, et al. A comparison of preconditioners for incompressible navier{
stokes solvers. International Journal for Numerical methods in  uids, 57(12):1731{
1751, 2008.
108
[99] J. Wetzl. The landscape of parallel computing research. 2012.
[100] B. Whitlock, J. Favre, and J. Meredith. Parallel in situ coupling of simulation with
a fully featured visualization system. In Proceedings of the 11th Eurographics confer-
ence on Parallel Graphics and Visualization, pages 101{109. Eurographics Association,
2011.
[101] P. Wriggers. Nonlinear  nite element methods, volume 4. Springer, 2008.
[102] S. Wu, T. Huang, and C. Li. Modi ed block preconditioners for the discretized time-
harmonic maxwell equations in mixed form. Journal of Computational and Applied
Mathematics, 237(1):419{431, 2013.
109