An Implementation of the Finite Element Method for the Velocity-Current
Magnetohydrodynamics Equations
by
K. Daniel Brauss
A dissertation submitted to the Graduate Faculty of
Auburn University
in partial ful llment of the
requirements for the Degree of
Doctor of Philosophy
Auburn, Alabama
August 2, 2014
Keywords: magnetohydrodynamics, nite element method, numerical solution
Copyright 2014 by K. Daniel Brauss
Approved by
A. J. Meir, Chair, Professor of Mathematics and Statistics
Paul G. Schmidt, Professor of Mathematics and Statistics
Yanzhao Cao, Professor of Mathematics and Statistics
Dmitry Glotov, Professor of Mathematics and Statistics
Abstract
The aim of this research is to extend the numerical results in [78] of a velocity-current
magnetohydrodynamics formulation proposed by A.J. Meir and Paul G. Schmidt in [74]
for a stationary ow that models a conductive uid in a bounded domain. A parallel -
nite element algorithm was successfully implemented on a high-performance computing,
distributed-memory architecture at the Alabama Supercomputer Center (ASC) using the
freely available, open-source academic and government libraries deal.ii, p4est and Trilinos.
Extending the work of Elman, Silvester and Wathen [37] for the Navier-Stokes equations,
a Schur complement preconditioner was developed for the current saddle-point problem to
successfully utilize the iterative Krylov subspace solver GMRES (generalized minimal resid-
ual method) and solve large linear systems of equations arising from mesh re nement. To
simplify and lower operation costs in forming the preconditioner, spectral equivalence was
established between the Schur complement and a mass matrix. The resulting C++ code was
tested succesfully on problems from [78] with similar results.
ii
Acknowledgments
I have to thank God for the opportunity to continue my studies here at Auburn University. I
have to thank my wife Minvera Rosario Brauss Blanco for her unwavering support. Without
my wife, I would not have made it this far. Thank you Rose. I thank Eliana Catherine
Brauss, my daughter, for being here and lling our lives. I thank all of my family for their
support with all that we do. I thank all of my teachers for the passion that they have given
to me. I thank Dr. A. J. Meir for his much appreciated support, patience and guidance as
my Ph.D. advisor. Dr. Meir is a positive in uence and a well-respected role model. I have
learned a deeper independence from him that has helped me to grow. I thank Dr. Paul G.
Schmidt for his support and his courses in functional analysis. Thanks to my other committee
members Dr. Yanzhao Cao and Dr. Dmitry Glotov. Lastly, thanks to Dr. Narendra Govil
and Dr. Geraldo DeSouza.
iii
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Preconditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Weak Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Matrix System and Preconditioner . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Matrix Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Finite Element Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Preconditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
iv
4 Computational Experiments and Results . . . . . . . . . . . . . . . . . . . . . . 47
4.1 Introduction to the Computational Environment . . . . . . . . . . . . . . . . 47
4.1.1 Choice of the Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.2 Working with the Libraries . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Example 1 - A Problem with an Exact Solution . . . . . . . . . . . . . . . . 51
4.3 Example 2 - An Applied Problem . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.1 Preconditioning Results . . . . . . . . . . . . . . . . . . . . . . . . . 58
5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Torus Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3 Code Speedup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
A Weak Formulation and Identities . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.1 Navier-Stokes Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.1.1 Pressure Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.1.2 Di usion Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.1.3 Advection Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
A.2 Ohms Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
B Newton?s Method Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
C Iterative Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
C.1 Conjugate Gradient (CG) Method . . . . . . . . . . . . . . . . . . . . . . . . 89
v
C.2 Generalized Minimal Residual (GMRES) Method . . . . . . . . . . . . . . . 91
D Exact Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
vi
List of Figures
2.1 Newton?s Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1 2-D Example of Q1 Basis Function for Pressure . . . . . . . . . . . . . . . . . . 30
3.2 2-D Example of Q2 Basis Function for Velocity . . . . . . . . . . . . . . . . . . 30
4.1 Schafer Turek Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Example 1 for Cube Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Example 1 for Lshaped Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Example 2 for Cube Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5 Example 2 for Lshaped Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6 Eigenvalues for System Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.7 Eigenvalues for Preconditioned Stokes-Ohms System . . . . . . . . . . . . . . . 60
4.8 Eigenvalues for Preconditioned Stokes-Ohms System with Inertial Term . . . . . 61
4.9 Eigenvalues for Preconditioned System Matrix . . . . . . . . . . . . . . . . . . . 61
vii
5.1 Example 1 for Lshaped Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2 Example 2 for Torus Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.3 Example 2 for Full Torus Domain . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 CPU Time Versus Number of Processors . . . . . . . . . . . . . . . . . . . . . . 68
5.5 CPU Time Versus Number of Processors . . . . . . . . . . . . . . . . . . . . . . 70
viii
List of Tables
4.1 Errors and Convergence Rates for the Unit Cube Domain . . . . . . . . . . . . 53
4.2 Errors and Convergence Rates for the Lshaped Domain . . . . . . . . . . . . . . 53
5.1 Errors and Convergence Rates for the Quarter Torus Domain . . . . . . . . . . 63
ix
Chapter 1
Introduction
1.1 Overview
Scienti c theories often have a quantitative, mathematical nature to them, such as a well-
founded set of governing equations developed through experimental tests and results. The
equations ususally depict a variable, quantity or quantities that are of scienti c interest and
may be a direct measure of the success of a process or system (e.g., velocity in mass transfer)
or an indirect measure of a system?s state (e.g., pressure or concentration pro les indicating
compression or swelling). Models whose governing equations involve complex multiphysics
and systems of partial di erential equations often arise in physics and engineering. Advances
in scienti c computing make it possible to numerically approximate solutions to the quanti-
ties of interest in such equations with increased accuracy and less simplifying assumptions.
Numerically approximating and visualizing solutions to complex multiphysics models can
then allow a scientist to investigate and analyze a process, assess its conditions, and address
possible optimizations. With growth in computing, the area of computational science and
applied mathematics has established itself as a the third pillar of scienti c investigation [31].
By way of visualization, scienti c computing can o er a deep understanding of phenomena
being scienti cally investigated as well as the limits of our theory and understanding of
natural phenomena [23].
In this research, our interests are in the predictions of scienti c quantities according to
governing equations for electrically conducting uid ow. We are interested in the dynamics
of electrically conductive uids in the presence of electromagnetic elds, placing our research
1
in the area of magnetohydrodynamics. We study viscous, incompressible, homogeneous,
conductive uids and therefore the governing incompressible Navier-Stokes equations [69]
@v@t u + (u r) u +rp = F + J B
r v = 0:
We consider the Lorentz force J B to act on the uid by way of electrical and magnetic
elds governed by the Maxwell equations [51, 38] and Ohm?s Law
r E = q
0
(Gauss?s Law for Electric Field);
r B = 0 (Gauss?s Law for Magnetic Field);
r E = @B@t (Faraday?s law);
r B = 0
J + 0@E@t
(Ampere-Maxwell law);
and
J = (E + u B) (Ohm?s Law)
The quantities of interest are the current J, velocity u, pressure p, electric potential , and
magnetic eld B.
Magnetohydrodynamics (MHD) is the study of the dynamics of electrically conducting uids
under electromagnetic elds. Our focus is on the interaction of electrically conducting uids
with electromagnetic elds at the macroscopic level [29]. Applications modeled by such
equations arise in liquid metal solidi cation processes [77, 83, 84, 41, 42, 70, 90, 49, 79,
32, 18] and silicon crystal growth processes used in the semiconductor and microelectronics
2
industry [50, 73, 56, 55] to hold, liquify and stir melts, control undesired convection, and lter
out impurities. Other example applications are electromagnetic pumping, nuclear reactor
cooling, plasmas in nuclear fusion and propulsion devices [66, 54, 76, 52, 62, 57, 3, 72, 71,
88]. We note that these studies address grand challenges seen in the National Academy of
Engineering?s (NAE) 21st Century?s Grand Challenges for Engineering [82] and the 5 Grand
Challenges from the Basic Energy Sciences Advisory Committee (BESAC) [39, 58] of the
U.S. Department of Energy (DOE). One particular grand challenge from NAE is providing
energy from thermonuclear fusion - the harnessing of the sun?s energy. The above research
also addresses the main theme of the BESAC grand challenges - designing and controlling
material processes.
The velocity-current formulation of the MHD equations that we study was proposed by A.J.
Meir and Paul G. Schmidt [74]. They have established existence and uniqueness of a solution
to the equations as well as a convergent nite element iterative numerical approximation
method [78]. The equations are a stationary (steady-state) form of the equations above and
will be reviewed in chapter 2.
The main goal of this research was to extend the work conducted by A.J. Meir and Paul
G. Schmidt in the numerical solution of the velocity-current MHD equations on three-
dimensional domains by implementing a nite element method using parallel computing
and a high-performance distributed memory cluster. To provide some motivation for this
e ort, we consider the ow of a uid contained in a simple domain such as the unit cube.
If we partition the unit cube in each coordinate direction into 101 segments to create a
100-by-100-by-100 domain mesh, and we perform the nite element method on this mesh,
then the resulting linear system of equations will have unknowns that reach well into the
millions. We have interest to apply our method to more complicated and larger domains as
well as extend our work to include further multiphysics. This would mean that the number
of unknowns could reach into the billions. We note that we are solving a nonlinear system
of the partial di erential equations using the nite element method and a picard iteration
3
scheme, and therefore solve this size of a system repeatedly to convergence. As might be
expected, a large portion of computational time is spent solving the linear system and is
common in scienti c computing [99, 94].
1.2 Solver
We therefore wish to set up and e ciently solve a linear system Ax = b. Two general
choices for solving a linear system are a direct solver or an iterative solver. A direct solver
can be viewed as a form of Gaussian elimination performed on the system to decompose
it into lower L and upper U triangular factors, A = LU. Many iterative solvers can be
viewed as projection methods [85, 95] where the iterative solution to the system is obtained
by projecting the exact solution to the system onto a particular subspace using a technique
such as least squares. As our linear system is large, our interest is in the operations and
memory required for such solvers. We summarize from a serial point of view.
The operation count for Gaussian elimination on a n n matrix A, excluding row inter-
changes, is O(n3). Indeed, the pivot on the main diagonal element at the (1,1) position
results in n multiplications across the rst row. The elimination (zeroing) of the element
below the pivot in the rst column requires a multiplication on the rst row (n ops) and
a row addition (n ops). Performing elimination for each of the n 1 rows below the pivot
results in a total of n+ (n+n)(n 1) = n+ 2n(n 1) ops ( oating point operations). The
total number of operations across each column can be written as a series
n 1X
i=0
(n i) + 2(n i)(n 1 i) =
n 1X
i=1
i+ 2(i)(i 1) = 2
n 1X
i=1
i2
n 1X
i=1
i
= (n 1)(n)(2n 1)3 (n 1)(n)2 = O(n3):
4
To solve the system Ax = b, the same operations are applied to the system?s right-hand side
b creating a column vector we call c and requiring
n+
n 1X
i=1
2(i 1) = n(n 1) n = O(n2)
oating point operations. Upon completion, the system of equations has been transformed
to Ux = c. To obtain the solution x, a backsolve is performed, starting from the lower row
of U and working upward. The number of operations to solve the ith element of x is one
multiplication and i 1 additions. The complexity for solving all the elements of x, the
entire backsolve, is
nX
i=1
1 +
nX
i=1
(i 1) =
nX
i=1
i = n(n+ 1)2 = O(n2):
The order of operations for the entire solution process is therefore O(n3). We note that bands
in the system matrix are common for numerical approximation techniques to solutions of
partial di erential equations, such as the nite element method, and this matrix structure is
taken advantage of by direct solvers. For a banded matrix with b nonzero diagonals below the
main diagonal, the complexity when solving with Gaussian elimiation is less and dependent
on the band size O(nb2) [44].
To compare the number of operations between a direct and iterative solver, we take as an
example a second order PDE discretized using the nite element method over an irregular
three-dimensional grid to obtain the system of equations Ax = b. A typical bandwith for
this problem is b = O(n2=3) [97]. The number of operations using the banded direct solver
would therefore be O(nb2) = O(n7=3). For an iterative solver, the symmetry and positive
de niteness of the system matrix makes the conjugate gradient (CG) method a natural
choice. Per iteration the number of oating point operations is O(n), see appendix C. To
5
estimate the number of operations for the complete iterative solve, we use an error bound
after l iterations of the method given in the following theorem [44].
Theorem 1.1 Suppose that A2Rn n is symmetric positive de nite and b2Rn. The error
after l iterations of the CG algorithm in solving the system Ax = b can be bounded as follows:
kx xlkA 2
p
(A) 1p
(A) + 1
!l
kx x0kA;
where (A) is the condition number of A.
For discretized second order PDEs, it can be shown that (A) = O(n2=3) [97]. Setting the
iterative solver error tolerance to , we estimate the required number of iterations l to achieve
the tolerance by using the error estimate from the theorem above. Here we have assumed
that n is large and used (A) = O(n2=3) to simplify.
1 1=p (A)
1 + 1=p (A)
!l
<
1 2p (A)
!l
0 and data
F2H 1 ( );E2L2 ( );
g2H1=2 ( ) with
Z
g n = 0;
Jext 2L2 R3n with r Jext = 0 in R3n and
Z
Jext n = 0;
and
Bext 2W1 R3 with r Bext = 0 in R3 and r Bext = 0 in ;
13
nd functions u2H1 ( );J2L2 ( );p2L2 ( =R); and 2H1 ( =R), such that
u + (u r) u +rp J B = F in ;
1J +r u B = E in ;
r u = 0 and r J = 0 in ;
u = g and J n = Jext n on ;
and
B = B0 +B(J);
B(J) (x) = 4
Z
x y
jx yj3 J (y)dy;
B0 (x) = Bext (x) 4
Z
R3n
x y
jx yj3 Jext (y)dy:
are satis ed.
Letting ~J2L2 (R3) be de ned as J in and Jext in R3n then r ~J = 0 (in the sense of
distributions on R3) and B = Bext + ~B with ~B given by
~B (x) =
4
Z
R3
x y
jx yj3
~J (y)dy
for x2R3. ~B2W1 (R3) is the unique solution of Maxwell?s equations
r ~B = 0 and r ~B = ~J (2.9)
in W1 (R3). We require Bext to belong to W1 (R3) as well [78].
14
2.2 Weak Formulation
A mixed weak formulation of the original problem is obtained by multiplying the system
equations (2.1) - (2.3) by test functions v 2 H10 ( ), K 2 L2 ( ), q 2 L2 ( )=R, and
2H1 ( )=R, respectively and integrating over the domain .
Z
u v +
Z
((u r) u) v +
Z
rp v
Z
(J B) v =
Z
F v
1
Z
J K +
Z
r K
Z
(u B) K =
Z
E K
Z
(r u)q = 0 and
Z
(r J) = 0
We add the rst two equations and the third and fourth. Using integration by parts, the
divergence theorem and some identities (see Appendix A) we obtain
Z
ru: rv + 2
Z
((u r) u) v
Z
((u r) v) u
Z
pr v
Z
(J B) v
+ 1
Z
J K +
Z
r K +
Z
(K B) u =
Z
F v +
Z
E K
Z
(r u)q +
Z
J r =
Z
J n
Combining together Navier-Stokes and Ohm?s Law and then the continuity equations, we
have the system in terms of the multilinear forms
a0 ((u;J);(v;K)) +a1 ((u;J);(u;J);(v;K)) +b((v;K);(p; )) =
Z
F v +
Z
E K;
b((u;J);(q; )) =
Z
Jext n;
where a0 : (H1 ( ) L2 ( )) (H1 ( ) L2 ( ))!R is de ned as
a0 ((u;J);(v;K)) =
Z
(ru) : (rv) + 1
Z
J K +
Z
((K B0) u (J B0) v);
15
a1 : (H1 ( ) L2 ( )) (H1 ( ) L2 ( )) (H1 ( ) L2 ( ))!R as
a1 ((u;J);(u;J);(v;K)) = 2
Z
(((u r) u) v ((u r) v) u)
+
Z
((K B(J)) u (J B(J)) v);
and b: (H1 ( ) L2 ( )) (L2 ( )=R H1 ( )=R)!R as
b((v;K);(p; )) =
Z
(r v)P(p) +
Z
K (r )
The pressure solution p to the system is only unique up to a constant, and therefore the L2
orthogonal projection P has been introduced for uniqueness and is given by
P(f) := f 1j j
Z
f:
Hence,
P(p+c) = (p+c) 1j j
Z
p+c
= p+c 1j j
Z
p 1j j
Z
c = p 1j j
Z
p
= P(p)
To shorten the notation above, de ne Y := H1 ( ) L2 ( ), M := L2 ( )=R H1 ( )=R,
and X := H10 ( ) L2 ( ).
To help establish existence and uniqueness, the interial term of the Navier-Stokes equa-
tions has been \skew-symmetrized", making the bilinear form a1 ((v0;K0);( ; );( ; )) skew-
symmetric on Y Y.
1
2
Z
(((v1 r) v2) v3 ((v1 r) v3) v2) =
Z
((v1 r) v2) v3
16
whenever r v1 = 0 and v3 = 0. Given below is a second version of our problem.
Problem P1: Given parameters ; ; ; > 0 and data
F2H 1 ( );E2L2 ( );
g2H1=2 ( ) with
Z
g n = 0
Jext 2L2 R3n with r Jext = 0 in R3n and
Z
Jext n = 0; and
Bext 2W1 R3 with r Bext = 0 in R3 and r Bext = 0 in ;
nd functions (u;J)2Y with u = g and (p; )2M, such that
a0 ((u;J);(v;K)) +a1 ((u;J);(u;J);(v;K)) +b((v;K);(p; )) =
Z
F v +
Z
E K;
b((u;J);(q; )) =
Z
Jext n
for all (q; )2M and (v;K)2X. B: L2 ( )!W1 (R3) de ned by
B(f) (x) := 4
Z
x y
jx yj3 f (y)dy 8x2R
3
is a bounded linear operator from L2 ( ) to W1 (R3) [75] and B0 is the component of the
magnetic eld that is generated by outside sources, previously de ned.
The properties of a0, a1 and b needed to establish existence and uniqueness are collected in
the lemma below.
Lemma 1
1. The forms a0, a1 and b are bounded on on Y Y, Y Y Y, and Y M, respectively,
with norms
ka0k cmax 1; ; 1;
1 +kJextkL2(R3n ) + Bext L3( )
17
ka1k cmaxf ; g; and kbk p3;
where c depends only on (c = c( )).
2. The form a0 is positive de nite on X X with a number c 1 minf ; 1g such
that
a0 ((v;K);(v;K)) k(v;K)k2Y 8(v;K)2X
(where c = c( )).
3. The form b satis es the Ladyzhenskaya-Babuska-Brezzi condition (LBB-condition) on
X M with a number ( ) > 0 such that
inf
(q; )2M
sup
(v;K)2X
b((v;K);(q; ))
k(v;K)kYk(q; )kM :
We note that the two conditions
inf
q2L2( )=R
sup
v2H1( )
R
(r v)q
kvkH1( )kqkL2( )=R 1 and
inf
2H1( )=R
sup
K2L2( )
R
K (r )
kKkL2( )k kH1( )=R 2
are su cient to establish the last part of the lemma [21].
Thanks to the LBB-condition, there exist u0 2H1 ( ) and J0 2L2 ( ) [78] such that
r u0 = 0 in ; u0 = j on and r J0 = 0 in ; J0 n = j on :
Therefore, the problem can be reduced by writing
u = u0 + ~u and J = J0 + ~J
18
and de ning the forms a: X X X!R and l2X by
a((v1;K1);(v2;K2);(v3;K3))
:= a0 ((v2;K2);(v3;K3)) +a1 ((v1;K2);(v2;K2);(v3;K3))
+a1 ((v2;K2);(u0;J0);(v3;K3)) +a1 ((u0;J0);(v2;K2);(v3;K3))
and
l(v;K) :=
Z
F v +
Z
E K
a0 ((u0;J0);(v;K)) a1 ((u0;J0);(u0;J0);(v;K)):
With the space V := f(v;K)2X: b((v;K);(q; )) = 0 8(q; )2Mg, we have a nal
problem version.
Problem P2: Find
~u;~J
2 V such that a
~u;~J
;
~u;~J
;(v;K)
= l(v;K) for all
(v;K)2V.
We refer to [78] for the equivalence of problem versions. To establish existence and uniqueness
we have the lemma and theorem below.
Lemma 2
1. The mapping (v;K) !a((v;K);(v;K);(v0;K0)), for any (v0;K0) 2 V, is weakly
sequentially continuous on V.
2. For every (v0;K0)2V and all (v;K)2V, we have
a((v0;K0);(v;K);(v;K)) ( ka1kk(g;j)k)k(v;K)k2Y
where and are constants established in [78] andk(g;j)kdenotes the norm of (g;j)2
H1=2 ( ) H 1=2 ( ).
19
3. The mapping (v0;K0) ! a((v0;K0);( ; );( ; )) is uniformly Lipschitz continuous,
with Lipschitz constant ka1k from V into the space L(V;V ) of bounded linear oper-
ators from V into V .
Theorem 2.1 Let N = N (F;E;g;Jext;Bext) denote the norm of the functional l V. Let
k(g;j)k denote the norm of (g;j) in H1=2 ( ) H 1=2 ( ), where j = Jext n, and choose
constants and (mentioned above). For Problem P (Version 3) we have the following
1. If k(g;j)k< ka1k, then there exists at least one solution
~u;~J
that satis es
~u;~J
Y
N ka
1kk(g;j)k
:
2. If k(g;j)k< ka1k, then the solution is unique.
The theorem asserts the existence of a solution for all versions of the problem if the boundary
data g and j = Jext n are su ciently small, and uniqueness is guaranteed if all the data
F;E;g;Jext, and Bext are su ciently small with the constants ; , and ka1k independent
of the data [78].
In steps similar to the in nite-dimensional problem, existence and uniqueness was established
for a nite-dimensional approximation to the problem [78]. In that direction, we introduce
the notation B for a Banach space and Bh h2I for a family of nite-dimensional subspaces
of B. I is a subset of the interval (0;1) such that 0 is its only limit point. Bh h2I is said to
be a nite-dimensional approximation of B if 8f 2B, we have inffh2Bh f fh B ! 0 as
h!0.
To continue, we assume that Yh1 h2I, Yh1 h2I, Mh1 h2I, and Mh2 h2I are nite-dimensional
approximations of Y1 = H1 ( ), Y2 = L2 ( ), M1 = L2 ( )nR, and M2 = H1 ( )nR,
respectively. Therefore, Yh := Yh1 Yh2 and Mh := Mh1 Mh2 approximate Y and M,
respectively. With X1 := H10 ( ), X2 := Y2 and X := X1 X2 we de ne Xh1 := Yh1 TXh1,
20
Xh2 := Yh2 , Xh := Xh1 Xh2, and let Yh1; denote the trace space of Yh1
Yh1; = vh : vh2Yh1
of Y1; := H1=2 ( ). For further details, please see [78].
With the set of parameters ; ; ; and data F;E;g;Jext, and Bext given and j := Jext n,
we choose a family gh h2I of approximate boundary values gh2Yh1; such that gh!g in
Y1; as h!0. Finally, we consider a family Ph1 (h2I) of nite-dimensional approximations
to Problem P1, as follows
Problem Ph1 : Find uh;Jh 2Yh with uh = gh and ph; h 2Mh such that
a0 uh;Jh ; vh;Kh +a1 uh;Jh ; uh;Jh ; vh;Kh
+b vh;Kh ; ph; h =
Z
F v +
Z
E K
and
b uh;Jh ; qh; h =
Z
(Jext n) h;
for all qh; h 2Mh and vh;Kh 2Xh. With existence and uniqueness established for the
nite-dimensional problem, we consider a convergent nite element iteration scheme [78] to
numerically approximate the in nite-dimensional solution to the problem.
2.3 Linearization
We use Newton?s method
X(k+1) = X(k) H
X(k)
H0(X(k)) (2.10)
21
Figure 2.1: Newton?s Method.
as a heuristic tool to develop the convergent iterative scheme introduced in [78], where the
vector function H (X)
H(X) =
2
66
66
66
66
66
66
66
66
66
66
4
u1 + (u r)u1 + @p@x (J2B3 J3B2) F1
u2 + (u r)u2 + @p@y + (J1B3 J3B1) F2
u3 + (u r)u3 + @p@z (J1B2 J2B1) F3
1J1 + @ @x (u2B3 u3B2) E1
1J2 + @ @y (u1B3 u3B1) E2
1J3 + @ @z (u1B2 u2B1) E3
@u1
@x +
@u2
@y +
@u3
@z
@J1
@x +
@J2
@y +
@J3
@z
3
77
77
77
77
77
77
77
77
77
77
5
and the variable X = (u1;u2;u3;p;J1;J2;J3; ) were obtained from the equations
u + (u r) u +rp J B = F
1J +r u B = E;
r u = 0 and r J = 0:
Since
B = B0 +B(J)
22
= B0 4
Z
x y
jx yj3 J (y)dy =
2
66
66
4
(B0)1 4 R (x2 y2)J3(y) (x3 y3)J2(y)jx yj3 dy
(B0)2 4 R (x1 y1)J3(y) (x3 y3)J1(y)jx yj3 dy
(B0)3 4 R (x1 y1)J2(y) (x2 y2)J1(y)jx yj3 dy
3
77
77
5
we have
H(X) =
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
u1 + (u r)u1 + @p@x J2
(B0)3 4 R (x1 y1)J2 (x2 y2)J1jx yj3 dy
+J3
(B0)2 4 R (x1 y1)J3 (x3 y3)J1jx yj3 dy
F1
u2 + (u r)u2 + @p@y + J1
(B0)3 4 R (x1 y1)J2 (x2 y2)J1jx yj3 dy
J3
(B0)1 4 R (x2 y2)J3 (x3 y3)J2jx yj3 dy
F2
u3 + (u r)u3 + @p@z J1
(B0)2 4 R (x1 y1)J3 (x3 y3)J1jx yj3 dy
+J2
(B0)1 4 R (x2 y2)J3 (x3 y3)J2jx yj3 dy
F3
1J1 + @ @x u2
(B0)3 4 R (x1 y1)J2 (x2 y2)J1jx yj3 dy
+u3
(B0)2 4 R (x1 y1)J3 (x3 y3)J1jx yj3 dy
E1
1J2 + @ @y + u1
(B0)3 4 R (x1 y1)J2 (x2 y2)J1jx yj3 dy
u3
(B0)1 4 R (x2 y2)J3 (x3 y3)J2jx yj3 dy
E2
1J3 + @ @z u1
(B0)2 4 R (x1 y1)J3 (x3 y3)J1jx yj3 dy
+u2
(B0)1 4 R (x2 y2)J3 (x3 y3)J2jx yj3 dy
E3
@u1
@x +
@u2
@y +
@u3
@z
@J1
@x +
@J2
@y +
@J3
@z
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
5
Di erentiating H (X) using the Gateaux deritave
H0G X(k) ( X) = lim !0 H
X(k) + X H X(k)
; (2.11)
23
where X = X(k+1) X(k) and X = ( u1; u2; u3; p; J1; J2; J3); we obtain
(see Appendix B) the Newton iteration
2
66
66
66
66
66
66
66
66
66
66
4
r ru(k+1) + u(k+1) r u(k) u(k) r u(k) + u(k) r u(k+1)
+rp(k+1) J(k) B J(k+1) + J(k) B J(k) J(k+1) B J(k)
1J(k+1) +r (k+1) u(k) B J(k+1) + u(k) B J(k) u(k+1) B J(k)
r u(k+1)
r J(k+1)
3
77
77
77
77
77
77
77
77
77
77
5
=
2
66
66
66
66
66
66
66
66
64
F
E
0
0
3
77
77
77
77
77
77
77
77
75
: (2.12)
Dropping o the rst two terms from the linearization of each of the three nonlinear terms
of the equations, we have a Picard iteration [53]
2
66
66
66
66
66
66
66
66
64
r ru(k+1) + u(k) r u(k+1) +rp(k+1) J(k+1) B J(k)
1J(k+1) +r (k+1) u(k+1) B J(k)
r u(k+1)
r J(k+1)
3
77
77
77
77
77
77
77
77
75
=
2
66
66
66
66
66
66
66
66
64
F
E
0
0
3
77
77
77
77
77
77
77
77
75
:
We can see that these are the original equations with a component of each nonlinear term
being lagged to create the linearization. The corresponding weak formulation would then
be
Z
ru
(k+1)
: (rv) + +Z
u
(k) r
u
(k+1)
v Z
p(k+1) (r v)
Z
J
(k+1) B
J
(k)
v
Z
J
(k+1) B0
v + 1Z
J(k+1) K +
Z
r (k+1) K
Z
u
(k+1) B
J
(k)
K
Z
u
(k+1) B0
K = Z
F v +
Z
E K
Z
r u
(k+1)
q +Z
J(k+1) r =
Z
(Jext n)
24
Recalling the forms
a0 ((v1;K1);(v2;K2)) :=
Z
(rv1) : (rv2) + 1
Z
K1 K2
+
Z
((K2 B0) v1 (K1 B0) v2); (2.13)
a1 ((v1;K1);(v2;K2);(v3;K3)) := 2
Z
(((v1 r) v2) v3 ((v1 r) v3) v2)
+
Z
((K3 B(K1)) v2 (K2 B(K1)) v3); (2.14)
and
b((v;K);(q; )) :=
Z
(r v)P(q) +
Z
K (r ); (2.15)
we are now ready to state the convergent Picard iteration scheme proposed by Meir and
Schmidt for the numerical solution of Problem Ph1 .
Iteration Scheme. Given uh0;Jh0 2 Yh with uh0 = gh, for n2N, nd uhn;Jhn 2 Yh
with uhn = gh and phn; hn 2Mh such that
a0 uh(k+1);Jh(k+1) ; vh;Kh +a1 uh(k);Jh(k) ; uh(k+1);Jh(k+1) ; vh;Kh
+b vh;Kh ; ph(k+1); h(k+1) =
Z
F vh +
Z
E Kh
and
b uh(k+1);Jh(k+1) ; qh; h =
Z
(Jext n) h
for all vh;Kh 2Xh and qh; h 2Mh. For further details, we refer to [78].
25
Chapter 3
Matrix System and Preconditioner
3.1 Matrix Formulation
Restating the convergent Picard iteration scheme with n in place of k + 1,
a0 uh(n);Jh(n) ; vh;Kh +a1 uh(n 1);Jh(n 1) ; uh(n);Jh(n) ; vh;Kh
+b vh;Kh ; ph(n); h(n) =
Z
F vh +
Z
E Kh
and
b uh(n);Jh(n) ; qh; h =
Z
(Jext n) h
for all vh;Kh 2 Xh and qh; h 2Mh. The scheme is expanded using the de nition of
a0, a1, and b
Z
ruh(n) : rvh + 1
Z
Jh(n) Kh +
Z
Kh B
0
uh
(n)
Z
Jh(n) B0
vh
+
Z
uh(n 1) r
uh(n)
vh +
Z
Kh B
Jh(n 1)
uh(n)
Z
Jh(n) B
Jh(n 1)
vh
Z
ph(n)r vh +
Z
r h(n) Kh =
Z
F vh +
Z
E Kh
Z
r uh(n)q +
Z
Jh(n)r h =
Z
(Jext n)
Choosing a basis for each of the nite-dimensional subspaces
n~
uk
onhu
k=1
Yh1 ,
n~
Jk
onh
J
k=1
Yh2 ,
f pkgnhpk=1 Mh1 , and
n
k
onh
k=1
Mh2 , we can write the unknowns of problem Ph1 as a linear
26
combination of the respective basis
ph(n) =
nhpX
k=1
c(p;n)k ?(p)k
uh(n) =
nhuX
k=1
c(u;n)k ~?(u)k
h(n) =
nh X
k=1
c( ;n)k ?( )k
Jh(n) =
nhJX
k=1
c(J;n)k ~?(J)k
Substituting this representation into the iteration scheme and and using distributive prop-
erties of the gradient, divergence, cross product and dot product we have
nhuX
i=1
c(u;n)i
Z
r~?(u)i : r~?(u)j +
nhJX
i=1
c(J;n)i 1
Z
~?(J)i ~?(J)k
+
nhuX
i=1
c(u;n)i
Z
~?(J)k ~B0
~?(u)i
nhJX
i=1
c(J;n)i
Z
~?(J)i ~B0
~?(u)j
+
nhuX
i=1
c(u;n)i
Z
~uh(n 1) r ~?(u)i
~?(u)j +
nhuX
i=1
c(u;n)i
Z
~?(J)k ~B
~
Jh(n 1)
~?(u)i
nhJX
i=1
c(J;n)i
Z
~?(J)i ~B
~
Jh(n 1)
~?(u)j
nhpX
i=1
c(p;n)i
Z
?(p)i r ~?(u)j
+
nh X
i=1
c( ;n)i
Z
r?( )i ~?(J)k =
Z
F ~?(u)j +
Z
E ~?(J)k
and
nhuX
i=1
c(u;n)i
Z
r ~?(u)i ?(p)l +
nhJX
i=1
c(J;n)i
Z
~?(J)i r?( )m =
Z
(~Jext ~n)?( )l :
27
Letting
A(u) =
h
a(u)(i;j)
i
:=
Z
r~?(u)i : r~?(u)j
1 i nu;1 j nu;
A(J) =
h
a(J)(i;k)
i
:=
1
Z
~?(J)i ~?(J)k
1 i nJ;1 k nJ;
B(p) =
h
b(p)(i;l)
i
:=
Z
?(p)l r ~?(u)i
1 i nu;1 l np;
B( ) =
h
b( )(i;m)
i
:=
Z
~?(J)i r?( )m
1 i nJ;1 m n ;
C = c(i;k) :=
Z
~?(J)k ~B0
~?(u)i
1 i nu;1 k nJ;
N(u) =
h
n(u)(i;j)
i
:=
Z
~uh(n 1) r ~?(u)i
~?(u)j
1 i nu;1 j nu;
and
N(J) =
h
n(J)(i;k)
i
:=
Z
~?(J)k ~B
~
Jh(n 1)
~?(u)i
1 i nu;1 k nJ
denote the matrices formed by the respective terms of the iteration scheme, given on the
right-hand side of each de nition. We group the coordinates from each linear combination of
the uknowns of problem Ph1 into a vector. We do that same for the respective right-hand sides
of each equation of the iteration scheme, and for the sake of simplicity reuse the notations
F and E.
c(u;n) :=
h
c(u;n)1 :::c(u;n)i :::c(u;n)nu
it
;
c(p;n) :=
h
c(p;n)1 :::c(p;n)i :::c(p;n)np
it
;
c(J;n) :=
h
c(J;n)1 :::c(J;n)i :::c(J;n)nJ
it
;
c( ;n) :=
h
c( ;n)1 :::c( ;n)i :::c( ;n)n
it
;
F =
Z
F (u)1 :::
Z
F (u)j :::
Z
F (u)nu
t
;
E =
Z
E (J)1 :::
Z
E (J)k :::
Z
E (J)nJ
t
;
28
and
G =
Z
~Jext?( )1 :::
Z
~Jext?( )l :::
Z
~Jext?( )n
t
:
With these de nitions we can write the iterative scheme in a matrix form
2
66
66
66
64
A(u) +N(u) Bt(p) N(J) +C t 0
B(p) 0 0 0
N(J) +C 0 A(J) Bt( )
0 0 B( ) 0
3
77
77
77
75
2
66
66
66
64
c(u;n)
c(p;n)
c(J;n)
c( ;n)
3
77
77
77
75
=
2
66
66
66
64
F
0
E
G
3
77
77
77
75
:
3.2 Finite Element Spaces
We choose nite element spaces for our nite-dimensional spaces. Recalling from [78] that
the spaces must satisfying the LBB-conditions
inf
qh2Mh1
sup
vh2Yh1
R
r vh qh
kvhkYh
1
kqhkMh
1
1 > 0
and
inf
h2Mh2
sup
Kh2Yh2
R
K
h r h
kKhkYh
2
k hkMh
2
2 > 0;
we determine velocity-pressure pairs Xh1;Mh1 and the current-potential pairs Xh2;Mh2 .
The LBB condition for the velocity-pressure pair appears in the theory on the Stokes and
Navier-Stokes equations. Taylor-Hood elements are examples of velocity-pressure nite-
element pairs that satisfy the condition. For Taylor-Hood elements we choose triquadratics
for the velocity and trilinears for the pressure.
29
(a) Q1 Trilinear Nodes. (b) Q1 Basis Function.
Figure 3.1: 2-D Example of Q1 Basis Function for Pressure
(a) Q2 Triquadratic Nodes.
(b) 2D Q2 Basis Function.
Figure 3.2: 2-D Example of Q2 Basis Function for Velocity
For each component k of the velocity vector basis function ~?(u)i we use triquadratic elements
of the form
?( )i (x)
k
=
X
i1;i2;i3 0
i1 +i2 +i3 2
i1;i2;i3xi11 xi22 xi33 2 1 i 3 i0 + i1xi + i2x2i : ij2R :
For the pressure basis functions ?(p)i , we use trilinears
?(p)i (x) =
X
i1;i2;i3 0
i1 +i2 +i3 1
i1;i2;i3xi11 xi22 xi33 2f 1 i 3 ( i0 + i1xi) : ij2Rg:
To satisfy the LBB condition for the current density and electric potential pairs, we chose
triquadratic elements for the electric potential and Nedelec elements of the rst kind and
30
second degree for the current density [81, 92, 17],
~?(J)i (x)Pk =
8
>>>>
<
>>>
>:
u =
0
BB
BB
@
u1
u2
u3
1
CC
CC
A
: u1 2Qk 1;k;k;u2 2Qk;k 1;k;u3 2Qk;k;k 1
9>
>>>
=
>>>
>;
:
where k = 2. For example, with k = 2 the rst component u1 is located in the tensor product
of a linear in x, quadratic in y, and quadratic in z. u1 will be linear and discontinuous in x
and quadratic and continuous in y and z. These properties of the Nedelec element are due
to the degrees of freedom having the form [17]
Z
e
u qds 8q2Pk 1 (e);
1
jfj
Z
f
(u n) qdA; 8q2(Pk 2 (f))3;
and Z
K
u qdV; 8q2(Pk 3 (K))3;
where K is a reference element, e is an edge of K, f is a face of K, is the unit vector along
the edge e, and Pk is the linear space of polynomials of degree k. Note p2Pk if and only
if
p(x) =
X
i1;i2;i3 0
i1 +i2 +i3 k
i1;i2;i2xi11 xi22 xi33
3.3 Preconditioner
We are now in a position to numerically solve the linear system. Having chosen GMRES to
solve the system, we consider the preconditioner. To solve the system e ciently, we wish to
minimize the number of iterations to convergence of the iterative solver. Hence we consider
31
bounds on the size of the residual rn = b Axn, since the tolerance on this value indicates
when the solver has converged. In particular, we have the theorem [95]
Theorem 3.1 At step n of the GMRES iteration for the system Ax = b, the residual rn =
b Axn satis es
krnk=kpn(A)bk inf
pn2Pn
kpn(A)kkbk (V) inf
pn2Pn
kpnk (A)kbk
where (A) is the set of eigenvalues of A, V is a nonsingular matrix of eigenvectors (as-
suming A is diagonalizable), Pn = fp a polynomial : degree (p) n;p(0) = 1g and kpnk (A)
is de ned by
kpnk (A) = sup
z2 (A)
fpn (z) : z2 (A) R;pn2Png
From the theorem, we conclude that if the system matrix A is normal, then a small condition
number for the matrix of eigenvectors V will decrease the estimate onkrnkand help to reduce
the number of iterations to reach convergence. Perhaps more importantly, polynomials in
Pn existing such that their values when applied to A have a small norm will also decrease
the bound on the residual. More can be said if the matrix A is normal. The norm on the
polynomials is considered over the spectrum of A ( (A)). If n is small then the zeros of the
polynomial would also be small. For this to occur, the number of eigenvalues of A would
have to be small or grouped together in a small number of clusters m. With this in mind,
we consider our matrix system
2
66
66
66
64
A(u) +N(u) Bt(p) N(J) +C t 0
B(p) 0 0 0
N(J) +C 0 A(J) Bt( )
0 0 B( ) 0
3
77
77
77
75
2
66
66
66
64
c(u;n)
c(p;n)
c(J;n)
c( ;n)
3
77
77
77
75
=
2
66
66
66
64
F
0
E
G
3
77
77
77
75
:
32
To construct a preconditioner, we start by considering the work done for the Stokes and
Navier-Stokes equations. These equations correspond to the upper left 2-by-2 block of our
system matrix. Successful constructions of preconditioners for this system have been carried
out by Elman, Silvester and Wathen [37]. The lower-right two-by-two block of our system
is similar in form to the Navier-Stokes system. We therefore plan to extend the work of
Elman, Silvester and Wathen to this portion of our system. To the best of our knowledge,
this extension to our system can not be found in the literature. As a rst preconditioner
we construct one that is block-diagonal, and based on the system after removal of the terms
due to the nonlinearities u B (magnetic induction), J B (the Lorentz force), and u ru
(the interial term). We therefore plan to construct a preconditioner for the system
2
66
66
66
64
A(u) Bt(p) 0 0
B(p) 0 0 0
0 0 A(J) Bt( )
0 0 B( ) 0
3
77
77
77
75
and
2
64A Bt
B 0
3
75 =
2
66
66
66
66
66
4
A(u) ... Bt(p) 0
0 A(J) ... 0 Bt( )
::: ::: ::: :::
B(p) 0 ... 0 0
0 B( ) ... 0 0
3
77
77
77
77
77
5
:
For the Stokes equations, Elman, Silvester and Wathen developed a block-diagonal precon-
ditioner using the pressure Schur complement S(p) = BpA 1u Btp. Their preconditioner was
P1 =
2
64A 1u 0
0 BpA 1u Btp
3
75
To extend their work, we consider the potential Schur complement S( ) = B( )A 1(J)Bt( )
and make plans to use it to precondition the lower-right 2-by-2 block of our system in a
similar manner. Following their work, we recall the de nition of the pressure mass matrix
Q(p) =
h
q(p)ij
i
=
hR
?
(p)
i ?
(p)
j
i
along with the following theorem about its spectral equivalence
with the pressure Schur complement matrix S(p) [37].
33
Theorem 3.2 For any ow problem with a Dirichlet boundary condition and discretized
using a uniformly stable mixed approximation on a shape regular, quasi-uniform subdivision
of R3, the pressure Schur complement matrix B(p)A 1(u)Bt(p) is spectrally equivalent to the
pressure mass matrix Q(p):
21
D
B(p)A 1(u)Bt(p)q;q
E
Q
(p)q;q
1 8q2Rnp such that q6= 1
and further
21
D
Bt(p)Q 1(p)B(p)v;v
E
A
(u)v;v
1 for all v2Rnu with v =2null B(p) :
A de nition for spectral equivalence is given below [96].
De nition 1 Consider two sequences of Hermitian positive de nite matrices Ah and Ch
and assume that all the eigenvalues of C 1h Ah satisfy c1 < < c2 with positive c1 and c2
independent of h. Then Ah and Ch are said to be spectrally equivalent.
From the de nition, we can see that Ch is a good preconditioner for the system matrix Ah
in terms of GMRES, since the preconditioned system?s eigenvalues cluster between c1 and
c2. The de nition holds for Q(p) and B(p)A 1(u)Bt(p) upon considering the eigenvalue problem
Q 1(p)B(p)A 1(u)Bt(p)x = x. and the resultant inner product
D
B(p)A 1(u)Bt(p)x;x
E
=
D
Q 1(p)x;x
E
.
The bounds from the theorem give the desired result. Hence, pressure mass matrix Q(p) is a
good preconditioner for pressure Schur complement B(p)A 1(u)Bt(p).
To construct, a preconditioner for the lower right 2-by-2 block, we de ne the laplace potential
mass matrix Q(r ) =
h
q(r )ij
i
=
hR
r?
( )
i r?
( )
j
i
and state and prove a corresponding
theorem.
34
Theorem 3.3 For any uniformly stable mixed approximation for the electric potential and
current-density on a shape regular, quasi-uniform subdivision of R3, the potential Schur
complement matrix S( ) = B( )A 1(J)Bt( ) is spectrally equivalent to the laplace potential mass
matrix Q(r ):
22
D
B( )A 1(J)Bt( )q;q
E
Q
(r )q;q
1 8q2Rn such that q6= 1:
The condition number of the potential Schur complement satis es
S( ) = max (S )
min (S )
< C 2
2c
:
To make clear some of the statements in the theorem, we give de nitions for shape regular
and quasi-uniform [37] and prove a lemma. LetTh be a decomposition of a convex polyhedral
domain of interest into hexahedral \brick" elements k, and let h(k)x , h(k)y , and h(k)z be the
lengths of the brick element k in the x, y, and z directions, respectively. We call a sequence
of hexahedral meshes fThg shape regular [37] if there exists a maximum brick edge ratio
such that every brick element k2Th satis es 1 k , where
k = max
(
h(k)x
h(k)y
;h
(k)
x
h(k)z
;h
(k)
y
h(k)x
;h
(k)
y
h(k)z
;h
(k)
z
h(k)x
;h
(k)
z
h(k)y
)!
:
From this de nition, we see that the ratio of the longest edge to the shortest edge of each
element k is bounded from above and below, and the uniform boundedness guarantees
that the elements do not degenerate with re nement. Letting hk denote the length of the
longest edge of k, we call the sequence of meshes fThg quasi-uniform [37] if there exists
a constant > 0 such that
min
k2Th
hk max
k2Th
hk
with max
k2Th
hk min
k2Th
hk
:
35
for every mesh in the sequence. The de nition implies that the local mesh size hk is about
constant for each mesh, with the ratio of the largest local mesh size to the smallest local
mesh size bounded between 1 and 1, and these bounds carry over with each re nement.
Lemma 3 Using a Q2 triquadratic element approximation of the electric potential on a
shape-regular subdivision in R3 for which a shape regular condition holds, the laplace potential
mass matrix Q(r ) approximates the scaled identity matrix in the sense that
ch3
Q
(r )q;q
hq;qi Ch
3 8q2Rn ;
where h = max k2Thhk and the constants c and C are independent of h.
Proof:
First, note that Q(r ) =
hR
r?
( )
i r?
( )
j
i
is symmetric and positive de nite. In particular,
0
r h
2
L2( )
=
Z
r h r h =
Z
r h r
n X
i=1
c( )i ?( )i =
n X
i=1
c( )i
Z
r h r?( )i
=
R
r
h r?( )1 ::: R
r
h r?( )n
2
66
66
4
c( )1
...
c( )n
3
77
77
5
=
P
n
j=1 c
( )
j
R
r?
( )
j r?
( )
1 :::
Pn
j=1 c
( )
j
R
r?
( )
j r?
( )n
2
66
66
4
c( )1
...
c( )n
3
77
77
5
=
c( )1 ::: c( )n
2
66
66
4
R
r?
( )
1 r?
( )
1 :::
R
r?
( )
1 r?
( )n
... ... ...
R
r?
( )n
r?
( )
1 :::
R
r?
( )n
r?
( )n
3
77
77
5
2
66
66
4
c( )1
...
c( )n
3
77
77
5
=
D
Q(r )c( );c( )
E
:
36
We can therefore form a basis for Rn out of the eigenvectors of Q(r ) and denote the
minimum and maximum eigenvalues as min and max respectively, with
minhq;qi Q(r )q;q maxhq;qi 8q2Rn :
Consider an arbitrary element k of the grid Th. Letting Q(k)(r ) =
hR
kr?
( )
i r?
( )
j
i
, we
have Q(r ) = Pk2ThQ(k)(r ) and
0 h 2L2(
k)
=
D
Q(k)(r )c( );c( )
E
:
With Q(k)(r ) symmetric and positive de nite, the same is true for 1h(k)
x h
(k)
y h
(k)
z
Q(k)(r ). Similar to
Q(r ), we have
(k)minhq;qi
*
1
h(k)x h(k)y h(k)z
Q(k)(r )q;q
+
(k)maxhq;qi 8q2Rn :
Shape regularity of the triangulations fThg implies that there exists a maximum brick edge
ratio such that every element k2Th satis es 1 k . Without loss of generality,
let h(k)x h(k)y h(k)z = hk. Then h(k)xh(k)
z
< h(k)yh(k)
z
< 1 and k = h(k)zh(k)
x
> h(k)yh(k)
x
> 1. With
h(k)x h(k)y h(k)z = h
(k)
x
h(k)z
h(k)y
h(k)z
h(k)
z
3 = 1
k
h(k)y
h(k)z
h3k 1
k
h(k)x
h(k)z
h3k 1 2
k
h3k 1
h3k
and
h(k)x h(k)y h(k)z = h
(k)
y
h(k)x
h(k)z
h(k)x
h(k)
x
3 h(k)y
h(k)x
h(k)z
h(k)x
h3k h
(k)
z
h(k)x
h(k)z
h(k)x
h3k = 2 kh3k 2 h3k;
we have
h3k 1 2
(k)minhq;qi h(k)x h(k)y h(k)z (k)minhq;qi
D
Q(k)(r )q;q
E
h(k)x h(k)y h(k)z (k)maxhq;qi h3k 2 (k)maxhq;qi:
37
The sequence of triangulations fThg are quasi-uniform and there exists a constant > 0
such that min k2Thhk max k2Thhk. With h = max k2Thhk we have
3h3 1 2
(k)minhq;qi
D
Q(k)(r )q;q
E
h3 2 (k)maxhq;qi
and since Q(r )q;q = P k2Th Q(k)q;q we have
3h3 1 2
hq;qi
n kX
k=1
(k)min Q(r )q;q h3 2 hq;qi
n kX
k=1
(k)max:
Letting c = 3 2
Pn
k=1
(k)
min and C =
2
Pn k
k=1
(k)
max, gives
ch3
Q
(r )q;q
hq;qi Ch
3 q2Rn :
We now prove Theorem 3.3.
Proof:
The lower bound of the inequality that we wish to establish is a consequence of the LBB
condition
2 min
h6=constant max~Kh6=~0
Kh;r h
k hk1; kKhk0; :
We start rst with the upper bound. Writing the components from the numerator of the
LBB condition as linear combinations of basis functions
Kh =
nJX
i=1
ki~?(J)i = ?t(J)k and h =
n X
i=1
qi?( )i = ?t( )q;
38
the integral in the numerator of the LBB condition can then be written as
Z
Kh r h =
nJX
i=1
ki
n X
j=1
qj
Z
~?(J)i r hj = k;Bt( )q :
For the denominator we have
~Kh
2
L2( )
=
Z
~Kh ~Kh =
nJX
i=1
ki
nJX
j=1
kj
Z
~?(J)i ~?(J)j = A(J)k;k
and
h 2
L2( ) =
Z
h h +r h r h
=
n X
i=1
qi
n X
j=1
qj
Z
?( )i ?( )j +
n X
i=1
qi
n X
j=1
qj
Z
r?( )i r?( )j
= q;Q( )q + q;Q(r )q = q; Q( ) +Q(r ) q q;Q(r )q ;
since Q( ) and Q(r ) are symmetric positive de nite. With A(J) symmetric positive de nite
we can diagonalize it with a matrix P and de ne its root
A(J) = PDP 1 and A1=2(J) = PD1=2P 1:
Thus,
k;Bt
( )q
= ~Kh;r h Kh
(L2( ))3
r h
L2( ) =
q
k;A(J)k
q
hq;Qr qi
and we obtain the upper bound
D
k;Bt( )q
E
q
k;A(J)k phq;Qr+ qi
1 8k2RnJ 8q2Rn :
39
In particular, for k = A 1(J)Bt( )q we have
rD
B( )A 1(J)Bt( )q;q
E
q
q;Q(r )q
1 8q2Rn :
For the lower bound, we start with the LBB condition
2 min
h6=const max~Kh6=~0
R ~Kh (r h)
~Kh
(L2( ))3
k hkH1( )
= min
q6=1
max
k6=0
D
k;Bt( )q
E
q
k;A(J)k
q
q; Q( ) +Q(r ) q
min
q6=1
max
k6=0
D
k;Bt( )q
E
q
k;A(J)k
q
q;Q(r )q
= min
q6=1
1q
q;Q
(r )q
maxk6=0
D
k;Bt( )q
E
q
k;A(J)k
= min
q6=1
1q
q;Q
(r )q
maxw6=0
D
A 1=2(J) w;Bt( )q
E
rD
A1=2(J)k;A1=2(J)k
E = minq6=1
1q
q;Q
(r )q
maxw6=0
D
w;A 1=2(J) Bt( )q
E
phw;wi
= min
q6=1
1q
q;Q
(r )q
D
A 1=2(J) Bt( )q;A 1=2(J) Bt( )q
E
rD
A 1=2(J) Bt( )q;A 1=2(J) Bt( )q
E = minq6=1
rD
B( )A 1(J)Bt( )q;q
E
q
q;Q(r )q
:
The maximum is attained when w = A 1=2(J) Bt( )q. This can be seen by incorporating the
norm of w in the denominator into the rst term of inner product in the numerator and
recalling the scalar product formula jajjbjcos ( ) = ha;bi. The largest absolute value will
occur when the unit vector wphw;wi is in the same direction or opposite direction of the other
vector. The two inequalities yield the result
22
D
B( )A 1(J)Bt( )q;q
E
Q
(r )q;q
1 8q2Rn such that q6= 1
Using the previous lemma, we have the bounds
22ch3 < min
B( )A 1(J)Bt( )
hq;qi
D
B( )A 1(J)Bt( )q;q
E
max
B( )A 1(J)Bt( )
0 and therefore = 1.
For y 6= 0, we set the rst and second row of the system equal to each other. From the
second row of the system, we have Bx = Qy = BA 1Bty and therefore
( 1)Bx = ( 1) BA 1Bty:
Solving the rst row Ax +Bty = Ax for Bty and multiplying by BA 1 we have
Bty = ( 1)Ax =) BA 1Bty = ( 1)Bx:
Therefore, BA 1Bty = ( 1) BA 1Bty and with BA 1Bt positive de nite (from the last
theorem) we have hBA 1Bty;yi> 0 and
1 = ( 1) with = 12
p5
2 :
using the quadratic formula.
44
To determine the intervals from the rst part of the theorem, we consider the two cases < 0
and > 0. Suppose that < 0 is an eigenvalue of the system. We claim this implies that
y6= 0. In contradiction, suppose that y = 0. Then x6= 0 to have a meaningful eigenvalue
problem. Looking at the rst row of the system, we have Ax + Bty = Ax. Since y = 0,
Ax = Ax. With A positive de nite, and x6= 0 it must be that = 1 > 0, a contradiction.
The rst row of the system Ax + Bty = Ax also implies that x = 1 1A 1Bty. We can
substitute this result for x into the second row of the system Bx = Qy, and obtain
1
1BA
1Bty = Qy and BA 1Bty = ( 1)Qy = (1 )Qy:
The positive de niteness and spectral equivalence of the Schur complements with their cor-
responding mass matrices along with < 0 imply
hQy;yi (1 )hQy;yi= BA 1By;y hQy;yi:
With y6= 0 we have 1. From the lower bounds of the spectral equivalences
(1 )hQy;yi= BA 1Bty;y min ( 1; 2)hQy;yi:
Hence, 2 min ( 1; 2) 0. Factorizing, we have ( +) ( ) 0, where
= 12
p1 + 4 min (
1; 2)
2 :
From + or , and we obtain the upper bound 12
p
1+4 min( 1; 2)
2 . With the
bounds 1 12
p
1+4 min( 1; 2)
2 established for < 0, we are ready to look at the last
case > 0.
For > 0 we have that x 6= 0. Suppose by contradiction that x = 0. Then, having an
eigenvalue problem with > 0 it must be that y6= 0. Taking an inner product of the second
45
row of the system Bx = 0 = Qy with y implies that y = 0, since Q(r ) positive de nite
and > 0.
Equating the rst and second row of the system by way of the inner product hBty;xi =
hBx;yi, we have
( 1)hAx;xi= hQy;yi 0:
And with x6= 0, we have 1. For the upper bound, the second row provides y = 1 Q 1Bx,
and we substitute this into the rst row
( 1)Ax = Bty = 1 BtQ 1Bx:
From the previous corollary, we have
( 1)hAx;xi= 1
1
B
tQ 1Bx;x
1 hAx;xi
and x6= 0 implies that 2 1 0. Completing the square, we have 12 2
p5
2
2 0
and the di erence of squares gives the upper bound 12 +
p5
2 .
The theorem on the block-diagonal preconditioner can be made more general as in [37].
Rather than using A in the preconditioner, spectral equivalence of A with a simpler matrix
is used in establishing eigenvalue bounds, However, the theorem above will allow us to make
comparisons in the next chapter. In particular, we will see the e ect of the preconditioner
from the theorem on eigenvalues of the simpli ed system for a particular example.
46
Chapter 4
Computational Experiments and Results
4.1 Introduction to the Computational Environment
4.1.1 Choice of the Libraries
The deal.ii library was used to implement the nite element method, p4est was used to
partition the domain to the nodes of the distributed memory cluster, and a GMRES im-
plementation from Trilinos was used to approximate the solution of the linear system in
parallel as well invoke part of the preconditioner. Freely available libraries were also used for
mesh generation and solution visualization. The open source library Gmsh [43] was used to
generate several meshes, and the open-source library VisIt [22, 23, 100] was used to visualize
the approximated solution.
Successful meshes were generated using Gmsh with particular coding techniques applied to
the Gmsh script le and alterations to output .msh les (removing all but volume elements).
The techniques were necessary, since Gmsh generates meshes using triangles in two dimen-
sions and tetrahedrons in three dimensions (n-simplices), where deal.ii works with meshes
that are quadrilaterals in two dimensions and and hexahedrals in three dimensions. There-
fore, using Gmsh required unconventional coding in its script language in order to obtain
hexahedral three-dimensional meshes.
The need for care with Gmsh was rst noticed with Navier-Stokes benchmark tests for ow
around a cylinder discussed by Turek and Schaefer [87]. During these tests it became evident
47
(a) Velocity Field with Presure Contours
(b) Velocity Vector Field
Figure 4.1: Schafer Turek Benchmark
that boundary conditions were not being properly enforced for the velocity. Nonzero veloc-
ities were occurring on the boundary where zero velocities were prescribed. After trouble-
shooting the code, it became clear that the grid was causing problems. Some web research
48
concerning Gmsh and hexahedrals lead to editing of the Gmsh script les and the output
.msh les. The problem with the boundary was xed by removing elements in the out-
put .msh le that were not volume elements (see GMSH manual http://geuz.org/gmsh/).
Results comparable to Schafer and Turek were obtained as seen in the gure 4.1. With in-
creasing library of meshes o ered by newer versions of deal.ii and continued obstacles using
Gmsh for generation, Gmsh was replaced by the deal.ii library for mesh generation toward
the end of this research. In particular, code was added to the deal.ii library to create a
toroidal mesh. A factor that appears to have in uenced this decision was the activity of the
mailing list. The depth of resources and documentation became important with increased
investment in a library.
All of the visualization of data conducted during this research was performed using the library
VisIt. VisIt was initially developed and is supported by the Lawrence Livermore National
Lab. VisIt has decent documentation. However, the lack of documentation on movies with
resolution providing crisp clarity have brought into consideration review of other libraries
such as Paraview for comparison of capabilities and documentation.
4.1.2 Working with the Libraries
It should be mentioned that problems running code with the libraries can, do and have
occurred during development. It is di cult to recall all bugs and problems encountered.
They range from a variety of sources. Problems may occur in the installation process of the
open-source libraries. This can be due to compiling a library with a faulty or unsupported
compiler, attempting to link di erent libraries that have not been compiled with the same
compiler, trying to link to versions of libraries that a particular library does not support,
trying to link shared and static libraries, or not being able to properly locate and link to
a compiler that has been installed by a system administrator. Bugs may also occur in the
code being developed, such as a segmentation fault for an incorrectly set vector size. Bugs
49
may be found in the codes of the libraries that are being used, such as a bug encountered
with the deal.ii PETSc vector values associated with ghost cells not being properly zeroed or
a bug encountered with deal.ii function calls that work properly for serial code but corrupt
for parallel code - as in mesh generation problems for an L-shaped domain. Bugs may
occur that allow the code to compile and run but give unexpected results. An example
is the approximate solution to the continuous pressure solutions of example 1 that gave
discontinuous pressure contours. Rather than set the pressure at a node, the mean value
pressure was calculated and then subtracted o the solution. This change in determining
the unique solution eliminated the discontinuity appearing in the pressure contours. As the
code was developed this problem eventually disappeared when both methods were retested.
Attempts to recreate the bug have been unsuccessful. Most problems can be traced to a
source, however some may disappear for uknown reasons with code development.
Bugs can occur outside of code run time, such as during visualization or mesh creation. An
example of this type of problem is mentioned with the opensource software Gmsh opensource
software when creating the ow around a cylinder to test out the Navier Stokes code while
developing toward the MHD code. The problem appeared to be creating hexahedral elements
using software designed to create tetrahedral elements. The problem was successfully over-
come for a mesh used to model the ow around a cylinder in a rectangular duct. However,
the problem was encountered again with the torus mesh with no apparent x. Fortunately,
deal.ii?s mesh generation capabilities were developed enough to add code to the library to
create a torus from hexhedral elements.
The use of multiple libraries can be forboding. The cause of an error may appear (and can
be) di cult to determine or understand. In troubleshooting all the bugs encountered, the
location of the bug was found rst. Then, to understand the bug?s behavior, adjustments to
the code were made. Examples are changing the sample problem being tested or the size of
a vector. The most common bugs occurred at interfaces between libraries. Important tips
50
to minimize work in tracking bugs are developing, programming, compiling and testing code
frequently and in increments.
4.2 Example 1 - A Problem with an Exact Solution
The parallel nite element code was tested for expected convergence rates on two di erent
domains: the cube [0;1]3 and an L-shaped domain
[0;1]3
[
x+ (0;1;0): x2[0;1]3
[
x+ (1;1;0): x2[0;1]3 ;
Using the example problem from [78]
u (x;y;z) :=
2
66
66
4
0
cos( x) exp( s) 12 z
cos( x) exp( s) y 12
3
77
77
5
J (x;y;z) :=
2
66
66
4
2 sin( x)(1 s) exp( s)
cos( x) exp( s) y 12
cos( x) exp( s) z 12
3
77
77
5
p(x;y;z) := 12 sin2 ( x)sexp( 2s);
(x;y;z) := 2 cos( x);
B (x;y;z) :=
2
66
66
4
1
sin( x) exp( s) 12 z
sin( x) exp( s) y 12
3
77
77
5
with r = 0;y 12;z 12 and s = jrj2 = y 12 2 + z 12 2. Note that u and B are
divergence free andr B = J (see Appendix D). The above set of equations is therefore a
smooth solution to the velocity-current magnetohydrodynmaic equations with all parameters
51
, , , and equal to unity and the data F, E, g, Jext, and Bext de ned by
F := u + (u r)u +rp J B;=
2
66
66
4
r ru1
r ru2
r ru3
3
77
77
5
+
2
66
66
4
u ru1
u ru2
u ru3
3
77
77
5
+
2
66
66
4
px
py
pz
3
77
77
5
2
66
66
4
J2B3 J3B2
J3B1 J1B3
J1B2 J2B1
3
77
77
5
=
2
66
66
66
66
66
4
0
cos( x) exp( s) z 12 2 + 9 4s + sin2( x) exp( 2s) y 12
2 cos2( x) exp( 2s) y 12
cos( x) exp( s) y 12 2 + 9 4s + sin2( x) exp( 2s) z 12
2 cos2( x) exp( 2s) z 12
3
77
77
77
77
77
5
E := J +r u B =
2
66
66
4
2 sin( x) ((1 s) exp( s) 1)
0
0
3
77
77
5
and
g := u ; Jext := J R3n ; Bext := i:
Further, instead of integrating the Biot-Savart formula over R3n , the magnetic eld induced
by Jext can be obtained from B Bext B J . For uniqueness of the pressure and the
electric potential, the exact solutions above for the pressure and potential were adjusted by
a constant
p(x;y;z) := 12 sin2 ( x)sexp( 2s) p;
(x;y;z) := 2 cos( x) 2 ;
52
where p = 1 R 12 sin2 ( x)sexp( 2s)dx was the mean pressure value of the exact solution
over the domain . For the electric potential one of the degrees of freedom on the x = 0
boundary of the domain [0;1]3 was set to the value of the electric potential at x = 0 (
(0) = 0 ) by using the deal.ii constraint matrix.
The rates of convergence on the cube and the L-shaped domain are given in the tables 4.1
and 4.2.
Table 4.1: Errors and Convergence Rates for the Unit Cube Domain
Domain Infomation u J p
# Cells # DOFs Cell size (h) H1-error L2-error L2-error H1-error
H1-rate L2-rate L2-rate H1-rate
512 34,253 0.125 0.015004 0.000573 0.008111 0.006400
(8x8x8)
4096 253,205 0.0625 0.003751 0.000136 0.002031 0.001602
(16x16x16) 2.000000 2.019216 1.999429 1.999674
32,768 1,945,637 0.03125 0.000938 0.000034 0.000508 0.000401
(32x32x32) 1.999615 2.000000 1.999290 1.998200
Table 4.2: Errors and Convergence Rates for the Lshaped Domain
Domain Infomation u J p
# Cells # DOFs Cell size (h) H1-error L2-error L2-error H1-error
H1-rate L2-rate L2-rate H1-rate
24 2183 0.5 0.404921 0.146763 0.0154336 0.217988
(2x2x2)
192 13,969 0.25 0.102145 0.037215 0.00490111 0.00558378
(4x4x4) 2.000000 2.076676 1.997737 1.998621
1536 44,508 0.125 0.02557343 0.00933088 0.000997906 0.0140478
(8x8x8) 2.000001 2.019216 1.999429 1.999674
Visualizations of the numerical approximations to the solution of this example (see Figure
4.2) show symmetry about the line at the intersection of the planes x = 12 and y = 12. Note
that to obtain illustrative perspectives in the plots for each variable, the point of view for
each variable?s plot is slightly di erent as indicated by the axes in the lower left corner. In
particular, to compare the velocity vector eld with the pressure contour plot, the pressure
53
(a) Current in Cube Domain - Example 1 (b) Potential in Cube Domain - Example 1
(c) Velocity in Cube Domain - Example 1 (d) Pressure in Cube Domain - Example 1
Figure 4.2: Example 1 for Cube Domain
plot would have to be rotated 90 degrees counter-clockwise around the z-axis. The symmetry
in each plot would then match.
Approximate solutions to the problem on an L-shaped domain were also obtained. with
graphs of the results shown in gure 4.3. From the gure, we see that the L-shaped domain
extends the pro les of each approximated function occurring on the cube domain.
4.3 Example 2 - An Applied Problem
Approximate solutions to a second problem with more realistic conditions were also obtained
for the unit cube and L-shaped domains. The problem prescribed a current entering at one
54
(a) Current in L-shaped Domain - Ex 1 (b) Potential in L-shaped Domain - Ex 1
(c) Velocity in L-shaped Domain - Ex 1 (d) Pressure in L-shaped Domain - Ex 1
Figure 4.3: Example 1 for Lshaped Domain
55
end (part of a domain face parallel to the z-axis) of the domain and exiting at the an end
opposite to the entrance. The Dirichlet boundary conditions for the velocity were zero and
the external magnetic eld was zero for the problem as well. The magnitude of the current
entering and exiting the domain through the boundary was xed at jJextj = 100. In order
to obtain a unique solution for the electric potential and pressure, the problem was handled
similar to the previous problem with an exact solution. In particular, the pressure mean
value was calculated in a post process and subtracted from the solved pressure solution. The
results can be seen in Figures 4.4, 4.5 and 5.2.
(a) Current in Cube Domain - Example 2 (b) Potential in Cube Domain - Example 2
(c) Velocity in Cube Domain - Example 2 (d) Pressure in Cube Domain - Example 2
Figure 4.4: Example 2 for Cube Domain
The current density vector eld plots exhibit the current entering and exiting the domain, as
indicated above. This can be seen for both the cube and L-shaped domains, where the current
56
(a) Current in L-shaped Domain - Ex 2 (b) Potential in L-shaped Domain - Ex 2
(c) Velocity in L-shaped Domain - Ex 2 (d) Pressure in L-shaped Domain - Ex 2
Figure 4.5: Example 2 for Lshaped Domain
travels into and out of the domain in the same direction. Since the external magnetic eld is
zero, the induced magnetic eld described by the Biot-Savart formula is the only contributor
to the forces experienced by the uid - the Lorentz force J B in the momentum equation.
In the vector eld plot for the velocity, we can see that maximum uid speeds are occurring
in the center of the domain with an upward direction. Again, the only driving force for this
ow is the Lorentz force J B due to the induced magnetic eld. In consideration of the
Biot-Savart formula and the symmetry of the current-density in the domain, we see that
the upward maximum velocity occurring in the center may be expected. The BiotSavart
formula provides for large contributions to the induced magnetic eld in regions where the
57
magnitudes J (y) of the current are large as well as where the distancesjx yjare small. We
see that the current magnitudes are largest at the boundary. In calculating the Biot-Savart
integrand at the center of the domain x = (1=2;1=2;1=2) and incorporating the negative
sign from outside the integral, vectors pointing in the direction of the negative x-axis for
both exiting and entering currents on the boundary are obtained. Likewise, close to the
center of the domain y where the distance jx yj to the center x is small, the negative
cross-product of the integrand again gives a direction toward the negative x-axis. With the
induced magnetic eld at the center having this direction, we can determine the body force
acting on the uid at this location by taking the cross product of this eld with the current
moving in the direction of the positive y-axis. This cross product yields an upward-pointing
Lorentz force in the positive z-direction. This upward body force acting on the uid does
indeed correspond to the upward velocity pro le seen at the center of the domain. Further,
the contributions noted on both sides of the plane y = 1=2 can only occur near the center
of the cube and imply a maximum magnitude there. The lack of domain symmetry with
respect to the entering and exiting current density of the L-shaped domain makes it harder
to argue the velocity pro les seen for this case.
4.3.1 Preconditioning Results
Returning to the second problem with the more realistic conditions and the cube domain, we
compare the expected theoretical results to computed computational results for the eigen-
values of this problem. Recall the form of our preconditioner
P =
2
66
66
66
64
A 1(u) 0 0 0
0 Q 1(p) 0 0
0 0 A 1(J) 0
0 0 0 Q 1(r )
3
77
77
77
75
;
58
where Q(p) is the pressure mass matrix and Q(r ) is the laplacian mass matrix.
For each eigenvalue plot, we have considered the second iteration in the Picard iteration
sequence. We plot the eigenvalues on the complex plane for the unit cube re ned once
in each direction to yield 8 cells. We start with the plot of the eigenvalues of the system
matrix before preconditioning, shown in 4.6. From this plot, we can see that the eigenvalues
are not clustered in groups, especially near the origin. We may expect that the number of
GMRES iterations to convergence to be high, and indeed without a preconditioner GMRES
fails converge (for the default tolerance and maximum number of iterations).
Figure 4.6: Eigenvalues for System Matrix
Next, we examine the eigenvalue plot in gure 4.7 for simpli ed system of theorem 3.4 after
applying our preconditioner. From the plot, we can see clustering in three locations. Looking
closely, we see the locations are exactly the eigenvalues stated in the last part of the theorem
when constructing the preconditioner by using the Schur complements instead of the mass
59
matrices. It appears that constructing with the mass matrices gives the same result. Roughly
twenty eigenvalues are located away from the three clusters. These eigenvalues are believed
to be due to the velocity boundary conditions enforced by deal.ii by way of the system matrix
[12, 1]. Further, a zero eigenvalue can be seen in the graph. This was determined to be due
to the null space of the pressure having dimension one. We note that GMRES did not have
a problem with convergence for such a system.
Figure 4.7: Eigenvalues for Preconditioned Stokes-Ohms System
Finally, we look at the e ects of preconditioning on the eigenvalues for the simpli ed system
with the inertial term, gure 4.8, and for the simpli ed system with both the inertial term
and Lorentz force terms (the entire system matrix), gure 4.9. From these plots, we see
that clustering of eigenvalues is still occurring, though not as well as for only the simpli ed
system. We note that GMRES still converged with less than 100 iterations.
60
Figure 4.8: Eigenvalues for Preconditioned Stokes-Ohms System with Inertial Term
Figure 4.9: Eigenvalues for Preconditioned System Matrix
61
Chapter 5
Conclusion and Future Work
5.1 Summary
We have implemented a parallel, object-oriented, nite element code that utilizes open-
source, academic and government software libraries. The code is capable of approximating
the solution to the velocity-current MHD equations on high performance computer clusters.
We have established a preconditioner that permits using GMRES to approximate the solution
to the system of equations iteratively. We have successfully tested the code on the unit cube
and an L-shaped domain for two problem types. For the problem with an exact solution we
have obtained expected orders of convergence [78]. We have also successfully run the code
with millions of unknowns on a 64 node distributed memory architecture at the Alabama
Supercomputer Center DMC. From these results and opportunities to discuss our work, we
have a couple di erent directions that we would like to develop in the future that will be
outlined in the upcoming sections.
5.2 Torus Domain
Discussions with physicists have indicated a potential for the velocity-current equations
to model plasma in fusion reactors. Two general types of fusion reactors are stellarators
(named after the devices purpose of harnessing the power of a stellar object - the sun) and
tokomaks (a Russion acronym, due to its origin). The main di erence between a tokomak
and a stellarator are their methods of generating the magnetic elds that con ne the plasma.
The stellarator uses two magnetic elds generated by currents running through coils that
62
surround the plasma. The tokomak generates one eld by running current through a set
of coils encircling the plasma, but creates the second eld by sending current through the
plasma. Our system of equations appears to apply naturally to the tokamak reactor design,
with a current being applied to the domain to create the second magnetic eld. We note that
research on fusion energy is active, with fusion energy being an engineering grand challenge
problem for the 21st century and grants being available through the NSF on fusion energy
research. In fact, the ITER (International Thermonuclear Experimental Reactor) project
located in the south of France is the largest tokomak reactor in the world and has been
designed to be the rst full scale fusion power plant. We can see that With the domain
of a tokomak being essentially a torus, we have interest in approximating solutions to our
equations for a toroidal domain. We have started testing out equations on this domain. In
particular, we are testing the rst problem from chapter 3, having an exact solution, on a
quarter torus. Results at this time are located in in the gure 5.1 and table 5.1.
Table 5.1: Errors and Convergence Rates for the Quarter Torus Domain
Domain Infomation u J p
# Cells # DOFs Cell size (h) H1-error L2-error L2-error H1-error
H1-rate L2-rate L2-rate H1-rate
80 5553 h1 0.132016 0.0270524 0.0257992 0.0828241
640 40,261 12h1 0.0349727 0.00647049 0.0134858 0.0214727
1.916412 2.063810 0.935884 1.947547
5120 306,597 14h1 0.00894921 0.00164881 0.0129277 0.00552349
1.966397 1.972450 0.060975 1.958852
From the convergence table for the quarter-torus we see the expected rates of approximation,
except for the pressure. Work continues on this problem to understand the problematic rate
of convergence for the pressure. We have seen a similar situation with the L-shaped domain
and will be testing our code with this in mind.
63
(a) Current in Torus Domain - Ex 1 (b) Potential in Torus Domain - Ex 1
(c) Velocity in Torus Domain - Ex 1 (d) Pressure in Torus Domain - Ex 1
Figure 5.1: Example 1 for Lshaped Domain
(a) Current in Torus Domain - Ex 2 (b) Potential in Torus Domain - Ex 2
64
(c) Velocity in Torus Domain - Ex 2 (d) Pressure in Torus Domain - Ex 2
Figure 5.2: Example 2 for Torus Domain
We have also obtained results on the quarter-torus domain for the second example of chapter
3. In this case the current can be seen entering and exiting through the cross-sections of
the torus (see Figure 5.2), with the directions of entry and departure perpendicular to each
other. The predictions for the velocity and the pressure are interesting. According to the
velocity vector eld, the uid does not appear to be cycling through to torus in the xy-plane,
but rather the movement is vertical and along the torus?s radial cross-sections. At the angle
of capture of the velocity graph, this may be di cult to see. The layers of the pressure
contours also correspond to the movement of the uid into and out of the torus? radial axis,
with high pressures located along that axis.
Finally, we have started numerical approximations of the equations on a full torus. Related
to the second example of chapter three with the velocity at the boundary being zero and
the current entering and exiting a particular part of the domain, we have the results seen in
gure 5.3
65
(a) Current in Full Torus Domain - Ex 2 (b) Potential in Full Torus Domain - Ex 2
(c) Velocity in Full Torus Domain - Ex 2 (d) Pressure in Full Torus Domain - Ex 2
Figure 5.3: Example 2 for Full Torus Domain
5.3 Code Speedup
Another research direction is the consideration of code bottlenecks. We have started work in
this direction by comparing the three slowest components of the code: the assembly of the
equation system, the solution of the system and the calculation of the Biot-Savart integral at
each quadrature point during the assembly. In the plots in gure 5.4, we quickly see that the
slowest component of the code during runtime is the calculation of the Biot-Savart integral.
Figure 5.4 considers strong scaling, where the problem size (the number of unknowns or
domain re nement) has been xed and the number of processors is varied. We recall that
the speedup for a parallel program under a xed problem size is the ratio of the parallel
66
program?s time over the sequential program?s time. Ideally, if two processors are used instead
of one, then the time to complete a portion of code should decrease by a factor of two, and
the speedup would be 2. In our observations, as we double the number of processors we
use to numerically approximation the solution to the equations of xed mesh size, we see
(in the plot on the left in gure 5.4) that the time to completion is roughly halved for each
of the components of the code being considered. Recall that parallel e ciency is the ratio
of speedup over the number of processors, and that ideal e ciency has a value of 1. We
conclude from the graphs that the libraries being used in the assembly (deal.ii) and the
solution (trilinos) of the system of equations have good parallel e ciency in this range of
processors for this particular problem size. From the graphs as well, the in-house MPI code
for the BiotSavart integral also exhibits the same level of e ciency in this processor range.
To see the speedup and e ciency more clearly, we use a log-log plot with a base of two.
Doubling the number of processors corresponds to incrementing by one unit on the x-axis,
and halving the cpu time corresponds to a decrease on the y-axis by one unit. We expect
in the ideal case that an increment of one unit along the x-axis would result in a decrease
of one unit along the y-axis, and that we would see a linear trend in the log-log plot with a
slope of negative one. The plot on the right in gure 5.4 does indeed show trends close to
ideal.
With the three slowest components working near optimal e ciency with respect to strong
scaling observations, we would like to address the slowest component of the code, the Biot-
Savart integral. The Biot-Savart integral calculates the magnetic eld induced by the current
density in the domain. The code does this using quadrature. Each processor contains
information - previous current densities and quadrature points - needed to perform the
quadrature calculation over the domain for a particular location x in
B (Jn 1) (x) = 4
Z
x y
jx yj3 Jn 1 (y)dy
67
Figure 5.4: CPU Time Versus Number of Processors
to determine the value of the Biot-Savart integral at that particular location. With all
the information readily available on each processor (after a numerical solve), the major
component of the calculation is the MPI communication between processors. To see if
we can shorten the time to approximate the induced magnetic eld at these locations, we
consider approximating the solution to the magnetic eld in the Maxwell equations
r B = 0 and r B = J(n 1) on
with the boundary condition
68
B n = B (Jn 1) n
by a parallel nite element method, using deal.ii. We approximate the solution to the system
of equations using the least squares nite element method [67, 19]. Just as with the solution
method using only the Biot-Savart integral, the obtained magnetic eld values are used in
the next solve for the four uknowns J, u, p, . For the rst example problem from chapter
three with exact solutions, we look at strong scaling using this method, and compare this
to the Biot-Savart method. Note that the method using only the Biot-Savart integral was
optimized to make the comparison as fair as possible - leading to considerable reductions
that can be seen when comparing gure 5.5 to 5.4. However, even with optimizations to the
communication occurring during the Biot-Savart method, we can see that the solution of the
div-curl system is almost an order of magnitude faster and appears to parallelize well.
As can be seen in the gure 5.5, we even tested an approximation to the BiotSavart integral
method, where we used the fact that the denominator in the integrand of the BiotSavart
integral caused the magnetic eld values to be small with large distances between x and y.
This method was a rst step toward a fast multipole method for the integral [24, 47, 13, 46,
93, 64]. However, the least squares nite element method to calculate the induced magnetic
eld in the domain outperformed this simplifying approximation to the quadrature method
for the BiotSavart integral. This div-curl nite element solver for the induced magnetic eld
seems to be a natural direction for future work. For future work, we would like to do some
analysis on this code and extend this code to domains that are nonconvex. A major issue
is the possible occurrence of singularities in such a domains [26]. Hence, the geometry of
the domain must be taken into consideration [27, 14, 63, 28] with the possible use of the
discontinuous nite element method. It may also be of interest to address the problem from
the point of view of nite element exterior calculus [61, 6, 4, 5].
69
Figure 5.5: CPU Time Versus Number of Processors
70
Appendices
71
Appendix A
Weak Formulation and Identities
A.1 Navier-Stokes Identities
Here, we go over the identities
Z
v rp =
Z
pr v +
Z
@
pn v
Z
v r2u =
Z
ru: rv
Z
@
(n ru) v
Z
((u r) u) v = 2
Z
((u r) u) v
Z
((u r) v) u
to write Navier-Stokes equations in the form below.
Z
F v =
Z
u v +
Z
((u r) u) v +
Z
rp v
=
Z
ru: rv + 2
Z
((u r) u) v
Z
((u r) v) u
Z
pr v
A.1.1 Pressure Term
We weaken the pressure in the term R rp v by using
@
@x (v a) =
@
@x (v1a1 +v2a2 +v3a3)
= v1@a1@x + @v1@xa1 +v2@a2@x + @v2@xa2 +v3@a3@x + @v3@xa3
72
= v
@
@xa
+
@
@xv
a
to form the identity
r (pv) =
@pv
1
@x ;
@pv2
@y ;
@pv3
@z
=
p@v1@x +v1@p@x;p@v2@y +v2@p@y;p@v3@z +v3@p@z
= v rp+pr v:
Using this result with the divergence theorem R r v = R@ v n we have
Z
v rp =
Z
pr v +
Z
r (pv)
=
Z
pr v +
Z
@
pn v
A.1.2 Di usion Term
The di usion term R r2u v is weakened using integration by parts and the identity
r (r~u ~v) = r~u: r~v +~v r2~u
Writing the gradient of the velocity vector in matrix form
ru =
2
66
66
4
@
@x
@
@y
@
@z
3
77
77
5
u1 u2 u3
=
2
66
66
4
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
3
77
77
5
(A.1)
73
The di usion term can then be written as
u = r2u =r (ru) =
@
@x1
@
@x2
@
@x3
2
66
66
4
u1;x u2;x u3;x
u1;y u2;y u3;y
u1;z u2;z u3;z
3
77
77
5
=
2
66
66
4
@2u1
@x2 +
@2u1
@y2 +
@2u1
@z2
@2u2
@x2 +
@2u2
@y2 +
@2u2
@z2
@2u3
@x2 +
@2u3
@y2 +
@2u3
@z2
3
77
77
5
=
2
66
66
4
r ru1
r ru2
r ru3
3
77
77
5
= r2u1;r2u2;r2u3
Here we will write the tensor-tensor product ru :rv as
ru :rv =
2
66
66
4
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
3
77
77
5
:
2
66
66
4
@v1
@x
@v2
@x
@v3
@x
@v1
@y
@v2
@y
@v3
@y
@v1
@z
@v2
@z
@v3
@z
3
77
77
5
= u1;xv1;x +u1;yv1;y +u1;zv1;z +u1;xv2;x +u2;yv2;y +u2;zv2;z
+u3;xv3;x +u3;yv3;y +u3;zv3;z
= (ru1 rv1) + (ru2 rv2) + (ru3 rv3)
(A.2)
Consider the divergence of the tensor-vector product ru v
r (ru v) = r
0
BB
BB
@
0
BB
BB
@
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
1
CC
CC
A
v
1
CC
CC
A
=r
@
@xu
v;
@
@yu
v;
@
@zu
v
= @@x
@
@xu v
+ @@y
@
@yu v
+ @@z
@
@zu v
=
@2u
@x2 v +
@v
@x
@u
@x
+
@2u
@y2 v +
@v
@y
@u
@y
@2u
@z2 v +
@v
@z
@u
@z
= v hr ru1;r ru2;r ru3i+ (rv1 ru1 +rv2 ru2 +rv3 ru3)
= v r2u1;r2u2;r2u3 +rv: ru = v r2u +rv: ru
74
Hence,
Z
r (ru v) =
Z
v (r ru) +
Z
ru: rv
With the help of the divergence thereom, we can rewrite the di usion term R r2u v in
Navier Stokes
Z
v r2u =
Z
ru: rv
Z
r (ru v)
=
Z
ru: rv
Z
@
(ru v) n
=
Z
ru: rv
Z
@
(n ru) v:
The last equality is true since
(ru v) n =
0
BB
BB
@
0
BB
BB
@
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
1
CC
CC
A
v
1
CC
CC
A
n =
0
BB
BB
@
v @@xu
v @@yu
v @@zu
1
CC
CC
A
n
= n1v @@xu +n2v @@yu +n3v @@zu =
n1 @@xu +n2 @@yu +n3 @@zu
v
=
n
@
@xu;
@
@yu;
@
@zu
v =
0
BB
BB
@
n
0
BB
BB
@
@u1
@x
@u2
@x
@u3
@x
@u1
@y
@u2
@y
@u3
@y
@u1
@z
@u2
@z
@u3
@z
1
CC
CC
A
1
CC
CC
A
v
= (n ru) v
A.1.3 Advection Term
For the advection term, we wish to show
Z
((u r) u) v = 2
Z
((u r) u) v
Z
((u r) v) u
75
To start,
Z
((u r) u) v = 2
Z
((u r) u) v + 2
Z
((u r) u) v
= 2
Z
((u r) u) v +
Z
((u r) u) v
We wish to show that
Z
((u r) u) v =
Z
((u r) v) u
we have that
((u r) u) v =
2
66
66
4
u1@u1@x +u2@u1@y +u3@u1@z
u1@u2@x +u2@u2@y +u3@u2@z
u1@u3@x +u2@u3@y +u3@u3@z
3
77
77
5
v
= v1
u1@u1@x +u2@u1@y +u3@u1@z
+v2
u1@u2@x +u2@u2@y +u3@u2@z
+v3
u1@u3@x + +u2@u3@y +u3@u3@z
(A.3)
and that
((u r) v) u =
2
66
66
4
u1@v1@x +u2@v1@y +u3@v1@z
u1@v2@x +u2@v2@y +u3@v2@z
u1@v3@x +u2@v3@y +u3@v3@z
3
77
77
5
u
= u1
u1@v1@x +u2@v1@y +u3@v1@z
+u2
u1@v2@x +u2@v2@y +u3@v2@z
+v3
u1@v3@x + +u2@v3@y +u3@v3@z
(A.4)
Using Integration-By-Parts and assuming that u = 0, we have
Z
u1@u1@x
1
v1 =
Z
(u1v1) @u1@x
1
=
Z
@(v1u1)
@x1 u1 +
Z
v1u1u1dS
76
=
Z
@(v1u1)
@x1 u1 =
Z
v1@(u1)@x
1
u1
Z
u1@(v1)@x
1
u1
Therefore,
Z
v1
u1@u1@x +u2@u1@y +u3@u1@z
+
Z
v2
u1@u2@x +u2@u2@y +u3@u2@z
+
Z
v3
u1@u3@x + +u2@u3@y +u3@u3@z
=
Z
v1@u1@xu1
Z
u1@v1@xu1
Z
v1@u2@y u1
Z
u2@v1@yu1
Z
v1@u3@z u1
Z
u3@v1@z u1
Z
v2@u1@xu2
Z
u1@v2@xu2
Z
v2@u2@y u2
Z
u2@v2@yu2
Z
v2@u3@z u2
Z
u3@v2@z u2
Z
v3@u1@xu3
Z
u1@v3@xu3
Z
v3@u2@y u3
Z
u2@v3@yu3
Z
v3@u3@z u3
Z
u3@v3@z u3
(A.5)
Therefore, collecting terms we have
Z
v1
u1@u1@x +u2@u1@y +u3@u1@z
+
Z
v2
u1@u2@x +u2@u2@y +u3@u2@z
+
Z
v3
u1@u3@x + +u2@u3@y +u3@u3@z
=
Z
u1@v1@xu1 +
Z
u2@v1@y u1 +
Z
u3@v1@z u1
Z
v1@u1@x u1 +
Z
v1@u2@y u1 +
Z
v1@u3@z u1
Z
u1@v2@xu2 +
Z
u2@v2@y u2 +
Z
u3@v2@z u2
Z
v2@u1@x u2 +
Z
v2@u2@y u2 +
Z
v2@u3@z u2
Z
u1@v3@xu3 +
Z
u2@v3@y u3 +
Z
u3@v3@z u3
Z
v3@u1@x u3 +
Z
v3@u2@y u3 +
Z
v3@u3@z u3
(A.6)
Writing this in vector notation, we have
77
Z
((u r) u) v =
Z
(u ru1)v1 + (u ru2)v2 + (u ru3)v3
=
Z
(u rv1)u1
Z
v1 (r u)u1
Z
(u rv2)u2
Z
v2 (r u)u2
Z
(u rv3)u3
Z
v3 (r u)u3
=
Z
(u rv1)u1
Z
(u rv2)u2
Z
(u rv3)u3 =
Z
((u r) v) u (A.7)
Hence,
Z
((u r) u) v = 2
Z
((u r) u) v +
Z
((u r) u) v
= 2
Z
((u r) u) v
Z
((u r) v) u
A.2 Ohms Law
We wish to show
Z
(u B) K =
Z
(K B) u:
This amounts to show that for any vector a, b, and c
Z
(a b) c =
Z
(c b) a (A.8)
To start, recall that the cross product between two vectors a and b is
78
a b =ha1;a2;a3i hb1;b2;b3i=
i j k
a1 a2 a3
b1 b2 b3
=
a2 a3
b2 b3
i
a1 a3
b1 b3
j +
a1 a2
b1 b2
k
= (a2b3 a3b2) i (a1b3 a3b1) j + (a1b2 a2b1) k (A.9)
Now, we wish to show that the identity
Z
(a b) c =
Z
(c b) a (A.10)
is true.
First, note that a (b c) = (a b) c
a (b c) =ha1;a2;a3i (hb1;b2;b3i hc1;c2;c3i)
=ha1;a2;a3i ((b2c3 b3c2) i (b1c3 b3c1) j + (b1c2 b2c1) k)
= a1 (b2c3 b3c2) a2 (b1c3 b3c1) +a3 (b1c2 b2c1)
= a1b2c3 a1b3c2 a2b1c3 +a2b3c1 +a3b1c2 a3b2c1
= (a2b3 a3b2)c1 (a1b3 a3b1)c2 + (a1b2 a2b1)c3 = (a b) c (A.11)
Since a b = b a and a b = b a
(a b) c = a (b c) = (b c) a = (c b) a (A.12)
Therefore,
79
Z
(a b) c =
Z
(c b) a (A.13)
80
Appendix B
Newton?s Method Linearization
Recalling Newton?s method
X(k+1) = X(k) H
X(k)
H0(X(k)) (B.1)
with
H(X) =
2
66
66
66
66
66
66
66
66
66
66
4
u1 + (u r)u1 + @p@x (J2B3 J3B2) F1
u2 + (u r)u2 + @p@y + (J1B3 J3B1) F2
u3 + (u r)u3 + @p@z (J1B2 J2B1) F3
1J1 + @ @x (u2B3 u3B2) E1
1J2 + @ @y (u1B3 u3B1) E2
1J3 + @ @z (u1B2 u2B1) E3
@u1
@x +
@u2
@y +
@u3
@z
@J1
@x +
@J2
@y +
@J3
@z
3
77
77
77
77
77
77
77
77
77
77
5
and the variable X = (u1;u2;u3;p;J1;J2;J3; ) obtained from the equations
u + (u r) u +rp J B = F
1J +r u B = E;
r u = 0 and r J = 0:
Also recall the Gateaux deritave
H0G X(k) ( X) = lim !0 H
X(k) + X H X(k)
; (B.2)
81
where X = X(k+1) X(k) and X = ( u1; u2; u3; p; J1; J2; J3): Note that
will be used for increment as opposed to using it for the Laplacian.
Since
B = B0 +B(J)
= B0 4
Z
x y
jx yj3 J (y)dy =
2
66
66
4
(B0)1 4 R (x2 y2)J3(y) (x3 y3)J2(y)jx yj3 dy
(B0)2 4 R (x1 y1)J3(y) (x3 y3)J1(y)jx yj3 dy
(B0)3 4 R (x1 y1)J2(y) (x2 y2)J1(y)jx yj3 dy
3
77
77
5
82
we have
H
X(k)
=
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
r ru(k)1 + u(k) r u(k)1 + @p(k)@x
J(k)2
(B0)3 4 R (x1 y1)J(k)2 (x2 y2)J(k)1jx yj3 dy
+J(k)3
(B0)2 4 R (x1 y1)J(k)3 (x3 y3)J(k)1jx yj3 dy
F1
r ru(k)2 + u(k) r u(k)2 + @p(k)@y
+J(k)1
(B0)3 4 R (x1 y1)J(k)2 (x2 y2)J(k)1jx yj3 dy
J(k)3
(B0)1 4 R (x2 y2)J(k)3 (x3 y3)J(k)2jx yj3 dy
F2
r ru(k)3 + u(k) r u(k)3 + @p(k)@z
J(k)1
(B0)2 4 R (x1 y1)J(k)3 (x3 y3)J(k)1jx yj3 dy
+J(k)2
(B0)1 4 R (x2 y2)J(k)3 (x3 y3)J(k)2jx yj3 dy
F3
1J(k)1 + @ (k)@x u(k)2
(B0)3 4 R (x1 y1)J(k)2 (x2 y2)J(k)1jx yj3 dy
+u(k)3
(B0)2 4 R (x1 y1)J(k)3 (x3 y3)J(k)1jx yj3 dy
E1
1J(k)2 + @ (k)@y u(k)1
(B0)3 4 R (x1 y1)J(k)2 (x2 y2)J(k)1jx yj3 dy
+u(k)3
(B0)1 4 R (x2 y2)J(k)3 (x3 y3)J(k)2jx yj3 dy
E2
1J(k)3 + @ (k)@z u(k)1
(B0)2 4 R (x1 y1)J(k)3 (x3 y3)J(k)1jx yj3 dy
+u(k)2
(B0)1 4 R (x2 y2)J(k)3 (x3 y3)J(k)2jx yj3 dy
E3
@u(k)1
@x +
@u(k)2
@y +
@u(k)3
@z
@J(k)1
@x +
@J(k)2
@y +
@J(k)3
@z
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
and
83
H
X(k) + X
=
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
r r
u(k)1 + u1
+ u(k) + u r
u(k)1 + u1
+@(p
(k)+ p)
@x
J(k)2 + J2
(B0)3 4 R (x1 y1)
J(k)2 + J2
(x2 y2)
J(k)1 + J1
jx yj3 dy
+
J(k)3 + J3
(B0)2 4 R (x1 y1)
J(k)3 + J3
(x3 y3)
J(k)1 + J1
jx yj3 dy
F1
r r
u(k)2 + u2
+ u(k) + u r
u(k)2 + u2
+ @(p
(k)+ p)
@y
+
J(k)1 + J1
(B0)3 4 R (x1 y1)
J(k)2 + J2
(x2 y2)
J(k)1 + J1
jx yj3 dy
J(k)3 + J3
(B0)1 4 R (x2 y2)
J(k)3 + J3
(x3 y3)
J(k)2 + J2
jx yj3 dy
F2
r r
u(k)3 + u3
+ u(k) + u r
u(k)3 + u3
+ @(p
(k)+ p)
@z
J(k)1 + J1
(B0)2 4 R (x1 y1)
J(k)3 + J3
(x3 y3)
J(k)1 + J1
jx yj3 dy
+
J(k)2 + J2
(B0)1 4 R (x2 y2)
J(k)3 + J3
(x3 y3)
J(k)2 + J2
jx yj3 dy
F3
1
J(k)1 + J1
+ @(
(k)+ )
@x
u(k)2 + u2
(B0)3 4 R (x1 y1)
J(k)2 + J2
(x2 y2)
J(k)1 + J1
jx yj3 dy
+
u(k)3 + u3
(B0)2 4 R (x1 y1)
J(k)3 + J3
(x3 y3)
J(k)1 + J1
jx yj3 dy
E1
1
J(k)2 + J2
+ @(
(k)+ )
@y
u(k)1 + u1
(B0)3 4 R (x1 y1)
J(k)2 + J2
(x2 y2)
J(k)1 + J1
jx yj3 dy
+
u(k)3 + u3
(B0)1 4 R (x2 y2)
J(k)3 + J3
(x3 y3)
J(k)2 + J2
jx yj3 dy
E2
1
J(k)3 + J3
+ @(
(k)+ )
@z
u(k)1 + u1
(B0)2 4 R (x1 y1)
J(k)3 + J3
(x3 y3)
J(k)1 + J1
jx yj3 dy
+
u(k)2 + u2
(B0)1 4 R (x2 y2)
J(k)3 + J3
(x3 y3)
J(k)2 + J2
jx yj3 dy
E3
@
u(k)1 + u1
@x +
@
u(k)2 + u2
@y +
@
u(k)3 + u3
@z
@
J(k)1 + J1
@x +
@
J(k)2 + J2
@y +
@
J(k)3 + J3
@z
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
5
84
Then, since
u(k) + u
r
u(k)1 + u1
=
u(k) + u
r
u(k)1 +
u(k) + u
r
u1
=
u(k) r
u(k)1 + ( u r)u(k)1 +
u(k) r
u1
+ ( u r) u1
J(k)2 + J2
0
@(B0)3 4
Z
(x1 y1)
J(k)2 + J2
(x2 y2)
J(k)1 + J1
jx yj3 dy
1
A
= J(k)2
0
@(B0)3 4
Z
(x1 y1)
J(k)2 + J2
(x2 y2)
J(k)1 + J1
jx yj3 dy
1
A
+ J2
0
@(B0)3 4
Z
(x1 y1)
J(k)2 + J2
(x2 y2)
J(k)1 + J1
jx yj3 dy
1
A
= J(k)2
(B0)3 4
Z
(x1 y1)J(k)2 (x2 y2)J(k)1
jx yj3 dy
!
+ J(k)2
0 4
Z
(x1 y1) J2 (x2 y2) J1
jx yj3 dy
!
+ J2
(B0)3 4
Z
(x1 y1)J(k)2 (x2 y2)J(k)1
jx yj3 dy
!
+ J2
0 4
Z
(x1 y1) J2 (x2 y2) J1
jx yj3 dy
!
lim !0 H
X(k) + X H X(k)
85
=
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
r r( u1) + ( u r)u(k)1 + u(k) r u1 + @ p@x
J(k)2
0 4 R (x1 y1) J2 (x2 y2) J1jx yj3 dy
J2
(B0)3 4 R (x1 y1)J(k)2 (x2 y2)J(k)1jx yj3 dy
+J(k)3
0 4 R (x1 y1) J3 (x3 y3) J1jx yj3 dy
+ J(k)3
(B0)2 4 R (x1 y1)J(k)3 (x3 y3)J(k)1jx yj3 dy
r r u2 + ( u r)u(k)2 + u(k) r u2 + @ p@y
+J(k)1
0 4 R (x1 y1) J2 (x2 y2) J1jx yj3 dy
+ J(k)1
(B0)3 4 R (x1 y1)J(k)2 (x2 y2)J(k)1jx yj3 dy
J(k)3
0 4 R (x2 y2) J3 (x3 y3) J2jx yj3 dy
J(k)3
(B0)1 4 R (x2 y2)J(k)3 (x3 y3)J(k)2jx yj3 dy
r r u3 + ( u r)u(k)3 + u(k) r u3 + @ p@z
J(k)1
0 4 R (x1 y1) J3 (x3 y3) J1jx yj3 dy
J1
(B0)2 4 R (x1 y1)J(k)3 (x3 y3)J(k)1jx yj3 dy
+J(k)2
0 4 R (x2 y2) J3 (x3 y3) J2jx yj3 dy
+ J(k)2
(B0)1 4 R (x2 y2)J(k)3 (x3 y3)J(k)2jx yj3 dy
1 J1 + @ @x
u(k)2
0 4 R (x1 y1) J2 (x2 y2) J1jx yj3 dy
u2
(B0)3 4 R (x1 y1)J(k)2 (x2 y2)J(k)1jx yj3 dy
+u(k)3
0 4 R (x1 y1) J3 (x3 y3) J1jx yj3 dy
+ u3
(B0)2 4 R (x1 y1)J(k)3 (x3 y3)J(k)1jx yj3 dy
1 J2 + @ @y
u(k)1
0 4 R (x1 y1) J2 (x2 y2) J1jx yj3 dy
u1
(B0)3 4 R (x1 y1)J(k)2 (x2 y2)J(k)1jx yj3 dy
+u(k)3
0 4 R (x2 y2) J3 (x3 y3) J2jx yj3 dy
+ u3
(B0)1 4 R (x2 y2)J(k)3 (x3 y3)J(k)2jx yj3 dy
1 J3 + @ @z
u(k)1
0 4 R (x1 y1) J3 (x3 y3) J1jx yj3 dy
u1
(B0)2 4 R (x1 y1)J(k)3 (x3 y3)J(k)1jx yj3 dy
+u(k)2
0 4 R (x2 y2) J3 (x3 y3) J2jx yj3 dy
+ u2
(B0)1 4 R (x2 y2)J(k)3 (x3 y3)J(k)2jx yj3 dy
@ u1
@x +
@ u2
@y +
@ u3
@z
@ J1
@x +
@ J2
@y +
@ J3
@z
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
86
=
2
66
66
66
64
r r( u) + (( u) r)u(k) + u(k) r ( u) +r( p) J(k) B( J) ( J) B J(k)
1 ( J) +r( ) u(k) B( J) ( u) B J(k)
r ( u)
r ( J)
3
77
77
77
75
So, we have
H0G X(k) ( X) =
2
66
66
66
66
66
66
66
66
66
66
4
r r( u) + (( u) r) u(k) + u(k) r ( u) +r( p)
J(k) B( J) ( J) B J(k)
1 ( J) +r( ) u(k) B( J) ( u) B J(k)
r ( u)
r ( J)
3
77
77
77
77
77
77
77
77
77
77
5
(B.3)
Recalling X = X(k+1) X(k), X = ( u1; u2; u3; p; J1; J2; J3), and Newton?s
method H0 X(k) ( X) = H X(k) , we have
2
66
66
66
64
r ru(k+1) + u(k+1) r u(k) + u(k) r u(k+1) +rp(k+1) J(k) B J(k+1) J(k+1) B J(k)
1J(k+1) +r (k+1) u(k) B J(k+1) u(k+1) B J(k)
r u(k+1)
r J(k+1)
3
77
77
77
75
2
66
66
66
64
r ru(k) + u(k) r u(k) + u(k) r u(k) +rp(k) J(k) B J(k) J(k) B J(k)
1J(k) +r (k) u(k) B J(k) u(k) B J(k)
r u(k)
r J(k)
3
77
77
77
75
= H
X(k)
87
Therefore
2
66
66
66
66
66
66
66
66
66
66
4
r ru(k+1) + u(k+1) r u(k) u(k) r u(k) + u(k) r u(k+1)
+rp(k+1) J(k) B J(k+1) + J(k) B J(k) J(k+1) B J(k)
1J(k+1) +r (k+1) u(k) B J(k+1) + u(k) B J(k) u(k+1) B J(k)
r u(k+1)
r J(k+1)
3
77
77
77
77
77
77
77
77
77
77
5
=
2
66
66
66
66
66
66
66
66
64
F
E
0
0
3
77
77
77
77
77
77
77
77
75
88
Appendix C
Iterative Solvers
C.1 Conjugate Gradient (CG) Method
We recall the conjugate gradient method from the point of view of steepest descent, where
we try to nd the minimum of the function (x) = 12xtAx xtb where b2Rn and A2Rn n
is symmetric positive de nite. Withr (x) =
h
@
@x1 (x);:::;
@
@xk (x);:::;
@
@xn (x)
i
, consider
@
@xk (x) =
@
@xk
1
2x
tAx xtb
= @@x
k
1
2hAx;xi hb;xi
= @@x
k
1
2
nX
i=1
xi
nX
j=1
aijxj
nX
i=1
bixi
!
= 12
nX
i=1
nX
j=1
aijxj @@x
k
xi + 12
nX
i=1
nX
j=1
aijxi @@x
k
xj
nX
i=1
bi @@x
k
xi
= 12
nX
j=1
akjxj + 12
nX
i=1
xiaik bk =
nX
j=1
akjxj bk = rowk (A)x bk
Thenr (x) = Ax b and an extreme value is attained whenr = 0, Ax = b, and we have
solve our system of equations. Recalling that at a location xn, a function decreases greatest
in the opposite direction of its gradient r (xn) = b Axn = rn and in the direction
of steepest descent. If the gradient is nonzero, we can decrease (xn) by moving in the
direction of rn. Due to the symmetry and positive de niteness of A, the quadratic form
(x) will attain a minimum at some location xn + rn in this direction. A good example to
illustrate (x) is when n = 2, x = (x1;x2), and (x) is the elliptic paraboloid. To determine
89
and the location of the minimum in the direction rn, we nd the critical value of
~ ( ) = (xn + rn)
= 12 (xn rn)tA(xn rn) (xn + rn)tb
= 12hAxn;xni+ hAxn;rni+ 2hArn;rni hb;xni hb;rni:
Di erentiating,
@
@
~ ( ) = hAxn;rni+ 2 hArn;rni hb;rni
= hAxn b;rni+ 2 hArn;rni= hrn;rni+ 2 hArn;rni:
Setting the derivative to zero, @@ ~ ( ) = 0, we have = hrn;rni2hArn;rni. The steepest descent
algorithm is given below.
Method of Steepest Descent
int k = 0;
vector xk; // initial guess x0
vector rk = b Axk; // initial residual r0
while rk 6= 0
k = k + 1;
k = hrk 1;rk 1i = hArk 1;rk 1i;
xk = xk 1 + krk 1;
rk = b Axk;
However, if the matrix A is ill-conditioned, the method of determining search directions
using the gradients and steepest descent may not be the most e cient in terms of the
number of iterations to convergence. By bringing the structure of A into the algorithm to
more e ectively determine search directions we can obtain the conjugate gradient method
90
[44] shown below. With k = 0 the conjugate gradient algorithm returns to the method of
steepest descent.
Conjugate Gradient (CG) Method
int k = 0;
vector xk; // initial guess x0
vector rk = b Axk; // initial residual r0
while rk 6= 0
k = k + 1;
if k = 1
p1 = r0
else
k = hrk 1;rk 1i = hrk 2;rk 2i;
pk = rk 1 + k pk 1
end
k = hrk 1;rk 1i = hApk;pki;
xk = xk 1 + kpk;
rk = rk 1 A pk;
The operations per iteration for this algorithm are two inner products, one matrix-vector
product, three scalar-vector products, two scalar-scalar multiplications and three additions.
If the system matrix A is sparse with m nonzeros per row and m<