**in C++** **and MPI**

*A seamless approach to parallel algorithms and their implementation* George Em Karniadakis and Robert M. Kirby II

### Cambridge University Press

Scientiﬁc computing is by its very nature a practical subject - it requires tools and a lot of
practice. To solve realistic problems we need not only fast algorithms but also a combination
of good tools and fast computers. This is the subject of the current book, which emphasizes
equally all three: algorithms, tools, and computers. Often times such concepts and tools are
taught*serially*across diﬀerent courses and diﬀerent textbooks, and hence the interconnection
between them is not immediately apparent. We believe that such a close integration is
important from the outset.

The book starts with a heavy dosage of C++ and basic mathematical and computational concepts, and it ends emphasizing advanced parallel algorithms that are used in modern simulations. We have tried to make this book fun to read, to somewhat demystify the subject, and thus the style is sometimes informal and personal. It may seem that this happens at the expense of rigor, and indeed we have tried to limit notation and theorem prooﬁng. Instead, we emphasize concepts and useful tricks-of-the-trade with many code segments, remarks, reminders, and warnings throughout the book.

The material of this book has been taught at diﬀerent times to students in engineering, physics, computer science, and applied mathematics at Princeton University, Brown Univer- sity, and MIT over the last 15 years. Diﬀerent segments have been taught to undergraduates and graduates, to novices as well as to experts. To this end, on all three subjects covered, we start with simple introductory concepts and proceed to more advanced topics - bandwidth, we believe, is one strength of this book.

We have been involved in large-scale parallel computing for many years from benchmark- ing new systems to solving complex engineering problems in computational mechanics. We represent two diﬀerent generations of computational science and supercomputing, and our expertise are both overlapping and complementary. The material we selected to include in this book is based on our experiences and needs as computational scientists for high-order accuracy, modular code, and domain decomposition. These are necessary ingredients for pushing the envelope in simulation science and allow one to test new theories and concepts or solve very large speciﬁc engineering problems accurately.

In addition to integrating C++ and MPI concepts and programs into the text, we also
provide with this book a *software suite*containing all the functions and programs discussed.

It is our belief, as stated earlier, that mastery of this subject requires both a knowledge of the tools and substantial practice using the tools. Part of the integration that we are attempting to achieve is attained when the reader is able to go immediately from the textbook to the computer to experiment with the concepts which have been presented. We envision the software suite allowing the reader to do the following: to verify the concepts presented in the book by using the programs that are provided, to extend the programs in the book to implement concepts that may have been discussed but not programmed, and to tackle diﬀerent problems than those presented using the software provided.

i

The current book is appropriate for use by students in engineering and physics, computer
science, and applied mathematics. It is designed to be more like a textbook and less of
a research monograph. The material can be used to ﬁll two semesters with the following
breakdown: The ﬁrst semester will cover chapters 1 to 5 at the senior undergraduate or ﬁrst
year graduate level. The second semester will cover the remainder of the book in a ﬁrst or
second year graduate course. Chapters 1 to 5 cover all the basic concepts in algorithms, C++,
and MPI. Chapters 6 to 10 cover discretization of diﬀerential equations and corresponding
solvers, and present more advanced C++ and MPI tools. The material in chapter 3 on
approximation of functions and discrete data is fundamental and precedes other topics. In
the basic material on discretization, we separated *explicit* from *implicit*approaches because
the parallel computational complexity of the two is fundamentally diﬀerent.

A lighter course, e.g. a quarter course or a lower level undergraduate course, could be based on chapters 1 to 5 by leaving out the MPI material and possibly other advanced topics such as wavelets, advanced quadrature rules, and systems of nonlinear equations.

There are other possibilities as well. A graduate level course on numerical linear algebra can be based on sections 4.1.6, 4.1.7 and chapters 7 to 10. Assuming that the student has a C++

background or even another high performance language then the addition of MPI material in sections 2.3, 3.4, 4.3 and 5.13 to the above will constitute one full semester course on parallel numerical linear algebra. Another possibility for a quarter course is to simply teach the algorithms in chapters 5 to 8 covering traditional numerical analysis. Supplementary notes from the instructor, e.g. theorem proofs and more case studies, can make this a full semester course.

The book is designed so that it can be used with or without the C++ and MPI tools and associated concepts but we strongly encourage the instructor to teach the course as a seamless integration of both algorithms and tools.

**Acknowledgements**

We are grateful to Dr. Ma Xia and Dr. C. Evangelinos for their help and advice regarding the material of this topic and for some of the ﬁgures that they provided. We would also like to thank Ms. Madeline Brewster for her help in formatting the book and for typing a major part of it. The ﬁrst author is grateful for the many years of funding by the Oﬃce of Naval Research, the Air Force Oﬃce of Scientiﬁc Research, and the Department of Energy.

Finally, we would like to thank our families for their continuous love, patience, and understanding, especially during this long project.

Providence, Rhode Island, USA George Em Karniadakis

Salt Lake City, Utah, USA Robert M. Kirby II

ii

**Contents**

**1** **Scientiﬁc Computing and Simulation Science** **2**

1.1 What is Simulation? . . . 2

1.2 A Seamless Approach Path . . . 4

1.3 The Concept of Programming Language . . . 6

1.4 Why C++ and What is MPI? . . . 8

1.5 What About OpenMP? . . . 10

1.6 Algorithms and Top Ten List . . . 10

**2** **Basic Concepts and Tools** **12**
2.1 Introduction to C++ . . . 13

2.1.1 Two Basic Concepts in C++ . . . 14

2.1.2 Learning the Syntax and Other Basic Commands . . . 21

2.1.3 Learning to Print . . . 34

2.1.4 Learning to Read . . . 35

2.1.5 How to Program in Style . . . 37

2.2 Mathematical and Computational Concepts . . . 41

2.2.1 Notation . . . 41

2.2.2 Binary Numbers and Round-oﬀ . . . 41

2.2.3 Condition Number . . . 44

2.2.4 Vector and Matrix Norms . . . 44

2.2.5 Eigenvalues and Eigenvectors . . . 46

2.2.6 Memory Management . . . 48

2.2.7 Basic Linear Algebra - BLAS . . . 52

2.2.8 Exploiting the Structure of Sparse Matrices . . . 61

2.2.9 Gram-Schmidt Vector Orthogonalization . . . 62

2.3 Parallel Computing . . . 70

2.3.1 From Supercomputing to Soupercomputing . . . 70

2.3.2 Mathematical Parallelism and Recursive-Doubling . . . 76

2.3.3 Amdahl’s Law . . . 79

2.3.4 MPI - Message Passing Interface . . . 80

2.4 Homework Problems . . . 91

**3** **Approximation** **94**
3.1 Polynomial Representation . . . 95

3.1.1 Vandermonde and Newton Interpolation . . . 95 iii

3.1.3 Lagrangian Interpolation . . . 114

3.1.4 The Runge Phenomenon . . . 117

3.1.5 Chebyshev Polynomials . . . 120

3.1.6 Hermite Interpolation and Splines . . . 126

3.1.7 Least-Squares Approximation . . . 131

3.1.8 Introduction to Classes . . . 142

3.1.9 Multi-Dimensional Interpolations . . . 153

3.1.10 Simple Domains . . . 154

3.1.11 Curvilinear Domains . . . 160

3.2 Fourier Series Representation . . . 163

3.2.1 Convergence . . . 163

3.2.2 Periodic Extension of Functions . . . 166

3.2.3 Diﬀerentiation and the Lanczos Filter . . . 168

3.2.4 Trigonometric Interpolation . . . 171

3.2.5 Noisy Data . . . 173

3.2.6 Matrix Representation . . . 174

3.2.7 The Fast Fourier Transform (FFT) . . . 176

3.2.8 The Fastest Fourier Transform in the West - FFTW . . . 178

3.3 Wavelet Series Representation . . . 181

3.3.1 Basic Relations . . . 181

3.3.2 Dilation Equation . . . 185

3.3.3 Discrete Wavelet Transform: Mallat’s Algorithm . . . 188

3.3.4 Some Orthonormal Wavelets . . . 190

3.4 Back to Parallel Computing: Send and Receive . . . 197

3.5 Homework Problems . . . 201

3.5.1 Homework Problems for Section 3.1 . . . 201

3.5.2 Homework Problems for Section 3.2 . . . 205

3.5.3 Homework Problems for Section 3.3 . . . 206

**4** **Roots and Integrals** **207**
4.1 Root Finding Methods . . . 208

4.1.1 Polynomial Equations . . . 210

4.1.2 Fixed Point Iteration . . . 213

4.1.3 Newton-Raphson Method . . . 217

4.1.4 Passing Functions to Functions in C++ . . . 221

4.1.5 Secant Method . . . 226

4.1.6 Systems of Nonlinear Equations . . . 227

4.1.7 Solution via Minimization: Steepest Descent and Conjugate Gradients . . . 230

4.2 Numerical Integration Methods . . . 240

4.2.1 Simple Integration Algorithms . . . 240

4.2.2 Advanced Quadrature Rules . . . 248

4.2.3 Multi-Dimensional Integration . . . 265

4.3 Back to Parallel Computing: Reduction . . . 268 iv

4.4.1 Homework Problems for Section 4.1 . . . 275

4.4.2 Homework Problems for Section 4.2 . . . 279

**5** **Explicit Discretizations** **281**
5.1 Explicit Space Discretizations . . . 282

5.1.1 Basics . . . 282

5.1.2 Uniform Grids . . . 285

5.1.3 MPI Parallel Implementation of Finite Diﬀerences . . . 296

5.1.4 Multi-Dimensional Arrays in C++ . . . 304

5.1.5 Non-Uniform Grids . . . 308

5.1.6 One-Dimensional Boundary Value Problem . . . 314

5.1.7 Multi-Dimensional Discretizations . . . 316

5.2 Explicit Time Discretizations . . . 323

5.2.1 Multi-Step Schemes . . . 323

5.2.2 Convergence: Consistency and Stability . . . 326

5.2.3 Stability and Characteristic Polynomials . . . 328

5.2.4 Runge-Kutta Methods . . . 334

5.2.5 Stability of Runge-Kutta Methods . . . 338

5.3 Homework Problems . . . 340

**6** **Implicit Discretizations** **345**
6.1 Implicit Space Discretizations . . . 346

6.1.1 Diﬀerence Operators . . . 346

6.1.2 Method of Undetermined Coeﬃcients . . . 349

6.1.3 One-Dimensional Boundary Value Problem . . . 357

6.1.4 Thomas Algorithm for Tridiagonal Systems . . . 359

6.1.5 Parallel Algorithm for Tridiagonal Systems . . . 367

6.2 Implicit Time Discretizations . . . 378

6.2.1 Fundamental Theorems for Multi-Step Methods . . . 381

6.2.2 Stability of Stiﬀ ODEs . . . 381

6.2.3 Second-Order Initial Value Problems . . . 384

6.2.4 How to March in Time . . . 386

6.3 Homework Problems . . . 387

**7** **Relaxation: Discretization**
**and Solvers** **390**
7.1 Discrete Models of Unsteady Diﬀusion . . . 391

7.1.1 Temporal and Spatial Discretization . . . 392

7.1.2 Accuracy of Diﬀerence Equation . . . 393

7.1.3 Stability of Diﬀerence Equation . . . 394

7.1.4 Spectrum of the Diﬀusion Operator . . . 403

7.1.5 Multi-Dimensional Time-Space Stencils . . . 409

7.2 Iterative Solvers . . . 416

7.2.1 Jacobi Algorithm . . . 416 v

7.2.3 Gauss-Seidel Algorithm . . . 431

7.2.4 Parallel (Black-Red) Gauss-Seidel Algorithm . . . 433

7.2.5 Successive Acceleration Techniques - SOR . . . 436

7.2.6 Symmetric Successive Acceleration Techniques - SSOR . . . 438

7.2.7 SSOR with Chebyshev Acceleration . . . 439

7.2.8 Convergence Analysis of Iterative Solvers . . . 441

7.2.9 Relaxed Jacobi and Gauss-Seidel . . . 445

7.2.10 The Multigrid Method . . . 449

7.3 Homework Problems . . . 462

**8** **Propagation: Numerical Diﬀusion and Dispersion** **466**
8.1 Advection Equation . . . 467

8.1.1 Dispersion and Diﬀusion . . . 467

8.1.2 Other Advection Equations . . . 469

8.1.3 First-Order Discrete Schemes . . . 470

8.1.4 High-Order Discrete Schemes . . . 482

8.1.5 Eﬀects of Boundary Conditions . . . 493

8.2 Advection-Diﬀusion Equation . . . 497

8.2.1 Discrete Schemes . . . 497

8.2.2 Eﬀects of Boundary Conditions . . . 505

8.3 MPI: Non-Blocking Communications . . . 509

8.4 Homework Problems . . . 514

**9** **Fast Linear Solvers** **517**
9.1 Gaussian Elimination . . . 518

9.1.1 LU Decomposition . . . 520

9.1.2 To Pivot or Not to Pivot? . . . 524

9.1.3 Parallel LU Decomposition . . . 530

9.1.4 Parallel Back Substitution . . . 534

9.1.5 Gaussian Elimination and Sparse Systems . . . 546

9.1.6 Parallel Cyclic Reduction for Tridiagonal Systems . . . 547

9.2 Cholesky Factorization . . . 559

9.3 QR Factorization and Householder Transformation . . . 560

9.3.1 Hessenberg and Tridiagonal Reduction . . . 568

9.4 Preconditioned Conjugate Gradient Method - PCGM . . . 572

9.4.1 Convergence Rate of CGM . . . 572

9.4.2 Preconditioners . . . 573

9.4.3 Toeplitz Matrices and Circulant Preconditioners . . . 577

9.4.4 Parallel PCGM . . . 578

9.5 Non-Symmetric Systems . . . 585

9.5.1 The Arnoldi Iteration . . . 586

9.5.2 GMRES . . . 590

9.5.3 GMRES(k) . . . 594

9.5.4 Preconditioning GMRES . . . 597 vi

9.5.5 Parallel GMRES . . . 597

9.6 What Solver to Choose? . . . 598

9.7 Available Software for Fast Solvers . . . 601

9.8 Homework Problems . . . 602

**10 Fast Eigensolvers** **608**
10.1 Local Eigensolvers . . . 609

10.1.1 Basic Power Method . . . 609

10.1.2 Inverse Shifted Power Method . . . 612

10.2 Householder Deﬂation . . . 616

10.3 Global Eigensolvers . . . 623

10.3.1 The QR Eigensolver . . . 623

10.3.2 The Hessenberg QR Eigensolver . . . 625

10.3.3 Shifted QR Eigensolver . . . 625

10.3.4 The Symmetric QR Eigensolver: Wilkinson Shift . . . 627

10.3.5 Parallel QR Eigensolver: Divide-and-Conquer . . . 627

10.3.6 The Lanczos Eigensolver . . . 635

10.4 Generalized Eigenproblems . . . 638

10.4.1 The QZ Eigensolver . . . 639

10.4.2 Singular Eigenproblems . . . 639

10.4.3 Polynomial Eigenproblems . . . 640

10.5 Arnoldi Method: Non-Symmetric Eigenproblems . . . 640

10.6 Available Software for Eigensolvers . . . 641

10.7 Homework Problems . . . 643

**A A. C++ Basics** **646**
A.1 Compilation Guide . . . 646

A.2 C++ Basic Data Types . . . 647

A.3 C++ Libraries . . . 647

A.3.1 Input/Output Library – iostream.h . . . 647

A.3.2 Input/Output Manipulation Library – iomanip.h . . . 648

A.3.3 Mathematics Library – math.h . . . 648

A.4 Operator Precedence . . . 649

A.5 C++ and BLAS . . . 649

**B B. MPI Basics** **651**
B.1 Compilation Guide . . . 651

B.2 MPI Commands . . . 652

B.2.1 Predeﬁned Variable Types in MPI . . . 652

B.2.2 Predeﬁned Reduction Operators in MPI . . . 653

B.2.3 MPI Function Declarations . . . 653

B.2.4 MPI Constants and Deﬁnitions . . . 672

**Chapter 1**

**Scientiﬁc Computing and Simulation** **Science**

**1.1** **What is Simulation?**

Science and engineering have undergone a major transformation at the research as well as at the development and technology level. The modern scientist and engineer spend more and more time in front of a laptop, a workstation, or a parallel supercomputer and less and less time in the physical laboratory or in the workshop. The virtual wind tunnel and the virtual biology lab are not a thing of the future, they are here! The old approach of “cut- and-try” has been replaced by “simulate-and-analyze” in several key technological areas such as aerospace applications, synthesis of new materials, design of new drugs, chip processing and microfabrication, etc. The new discipline of nanotechnology will be based primarily on large-scale computations and numerical experiments. The methods of scientiﬁc analysis and engineering design are changing continuously, aﬀecting both our approach to the phenomena that we study as well as the range of applications that we address. While there is a lot of software available to be used as almost a “black-box,” working in new application areas requires good knowledge of fundamentals and mastering of eﬀective new tools.

In the classical scientiﬁc approach, the physical system is ﬁrst simpliﬁed and set in a form that suggests what type of phenomena and processes may be important, and correspond- ingly what experiments are to be conducted. In the absence of any known-type governing equations, dimensional inter-dependence between physical parameters can guide laboratory experiments in identifying key parametric studies. The database produced in the laboratory is then used to construct a simpliﬁed “engineering” model which after ﬁeld-test validation will be used in other research, product development, design, and possibly lead to new tech- nological applications. This approach has been used almost invariably in every scientiﬁc discipline, i.e., engineering, physics, chemistry, biology, etc.

The simulation approach follows a parallel path but with some signiﬁcant diﬀerences.

*First,* the phase of the physical model analysis is more elaborate: The physical system is
cast in a form governed by a set of partial diﬀerential equations, which represent continuum
approximations to microscopic models. Such approximations are not possible for all systems,
and sometimes the microscopic model should be used directly. *Second, the laboratory exper-*

2

iment is replaced by simulation, i.e., a numerical experiment based on a discrete model. Such a model may represent a discrete approximation of the continuum partial diﬀerential equa- tions, or it may simply represent a statistical representation of the microscopic model. Finite diﬀerence approximations on a grid are examples of the ﬁrst case, and Monte Carlo methods are examples of the second case. In either case, these algorithms have to be converted to software using an appropriate computer language, debugged, and run on a workstation or a parallel supercomputer. The output is usually a large number of ﬁles of a few Megabytes to hundreds of Gigabytes, being especially large for simulations of time-dependent phenomena.

To be useful, this numerical database needs to be put into graphical form using various vi- sualization tools, which may not always be suited for the particular application considered.

Visualization can be especially useful during simulations where interactivity is required as the grid may be changing or the number of molecules may be increasing.

The simulation approach has already been followed by the majority of researchers across
disciplines in the last few decades. The question is if this is a new science, and how one could
formally obtain such skills. Moreover, does this constitute fundamental new knowledge or is
it a “mechanical procedure,” an ordinary skill that a chemist, a biologist or an engineer will
acquire easily as part of “training on the job” without speciﬁc formal education. It seems
that the time has arrived where we need to reconsider boundaries between disciplines and
reformulate the education of the future *simulation scientist, an inter-disciplinary scientist.*

Let us re-examine some of the requirements following the various steps in the simulation
approach. The *ﬁrst task* is to select the right representation of the physical system by
making consistent assumptions in order to derive the governing equations and the associated
boundary conditions. The conservation laws should be satisﬁed; the entropy condition should
not be violated; the uncertainty principle should be honored. The *second task*is to develop
the right algorithmic procedure to discretize the continuum model or represent the dynamics
of the atomistic model. The choices are many, but which algorithm is the most accurate
one, or the simplest one, or the most eﬃcient one? These algorithms do not belong to a
discipline! Finite elements, ﬁrst developed by the famous mathematician Courant and re-
discovered by civil engineers, have found their way into every engineering discipline, physics,
geology, etc. Molecular dynamics simulations are practiced by chemists, biologists, material
scientists, and others. The *third task* is to compute eﬃciently in the ever-changing world of
supercomputing. How eﬃcient the computation is translates to how realistic of a problem is
solved, and therefore how useful the results can be to applications. The*fourth task*is to assess
the accuracy of the results in cases where no direct conﬁrmation from physical experiments
is possible such as in nanotechnology or in biosystems or in astrophysics, etc. Reliability of
the predicted numerical answer is an important issue in the simulation approach as some
of the answers may lead to new physics or false physics contained in the discrete model or
induced by the algorithm but not derived from the physical problem. *Finally, visualizing the*
simulated phenomenon, in most cases in three-dimensional space and in time, by employing
proper computer graphics (a separate specialty on its own) completes the full simulation
cycle. The rest of the steps followed are similar to the classical scientiﬁc approach.

In classical science we are dealing with matter and therefore *atoms*but in simulation we
are dealing with information and therefore*bits, so it is atoms versus bits! We should, there-*
fore, recognize the simulation scientist as a separate scientist, the same way we recognized
just a few decades ago the computer scientist as diﬀerent than the electrical engineer or the

Numerical Mathematics

Computer Science

### Scientific Computing

Modeling

Figure 1.1: Deﬁnition of scientiﬁc computing as the intersection of numerical mathematics, com- puter science, and modeling.

applied mathematician. The new scientist is certainly not a computer scientist although she should be computer literate in both software and hardware. She is not a physicist although she needs a sound physics background. She is not an applied mathematician although she needs expertise of mathematical analysis and approximation theory.

With the rapid and simultaneous advances in software and computer technology, espe-
cially commodity computing, the so-called*soupercomputing, every scientist and engineer will*
have on her desk an advanced simulation kit of tools consisting of a software library and
multi-processor computers that will make analysis, product development, and design more
optimal and cost-eﬀective. But what the future scientists and engineers will need, ﬁrst and
foremost, is a solid inter-disciplinary education.

Scientiﬁc computing is the heart of simulation science, and this is the subject of this book. The emphasis is on a balance between classical and modern elements of numerical mathematics and of computer science, but we have selected the topics based on broad mod- eling concepts encountered in physico-chemical and biological sciences, or even economics (see ﬁgure 1.1).

**1.2** **A Seamless Approach Path**

Our aim in writing this book has been to provide the student, the future simulation scien-
tist, with a *seamless* approach to numerical algorithms, modern programming techniques,
and parallel computing. Often times such concepts and tools are taught*serially* across dif-
ferent courses and diﬀerent textbooks, and hence the interconnection between them is not
immediately apparent. The necessity of integrating concepts and tools usually comes after
such courses are concluded, e.g. during a ﬁrst job or a thesis project, thus forcing the student
to synthesize what is perceived to be three*independent subﬁelds*into one in order to produce
a solution. Although this process is undoubtly valuable, it is time consuming and in many
cases it may not lead to an eﬀective combination of concepts and tools. Moreover, from the

pedagogical point of view, the integrated seamless approach can stimulate the student simul- taneously through the eyes of multiple disciplines, thus leading to enhanced understanding of subjects in scientiﬁc computing.

(a)

**C++** **MPI**

**This**
**Book**
**Algorithms**

(b)

**C++** **MPI**

**Algorithms**

Figure 1.2: Simultaneous integration of concepts shown in (a) in contrast with the classical serial integration shown in (b).

As discussed in the previous section, in the scientiﬁc simulation approach there are several successive stages that lead from :

1. The real world problem to its mathematical formulation;

2. From the mathematical description to the computer implementation and solution, and 3. From the numerical solution to visualization and analysis.

In this book, we concentrate on stage (2) which includes not only the mathematics of nu- merical linear algebra and discretization but also the implementation of these concepts in C++ and MPI.

There are currently several excellent textbooks and monographs on these topics but without the type of integration that we propose. For example, the book by Golub & Ortega [45] introduces pedagogically all the basic parallel concepts, but a gap remains between the parallel formulation and its implementation. Similarly, the books by Trefethen & Bau [88]

and Demmel [26] provide rigor and great insight into numerical linear algebra algorithms, but they do not provide suﬃcient material on discretization and implementation. On the other hand, popular books in C++ (e.g., by Stroustrup [86]) and MPI (e.g., by Pacheco [73]) are references that teach programming using disconnected algorithmic examples, which is useful for acquiring general programming skills but not for parallel scientiﬁc computing. Our book treats numerics, parallelism, and programming equally and simultaneously by placing the reader at a vantage point between the three areas as shown in the schematic of ﬁgure 1.2(a), and in contrast with the classical approach of connecting the three subjects serially as illustrated in ﬁgure 1.2(b).

**1.3** **The Concept of Programming Language**

In studying computer languages, we want to study a new way of interacting with the com- puter. Most people are familiar with the use of software purchased from your local computer store, software ranging from word processors and spreadsheets to interactive games. But have you ever wondered how these things are created? How do you actually “write” soft- ware? Throughout this book we will be teaching through both lecture and example how to create computer software that solves scientiﬁc problems. Our purpose is not to teach you how to write computer games and the like, but the knowledge gained here can be used to devise your own software endeavors.

It has been stated by some that the computer is a pretty dumb device, in that it only
understands two things – *on* and *oﬀ. Like sending Morse code over a telegraph wire with*
signals of dots and dashes, the computer uses sequences of zeros and ones as its language.

The zeros and ones may seem ineﬃcient, but it is not just the data used, but the rules
applied to the data which make it powerful. This concept, in theory, is no diﬀerent than
human language. If we were to set before you a collection of symbols, say a b c d ... z,
and indicate to you that you can use these to express even the most complex thoughts
and emotions of the human mind and heart, you would think we were crazy. Just 26 little
symbols? How can this be? We know, however, that it is not merely the symbols that are
important, but the rules used to combine the symbols. If you adhere to the rules deﬁned by
the English language, then books like this can be written using merely combinations of the
26 characters! How is this similar to the computer? The computer is a complex device for
executing instructions. These instructions are articulated by using our *two-base* characters,
**0** and**1, along with a collection of rules for combining them together. This brings us to our**
ﬁrst axiom:

**Axiom I:***Computers are machines which execute instructions. If someone is not telling the*
*computer what to do, it does nothing.*

Most people have had some experience with computers, and immediately they will read
this statement and say: “Hey, I have had my computer do all kinds of things that I didn’t
want!”. Ah, but read the axiom carefully. The key to this axiom is the use of the term
*someone. The one thing to keep in mind is that some human, or collection of humans,*
developed software to tell the computer what to do. At a relatively low level, this would be
the people who wrote the operating system used by the computer. At a higher level, this
would be the people that developed the word processor or game that you were using. In
both cases, however, someone determined how the computer would act and react to your
input. We want you, after reading this book and understanding the concepts herein, to be
able to be in the driver’s seat. This leads us to our second axiom:

**Axiom II:***Computer programming languages allow humans a simpliﬁed means of giving the*
*computer instructions.*

We tell you that we want you to be in the driver’s seat, and you tell me “I don’t want to learn how to communicate in zeros and ones ... learning English was hard enough!” You can imagine how slowly the computer age would have progressed if every programming class consisted of the following lecture scheme. Imagine the ﬁrst day of class. On the ﬁrst day,

the instructor tells you that you will be learning the two basic components of the computer language today: 0 and 1. He may force you to say zero and one a few times, and then write zero and one many times on a piece of paper for practice, but then, what else would there be to learn concerning your character set? Class dismissed. Then, for the rest of the semester, you would spent your time learning how to combine zeros and ones to get the computer to do what you want. Your ﬁrst assignment might be to add two numbers a and b, and to store the result in c (i.e., c = a + b). You end up with something that looks like:

01011001010001000111010101000100 01011001011100100010100101000011 00111010101000100111010100100101 01011101010101010101010000111101 Seems like a longwinded way of saying

*c*=*a*+*b*?

But this is what the computer understands, so this is how we must communicate with it.

However, we do not communicate in this fashion. Human language and thought use a higher abstraction than this. How can we make a bridge for this gap? We bridge this gap via programming languages, see ﬁgure 1.3.

Computer

Assembly

C/C++

Java

Fortran Visual C++

Visual Quick Basic

Human

Low High

Easiness Specificity

Figure 1.3: Programming languages provide us a means of bridging the gap between the computer and the human.

The ﬁrst programming language we will mention is*assembly. The unique property of as-*
sembly is that for each instruction, there is a one-to-one correspondence between a command
in assembly and a computer understandable command (in zeros and ones). For instance,
instead of writing

01011001010001000111010101000100

as a command, you could write ‘load a $1’. This tells the computer to load the contents of the memory location denoted by ‘a’ into register $1 in the computer’s CPU (Central Processing Unit). This is much better than before. Obviously, this sequence of commands is much easier for the human to understand. This was a good start, but assembly is still considered a “low-level language”. By low-level we mean that one instruction in assembly is

equal to one computer instruction. But as we said earlier, we want to be able to think on
a *higher level. Hence, there was the introduction of “higher level” languages. Higher level*
languages are those in which one instruction in the higher-level language equals one or more
computer level instructions. We want a computer language where we can say ‘c = a + b’,
and this would be equivalent to saying :

load a $1 load b $2 add $1 $2 $3 save $3 c

One high-level instruction was equivalent to four lower level instructions (here written in pseudo-assembly so that you can follow what is going on). This is preferable for many rea- sons. For one thing, we as humans would like to spend our time thinking about how to solve the problem, not just trying to remember (and write) all the assembly code! Secondly, by writing in a higher-level language, we can write code that can work on multiple computers, because the translation of the higher level code can be done by a compiler into the assembly code of the processor on which we are running.

As you read through this book and do the exercises found herein, always be mindful that our goal is to utilize the computer for accomplishing scientiﬁc tasks encountered in simulation science. At a high-level, there is a science or engineering problem to solve, and we want to use the computer as a tool for solving the problem. The means by which we will use the computer is through the writing and execution of programs written using the computing language C++ and the parallel message passing libraries of MPI.

**1.4** **Why C++ and What is MPI?**

The algorithms we present in the book can certainly be implemented in other languages, e.g.

FORTRAN or Java, as well as other communication libraries, e.g. PVM (Parallel **Virtual**
**Machine). However, we commit to a speciﬁc language and parallel library in order to provide**
the student with the immediate ability to experiment with the concepts presented. To this
end, we have chosen C++ as our programming language for a multitude of reasons: *First,*
it provides an object-oriented infrastructure that accommodates a natural breakdown of the
problem into a collection of data structures and operations on those structures. *Secondly, the*
use of C++ transcends many disciplines beyond engineering where traditionally FORTRAN
has been the prevailing language. *Thirdly, C++ is a language naturally compatible with the*
basic algorithmic concepts of

*•* partitioning,

*•* recursive function calling,

*•* dynamic memory allocation, and

*•* encapsulation.

Similarly, we commit to MPI (Message **Passing** **Interface) as a message passing library be-**
cause it accommodates a natural and easy partitioning of the problem, it provides portability
and eﬃciency, and it has received wide acceptance by academia and industry.

**C++**

**Chapter 1** **Chapter 10**

**Algorithms**

**MPI**

Figure 1.4: Progression of new material throughout the book in the three areas shown in ﬁgure (1.2).

The simultaneous integration we propose in this book will be accomplished by carefully presenting related concepts from all three sub-areas. Moving from one chapter to the next re- quires diﬀerent dosages of new material in algorithms and tools. This is explained graphically in ﬁgure 1.4, which shows that while new algorithms are introduced at an approximately constant rate, the introduction of new C++ and MPI material vary inversely. We begin with an emphasis on the basics of the language, which allows the student to immediately work on the simple algorithms introduced initially, while as the book progresses and the computational complexity of algorithms increases the use of parallel constructs and libraries is emphasized.

More speciﬁcally, to help facilitate the student’s immersion into object-oriented thinking,
we provide a library of *classes* and *functions* for use throughout the book. The classes
contained in this library are used from the very beginning of the book as a natural, user-
deﬁned, extension of C++. As the book progresses, the underlying logic and programming
implementation of these classes are explained, bringing the student to a deeper understanding
of the development of C++ classes. We will denote all classes used within the book and not
inherent to C++ with the letters **SC, such as the classes** *SCVector*and *SCMatrix.*

*Software*
* Suite*

This is done to clearly distinguish between C++ deﬁned and
user-deﬁned data types, and also to accentuate the utility of
user-deﬁned types within the C++ programming language. As
students become more familiar and conﬁdent in their ability to
devise and use datatypes, we encourage them to use these facil-
ities provided by the language for more eﬀective programming
and problem solving. All the codes of this book and many
more examples are included in the *software suite, which is dis-*
tributed with this book.

**1.5** **What About OpenMP?**

Due to the recent proliferation of distributed shared-memory (DSM) machines in the scientiﬁc
computing community, there is much interest in how best to appropriately utilize both the
distributed and the shared-memory partitioning of these systems. MPI provides an eﬃcient
means of parallel communication among a *distributed* collection of machines; however, not
all MPI implementations take advantage of *shared-memory* when it is available between
processors (the basic premise being that two processors, which share common memory, can
communicate with each other faster through the use of the shared medium than through
other communication means).

OpenMP (Open Multi Processing) was introduced to provide a means of implementing shared-memory parallelism in FORTRAN and C/C++ programs. Speciﬁcally, OpenMP speciﬁes a set of environment variables, compiler directives, and library routines to be used for shared-memory parallelization. OpenMP was speciﬁcally designed to exploit certain characteristics of shared-memory architectures such as the ability to directly access memory throughout the system with low latency and very fast shared-memory locks. To learn more about OpenMP, visit www.openmp.org.

A new parallel programming paradigm is emerging in which both the MPI and OpenMP are used for parallelization. In a distributed shared-memory architecture, OpenMP would be used for intra-node communication (i.e., between a collection of processors which share the same memory subsystem) and MPI would be used for inter-node communication (i.e., between distinct distributed collections of processors). The combination of these two par- allelization methodologies may provide the most eﬀective means of fully exploiting modern distributed shared-memory (DSM) systems.

**1.6** **Algorithms and Top Ten List**

The Greeks and Romans invented many scientiﬁc and engineering algorithms, but it is be-
lieved that the term ‘algorithm’ stems from the name of the ninth-century Arab mathemati-
cian *al-Khwarizmi, who wrote the book* *al-jabr wa’l muqabalach* which eventually evolved
into today’s high school algebra textbooks. He was perhaps the ﬁrst to stress systematic
procedures for solving mathematical problems. Since then, some truly ingenious algorithms
have been invented, but the algorithms that have formed the foundations of the scientiﬁc
computing as a separate discipline were developed in the second part of the twentieth cen-
tury. Dongarra & Sullivan put together a list of the *top ten algorithms* of the twentieth
century [33]. According to these authors, these algorithms had the greatest inﬂuence on
science and engineering in the past. They are in chronological order:

1. **1946:** The Monte Carlo method for modeling probabilistic phenomena.

2. **1947:** The Simplex method for linear optimization problems.

3. **1950:** The Krylov subspace iteration method for fast linear solvers and eigensolvers.

4. **1951:** The Householder matrix decomposition to express a matrix as a product of
simpler matrices.

5. **1957:** The FORTRAN compiler that liberated scientists and engineers from program-
ming in assembly.

6. **1959-1961:** The QR algorithm to compute many eigenvalues.

7. **1962:** The Quicksort algorithm to put things in numerical or alphabetical order fast.

8. **1965:** The Fast Fourier Transform to reduce operation count in Fourier series repre-
sentation.

9. **1977:** The Integer relation detection algorithm, which is useful for bifurcations and in
quantum ﬁeld theory.

10. **1987:** The Fast multipole algorithm for N-body problems.

Although there is some debate as to the relative importance of these algorithms or the absence of other important methods in the list, e.g. ﬁnite diﬀerences and ﬁnite elements, this selection by Dongarra & Sullivan reﬂects some of the thrusts in scientiﬁc computing in the past. The appearance of the FORTRAN compiler, for example, represents the historic transition from assembly language to higher level languages, as discussed earlier. In fact, the ﬁrst FORTRAN compiler was written in 23,500 assembly language instructions! FORTRAN has been used extensively in the past, especially in the engineering community, but most of the recent scientiﬁc computing software has been re-written in C++, e.g. the Numerical Recipes [75].

In this book we will cover in detail the algorithms (3), (4), (6) and (8) from the above list, including many more recent versions, which provide more robustness with respect to round-oﬀ errors and eﬃciency in the context of parallel computing. We will also present discretizations of ordinary and partial diﬀerential equations using several ﬁnite diﬀerence formulations.

Many new algorithms will probably be invented in the twenty-ﬁrst century, hopefully some of them from the readers of this book! As Dongarra & Sullivan noted “This century will not be very restful for us, but is not going to be dull either!”

**Chapter 2**

**Basic Concepts and Tools**

In this chapter we introduce the main themes that we will cover in this book and provide an
introduction for each of them. We begin with a brief overview of C++ and deﬁne the two
basic concepts of*functions*and*classes*as well as other syntactic elements of the language. We
then introduce basic mathematical concepts that include elements of linear algebra, vector
orthogonalization, and corresponding codes and software. Finally, we introduce parallel
programming and review some generic parallel architectures as well as standard parallel
algorithms for basic operations, e.g., the *fan-in* algorithm for recursive doubling. We also
provide a brief overview of the main MPI commands.

12

**2.1** **Introduction to C++**

An ancient proverb states that the beginning of a thousand mile journey begins with a single step. For us, this single step will be a brief overview of the C++ programming language. This introduction is not designed to be all-inclusive, but rather it should provide the scaﬀolding from which we will build concepts throughout this book. Admittedly, what you will read now may seem daunting in its scope, but as you become more familiar with the concepts found herein, you will be able to use the C++ language as a tool for furthering your understanding of deeper mathematical and algorithmic concepts presented later in the book. With this in mind, let us begin our thousand mile journey with this ﬁrst step.

Any programming language can be broken down into two high level concepts:

*•* Data, and

*•* Operations on data.

Though this may seem like a trivial statement, quite often in science and engineering prob- lems the real work that needs to be done is identifying what is the relevant data, and what operations need to be executed on that data to obtain the desired results. From the pro- gramming point of view, we will assume that you already have a clear concept of what data is needed to solve the problem, and what algorithms will be acting on the data; we will focus on translating these needs into the programming language.

We have chosen to present this material in a top-down manner; that is, we will start from the high level of the program and work our way down toward lower and lower levels of detail. At the end of this section, we will recapitulate what we have learned, and show how all the pieces do indeed ﬁt together though an example. We start with the idea of a program, or ‘code’ as it is sometimes referred to within the scientiﬁc computing community.

A program is a sequence of instructions acting on a collection of data. Just as this chapter
had a starting point, every program must have a starting point, and in C++, the starting
point of execution is the “main” function, which is called **main. This tells the computer**
where to start execution of your program. The simplest C++ code that can be written is
the following:

int main(int argc, char ** argv){

}

This piece of code will compile and execute, but it will do absolutely nothing. Though
it may contain data inputed through the arguments*argc*and *argv, it contains no operations*
on that data. It merely provides the computer with a starting point of execution, and then
immediately terminates because all executable commands have been executed (which in this
case is none!).

*Software*
* Suite*

This is your ﬁrst C++ program. In keeping with programming tradition, your ﬁrst non-trivial C++ program should be the following:

#include<iostream.h>

int main(int argc, char ** argv){

cout << "Hello World" << endl;

}

At this stage, you should type in the program above, compile it using your native C++

compiler, and execute this program. The result of this program should be that the statement

“Hello World” is printed to your screen. If you have problems with this exercise, see Appendix A.1.

In theory, you now have your ﬁrst C++ program. You have written the code, compiled and linked the code, and are able to execute the code on your native machine. Now that this ﬁrst step is behind us, let us jump into discussing some of the basic concepts, one of which we have just gained some experience, i.e., the concept of a function.

**2.1.1** **Two Basic Concepts in C++**

There are two basic concepts used throughout the C++ programming language: the concepts of

*•* *Function, and of*

*•* *Class.*

The C programming language, upon its inception, had at least one self-deﬁning feature:

*modularity. The C language was designed to be modular, and this modularity was accom-*
plished through the employment of functions. Almost everything in C is a function, and
some have said that “... all C really is a big function calling a bunch of other functions”.

Well, this is almost right. The C language basically consists of two components, a core lan-
guage speciﬁcation which contains basic data types and constructs (such as **if** statements,
**for** statements, etc., some of which we discuss later on in this chapter), and a collection of
libraries, each of which contains many pre-deﬁned functions. C++ built on this philosophy,
and introduced the “class” as a second fundamental building block of the language. Within
C++, functions and classes are intertwined to create the desired program.

We begin by deﬁning what we mean by a *function*and a*class. Functions and classes can*
be distinguished by their fundamental premises. The primary premise of a function revolves
around what the function does, whereas the fundamental premise of a class revolves around
the data that the class contains. Functions are designed to be abstractions of algorithms;

Classes (at least as presented in this book) are an abstraction of data and operations on that data. We will clarify this distinction by examining the two concepts in more detail.

**Functions**

Functions are abstractions which encapsulate a concept or algorithm. The concept of a function is probably not new to you. In mathematics, we see functions all the time. We

deﬁne functions so that we can abstract a particular operation or collection of operations into one statement. In mathematics, we note functions in a manner like

*f*(x) =*x*^{3}*−x*^{2}+ 2.

We understand that if we evaluate the function at the point *x* = 2, denoted *f*(2), this is
equivalent to substituting the number 2 into the expression*x*^{3}*−x*^{2}+2, yielding 2^{3}*−*2^{2}+2 = 6.

We hence would say that *f*(2) = 6. In mathematical parlance, we would add rigor to all
that we have done so far, and state this collection of operations as follows:

*Given* *x* *as a real number, deﬁne* *f*(x) *as a function returning a real number, where the*
*deﬁnition of* *f*(x) *is given by the expression* *f*(x) =*x*^{3}*−x*^{2}+ 2.

This example demonstrates the three major components of a function:

*•* *input, output, and* *contract (or algorithm).*

We speciﬁed the valid range of parameters that can be inputed into this function (math- ematically referred to as the domain of the function); we speciﬁed the range in which the output would lie (mathematically referred to as the range of the function); and ﬁnally we speciﬁed what, given a particular input, the function will do. The same holds true for C++.

For a function, we need to specify the input, output, and contract.

Function

Input Output

Algorithm/Contract

Figure 2.1: A schematic of a function in C++.

In C++, the process of specifying the input, output, and contract is done in two stages,
see ﬁgure 2.1^{1}. These two stages are specifying the following for each function:

*•* Function declaration, and

*•* Function deﬁnition.

A function’s declaration accomplishes several things in one step. It declares the name of the function and the data types of the input and output of the function. The function deﬁnition speciﬁes what the function will accomplish given particular input data. Hence, the deﬁnition of a function is the algorithmic explanation of the contract of the function.

A schematic for the syntax used for the declaration of a C++ function is given in ﬁgure 2.2. Using this as our guide, let us attempt to formulate our mathematical function into a

1According to the language standard, it is possible to combine both of these items into one statement satisfying the requirement for both simultaneously. For pedagogical clarity, we will always keep the two stages separate.

C++ function. For the purposes of this demonstration, let us assume that we have a data
type called *ﬂoat, which is the ﬂoating point representation of a real number. The function*
declaration of our function “f” is given by:

float f(float x);

Output Function Name

### (

Input/Output### )

"Arguments"

Figure 2.2: Schematic of the syntax of a C++ function.

Examining our schematic given in 2.2, we can dissect this code to understand what is going on. Let us see if we have met the three components of a function declaration. First, we speciﬁed the name of the function, “f”. Next, we speciﬁed that the valid input to our function is a ﬂoating point value. Finally, we speciﬁed that our function returns a ﬂoating point value. Hence, we now have declared our function! What does declaration really mean though? Declaration of a function allows us to use this function throughout our code with the assumption that this function will act upon the contract of the function (later speciﬁed by the deﬁnition of the function). A key thing to realize is that:

*•* *A function must be declared before it can be used.*

Because of this fact, many programmers place all their function declarations at the beginning
of their C++ program so that all functions are accessible everywhere. You will notice
throughout this book that we either place our function declarations within a *header ﬁle*
(the ﬁles with a *.h* extension) which are included at the beginning of a program, or we
directly insert the function declarations after the include ﬁles and prior to the*main*function
deﬁnition. An example template of this type of ﬁle setup is shown at the end of this section.

Another important fact is that within the *argument list* (the list of inputs and outputs
given as arguments to the function), the names speciﬁed are irrelevant. The compiler is only
concerned about the data type. Hence, it would be perfectly valid to write

float f(float);

Why have a variable name there, if it is just to be ignored? The most practical reason we
put variable names in function declarations is that we normally cut-and-paste the function
declaration from the function deﬁnition. The function deﬁnition*does* require variable names
to be used. We will now give the function deﬁnition so that this becomes more apparent.

float f(float x){

float y;

y = x*x*x - x*x + 2;

return y;

}

Notice that the function deﬁnition has a very similar beginning to the function declara-
tion. As before, you specify the function name and the input and output data types. The
diﬀerence, however, is that the function deﬁnition is the implementation of our contract. For
the function deﬁnition, including speciﬁc variable names within the argument list is essential
because we will use these variable names throughout the deﬁnition to refer to that data that
was inputed. In C++, when data is inputed into a function, the information is *passed by*
*value. This means that when the function is called, a copy of the information is created for*
the function to use, not the original data. This is an important and yet subtle point about
C++. We will discuss this in more detail later (see section 3.1.2). For now, the thing to
remember is that the function takes from its argument list the information passed to it, and
it stores a copy of the information in a variable speciﬁed by the name given in the argument
list.

In this example, we declare a variable y in which we temporarily store the value of
our function, given by the expressionx*x*x - x*x + 2, and then we return the value of the
variable*y. The* *return*statement designates the variable from which a value is to be returned
from the function back to the caller. If we were to examine a code snippet, we could use our
function just as we did mathematically by writing:

float w;

w = f(2);

If were to print the value of *w, we would see that returned value is the ﬂoating point*
value 6.000. Some of this will become more clear after we have discussed basic data types.

The key items to remember from this discussion are:

*•* Every function must have a function declaration and deﬁnition.

*•* Function declarations specify the name of the function and the data types of the inputs
and outputs.

*•* Function deﬁnitions specify the implementation of the algorithm used to carry out the
contract of the function.

*•* Variable names in function declarations *do not* matter.

*•* Variable names in function deﬁnitions *do*matter because they specify how the data is
to be referred to in the implementation of the function.

*•* Variables passed to C++ functions are passed by*value* unless otherwise speciﬁed.

*Software*
* Suite*

*Putting it into Practice*

Recall our little main function we wrote, compiled, and ran at the beginning of this section. Let us now combine that code with our new function.

#include <iostream.h> // inclusion of library header file // for use of cout

float f(float x); // function declaration int main(int argc, char ** argv){

float w;

w = f(2);

cout << "The value of w is: " << w << endl;

}

float f(float x){ // function definition float y;

y = x*x*x - x*x + 2;

return y;

}

If you were to compile and run this code, you would obtain a statement on your screen that says: The value of w is: 6.00 .

In the program above, we use an object named *cout, the declaration of which is found in*
the system header ﬁle *iostream.h. The object* *cout* is used for printing to standard output,
which in many cases is the screen. For now, it is suﬃcient to know that the << symbols
delineate expressions to be printed. In the code above, the ﬁrst statement to be printed is
the string “The value of w is:,” then the value associated with the variable *w, and then the*
end-of-line character denoted by the term *endl. We will speak more about* *cout* later in this
chapter.

**Classes**

Classes are abstractions which encapsulate data and operations on that data. In C++, the concept of classes is used to simplify through encapsulation very complex data structures.

A class consists of two parts: *data*and *methods operating on the data. What you will ﬁnd is*
that methods are merely functions which are “attached” to classes. The concept of a method
is analogous to that of a function, with the primary focus of a method being to act on the
data of the class, and not on arbitrary inputed data.

For example, in the Gram-Schmidt routines in section 2.2.9, we utilize several user-deﬁned classes; among those classes is the class SCVector. Vector does not exist as a basic data type in C++; however, the language allows us to deﬁne a new type, called SCVector, which consists of a collection of basic data types and operations on that data. The declaration for the class SCVector is given below. We will not go over every detail now, but will defer explicit explanation until later in this book (see section 3.1.8). However, we call your attention to the two basic concepts of classes in C++:

1. encapsulated data, and 2. methods acting on that data.

In this class, the variables *dimension* and *data* are encapsulated by the class, and all the
remaining methods (in the section marked ‘public’) act on this basic data.

*Software*
* Suite*

We now present the class declaration of the SCVector class:

class SCVector{

private:

int dimension; // dimension of the vector

double *data; // pointer to array containing vector components public:

SCVector(int dim); // default constructor

SCVector(const SCVector& v); // copy constructor

SCVector(int col, const SCMatrix &A); //secondary constructor

~SCVector(); //destructor

int Dimension() const; //dimension of the vector double Length(); // Euclidean norm of the vector void Normalize(); // normalize vector

double Norm_l1();

double Norm_l2();

double Norm_linf();

//************************

// User Defined Operators

//************************

int operator==(const SCVector& v) const;

int operator!=(const SCVector& v) const;

SCVector & operator=(const SCVector& v);

double operator()(const int i) const;

double& operator()(const int i);

void Print() const;

};

Methods Data C++ Class

Input Output

Figure 2.3: A C++ class encapsulates data and methods acting on that data.

We will explain classes more thoroughly later (section 3.1.8), but let us take this oppor- tunity to point out a few features of classes:

*•* In the class above, there are several “constructors”. A**constructor**is the ﬁrst method
which is called when an object is instantiated. These methods can be used to initialize
data, set up information within the class, etc.

*•* A **destructor** is the method called prior to an object being deleted. The operating
system will call this method (if it is available) to allow the object to “clean up for
itself” prior to the operating system (OS) ﬁnishing the job by freeing the memory to
which the class was allocated.

*•* Notice that some methods of this class modify the data contained within the ob-
ject, while others merely compute things based upon the data within the object. For
instance, the function*Normalize*does exactly that – it normalizes the vector data con-
tained with the object to have norm one. The function *Norm l2, however, does not*
modify the data contained with the object, but merely computes the Euclidean norm
of the vector based upon the information within the object.

Type Description short short integer

int integer long long integer Table 2.1: Integer data types.

*•* Classes allow us to deﬁne what are referred to as **overloaded operators. In the**
declaration given above, we have listed these as “user deﬁned operators”. In addition
to deﬁning new data types, we can also deﬁne (or re-deﬁne) how common unary and
binary operators act on those objects (such as deﬁning what ‘+’ means when two newly
deﬁned objects are involved).

**2.1.2** **Learning the Syntax and Other Basic Commands**

**Getting Past “;” and “****{ }****”**

As you may have already noticed from the small amount of code that we have presented to
you, the symbols “;” and “*{ }* ” are integral to C++. We will brieﬂy describe the purpose
of these symbols here.

In general, the “;” is used to terminate an executable statement, hence why you see it at the conclusion of the commands listed above. Having such a symbol denote the end of an executable statement allows the compiler easily delineate between statements.

The *{ }* brackets (called curly brackets) are used to denote *scope. We will not go into*
all the nuances of scope right now other than to tell you that the scope of a variable or a
function is the area of code in which that variable or function can be used.

**Basic Data Types**

In C++, variables are used to store information. In this section we go over some (not all, see
the Appendix A.2 for more information) of the basic data types available in C++. Just like
in mathematics, a variable is a symbol used to denote a particular value, either numerical
or character. One way of thinking of a variable is that it is a box in which information can
be stored. In C++, all variables must have a*type. Thetype* tells the computer what kind of
box to create, the dimension of the box, the shape, etc. The syntax for creating a variable
is:

*<type> <variable list>*

Some basic data types are listed in the tables 2.1, 2.2 and 2.3. Given the convention
above, we see that to declare a variable called *x* of type*int, we would write:*

**int x;**

Type Description ﬂoat single precision double double precision Table 2.2: Floating point data types.

Type Description char character

Table 2.3: Character data type.

This allocates a block of memory of the size of an integer, and would assign the symbol
*x* to refer to that location in memory. Hence, from now on, if we act on the variable *x, we*
are acting on the content of the memory assigned to *x. This may sound odd at ﬁrst, but*
this is one of the subtle diﬀerences between computer programming and mathematics. One
thing that most people do not realize is that the computer does not have integer memory,
ﬂoating point memory, character memory, etc ^{2}. As far as the computer is concerned,
memory is memory. The data type that is used tells the computer*how to interpret memory.*

A particular set of four bytes can be used in one program to represent an integer, and in
another program to represent a ﬂoat. The computer does not care. What is important is
that the computer needs to know how to interpret the *bit pattern* of the four bytes. Does it
interpret the collection of bits as an integer or a ﬂoat, or some other variable type? Coming to
grips with this notion will be very important when we start discussing the idea of addresses,
the subject of pointers, etc. So, there are two key points to remember about data types.

The data type you specify tells the computer two things:

*•* The number of bytes to use to hold the data.

*•* How to interpret the bit pattern speciﬁed in those bytes.

Now that you know that variables exist, and you know some of the basic types, here are some rules to remember when using variables:

*•* Variables must be declared before they are used.

*•* Variable names can be of arbitrary length.

*•* Variable names *are*case sensitive.

*•* Variables are to be assumed to be uninitialized. *Do not* rely on the operating system
to initialize/zero values for you.

2Computers do in fact have specialized memory within the processor, called *registers, which are either*
integer or ﬂoat/double.

Symbol Interpretation

+ addition

- subtraction

* multiplication

/ division

Table 2.4: Binary arithmetic operations.

*•* Variable lists may be used. If we wish to allocate three integer variables*a,b* and *c, we*
may do so as follows: int a,b,c;. All three variables will be declared as integers. It
is also perfectly valid to declare each one with a separate statement.

*•* Variables may be initialized by either constants or expressions within the declaration
statement. Hence, if we wanted to initialize the integer variable*a* to zero, we could do
the following: int a=0;. You can also do this in the context of variable lists. Suppose
you want to initialize the integer variable *b* to one, but not the other variables. You
can do this as follows: int a,b=1,c;

**Basic Operations**

Now that we have some concept of variables, the natural question to ask is: “What can we do with them ?” C++ provides a collection of basic operations, some of which are listed in Table 2.4.

The operations presented look very much like the operations you would expect in math- ematics. We must make two special notes, however. The ﬁrst note is concerning the assign- ment operator, and the second note is concerning order of precedence and associativity. In C++, the symbol “=,” which we, in English, pronounce “equals,” should be interpreted as

“is assigned the value of,”, e.g. *x*= 2 is to be read *x*“is assigned the value of” 2. Take the
following C++ code example:

int x,y,z;

x = 2;

y = 4;

z = x + y;

The C++ interpretation of this code is as follows: First, declare three variables, x, y,
and z as integers. Next, assign x the value 2, then assign y the value 4. We then add the
values of x and y, and assign to z the newly computed value. The ordering is important. In
mathematics, we may say *p* = *q* or *q* =*p, and both statements mean that* *p* is equal to *q.*

However, in C++, *p* =*q* says that the variable *p* is assigned the same value as the variable
*q, whereas* *q* = *p* says that the variable *q* is assigned the value of *p. As you can see, these*
two statements are not equivalent.

The second item to note is that operators have both precedence and associativity. The C++ operator precedence and associativity are provided in table 2.5. What does operator precedence and associativity really mean to the programmer? Examine the following exam- ple: Assume that we want to add up six numbers: 1,2,3,4,5,6. Mathematically, we write this operation as 1 + 2 + 3 + 4 + 5 + 6. However, if we implement this expression, keeping in mind that we are dealing with a binary operator “+,” then we would write the following for summing: 1 + 2 = 3,3 + 3 = 6,6 + 4 = 10,10 + 5 = 15,15 + 6 = 21. We begin by adding the ﬁrst two numbers together, and then we accumulate as we come to each new value. The same is true for the computer. When the computer is faced with the expression:

int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = x + y + z;

it interprets this as being equivalent to:

int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = x + y;

w = w + z;

Notice that in the second expression, each evaluation involves only one binary expression.

Hence associativity is left to right in this case. Now suppose we had the following expression:

int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = x + y * z;

The computer interprets this to be the following:

int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = y * z;

w = w + x;

Why multiplication prior to addition? The multiplication operator has precedence over the addition operator, and hence all multiplications are done ﬁrst. This is a very important concept to realize:

### Key Concept

*•* Order of operations is important. Precedence can make all the
diﬀerence!

OPERATIONS ASSOCIATIVITY

( ) [ ] *→* . left to right

! *∼* ++ - - + *− ∗* & (type) sizeof right to left

*∗* / % left to right

+ *−* left to right

*<<* *>>* left to right

*<* *<=* *>* *>=* left to right

= = != left to right

Table 2.5: Unitary +, -, and * have higher precedence than the binary forms.

One thing we should speak to is the use of () in expressions. Notice in the precedence table that () are at the top of the list. This is for a reason. The use of () gives the programmer the right to specify the order of precedence by explicitly placing () within the expression.

For instance, in the following piece of code int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = (x + y) * z;

the computer interpret this to be the following:

int x,y,z,w;