in C++ and MPI

(1)

in C++ and MPI

A seamless approach to parallel algorithms and their implementation George Em Karniadakis and Robert M. Kirby II

Cambridge University Press

(2)

Scientific computing is by its very nature a practical subject - it requires tools and a lot of practice. To solve realistic problems we need not only fast algorithms but also a combination of good tools and fast computers. This is the subject of the current book, which emphasizes equally all three: algorithms, tools, and computers. Often times such concepts and tools are taughtseriallyacross different courses and different textbooks, and hence the interconnection between them is not immediately apparent. We believe that such a close integration is important from the outset.

The book starts with a heavy dosage of C++ and basic mathematical and computational concepts, and it ends emphasizing advanced parallel algorithms that are used in modern simulations. We have tried to make this book fun to read, to somewhat demystify the subject, and thus the style is sometimes informal and personal. It may seem that this happens at the expense of rigor, and indeed we have tried to limit notation and theorem prooﬁng. Instead, we emphasize concepts and useful tricks-of-the-trade with many code segments, remarks, reminders, and warnings throughout the book.

The material of this book has been taught at diﬀerent times to students in engineering, physics, computer science, and applied mathematics at Princeton University, Brown Univer- sity, and MIT over the last 15 years. Diﬀerent segments have been taught to undergraduates and graduates, to novices as well as to experts. To this end, on all three subjects covered, we start with simple introductory concepts and proceed to more advanced topics - bandwidth, we believe, is one strength of this book.

We have been involved in large-scale parallel computing for many years from benchmark- ing new systems to solving complex engineering problems in computational mechanics. We represent two diﬀerent generations of computational science and supercomputing, and our expertise are both overlapping and complementary. The material we selected to include in this book is based on our experiences and needs as computational scientists for high-order accuracy, modular code, and domain decomposition. These are necessary ingredients for pushing the envelope in simulation science and allow one to test new theories and concepts or solve very large speciﬁc engineering problems accurately.

In addition to integrating C++ and MPI concepts and programs into the text, we also provide with this book a software suitecontaining all the functions and programs discussed.

It is our belief, as stated earlier, that mastery of this subject requires both a knowledge of the tools and substantial practice using the tools. Part of the integration that we are attempting to achieve is attained when the reader is able to go immediately from the textbook to the computer to experiment with the concepts which have been presented. We envision the software suite allowing the reader to do the following: to verify the concepts presented in the book by using the programs that are provided, to extend the programs in the book to implement concepts that may have been discussed but not programmed, and to tackle diﬀerent problems than those presented using the software provided.

i

(3)

The current book is appropriate for use by students in engineering and physics, computer science, and applied mathematics. It is designed to be more like a textbook and less of a research monograph. The material can be used to fill two semesters with the following breakdown: The first semester will cover chapters 1 to 5 at the senior undergraduate or first year graduate level. The second semester will cover the remainder of the book in a first or second year graduate course. Chapters 1 to 5 cover all the basic concepts in algorithms, C++, and MPI. Chapters 6 to 10 cover discretization of differential equations and corresponding solvers, and present more advanced C++ and MPI tools. The material in chapter 3 on approximation of functions and discrete data is fundamental and precedes other topics. In the basic material on discretization, we separated explicit from implicitapproaches because the parallel computational complexity of the two is fundamentally different.

A lighter course, e.g. a quarter course or a lower level undergraduate course, could be based on chapters 1 to 5 by leaving out the MPI material and possibly other advanced topics such as wavelets, advanced quadrature rules, and systems of nonlinear equations.

There are other possibilities as well. A graduate level course on numerical linear algebra can be based on sections 4.1.6, 4.1.7 and chapters 7 to 10. Assuming that the student has a C++

background or even another high performance language then the addition of MPI material in sections 2.3, 3.4, 4.3 and 5.13 to the above will constitute one full semester course on parallel numerical linear algebra. Another possibility for a quarter course is to simply teach the algorithms in chapters 5 to 8 covering traditional numerical analysis. Supplementary notes from the instructor, e.g. theorem proofs and more case studies, can make this a full semester course.

The book is designed so that it can be used with or without the C++ and MPI tools and associated concepts but we strongly encourage the instructor to teach the course as a seamless integration of both algorithms and tools.

Acknowledgements

We are grateful to Dr. Ma Xia and Dr. C. Evangelinos for their help and advice regarding the material of this topic and for some of the figures that they provided. We would also like to thank Ms. Madeline Brewster for her help in formatting the book and for typing a major part of it. The first author is grateful for the many years of funding by the Office of Naval Research, the Air Force Office of Scientific Research, and the Department of Energy.

Finally, we would like to thank our families for their continuous love, patience, and understanding, especially during this long project.

Providence, Rhode Island, USA George Em Karniadakis

Salt Lake City, Utah, USA Robert M. Kirby II

ii

(4)

Chapter 1 Scientiﬁc Computing and Simulation Science

1.1 What is Simulation?

Science and engineering have undergone a major transformation at the research as well as at the development and technology level. The modern scientist and engineer spend more and more time in front of a laptop, a workstation, or a parallel supercomputer and less and less time in the physical laboratory or in the workshop. The virtual wind tunnel and the virtual biology lab are not a thing of the future, they are here! The old approach of “cut- and-try” has been replaced by “simulate-and-analyze” in several key technological areas such as aerospace applications, synthesis of new materials, design of new drugs, chip processing and microfabrication, etc. The new discipline of nanotechnology will be based primarily on large-scale computations and numerical experiments. The methods of scientific analysis and engineering design are changing continuously, affecting both our approach to the phenomena that we study as well as the range of applications that we address. While there is a lot of software available to be used as almost a “black-box,” working in new application areas requires good knowledge of fundamentals and mastering of effective new tools.

In the classical scientific approach, the physical system is first simplified and set in a form that suggests what type of phenomena and processes may be important, and correspond- ingly what experiments are to be conducted. In the absence of any known-type governing equations, dimensional inter-dependence between physical parameters can guide laboratory experiments in identifying key parametric studies. The database produced in the laboratory is then used to construct a simplified “engineering” model which after field-test validation will be used in other research, product development, design, and possibly lead to new technological applications. This approach has been used almost invariably in every scientific discipline, i.e., engineering, physics, chemistry, biology, etc.

The simulation approach follows a parallel path but with some signiﬁcant diﬀerences.

First, the phase of the physical model analysis is more elaborate: The physical system is cast in a form governed by a set of partial diﬀerential equations, which represent continuum approximations to microscopic models. Such approximations are not possible for all systems, and sometimes the microscopic model should be used directly. Second, the laboratory exper-

2

(10)

iment is replaced by simulation, i.e., a numerical experiment based on a discrete model. Such a model may represent a discrete approximation of the continuum partial differential equations, or it may simply represent a statistical representation of the microscopic model. Finite difference approximations on a grid are examples of the first case, and Monte Carlo methods are examples of the second case. In either case, these algorithms have to be converted to software using an appropriate computer language, debugged, and run on a workstation or a parallel supercomputer. The output is usually a large number of files of a few Megabytes to hundreds of Gigabytes, being especially large for simulations of time-dependent phenomena.

To be useful, this numerical database needs to be put into graphical form using various visualization tools, which may not always be suited for the particular application considered.

Visualization can be especially useful during simulations where interactivity is required as the grid may be changing or the number of molecules may be increasing.

The simulation approach has already been followed by the majority of researchers across disciplines in the last few decades. The question is if this is a new science, and how one could formally obtain such skills. Moreover, does this constitute fundamental new knowledge or is it a “mechanical procedure,” an ordinary skill that a chemist, a biologist or an engineer will acquire easily as part of “training on the job” without speciﬁc formal education. It seems that the time has arrived where we need to reconsider boundaries between disciplines and reformulate the education of the future simulation scientist, an inter-disciplinary scientist.

Let us re-examine some of the requirements following the various steps in the simulation approach. The first task is to select the right representation of the physical system by making consistent assumptions in order to derive the governing equations and the associated boundary conditions. The conservation laws should be satisfied; the entropy condition should not be violated; the uncertainty principle should be honored. The second taskis to develop the right algorithmic procedure to discretize the continuum model or represent the dynamics of the atomistic model. The choices are many, but which algorithm is the most accurate one, or the simplest one, or the most efficient one? These algorithms do not belong to a discipline! Finite elements, first developed by the famous mathematician Courant and re- discovered by civil engineers, have found their way into every engineering discipline, physics, geology, etc. Molecular dynamics simulations are practiced by chemists, biologists, material scientists, and others. The third task is to compute efficiently in the ever-changing world of supercomputing. How efficient the computation is translates to how realistic of a problem is solved, and therefore how useful the results can be to applications. Thefourth taskis to assess the accuracy of the results in cases where no direct confirmation from physical experiments is possible such as in nanotechnology or in biosystems or in astrophysics, etc. Reliability of the predicted numerical answer is an important issue in the simulation approach as some of the answers may lead to new physics or false physics contained in the discrete model or induced by the algorithm but not derived from the physical problem. Finally, visualizing the simulated phenomenon, in most cases in three-dimensional space and in time, by employing proper computer graphics (a separate specialty on its own) completes the full simulation cycle. The rest of the steps followed are similar to the classical scientific approach.

In classical science we are dealing with matter and therefore atomsbut in simulation we are dealing with information and thereforebits, so it is atoms versus bits! We should, therefore, recognize the simulation scientist as a separate scientist, the same way we recognized just a few decades ago the computer scientist as diﬀerent than the electrical engineer or the

(11)

Numerical Mathematics

Computer Science

Scientific Computing

Modeling

Figure 1.1: Deﬁnition of scientiﬁc computing as the intersection of numerical mathematics, computer science, and modeling.

applied mathematician. The new scientist is certainly not a computer scientist although she should be computer literate in both software and hardware. She is not a physicist although she needs a sound physics background. She is not an applied mathematician although she needs expertise of mathematical analysis and approximation theory.

With the rapid and simultaneous advances in software and computer technology, especially commodity computing, the so-calledsoupercomputing, every scientist and engineer will have on her desk an advanced simulation kit of tools consisting of a software library and multi-processor computers that will make analysis, product development, and design more optimal and cost-eﬀective. But what the future scientists and engineers will need, ﬁrst and foremost, is a solid inter-disciplinary education.

Scientiﬁc computing is the heart of simulation science, and this is the subject of this book. The emphasis is on a balance between classical and modern elements of numerical mathematics and of computer science, but we have selected the topics based on broad modeling concepts encountered in physico-chemical and biological sciences, or even economics (see ﬁgure 1.1).

1.2 A Seamless Approach Path

Our aim in writing this book has been to provide the student, the future simulation scientist, with a seamless approach to numerical algorithms, modern programming techniques, and parallel computing. Often times such concepts and tools are taughtserially across different courses and different textbooks, and hence the interconnection between them is not immediately apparent. The necessity of integrating concepts and tools usually comes after such courses are concluded, e.g. during a first job or a thesis project, thus forcing the student to synthesize what is perceived to be threeindependent subfieldsinto one in order to produce a solution. Although this process is undoubtly valuable, it is time consuming and in many cases it may not lead to an effective combination of concepts and tools. Moreover, from the

(12)

pedagogical point of view, the integrated seamless approach can stimulate the student simultaneously through the eyes of multiple disciplines, thus leading to enhanced understanding of subjects in scientiﬁc computing.

(a)

C++ MPI

This Book Algorithms

(b)

C++ MPI

Algorithms

Figure 1.2: Simultaneous integration of concepts shown in (a) in contrast with the classical serial integration shown in (b).

As discussed in the previous section, in the scientiﬁc simulation approach there are several successive stages that lead from :

1. The real world problem to its mathematical formulation;

2. From the mathematical description to the computer implementation and solution, and 3. From the numerical solution to visualization and analysis.

In this book, we concentrate on stage (2) which includes not only the mathematics of numerical linear algebra and discretization but also the implementation of these concepts in C++ and MPI.

There are currently several excellent textbooks and monographs on these topics but without the type of integration that we propose. For example, the book by Golub & Ortega [45] introduces pedagogically all the basic parallel concepts, but a gap remains between the parallel formulation and its implementation. Similarly, the books by Trefethen & Bau [88]

and Demmel [26] provide rigor and great insight into numerical linear algebra algorithms, but they do not provide sufficient material on discretization and implementation. On the other hand, popular books in C++ (e.g., by Stroustrup [86]) and MPI (e.g., by Pacheco [73]) are references that teach programming using disconnected algorithmic examples, which is useful for acquiring general programming skills but not for parallel scientific computing. Our book treats numerics, parallelism, and programming equally and simultaneously by placing the reader at a vantage point between the three areas as shown in the schematic of figure 1.2(a), and in contrast with the classical approach of connecting the three subjects serially as illustrated in figure 1.2(b).

(13)

1.3 The Concept of Programming Language

In studying computer languages, we want to study a new way of interacting with the computer. Most people are familiar with the use of software purchased from your local computer store, software ranging from word processors and spreadsheets to interactive games. But have you ever wondered how these things are created? How do you actually “write” software? Throughout this book we will be teaching through both lecture and example how to create computer software that solves scientiﬁc problems. Our purpose is not to teach you how to write computer games and the like, but the knowledge gained here can be used to devise your own software endeavors.

It has been stated by some that the computer is a pretty dumb device, in that it only understands two things – on and oﬀ. Like sending Morse code over a telegraph wire with signals of dots and dashes, the computer uses sequences of zeros and ones as its language.

The zeros and ones may seem inefficient, but it is not just the data used, but the rules applied to the data which make it powerful. This concept, in theory, is no different than human language. If we were to set before you a collection of symbols, say a b c d ... z, and indicate to you that you can use these to express even the most complex thoughts and emotions of the human mind and heart, you would think we were crazy. Just 26 little symbols? How can this be? We know, however, that it is not merely the symbols that are important, but the rules used to combine the symbols. If you adhere to the rules defined by the English language, then books like this can be written using merely combinations of the 26 characters! How is this similar to the computer? The computer is a complex device for executing instructions. These instructions are articulated by using our two-base characters, 0 and1, along with a collection of rules for combining them together. This brings us to our first axiom:

Axiom I:Computers are machines which execute instructions. If someone is not telling the computer what to do, it does nothing.

Most people have had some experience with computers, and immediately they will read this statement and say: “Hey, I have had my computer do all kinds of things that I didn’t want!”. Ah, but read the axiom carefully. The key to this axiom is the use of the term someone. The one thing to keep in mind is that some human, or collection of humans, developed software to tell the computer what to do. At a relatively low level, this would be the people who wrote the operating system used by the computer. At a higher level, this would be the people that developed the word processor or game that you were using. In both cases, however, someone determined how the computer would act and react to your input. We want you, after reading this book and understanding the concepts herein, to be able to be in the driver’s seat. This leads us to our second axiom:

Axiom II:Computer programming languages allow humans a simpliﬁed means of giving the computer instructions.

We tell you that we want you to be in the driver’s seat, and you tell me “I don’t want to learn how to communicate in zeros and ones ... learning English was hard enough!” You can imagine how slowly the computer age would have progressed if every programming class consisted of the following lecture scheme. Imagine the ﬁrst day of class. On the ﬁrst day,

(14)

the instructor tells you that you will be learning the two basic components of the computer language today: 0 and 1. He may force you to say zero and one a few times, and then write zero and one many times on a piece of paper for practice, but then, what else would there be to learn concerning your character set? Class dismissed. Then, for the rest of the semester, you would spent your time learning how to combine zeros and ones to get the computer to do what you want. Your ﬁrst assignment might be to add two numbers a and b, and to store the result in c (i.e., c = a + b). You end up with something that looks like:

01011001010001000111010101000100 01011001011100100010100101000011 00111010101000100111010100100101 01011101010101010101010000111101 Seems like a longwinded way of saying

c=a+b?

But this is what the computer understands, so this is how we must communicate with it.

However, we do not communicate in this fashion. Human language and thought use a higher abstraction than this. How can we make a bridge for this gap? We bridge this gap via programming languages, see ﬁgure 1.3.

Computer

Assembly

C/C++

Java

Fortran Visual C++

Visual Quick Basic

Human

Low High

Easiness Specificity

Figure 1.3: Programming languages provide us a means of bridging the gap between the computer and the human.

The ﬁrst programming language we will mention isassembly. The unique property of assembly is that for each instruction, there is a one-to-one correspondence between a command in assembly and a computer understandable command (in zeros and ones). For instance, instead of writing

01011001010001000111010101000100

as a command, you could write ‘load a $1’. This tells the computer to load the contents of the memory location denoted by ‘a’ into register $1 in the computer’s CPU (Central Processing Unit). This is much better than before. Obviously, this sequence of commands is much easier for the human to understand. This was a good start, but assembly is still considered a “low-level language”. By low-level we mean that one instruction in assembly is

(15)

equal to one computer instruction. But as we said earlier, we want to be able to think on a higher level. Hence, there was the introduction of “higher level” languages. Higher level languages are those in which one instruction in the higher-level language equals one or more computer level instructions. We want a computer language where we can say ‘c = a + b’, and this would be equivalent to saying :

load a $1 load b $2 add $1 $2 $3 save $3 c

One high-level instruction was equivalent to four lower level instructions (here written in pseudo-assembly so that you can follow what is going on). This is preferable for many reasons. For one thing, we as humans would like to spend our time thinking about how to solve the problem, not just trying to remember (and write) all the assembly code! Secondly, by writing in a higher-level language, we can write code that can work on multiple computers, because the translation of the higher level code can be done by a compiler into the assembly code of the processor on which we are running.

As you read through this book and do the exercises found herein, always be mindful that our goal is to utilize the computer for accomplishing scientiﬁc tasks encountered in simulation science. At a high-level, there is a science or engineering problem to solve, and we want to use the computer as a tool for solving the problem. The means by which we will use the computer is through the writing and execution of programs written using the computing language C++ and the parallel message passing libraries of MPI.

1.4 Why C++ and What is MPI?

The algorithms we present in the book can certainly be implemented in other languages, e.g.

FORTRAN or Java, as well as other communication libraries, e.g. PVM (Parallel Virtual Machine). However, we commit to a speciﬁc language and parallel library in order to provide the student with the immediate ability to experiment with the concepts presented. To this end, we have chosen C++ as our programming language for a multitude of reasons: First, it provides an object-oriented infrastructure that accommodates a natural breakdown of the problem into a collection of data structures and operations on those structures. Secondly, the use of C++ transcends many disciplines beyond engineering where traditionally FORTRAN has been the prevailing language. Thirdly, C++ is a language naturally compatible with the basic algorithmic concepts of

• partitioning,

• recursive function calling,

• dynamic memory allocation, and

• encapsulation.

(16)

Similarly, we commit to MPI (Message Passing Interface) as a message passing library be- cause it accommodates a natural and easy partitioning of the problem, it provides portability and eﬃciency, and it has received wide acceptance by academia and industry.

C++

Chapter 1 Chapter 10

Algorithms

MPI

Figure 1.4: Progression of new material throughout the book in the three areas shown in ﬁgure (1.2).

The simultaneous integration we propose in this book will be accomplished by carefully presenting related concepts from all three sub-areas. Moving from one chapter to the next requires diﬀerent dosages of new material in algorithms and tools. This is explained graphically in ﬁgure 1.4, which shows that while new algorithms are introduced at an approximately constant rate, the introduction of new C++ and MPI material vary inversely. We begin with an emphasis on the basics of the language, which allows the student to immediately work on the simple algorithms introduced initially, while as the book progresses and the computational complexity of algorithms increases the use of parallel constructs and libraries is emphasized.

More speciﬁcally, to help facilitate the student’s immersion into object-oriented thinking, we provide a library of classes and functions for use throughout the book. The classes contained in this library are used from the very beginning of the book as a natural, user- deﬁned, extension of C++. As the book progresses, the underlying logic and programming implementation of these classes are explained, bringing the student to a deeper understanding of the development of C++ classes. We will denote all classes used within the book and not inherent to C++ with the letters SC, such as the classes SCVectorand SCMatrix.

Software Suite

This is done to clearly distinguish between C++ defined and user-defined data types, and also to accentuate the utility of user-defined types within the C++ programming language. As students become more familiar and confident in their ability to devise and use datatypes, we encourage them to use these facil- ities provided by the language for more effective programming and problem solving. All the codes of this book and many more examples are included in the software suite, which is dis- tributed with this book.

(17)

1.5 What About OpenMP?

Due to the recent proliferation of distributed shared-memory (DSM) machines in the scientiﬁc computing community, there is much interest in how best to appropriately utilize both the distributed and the shared-memory partitioning of these systems. MPI provides an eﬃcient means of parallel communication among a distributed collection of machines; however, not all MPI implementations take advantage of shared-memory when it is available between processors (the basic premise being that two processors, which share common memory, can communicate with each other faster through the use of the shared medium than through other communication means).

OpenMP (Open Multi Processing) was introduced to provide a means of implementing shared-memory parallelism in FORTRAN and C/C++ programs. Specifically, OpenMP specifies a set of environment variables, compiler directives, and library routines to be used for shared-memory parallelization. OpenMP was specifically designed to exploit certain characteristics of shared-memory architectures such as the ability to directly access memory throughout the system with low latency and very fast shared-memory locks. To learn more about OpenMP, visit www.openmp.org.

A new parallel programming paradigm is emerging in which both the MPI and OpenMP are used for parallelization. In a distributed shared-memory architecture, OpenMP would be used for intra-node communication (i.e., between a collection of processors which share the same memory subsystem) and MPI would be used for inter-node communication (i.e., between distinct distributed collections of processors). The combination of these two parallelization methodologies may provide the most eﬀective means of fully exploiting modern distributed shared-memory (DSM) systems.

1.6 Algorithms and Top Ten List

The Greeks and Romans invented many scientific and engineering algorithms, but it is be- lieved that the term ‘algorithm’ stems from the name of the ninth-century Arab mathematician al-Khwarizmi, who wrote the book al-jabr wa’l muqabalach which eventually evolved into today’s high school algebra textbooks. He was perhaps the first to stress systematic procedures for solving mathematical problems. Since then, some truly ingenious algorithms have been invented, but the algorithms that have formed the foundations of the scientific computing as a separate discipline were developed in the second part of the twentieth century. Dongarra & Sullivan put together a list of the top ten algorithms of the twentieth century [33]. According to these authors, these algorithms had the greatest influence on science and engineering in the past. They are in chronological order:

1. 1946: The Monte Carlo method for modeling probabilistic phenomena.

2. 1947: The Simplex method for linear optimization problems.

3. 1950: The Krylov subspace iteration method for fast linear solvers and eigensolvers.

4. 1951: The Householder matrix decomposition to express a matrix as a product of simpler matrices.

(18)

5. 1957: The FORTRAN compiler that liberated scientists and engineers from programming in assembly.

6. 1959-1961: The QR algorithm to compute many eigenvalues.

7. 1962: The Quicksort algorithm to put things in numerical or alphabetical order fast.

8. 1965: The Fast Fourier Transform to reduce operation count in Fourier series representation.

9. 1977: The Integer relation detection algorithm, which is useful for bifurcations and in quantum ﬁeld theory.

10. 1987: The Fast multipole algorithm for N-body problems.

Although there is some debate as to the relative importance of these algorithms or the absence of other important methods in the list, e.g. finite differences and finite elements, this selection by Dongarra & Sullivan reflects some of the thrusts in scientific computing in the past. The appearance of the FORTRAN compiler, for example, represents the historic transition from assembly language to higher level languages, as discussed earlier. In fact, the first FORTRAN compiler was written in 23,500 assembly language instructions! FORTRAN has been used extensively in the past, especially in the engineering community, but most of the recent scientific computing software has been re-written in C++, e.g. the Numerical Recipes [75].

In this book we will cover in detail the algorithms (3), (4), (6) and (8) from the above list, including many more recent versions, which provide more robustness with respect to round-off errors and efficiency in the context of parallel computing. We will also present discretizations of ordinary and partial differential equations using several finite difference formulations.

Many new algorithms will probably be invented in the twenty-ﬁrst century, hopefully some of them from the readers of this book! As Dongarra & Sullivan noted “This century will not be very restful for us, but is not going to be dull either!”

(19)

Chapter 2 Basic Concepts and Tools

In this chapter we introduce the main themes that we will cover in this book and provide an introduction for each of them. We begin with a brief overview of C++ and deﬁne the two basic concepts offunctionsandclassesas well as other syntactic elements of the language. We then introduce basic mathematical concepts that include elements of linear algebra, vector orthogonalization, and corresponding codes and software. Finally, we introduce parallel programming and review some generic parallel architectures as well as standard parallel algorithms for basic operations, e.g., the fan-in algorithm for recursive doubling. We also provide a brief overview of the main MPI commands.

12

(20)

2.1 Introduction to C++

An ancient proverb states that the beginning of a thousand mile journey begins with a single step. For us, this single step will be a brief overview of the C++ programming language. This introduction is not designed to be all-inclusive, but rather it should provide the scaﬀolding from which we will build concepts throughout this book. Admittedly, what you will read now may seem daunting in its scope, but as you become more familiar with the concepts found herein, you will be able to use the C++ language as a tool for furthering your understanding of deeper mathematical and algorithmic concepts presented later in the book. With this in mind, let us begin our thousand mile journey with this ﬁrst step.

Any programming language can be broken down into two high level concepts:

• Data, and

• Operations on data.

Though this may seem like a trivial statement, quite often in science and engineering problems the real work that needs to be done is identifying what is the relevant data, and what operations need to be executed on that data to obtain the desired results. From the programming point of view, we will assume that you already have a clear concept of what data is needed to solve the problem, and what algorithms will be acting on the data; we will focus on translating these needs into the programming language.

We have chosen to present this material in a top-down manner; that is, we will start from the high level of the program and work our way down toward lower and lower levels of detail. At the end of this section, we will recapitulate what we have learned, and show how all the pieces do indeed ﬁt together though an example. We start with the idea of a program, or ‘code’ as it is sometimes referred to within the scientiﬁc computing community.

A program is a sequence of instructions acting on a collection of data. Just as this chapter had a starting point, every program must have a starting point, and in C++, the starting point of execution is the “main” function, which is called main. This tells the computer where to start execution of your program. The simplest C++ code that can be written is the following:

int main(int argc, char ** argv){

}

This piece of code will compile and execute, but it will do absolutely nothing. Though it may contain data inputed through the argumentsargcand argv, it contains no operations on that data. It merely provides the computer with a starting point of execution, and then immediately terminates because all executable commands have been executed (which in this case is none!).

Software Suite

This is your ﬁrst C++ program. In keeping with programming tradition, your ﬁrst non-trivial C++ program should be the following:

(21)

#include<iostream.h>

int main(int argc, char ** argv){

cout << "Hello World" << endl;

}

At this stage, you should type in the program above, compile it using your native C++

compiler, and execute this program. The result of this program should be that the statement

“Hello World” is printed to your screen. If you have problems with this exercise, see Appendix A.1.

In theory, you now have your ﬁrst C++ program. You have written the code, compiled and linked the code, and are able to execute the code on your native machine. Now that this ﬁrst step is behind us, let us jump into discussing some of the basic concepts, one of which we have just gained some experience, i.e., the concept of a function.

2.1.1 Two Basic Concepts in C++

There are two basic concepts used throughout the C++ programming language: the concepts of

• Function, and of

• Class.

The C programming language, upon its inception, had at least one self-deﬁning feature:

modularity. The C language was designed to be modular, and this modularity was accom- plished through the employment of functions. Almost everything in C is a function, and some have said that “... all C really is a big function calling a bunch of other functions”.

Well, this is almost right. The C language basically consists of two components, a core language speciﬁcation which contains basic data types and constructs (such as if statements, for statements, etc., some of which we discuss later on in this chapter), and a collection of libraries, each of which contains many pre-deﬁned functions. C++ built on this philosophy, and introduced the “class” as a second fundamental building block of the language. Within C++, functions and classes are intertwined to create the desired program.

We begin by deﬁning what we mean by a functionand aclass. Functions and classes can be distinguished by their fundamental premises. The primary premise of a function revolves around what the function does, whereas the fundamental premise of a class revolves around the data that the class contains. Functions are designed to be abstractions of algorithms;

Classes (at least as presented in this book) are an abstraction of data and operations on that data. We will clarify this distinction by examining the two concepts in more detail.

Functions

Functions are abstractions which encapsulate a concept or algorithm. The concept of a function is probably not new to you. In mathematics, we see functions all the time. We

(22)

deﬁne functions so that we can abstract a particular operation or collection of operations into one statement. In mathematics, we note functions in a manner like

f(x) =x³−x²+ 2.

We understand that if we evaluate the function at the point x = 2, denoted f(2), this is equivalent to substituting the number 2 into the expressionx³−x²+2, yielding 2³−2²+2 = 6.

We hence would say that f(2) = 6. In mathematical parlance, we would add rigor to all that we have done so far, and state this collection of operations as follows:

Given x as a real number, deﬁne f(x) as a function returning a real number, where the deﬁnition of f(x) is given by the expression f(x) =x³−x²+ 2.

This example demonstrates the three major components of a function:

• input, output, and contract (or algorithm).

We specified the valid range of parameters that can be inputed into this function (mathematically referred to as the domain of the function); we specified the range in which the output would lie (mathematically referred to as the range of the function); and finally we specified what, given a particular input, the function will do. The same holds true for C++.

For a function, we need to specify the input, output, and contract.

Function

Input Output

Algorithm/Contract

Figure 2.1: A schematic of a function in C++.

In C++, the process of specifying the input, output, and contract is done in two stages, see ﬁgure 2.1¹. These two stages are specifying the following for each function:

• Function declaration, and

• Function deﬁnition.

A function’s declaration accomplishes several things in one step. It declares the name of the function and the data types of the input and output of the function. The function definition specifies what the function will accomplish given particular input data. Hence, the definition of a function is the algorithmic explanation of the contract of the function.

A schematic for the syntax used for the declaration of a C++ function is given in ﬁgure 2.2. Using this as our guide, let us attempt to formulate our mathematical function into a

1According to the language standard, it is possible to combine both of these items into one statement satisfying the requirement for both simultaneously. For pedagogical clarity, we will always keep the two stages separate.

(23)

C++ function. For the purposes of this demonstration, let us assume that we have a data type called ﬂoat, which is the ﬂoating point representation of a real number. The function declaration of our function “f” is given by:

float f(float x);

Output Function Name

(

Input/Output

)

"Arguments"

Figure 2.2: Schematic of the syntax of a C++ function.

Examining our schematic given in 2.2, we can dissect this code to understand what is going on. Let us see if we have met the three components of a function declaration. First, we specified the name of the function, “f”. Next, we specified that the valid input to our function is a floating point value. Finally, we specified that our function returns a floating point value. Hence, we now have declared our function! What does declaration really mean though? Declaration of a function allows us to use this function throughout our code with the assumption that this function will act upon the contract of the function (later specified by the definition of the function). A key thing to realize is that:

• A function must be declared before it can be used.

Because of this fact, many programmers place all their function declarations at the beginning of their C++ program so that all functions are accessible everywhere. You will notice throughout this book that we either place our function declarations within a header file (the files with a .h extension) which are included at the beginning of a program, or we directly insert the function declarations after the include files and prior to themainfunction definition. An example template of this type of file setup is shown at the end of this section.

Another important fact is that within the argument list (the list of inputs and outputs given as arguments to the function), the names speciﬁed are irrelevant. The compiler is only concerned about the data type. Hence, it would be perfectly valid to write

float f(float);

Why have a variable name there, if it is just to be ignored? The most practical reason we put variable names in function declarations is that we normally cut-and-paste the function declaration from the function definition. The function definitiondoes require variable names to be used. We will now give the function definition so that this becomes more apparent.

(24)

float f(float x){

float y;

y = x*x*x - x*x + 2;

return y;

}

Notice that the function definition has a very similar beginning to the function declaration. As before, you specify the function name and the input and output data types. The difference, however, is that the function definition is the implementation of our contract. For the function definition, including specific variable names within the argument list is essential because we will use these variable names throughout the definition to refer to that data that was inputed. In C++, when data is inputed into a function, the information is passed by value. This means that when the function is called, a copy of the information is created for the function to use, not the original data. This is an important and yet subtle point about C++. We will discuss this in more detail later (see section 3.1.2). For now, the thing to remember is that the function takes from its argument list the information passed to it, and it stores a copy of the information in a variable specified by the name given in the argument list.

In this example, we declare a variable y in which we temporarily store the value of our function, given by the expressionx*x*x - x*x + 2, and then we return the value of the variabley. The returnstatement designates the variable from which a value is to be returned from the function back to the caller. If we were to examine a code snippet, we could use our function just as we did mathematically by writing:

float w;

w = f(2);

If were to print the value of w, we would see that returned value is the ﬂoating point value 6.000. Some of this will become more clear after we have discussed basic data types.

The key items to remember from this discussion are:

• Every function must have a function declaration and deﬁnition.

• Function declarations specify the name of the function and the data types of the inputs and outputs.

• Function deﬁnitions specify the implementation of the algorithm used to carry out the contract of the function.

• Variable names in function declarations do not matter.

(25)

• Variable names in function deﬁnitions domatter because they specify how the data is to be referred to in the implementation of the function.

• Variables passed to C++ functions are passed byvalue unless otherwise speciﬁed.

Software Suite

Putting it into Practice

Recall our little main function we wrote, compiled, and ran at the beginning of this section. Let us now combine that code with our new function.

#include <iostream.h> // inclusion of library header file // for use of cout

float f(float x); // function declaration int main(int argc, char ** argv){

float w;

w = f(2);

cout << "The value of w is: " << w << endl;

}

float f(float x){ // function definition float y;

y = x*x*x - x*x + 2;

return y;

}

If you were to compile and run this code, you would obtain a statement on your screen that says: The value of w is: 6.00 .

In the program above, we use an object named cout, the declaration of which is found in the system header file iostream.h. The object cout is used for printing to standard output, which in many cases is the screen. For now, it is sufficient to know that the << symbols delineate expressions to be printed. In the code above, the first statement to be printed is the string “The value of w is:,” then the value associated with the variable w, and then the end-of-line character denoted by the term endl. We will speak more about cout later in this chapter.

Classes

Classes are abstractions which encapsulate data and operations on that data. In C++, the concept of classes is used to simplify through encapsulation very complex data structures.

(26)

A class consists of two parts: dataand methods operating on the data. What you will ﬁnd is that methods are merely functions which are “attached” to classes. The concept of a method is analogous to that of a function, with the primary focus of a method being to act on the data of the class, and not on arbitrary inputed data.

For example, in the Gram-Schmidt routines in section 2.2.9, we utilize several user-deﬁned classes; among those classes is the class SCVector. Vector does not exist as a basic data type in C++; however, the language allows us to deﬁne a new type, called SCVector, which consists of a collection of basic data types and operations on that data. The declaration for the class SCVector is given below. We will not go over every detail now, but will defer explicit explanation until later in this book (see section 3.1.8). However, we call your attention to the two basic concepts of classes in C++:

1. encapsulated data, and 2. methods acting on that data.

In this class, the variables dimension and data are encapsulated by the class, and all the remaining methods (in the section marked ‘public’) act on this basic data.

Software Suite

We now present the class declaration of the SCVector class:

class SCVector{

private:

int dimension; // dimension of the vector

double *data; // pointer to array containing vector components public:

SCVector(int dim); // default constructor

SCVector(const SCVector& v); // copy constructor

SCVector(int col, const SCMatrix &A); //secondary constructor

~SCVector(); //destructor

int Dimension() const; //dimension of the vector double Length(); // Euclidean norm of the vector void Normalize(); // normalize vector

double Norm_l1();

double Norm_l2();

double Norm_linf();

//************************

// User Defined Operators

(27)

//************************

int operator==(const SCVector& v) const;

int operator!=(const SCVector& v) const;

SCVector & operator=(const SCVector& v);

double operator()(const int i) const;

double& operator()(const int i);

void Print() const;

};

Methods Data C++ Class

Input Output

Figure 2.3: A C++ class encapsulates data and methods acting on that data.

We will explain classes more thoroughly later (section 3.1.8), but let us take this oppor- tunity to point out a few features of classes:

• In the class above, there are several “constructors”. Aconstructoris the ﬁrst method which is called when an object is instantiated. These methods can be used to initialize data, set up information within the class, etc.

• A destructor is the method called prior to an object being deleted. The operating system will call this method (if it is available) to allow the object to “clean up for itself” prior to the operating system (OS) ﬁnishing the job by freeing the memory to which the class was allocated.

• Notice that some methods of this class modify the data contained within the object, while others merely compute things based upon the data within the object. For instance, the functionNormalizedoes exactly that – it normalizes the vector data contained with the object to have norm one. The function Norm l2, however, does not modify the data contained with the object, but merely computes the Euclidean norm of the vector based upon the information within the object.

(28)

Type Description short short integer

int integer long long integer Table 2.1: Integer data types.

• Classes allow us to define what are referred to as overloaded operators. In the declaration given above, we have listed these as “user defined operators”. In addition to defining new data types, we can also define (or re-define) how common unary and binary operators act on those objects (such as defining what ‘+’ means when two newly defined objects are involved).

2.1.2 Learning the Syntax and Other Basic Commands

Getting Past “;” and “{ }”

As you may have already noticed from the small amount of code that we have presented to you, the symbols “;” and “{ } ” are integral to C++. We will brieﬂy describe the purpose of these symbols here.

In general, the “;” is used to terminate an executable statement, hence why you see it at the conclusion of the commands listed above. Having such a symbol denote the end of an executable statement allows the compiler easily delineate between statements.

The { } brackets (called curly brackets) are used to denote scope. We will not go into all the nuances of scope right now other than to tell you that the scope of a variable or a function is the area of code in which that variable or function can be used.

Basic Data Types

In C++, variables are used to store information. In this section we go over some (not all, see the Appendix A.2 for more information) of the basic data types available in C++. Just like in mathematics, a variable is a symbol used to denote a particular value, either numerical or character. One way of thinking of a variable is that it is a box in which information can be stored. In C++, all variables must have atype. Thetype tells the computer what kind of box to create, the dimension of the box, the shape, etc. The syntax for creating a variable is:

<type> <variable list>

Some basic data types are listed in the tables 2.1, 2.2 and 2.3. Given the convention above, we see that to declare a variable called x of typeint, we would write:

int x;

(29)

Type Description ﬂoat single precision double double precision Table 2.2: Floating point data types.

Type Description char character

Table 2.3: Character data type.

This allocates a block of memory of the size of an integer, and would assign the symbol x to refer to that location in memory. Hence, from now on, if we act on the variable x, we are acting on the content of the memory assigned to x. This may sound odd at first, but this is one of the subtle differences between computer programming and mathematics. One thing that most people do not realize is that the computer does not have integer memory, floating point memory, character memory, etc ². As far as the computer is concerned, memory is memory. The data type that is used tells the computerhow to interpret memory.

A particular set of four bytes can be used in one program to represent an integer, and in another program to represent a ﬂoat. The computer does not care. What is important is that the computer needs to know how to interpret the bit pattern of the four bytes. Does it interpret the collection of bits as an integer or a ﬂoat, or some other variable type? Coming to grips with this notion will be very important when we start discussing the idea of addresses, the subject of pointers, etc. So, there are two key points to remember about data types.

The data type you specify tells the computer two things:

• The number of bytes to use to hold the data.

• How to interpret the bit pattern speciﬁed in those bytes.

Now that you know that variables exist, and you know some of the basic types, here are some rules to remember when using variables:

• Variables must be declared before they are used.

• Variable names can be of arbitrary length.

• Variable names arecase sensitive.

• Variables are to be assumed to be uninitialized. Do not rely on the operating system to initialize/zero values for you.

2Computers do in fact have specialized memory within the processor, called registers, which are either integer or ﬂoat/double.

(30)

Symbol Interpretation

+ addition

- subtraction

* multiplication

/ division

Table 2.4: Binary arithmetic operations.

• Variable lists may be used. If we wish to allocate three integer variablesa,b and c, we may do so as follows: int a,b,c;. All three variables will be declared as integers. It is also perfectly valid to declare each one with a separate statement.

• Variables may be initialized by either constants or expressions within the declaration statement. Hence, if we wanted to initialize the integer variablea to zero, we could do the following: int a=0;. You can also do this in the context of variable lists. Suppose you want to initialize the integer variable b to one, but not the other variables. You can do this as follows: int a,b=1,c;

Basic Operations

Now that we have some concept of variables, the natural question to ask is: “What can we do with them ?” C++ provides a collection of basic operations, some of which are listed in Table 2.4.

The operations presented look very much like the operations you would expect in mathematics. We must make two special notes, however. The ﬁrst note is concerning the assignment operator, and the second note is concerning order of precedence and associativity. In C++, the symbol “=,” which we, in English, pronounce “equals,” should be interpreted as

“is assigned the value of,”, e.g. x= 2 is to be read x“is assigned the value of” 2. Take the following C++ code example:

int x,y,z;

x = 2;

y = 4;

z = x + y;

The C++ interpretation of this code is as follows: First, declare three variables, x, y, and z as integers. Next, assign x the value 2, then assign y the value 4. We then add the values of x and y, and assign to z the newly computed value. The ordering is important. In mathematics, we may say p = q or q =p, and both statements mean that p is equal to q.

However, in C++, p =q says that the variable p is assigned the same value as the variable q, whereas q = p says that the variable q is assigned the value of p. As you can see, these two statements are not equivalent.

(31)

The second item to note is that operators have both precedence and associativity. The C++ operator precedence and associativity are provided in table 2.5. What does operator precedence and associativity really mean to the programmer? Examine the following example: Assume that we want to add up six numbers: 1,2,3,4,5,6. Mathematically, we write this operation as 1 + 2 + 3 + 4 + 5 + 6. However, if we implement this expression, keeping in mind that we are dealing with a binary operator “+,” then we would write the following for summing: 1 + 2 = 3,3 + 3 = 6,6 + 4 = 10,10 + 5 = 15,15 + 6 = 21. We begin by adding the ﬁrst two numbers together, and then we accumulate as we come to each new value. The same is true for the computer. When the computer is faced with the expression:

int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = x + y + z;

it interprets this as being equivalent to:

int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = x + y;

w = w + z;

Notice that in the second expression, each evaluation involves only one binary expression.

Hence associativity is left to right in this case. Now suppose we had the following expression:

int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = x + y * z;

The computer interprets this to be the following:

int x,y,z,w;

x = 2.0;

y = 3.0;

(32)

z = 4.0;

w = y * z;

w = w + x;

Why multiplication prior to addition? The multiplication operator has precedence over the addition operator, and hence all multiplications are done ﬁrst. This is a very important concept to realize:

Key Concept

• Order of operations is important. Precedence can make all the diﬀerence!

OPERATIONS ASSOCIATIVITY

( ) [ ] → . left to right

! ∼ ++ - - + − ∗ & (type) sizeof right to left

∗ / % left to right

+ − left to right

<< >> left to right

< <= > >= left to right

= = != left to right

Table 2.5: Unitary +, -, and * have higher precedence than the binary forms.

One thing we should speak to is the use of () in expressions. Notice in the precedence table that () are at the top of the list. This is for a reason. The use of () gives the programmer the right to specify the order of precedence by explicitly placing () within the expression.

For instance, in the following piece of code int x,y,z,w;

x = 2.0;

y = 3.0;

z = 4.0;

w = (x + y) * z;

the computer interpret this to be the following:

int x,y,z,w;

in C++ and MPI

in C++ and MPI

A seamless approach to parallel algorithms and their implementation George Em Karniadakis and Robert M. Kirby II

Cambridge University Press

Acknowledgements

Contents

Chapter 1

Scientiﬁc Computing and Simulation Science

1.1 What is Simulation?

Scientific Computing

1.2 A Seamless Approach Path

1.3 The Concept of Programming Language

1.4 Why C++ and What is MPI?

1.5 What About OpenMP?

1.6 Algorithms and Top Ten List

Chapter 2

Basic Concepts and Tools

2.1 Introduction to C++

2.1.1 Two Basic Concepts in C++

(

)

Putting it into Practice

2.1.2 Learning the Syntax and Other Basic Commands

Key Concept