Introduction to High Performance Scientific Computing

Introduction to High Performance Scientific Computing

Victor Eijkhout

Language: English

Pages: 482

ISBN: 1257992546

Format: PDF / Kindle (mobi) / ePub


This is a textbook that teaches the bridging topics between numerical analysis, parallel computing, code performance, large scale applications.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

physically distributed memory; the distributed nature of it is just not apparent to the programmer. With logically and physically distributed memory, the only way one processor can exchange information with another is through passing information explicitly through the network. You will see more about this in section 2.5.3.3. This type of architecture has the significant advantage that it can scale up to large numbers of processors: the IBM BlueGene has been built with over 200,000 processors. On

access only to local data – everything else needs to be communicated with send and receive operations – and the processor knows its own number. One possible way of writing this would be • If I am processor 0 do nothing, otherwise receive a y element from the left, add it to my x element. • If I am the last processor do nothing, otherwise send my y element to the right. At first we look at the case where sends and receives are so-called blocking communication instructions: a send instruction does

This style of programming is further encouraged by the existence of Remote Direct Memory Access (RDMA) support on some hardware. An early example was the Cray T3E . These days, one-sided communication is widely available through its incorporation in the MPI-2 library; section 2.5.3.7. 90 Introduction to High Performance Scientific Computing – r542 2.5. Parallel programming Let us take a brief look at one-sided communication in MPI-2, using averaging of array values as an example: ∀i : ai ← (ai

time of a parallel implementation on a hypercube. Show that the theoretical speedup from the example is attained (up to a factor) for the implementation on a hypercube. 2.6.4.1 Embedding grids in a hypercube Above we made the argument that mesh-connected processors are a logical choice for many applications that model physical phenomena. Hypercubes do not look like a mesh, but they have enough connections that they can simply pretend to be a mesh by ignoring certain connections. Let’s say that

operations can be done in exact arithmetic. However, it is good to become aware of some of the potential problems due to our finite precision computer arithmetic. This allows us to design algorithms that minimize the effect of roundoff. A more rigorous approach to the topic of numerical linear algebra includes a full-fledged error analysis of the algorithms we discuss; however, that is beyond the scope of this course. Error analysis of computations in computer arithmetic is the focus of

Download sample

Download