domingo, 11 de marzo de 2012

MPI for Python

MPI for Python

This week for the laboratory I tried to test MPI for python for a better understanding of the MPI functions. This python MPI is easier than the C++ counterpart, it doesn't need to initialize or finalize the MPI, and it also provides tools to check which process I am and how many processes there are. With that, I will explain some easier examples of how those functions work.

Rank, Size, Name

The first code I'll explain is the MPI for Python Hello World in the mpi4py-1.3/demo directory. This code is perfect to understand how the rank, size and name functions work, and it is pretty simple.

Code:


from mpi4py import MPI
import sys

size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()

sys.stdout.write(    "Hello, World! I am process %d of %d on %s.\n"    % (rank, size, name))

Explanation:


from mpi4py import MPI
import sys

As we know, we first need to import the MPI module, and in this example we also use sys to write in the terminal the output.


size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()

Using the  MPI.COMM_WORLD.Get_size() we can get the number of processes executing right now. This number is specified at the moment of the code execution. For example if we run this example like this:

  • The size would be 10, because we specified that we want 10 processes to run hello.py. 
  • The rank is the current processes in execution so we can tell which one is which.
  • And the name is the processor name to identify in which machine the process is running, which would be useful if we're running code in a computer cluster.


sys.stdout.write("Hello, World! I am process %d of %d on %s.\n"% (rank, size, name))

And finally we just write all of the previous data(size, name, rank) to see that this really works. Which should produce an output like this:


Send-Receive

The previous example was good to tell how the rank, size and name functions work, in order to differentiate processes, now lets see how we can use the rank to send data from a process to another. This is a point to point communication between to processes, so there will be no more than 2 involved.

Code: 


from mpi4py import MPI
import sys

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
name = MPI.Get_processor_name()

if rank == 0: 
  hello = "Hello" 
  world = "World!\n"
  comm.send(hello, dest = 1, tag = 1)
  comm.send(world, dest = 1, tag = 2)
elif rank == 1: 
  hello =  comm.recv(source = 0, tag = 1) 
  world = comm.recv(source = 0, tag = 2) 
  sys.stdout.write("%s %s"%(hello, world))

Explanation:


from mpi4py import MPI
import sys


Again, first we import the modules, MPI and sys.


comm = MPI.COMM_WORLD
rank = comm.Get_rank()
name = MPI.Get_processor_name()

Here is pretty much the same as before, but the code is simplified a little saving the MPI.COMM_WORLD contents in the variable comm, so it will be easier to call in order to get the rank, name and later on, the send and receive functions.


if rank == 0: 
  hello = "Hello" 
  world = "World!" 
  comm.send(hello, dest = 1, tag = 1)
  comm.send(world, dest = 1, tag = 2)

So, this will be a point to point communication, so we will need only two processes, and processes start from number 0 onwards, with this condition we make sure that process 0 will run this code, here we define the string hello, and world (I separated it in two, so we can see how processes tell which data is which when they send and receive it). Then the process 0 will send each string separately to the process 1, here we can see each send function has a different tag, this is so we can tell which data is the string hello, and which is the string world.


elif rank == 1: 
  hello = comm.recv(source = 0, tag = 1) 
  world = comm.recv(source = 0, tag = 2)   
  sys.stdout.write("%s %s I am process %d on %s.\n"%(hello, world, rank, name))

Now, process 1 runs this, waits to receive data from the process 0, tagged with 1(string hello) and then it waits to receive data again from 0, tagged with 2(string world). And we print it to see if it really was sent successfully. 

We can run this code with any number of processes, but using more than 2 would be unnecessary so try the following:


Broadcast

To explain the broadcast function I made an example where a string is sent to all processes through broadcast, and then each of them prints that string.

Code:


from mpi4py import MPI
import sys

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
 data = "Hello World!"
else:
 data = None

data = comm.bcast(data, root=1)
sys.stdout.write("Process %s received %s from 0\n"%(rank, data))

Explanation:



from mpi4py import MPI
import sys

As before, we first import the needed modules. MPI for the message passing functions and sys for writing outputs.


comm = MPI.COMM_WORLD
rank = comm.Get_rank()

Again we simplify things passing the contents of MPI.COMM_WORLD to comm, in order to call the rank and broadcast functions using comm.


if rank == 0:
 data = "Hello World!"
else:
 data = None

Then we check if the current process is process number 0, if it is process 0, then the string will be "Hello World!", if not the string will be None. This is so we can see later on how the root in the broadcast function works.


data = comm.bcast(data, root=0)
sys.stdout.write("Process %s received %s from 0\n"%(rank, data))

Finally, we broadcast the string, using process 0 as root, so process 0 will send its contents of the variable data. This will be the string "Hello World!", and running this code should produce the following output:



So if we modify the root from the broadcast function, making it 1 or larger(not bigger than the specified number of processes), that process should send None instead of the string "Hello World", we can see this in the following output:




References:

1 comentario: