Parallel Environment

MPI vs OpenMP

  • Both are designed to allow programmer to use more CPU by leveraging parallel processing.
  • Perhaps the biggest difference is CPU vs Memory locality.
  • OpenMP is newer than MPI, it leverages the newer multi-core CPU and multi-CPU machines. Memory is local to the machine. At the basic level, OpenMP run within a single machine, so it is shared-memory system and so accessible to all "threads" of the program.
  • MPI is an older API and does NOT use shared-memory. Host kinda expected to have single CPU/core. So, each MPI process ("thread") is independent and don't have access of memory of other threads.
  • with the above mindset, it is easier to see what MPI and OpenMP audiences are.

    MPI

  • MPI is considered to be a lower level API than OpenMP. To coordinate distributed processing, MPI program need to ship the data to remote node/process (cuz not sharing memory).
  • The main programming paradigm is scatter-gather. Ship data to remote node, have them run specific process, then collect results from them. So, this often lend to SIMD processing.
  • To carry our the scatter-gather processing, a diverse number of functions are provided, via the Message Passing approach (which means they are low-level programming constructs):
  • The constructs diversity means that MPI program isn't really restricted to scather/gather. It can be MIMD, it is up to the programmer on how to utilize the communication API to process data.
  • Because of the programming model, MPI tends to require its own "mindset" and pretty much program has to be written from the ground up with this MPI mindset.

    OpenMP

  • OpenMP was build with focus on symmetric multiprocessing. Leverage multi-core CPU, multiple CPU on same host, with memory readily available to process. So parallelization is simpler.
  • use a multi-threading model. different threads can do different things. not limited to SIMD.
  • The multi-threading model allows programmer to adopt a core piece of the program to use OpenMP gradually, instead of write whole program in MPI style code.
  • code is #pragma directive that guide compiler to use openMP. Non-OpenMP compiler produces serial code.
  • gcc 4.2 supports openMP.
  • at least when running in "single-host" shared memory SMP machine, probably no need to do anything at the sys admin level once program is compiled with proper compiler (and appropriate LD_LIBRARY stuff included).
  • Cluster OpenMP and other works on distributed memory system.
  • https://computing.llnl.gov/tutorials/openMP/exercise.html has simple hello_world exercise for OMP.
  • SGE -sp smp and OpenMP is largely the same. no special daemon needed... so, if on shared-memory system, not sure what diff OpenMP has vs pthreads. perhaps only ability to scale to other nodes when appropriate compiler/library is used?

    Further notes on OpenMP:
    	* by itself not for shared memory design
    	* run w/in a single (SMP) computer
    	* not spanning machines by default, so was not told about it in grad school (also v1.0 for C/C++ released in Oct 1998)
    	* so really think of it as alternative to pthreads... 
    	* in hpc/cluster environment, SGE use of PE called OpenMP has limited things it need to do, since generated code just run.  Mostly, it would sets the correct OMP_NUM_THREADS for the node, and know where to launch the job.
    	* a couple of web example of openmp in sge suggest just specify a PE that allows defining how many core to take, eg -pe threads 2-8, then in the qsub script to define OMP_NUM_THREADS=$NSLOTS   (NSLOTS defined in SGE inside the qsub)
    
    
    
    sky/code/openMP > cat hello_slac.c
    
    /* http://www.slac.stanford.edu/comp/unix/farm/openmp.html */
    #include 
    #include 
    
    int main(int argc, char *argv[]) {
      int iam = 0, np = 1;
    
      #pragma omp parallel default(shared) private(iam, np)
      {
        #if defined (_OPENMP)
          np = omp_get_num_threads();
          iam = omp_get_thread_num();
        #endif
        printf("Hello from thread %d out of %d\n", iam, np);
      }
    }
    
    
    
    # http://www.dartmouth.edu/~rc/classes/intro_openmp/compile_run.html
    # gcc 4.4 and above
    cc  -fopenmp hello_llbl.c
    gcc -fopenmp hello_llbl.c
    icc -openmp  hello_slac.c
    
    
    export OMP_NUM_THREADS=4
    # if not specified, run as many threads as available cpu cores (or hyperthreads if enabled)
    # if ask for more threads than avail cores, the OS will sequentialize them.
    ./a.out
    
    

    ScaleMP

  • marketed as vSMP Foundation, scale across multiple physical server rather than depends on shared-memory or NUMA host.
  • http://www.scalemp.com/products/product-comparison/ there is a vSMP Foundation Free for up to 8 nodes / 1TB shared memory. but essentially a commercial product.
  • can use IB as interconnect to speed data/memory xfer, even bonding IB in adv version.
  • but then presumably need to run some sort of daemon process on each nodes. the cheaper licensing model lic is node-locked. so create quite a complex env to use in an batch scheduler env.
    This and the cost issue maybe why folks just use MPI ? don't seems to see much ScaleMP or OpenMP, at least not in life science space.
  • http://www.scalemp.com/industries/lifescience/computational-chemistry/ list Schrodinger Jaguar, DOCK. Glide. Amber. Gaussian. OpenEye Fred, Omega. HMMER. mpiBLAST (but not the GPU blast?). touted for dept w/o dedicated IT.

    MPI API Implementation Details

    MPICH versions

    There are many other implementations, including commercial ones, MATLAB, Java, etc. See: wikipedia MPI Implemenatation

    
    
    
    
    
    

    MPICH v1

    (See config-backup/sw/mpi/mpich1.test.txt for more info).
    
    

    Starting MPICH

    Environment VARs: MPI_HOME MPI_USEP4SSPORT=yes MPI_P4SSPORT=4644 /etc/hosts.equiv or .rhosts need to be setup, even if using ssh !! some sys call in MPICH need this for auth. $MPI_HOME/share/machines.LINUX # host (+cpu) definition file # node1:2 would be a 4 cpu machine, but then indicate shared memory # which parallel Jaguar don't support. Instead, repeat lines per node # for number of CPU, eg : # node1 # node1 # node2 # node2 To start a shared daemon as root: ssh node1 "serv_p4 -o -p 1235 -l /nfs/mpilogs/node1.log" ssh node2 "serv_p4 -o -p 1235 -l /nfs/mpilogs/node2.log" # rc script on each node to start up would be good, # but centralized script in above form to start/kill would also be useful. # Alternatively, Schrodinger mpich utility can start serv_p4 correctly # (without the problem of chp4_sers which results in non-sharable deamons). For a per-user process, can start/monitor MPICH as: tools (scripts) in $MPI_HOME/sbin/ chp4_servs -port=4644 # script to start serv_p4 on all nodes, DOESN'T obey MPI_P4SSPORT (def to 1234) # at some point in the past also used port 1235 chp4_servs -hosts=filename # use filename to get list of hosts to start serv_p4 (def to machines.LINUX) chp4_servs -hunt # list all serv_p4 process on all mpi nodes (on all ports) chkserv -port 4644 # see which node don't have mpi daemon running # DOESN'T obey MPI_P4SSPORT (def to 1234) # no output = all good. # (parallel jaguar will trigger it to start anyway)
    NOTE: schrodinger has mpich utility to monitor MPICH status also.

    Testing MPICH

    
    $MPI_HOME/sbin/tstmachines -v
    	# see if daemons are fine.
    
    cat $HOME/.server_apps
    	exact path to each binary, should be populated automatically.
    
    
    cd $MPI_HOME/examples
    mpirun -np 16 cpi
    	# run pi calculation test on 16 procs.
            # doesn't really start a serv_p4 process, so can't use to test sharing daemon b/w users.
    
    

    Per-User Environment

    (There is no need for this unless the shared root daemon process don't work)
    MPICH allows a per-user instance of MPICH daemon rings instead of depending on a shared daemon run by root. This has been tested to work with Parallel Jaguar. To use this, add an environment defining the port you want to use with your set of MPI daemon ring. Your 4 digit phone extension would be a good number to use. It maybe best to add it to your $HOME/.cshrc, like this:
      setenv MPI_P4SSPORT     4644                    #change number to a unique port for yourself
    
    After this, parallel jaguar (or mpirun) jobs should work. If there are problem, check that you have sourced /protos/package/skels/local.cshrc.linux.apps and these variables are defined:
      setenv MPI_HOME /protos/package/linux/mpich	
      setenv PATH "${PATH}:${MPI_HOME}/bin"	
      setenv MPI_USEP4SSPORT yes
      setenv P4_RSHCOMMAND ssh	
      setenv SCHRODINGER_MPI_START yes                  # Parallel Jaguar to start its own MPICH serv_p4 on user defined port	
    
    After these setup, can run parallel jaguar job like:
    $SCHRODINGER/jaguar run -HOST "vic1 vic2 vic3 vic4" -PROCS 4 piperidine
    
    
    
    


    PVM

    Ref: http://www.csm.ornl.gov/pvm/
    
    
    
    source pvm.env           # get PVM_ROOT, etc
    pvm                      # starts monitor, starting pvmd* daemon if needed.
    
    $PVM_ROOT/lib/pvmd   pvmhost.conf
    # starts PVM daemon on lists specified in the conf file, whereby hosts is listed one per line.
    # may want to put it in background.  ^C will end everything.
    # it uses RSH (ssh if defined correctly) to login to remote host to start process
    # Need to ensure ssh login will source the env correctly for pvm/pvmd to run.
    # Can be started by any user   (what about more than one user??)
    
    
    kill -SIGTERM    can be used to kill the daemon
    if use kill -9 (or other non-catchable signal), be sure to clean up /tmp/pvmd.
    
    
    pvm> commands
    ps 
    conf
    halt
    exit
    
    
    To run OpenEye omega/rocs job, the 
    $PVM_ROOT/bin/$PVM_ARCH dir must have access to the desired binary (eg sym link to omega).
    PATH from .login will not be sourced.
    
    run command as:
    omega  -pvmconf omega.pvmconf -in carboxylic_acids_1--100.smi -out carboxylic_acids_1--100.oeb.gz -log omega_pvm.log
    
    
    Each user that start pvm will have her own independent instance of pvmd3.
    pvm rsh/ssh to remote host to start itself, so ports numbers are likely not going to be static.  
    It uses UDP for communication.
    
    										   
    from lsof -i4 -n
    										   
    process name / pid uid   ...
    pvmd3     27808    tinh    7u  IPv4 17619158       UDP 10.220.3.20:33430
    pvmd3     27808    tinh    8u  IPv4 17619159       UDP 10.220.3.20:33431
    
    
    tin     27808     1  0 14:25 pts/29   00:00:00 /app/pvm/pvm345/lib/LINUX/pvmd3
    
    
    ## omega.pvmconf
    ## host = req keyword
    ## hostname, sometime may need to be FQDN, depending on what command "hostname" returns
    ## n = number of instance of PVM to run
    host  phpc-cn01 1
    host  phpc-cn02 2
    host  phpc-cn03 2
    
    
    ##/home/common/Environments/pvm.env
    
    # csh environment setup for PVM 3.4.5
    # currently only available for LINUX64 (LSF cluster)
    
    setenv PVM_ROOT /app/pvm/pvm345
    
    source ${PVM_ROOT}/lib/cshrc.stub
    
    # http://mail.hudat.com/~ken/help/unix/.cshrc
    #alias ins2path  'if ("$path:q" !~ *"\!$"* ) set path=( \!$ $path )'
    #alias add2path  'if ("$path:q" !~ *"\!$"* ) set path=( $path \!$ )'
    ##add2path ${PVM_ROOT}/bin
    
    ## : has special meaning in cshrc, so need to escape it for it to be taken verbatim
    ## there is no auto shell conversion between $manpath and $MANPATH as it does for PATH
    ## csh is convoluted.
    setenv MANPATH $MANPATH\:${PVM_ROOT}/man
    
    
    

    Links



    [Doc URL]
    http://tiny.cc/mpi2
    https://tin6150.github.io/psg/mpi.html
    (cc) Tin Ho. See main page for copyright info.


    "ting"
    "ting"
    tin6150 sn50