Comparaison des versions

Légende

  • Ces lignes ont été ajoutées. Ce mot a été ajouté.
  • Ces lignes ont été supprimées. Ce mot a été supprimé.
  • La mise en forme a été modifiée.

...

delete(gcp);
exit;

Expected Output

The batch bash script used for this example is very similar to the one used for Running in Parallel on One Node except ntasks was 4 instead of 20. When completed the out put file for the output file will contain the MATLAB version information, followed by information about the cluster and the pool. Next is the results of the jobs:

...

As you can see each worker is aware of what rank they are. A simple way to utilize this would be to name your data sets somethings along the lines of dataset1, dataset2dataset_1, dataset_2, ..., etc. and have each worker use 

load(['datafiledataset_' num2str(labindex) '.ascii']). 

Download spmd_example.m.

Download spmd_example.sh..sh.

parafeval

parafeval is useful for when you want to run a loop that you can stop early. For example, if you are analyzing a very large data set, you may want to stop when the results are 'good enough' instead of waiting for the entire set to be completed. parafeval is also useful for running functions in the background because it doesn't block MATLAB from continuing to work. parafeval will split up the workers in the pool itself.

Syntax

f = parfeval(fcn,numout,in1,in2,...)

fnc: the function to execute

numout: expected number of outputs from the function

in1, in2: the parameters for the function 

f: a future object. By itself, it doesn't mean a lot, the data has to be extracted from it when it's ready with fetchNext(f).

If you want to break out of a loop using parafeval use cancel(f) to stop the evaluation of the future object.

Example

This example shows how you can use parafeval to evaluate a function an get the results as they are available. 

%=================================
% Simple example of parfeval
% From MATLAB documentation
% must run setPool before this
%=================================

% evaluate the magic function 10 times

for idx = 1:10
  f(idx) = parfeval(@magic, 1, idx);
end

% preallocate place to store results

magicResults = cell(1,10);

% get the results and put them in the array

for idx = 1:10
  [completedIdx, value] = fetchNext(f);
  magicResults{completedIdx} = value
  fprintf('Got results with index %d.\n', completedIdx);
end

%clean up the pool and exit

delete(gcp);

exit;

Expected Output

The bash script for this example is identical to the script for previous spmd example. The output file has MATLAB's version information, followed by the cluster and pool properties. The actual results of the MATALAB should be 'Got result with index: 1', then 2, then 3, ... etc., up to 10. If this was a much larger job, then the indexes, may not be in order; it would all depend on which future object was ready for fetchNext(f) first. 

Quick Guide

Brief explanation of terms to know when using MATLAB's Parallel Computing Toolbox. 

workerThe MATLAB computational engine that processes the code. Can also be called a lab. Each worker is assigned a number called its rank.

numlabs

Returns the total number of workers available.
labindexReturns the rank of the worker.
parpoolThe parallel pool of workers. It is created in the MATLAB program with parpool('local', #ofWorkers). The number of workers is the number of cpus requested on the node - 1.
gcpMATLAB function that will get the current pool. At the end of the parallel code using delete(gcp) will neatly shutdown all the workers.
parforThe parallel for loop. Splits the iterations of the for loop among the workers to be done in parallel. The step of the iteration must be +1, the iterations cannot rely on one another, parfor loops cannot be nested, and you cannot break out of the loop early.
spmdSingle Program Multiple Data. Allows for control over each worker. Use the worker's rank to assign jobs. Useful for when you want to do the same thing to different data sets. Like parfor: cannot nest spmd blocks in each other and cannot break out of thembreak out of them.
parafevalParallel Function Evaluation (not official, just assuming that what it stands for) will allow you to run functions in parallel without having MATLAB be blocked from running other things. Call parafeval as many times as you want the function to run in a loop and call fetchNext to get the results.


Trouble Shooting

Sometimes things just don't go right, here are some tips to help. If none of these tips help email us at rc-help@rit.edu.

  1. Check the .err files for your job, they will help you the most. Often it will explicitly state the error that occurred in a specific file on a specific line. If you don't understand the error message try to look it up online. Always double check your spelling and capitalization.
  2. Run sacct. This command tells you the state of each of your jobs and the exit code. You might get a state of OUT_OF_ME+ and an exit code of 125. This means you need to increased the amount of memory you request in your .sh file. Remember #SBATCH --mem uses megabytes by default. Append M, G, or T to the end of the number to request megabytes, gigabytes, or terabytes.
  3. Your job might not be running because there are no resources available. Run squeue and look under the REASON column. If you see (Resources) that means you need to wait for other jobs to finish before yours can run. You can also see this by running sinfo. If you look in the rows for the tier you are trying to run your job in, you'll see none with the STATE idle or mix.

More Reading

The topics touched on in this documentation will be enough to get you up and running with parallel code for MATLAB. However, there is much more to MATLAB's Parallel Computing Toolbox, such as sending specific messages between workers and increasing the performance of your parfor loops. 

If there are any further questions, or there is an issue with the documentation, please contact rc-help@rit.edu for additional assistance.