main content

create parallel pool on cluster -凯发k8网页登录

create parallel pool on cluster

description

example

parpool starts a parallel pool of workers using the default profile. with default preferences, matlab® starts a pool on the local machine with one worker per physical cpu core up to the limit set in the default profile. for more information on parallel preferences, see specify your parallel preferences.

in general, the pool size is specified by the preferredpoolnumworkers property of the default profile. for all factors that can affect your pool size, see .

parpool enables the full functionality of the parallel language features in matlab by creating a special job on a pool of workers, and connecting the matlab client to the parallel pool. parallel language features include parfor, parfeval, parfevalonall, spmd, and distributed. if possible, the working folder on the workers is set to match that of the matlab client session.

example

parpool(poolsize) creates and returns a pool with the specified number of workers. poolsize can be a positive integer or a range specified as a 2-element vector of integers. if poolsize is a range, the resulting pool has size as large as possible in the range requested.

specifying the poolsize overrides any value specified in the preferredpoolnumworkers property, and starts a pool of exactly that number of workers, even if it has to wait for them to be available. most clusters have a maximum number of workers they can start. if the profile specifies a matlab job scheduler cluster, parpool reserves its workers from among those already running and available under that matlab job scheduler. if the profile specifies a local or third-party scheduler, parpool instructs the scheduler to start the workers for the pool.

example

parpool(resources) or parpool(resources,poolsize) starts a worker pool on the resources specified by resources.

example

parpool(___,name,value) applies the specified values for certain properties when starting the pool.

example

poolobj = parpool(___) returns a object to the client workspace representing the pool on the cluster. you can use the pool object to programmatically delete the pool or to access its properties. use delete(pool) to shut down the parallel pool.

examples

start a parallel pool using the default profile to define the number of workers. with default preferences, the default pool is on the local machine.

parpool

you can create pools on different types of parallel environments on your local machine.

  • start a parallel pool of process workers.

    parpool('processes')
  • start a parallel pool of thread workers.

    parpool('threads')

fore more information on parallel environments, see .

start a parallel pool of 16 workers using a profile called myprof.

parpool('myprof',16)

create an object representing the cluster identified by the default profile, and use that cluster object to start a parallel pool. the pool size is determined by the default profile.

c = parcluster
parpool(c)

start a parallel pool with the default profile, and pass two code files to the workers.

parpool('attachedfiles',{'mod1.m','mod2.m'})

if you have access to several gpus, you can perform your calculations on multiple gpus in parallel using a parallel pool.

to determine the number of gpus that are available for use in matlab, use the function.

availablegpus = gpudevicecount("available")
availablegpus = 3

start a parallel pool with as many workers as available gpus. for best performance, matlab assigns a different gpu to each worker by default.

parpool('processes',availablegpus);
starting parallel pool (parpool) using the 'processes' profile ...
connected to the parallel pool (number of workers: 3).

to identify which gpu each worker is using, call gpudevice inside an spmd block. the spmd block runs gpudevice on every worker.

spmd
    gpudevice
end

use parallel language features, such as parfor or , to distribute your computations to workers in the parallel pool. if you use gpuarray enabled functions in your computations, these functions run on the gpu of the worker. for more information, see run matlab functions on a gpu. for an example, see run matlab functions on multiple gpus.

when you are done with your computations, shut down the parallel pool. you can use the function to obtain the current parallel pool.

delete(gcp('nocreate'));

if you want to use a different choice of gpus, then you can use gpudevice to select a particular gpu on each worker, using the gpu device index. you can obtain the index of each gpu device in your system using the gpudevicecount function.

suppose you have three gpus available in your system, but you want to use only two for a computation. obtain the indices of the devices.

[availablegpus,gpuindx] = gpudevicecount("available")
availablegpus = 3
gpuindx = 1×3
     1     2     3

define the indices of the devices you want to use.

usegpus = [1 3];

start your parallel pool. use an spmd block and gpudevice to associate each worker with one of the gpus you want to use, using the device index. the spmdindex function identifies the index of each worker.

parpool('processes',numel(usegpus));
starting parallel pool (parpool) using the 'processes' profile ...
connected to the parallel pool (number of workers: 2).
spmd
    gpudevice(usegpus(spmdindex));
end

as a best practice, and for best performance, assign a different gpu to each worker.

when you are done with your computations, shut down the parallel pool.

delete(gcp('nocreate'));

create a parallel pool with the default profile, and later delete the pool.

poolobj = parpool;
delete(poolobj)

find the number of workers in the current parallel pool.

poolobj = gcp('nocreate'); % if no pool, do not create new one.
if isempty(poolobj)
    poolsize = 0;
else
    poolsize = poolobj.numworkers
end

input arguments

size of the parallel pool, specified as a positive integer or a range specified as a 2-element vector of integers. if poolsize is a range, the resulting pool has size as large as possible in the range requested. set the default preferred number of workers in the cluster profile.

example: parpool('processes',2)

data types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

resources to start the pool on, specified as 'processes', 'threads', a cluster profile name or cluster object.

  • 'processes' – starts a pool of process workers on the local machine. for more information on process-based environments, see .

  • 'threads' – starts a pool of thread workers on the local machine. for more information on thread-based environments, see .

  • profile name – starts a pool on the cluster specified by the profile. for more information on cluster profiles, see discover clusters and use cluster profiles.

  • cluster object – starts a pool on the cluster specified by the cluster object. use to get a cluster object.

example: parpool('processes')

example: parpool('threads')

example: parpool('myclusterprofile',16)

example: c = parcluster; parpool(c)

data types: char | string | parallel.cluster

name-value arguments

specify optional pairs of arguments as name1=value1,...,namen=valuen, where name is the argument name and value is the corresponding value. name-value arguments must appear after other arguments, but the order of the pairs does not matter.

before r2021a, use commas to separate each name and value, and enclose name in quotes.

example: 'attachedfiles',{'myfun.m'}

files to attach to pool, specified as a character vector, string or string array, or cell array of character vectors.

with this argument pair, parpool starts a parallel pool and passes the identified files to the workers in the pool. the files specified here are appended to the attachedfiles property specified in the applicable parallel profile to form the complete list of attached files. the 'attachedfiles' property name is case sensitive, and must appear as shown.

example: {'myfun.m','myfun2.m'}

data types: char | cell

flag to specify if user-added entries on the client path are added to path of each worker at startup, specified as a logical value.

data types: logical

names of environment variables to copy from the client session to the workers, specified as a character vector, string or string array, or cell array of character vectors. the names specified here are appended to the 'environmentvariables' property specified in the applicable parallel profile to form the complete list of environment variables. any variables listed which are not set are not copied to the workers. these environment variables are set on the workers for the duration of the parallel pool.

data types: char | cell

flag to specify if spmd support is enabled on the pool, specified as a logical value. you can disable support only on a local or matlab job scheduler cluster. parfor iterations do not involve communication between workers. therefore, if 'spmdenabled' is false, a parfor-loop continues even if one or more workers aborts during loop execution.

data types: logical

time in minutes after which the pool shuts down if idle, specified as an integer greater than zero. a pool is idle if it is not running code on the workers. by default 'idletimeout' is the same as the value in your parallel preferences. for more information on parallel preferences, see specify your parallel preferences.

example: pool = parpool('idletimeout',120)

output arguments

access to parallel pool from client, returned as a object.

tips

  • the pool status indicator in the lower-left corner of the desktop shows the client session connection to the pool and the pool status. click the icon for a menu of supported pool actions.

    with a pool running: the parallel pool status indicator indicating that a pool is running, showing the start parallel pool and parallel preferences menu options. with no pool running:the parallel pool status indicator indicating that no pool is running, showing the start parallel pool and parallel preferences menu options.

  • if you set your parallel preferences to automatically create a parallel pool when necessary, you do not need to explicitly call the parpool command. you might explicitly create a pool to control when you incur the overhead time of setting it up, so the pool is ready for subsequent parallel language constructs.

  • delete(poolobj) shuts down the parallel pool. without a parallel pool, spmd and parfor run as a single thread in the client, unless your parallel preferences are set to automatically start a parallel pool for them.

  • when you use the matlab editor to update files on the client that are attached to a parallel pool, those updates automatically propagate to the workers in the pool. (this automatic updating does not apply to simulink® model files. to propagate updated model files to the workers, use the function.)

  • if possible, the working folder on the workers is initially set to match that of the matlab client session. subsequently, the following commands entered in the client command window also execute on all the workers in the pool:

    this behavior allows you to set the working folder and the command search path on all the workers, so that subsequent pool activities such as parfor-loops execute in the proper context.

    when changing folders or adding a path with cd or addpath on clients with windows® operating systems, the value sent to the workers is the unc path for the folder if possible. for clients with linux® operating systems, it is the absolute folder location.

    if any of these commands does not work on the client, it is not executed on the workers either. for example, if addpath specifies a folder that the client cannot access, the addpath command is not executed on the workers. however, if the working folder can be set on the client, but cannot be set as specified on any of the workers, you do not get an error message returned to the client command window.

    be careful of this slight difference in behavior in a mixed-platform environment where the client is not the same platform as the workers, where folders local to or mapped from the client are not available in the same way to the workers, or where folders are in a nonshared file system. for example, if you have a matlab client running on a microsoft® windows operating system while the matlab workers are all running on linux operating systems, the same argument to addpath cannot work on both. in this situation, you can use the function to assure that a command runs on all the workers.

    another difference between client and workers is that any addpath arguments that are part of the folder are not set on the workers. the assumption is that the matlab install base is already included in the workers’ paths. the rules for addpath regarding workers in the pool are:

    • subfolders of the matlabroot folder are not sent to the workers.

    • any folders that appear before the first occurrence of a matlabroot folder are added to the top of the path on the workers.

    • any folders that appear after the first occurrence of a matlabroot folder are added after the matlabroot group of folders on the workers’ paths.

    for example, suppose that matlabroot on the client is c:\applications\matlab\. with an open parallel pool, execute the following to set the path on the client and all workers:

    addpath('p1',
            'p2',
            'c:\applications\matlab\t3',
            'c:\applications\matlab\t4',
            'p5',
            'c:\applications\matlab\t6',
            'p7',
            'p8');

    because t3, t4, and t6 are subfolders of matlabroot, they are not set on the workers’ paths. so on the workers, the pertinent part of the path resulting from this command is:

    p1
    p2
    
    p5
    p7
    p8
  • if you are using macintosh or linux, and see problems during large parallel pool creation, see .

version history

introduced in r2013b
网站地图