Configuration

pcluster uses the file ~/.parallelcluster/config by default for all configuration parameters.

You can see an example configuration file site-packages/aws-parallelcluster/examples/config

Layout

Configuration is defined in multiple sections. Required sections are “global”, “aws”, one “cluster”, and one “subnet”.

A section starts with the section name in brackets, followed by parameters and configuration.

[global]
cluster_template = default
update_check = true
sanity_check = true

Configuration Options

global

Global configuration options related to pcluster.

[global]

cluster_template

The name of the cluster section used for the cluster.

See the Cluster Definition.

cluster_template = default

update_check

Whether or not to check for updates to pcluster.

update_check = true

sanity_check

Attempts to validate that resources defined in parameters actually exist.

sanity_check = true

aws

This is the AWS credentials/region section (required). These settings apply to all clusters.

We highly recommend use of the environment, EC2 IAM Roles, or storing credentials using the AWS CLI to store credentials, rather than storing them in the AWS ParallelCluster config file.

[aws]
aws_access_key_id = #your_aws_access_key_id
aws_secret_access_key = #your_secret_access_key

# Defaults to us-east-1 if not defined in environment or below
aws_region_name = #region

aliases

This is the aliases section. Use this section to customize the ssh command.

CFN_USER is set to the default username for the os. MASTER_IP is set to the IP address of the master instance. ARGS is set to whatever arguments the user provides after pcluster ssh cluster_name.

[aliases]
# This is the aliases section, you can configure
# ssh alias here
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}

cluster

You can define one or more clusters for different types of jobs or workloads.

Each cluster has it’s own configuration based on your needs.

The format is [cluster <clustername>].

[cluster default]

key_name

Name of an existing EC2 KeyPair to enable SSH access to the instances.

key_name = mykey

template_url

Overrides the path to the CloudFormation template used to create the cluster

Defaults to https://s3.amazonaws.com/<aws_region_name>-aws-parallelcluster/templates/aws-parallelcluster-<version>.cfn.json.

template_url = https://s3.amazonaws.com/us-east-1-aws-parallelcluster/templates/aws-parallelcluster.cfn.json

compute_instance_type

The EC2 instance type used for the cluster compute nodes.

If you’re using awsbatch, please refer to the Compute Environments creation in the AWS Batch UI for the list of the supported instance types.

Defaults to t2.micro, optimal when scheduler is awsbatch

compute_instance_type = t2.micro

master_instance_type

The EC2 instance type use for the master node.

This defaults to t2.micro.

master_instance_type = t2.micro

initial_queue_size

The initial number of EC2 instances to launch as compute nodes in the cluster for traditional schedulers.

If you’re using awsbatch, use min_vcpus.

The default is 2.

initial_queue_size = 2

max_queue_size

The maximum number of EC2 instances that can be launched in the cluster for traditional schedulers.

If you’re using awsbatch, use max_vcpus.

This defaults to 10.

max_queue_size = 10

maintain_initial_size

Boolean flag to set autoscaling group to maintain initial size for traditional schedulers.

If you’re using awsbatch, use desired_vcpus.

If set to true, the Auto Scaling group will never have fewer members than the value of initial_queue_size. It will still allow the cluster to scale up to the value of max_queue_size.

Setting to false allows the Auto Scaling group to scale down to 0 members, so resources will not sit idle when they aren’t needed.

Defaults to false.

maintain_initial_size = false

min_vcpus

If scheduler is awsbatch, the compute environment won’t have fewer than min_vcpus.

Defaults to 0.

min_vcpus = 0

desired_vcpus

If scheduler is awsbatch, the compute environment will initially have desired_vcpus

Defaults to 4.

desired_vcpus = 4

max_vcpus

If scheduler is awsbatch, the compute environment will at most have max_vcpus.

Defaults to 20.

desired_vcpus = 20

scheduler

Scheduler to be used with the cluster. Valid options are sge, torque, slurm, or awsbatch.

If you’re using awsbatch, please take a look at the networking setup.

Defaults to sge.

scheduler = sge

cluster_type

Type of cluster to launch i.e. ondemand or spot

Defaults to ondemand.

cluster_type = ondemand

spot_price

If cluster_type is set to spot, you can optionally set the maximum spot price for the ComputeFleet on traditional schedulers. If you do not specify a value, you are charged the Spot price, capped at the On-Demand price.

If you’re using awsbatch, use spot_bid_percentage.

See the Spot Bid Advisor for assistance finding a bid price that meets your needs:

spot_price = 1.50

spot_bid_percentage

If you’re using awsbatch as your scheduler, this optional parameter is the on-demand bid percentage. If not specified you’ll get the current spot market price, capped at the on-demand price.

spot_price = 85

custom_ami

ID of a Custom AMI, to use instead of default published AMI’s.

custom_ami = NONE

s3_read_resource

Specify S3 resource for which AWS ParallelCluster nodes will be granted read-only access

For example, ‘arn:aws:s3:::my_corporate_bucket/*’ would provide read-only access to all objects in the my_corporate_bucket bucket.

See working with S3 for details on format.

Defaults to NONE.

s3_read_resource = NONE

s3_read_write_resource

Specify S3 resource for which AWS ParallelCluster nodes will be granted read-write access

For example, ‘arn:aws:s3:::my_corporate_bucket/Development/*’ would provide read-write access to all objects in the Development folder of the my_corporate_bucket bucket.

See working with S3 for details on format.

Defaults to NONE.

s3_read_write_resource = NONE

pre_install

URL to a preinstall script. This is executed before any of the boot_as_* scripts are run

This only gets executed on the master node when using awsbatch as your scheduler.

Can be specified in “http://hostname/path/to/script.sh” or “s3://bucketname/path/to/script.sh” format.

Defaults to NONE.

pre_install = NONE

pre_install_args

Quoted list of arguments to be passed to preinstall script

Defaults to NONE.

pre_install_args = NONE

post_install

URL to a postinstall script. This is executed after any of the boot_as_* scripts are run

This only gets executed on the master node when using awsbatch as your scheduler.

Can be specified in “http://hostname/path/to/script.sh” or “s3://bucketname/path/to/script.sh” format.

Defaults to NONE.

post_install = NONE

post_install_args

Arguments to be passed to postinstall script

Defaults to NONE.

post_install_args = NONE

proxy_server

HTTP(S) proxy server, typically http://x.x.x.x:8080

Defaults to NONE.

proxy_server = NONE

placement_group

Cluster placement group. The can be one of three values: NONE, DYNAMIC and an existing placement group name. When DYNAMIC is set, a unique placement group will be created as part of the cluster and deleted when the cluster is deleted.

This does not apply to awsbatch.

Defaults to NONE. More information on placement groups can be found here:

placement_group = NONE

placement

Cluster placement logic. This enables the whole cluster or only compute to use the placement group.

Can be cluster or compute.

This does not apply to awsbatch.

Defaults to cluster.

placement = cluster

ephemeral_dir

If instance store volumes exist, this is the path/mountpoint they will be mounted on.

Defaults to /scratch.

ephemeral_dir = /scratch

shared_dir

Path/mountpoint for shared EBS volume. Do not use this option when using multiple EBS volumes; provide shared_dir under each EBS section instead

Defaults to /shared. The example below mounts to /myshared. See EBS Section for details on working with multiple EBS volumes:

shared_dir = myshared

encrypted_ephemeral

Encrypted ephemeral drives. In-memory keys, non-recoverable. If true, AWS ParallelCluster will generate an ephemeral encryption key in memory and using LUKS encryption, encrypt your instance store volumes.

Defaults to false.

encrypted_ephemeral = false

master_root_volume_size

MasterServer root volume size in GB. (AMI must support growroot)

Defaults to 15.

master_root_volume_size = 15

compute_root_volume_size

ComputeFleet root volume size in GB. (AMI must support growroot)

Defaults to 15.

compute_root_volume_size = 15

base_os

OS type used in the cluster

Defaults to alinux. Available options are: alinux, centos6, centos7, ubuntu1404 and ubuntu1604

Note: The base_os determines the username used to log into the cluster.

Supported OS’s by region. Note that commercial is all supported regions such as us-east-1, us-west-2 etc.

============== ======  ============ ============ ============= ============
region         alinux    centos6       centos7     ubuntu1404   ubuntu1604
============== ======  ============ ============ ============= ============
commercial      True     True          True          True        True
us-gov-west-1   True     False         False         True        True
us-gov-east-1   True     False         False         True        True
cn-north-1      True     False         False         True        True
cn-northwest-1  True     False         False         False       False
============== ======  ============ ============ ============= ============
  • CentOS 6 & 7: centos

  • Ubuntu: ubuntu

  • Amazon Linux: ec2-user

    base_os = alinux
    

ec2_iam_role

The given name of an existing EC2 IAM Role that will be attached to all instances in the cluster. Note that the given name of a role and its Amazon Resource Name (ARN) are different, and the latter can not be used as an argument to ec2_iam_role.

Defaults to NONE.

ec2_iam_role = NONE

extra_json

Extra JSON that will be merged into the dna.json used by Chef.

Defaults to {}.

extra_json = {}

additional_cfn_template

An additional CloudFormation template to launch along with the cluster. This allows you to create resources that exist outside of the cluster but are part of the cluster’s life cycle.

Must be a HTTP URL to a public template with all parameters provided.

Defaults to NONE.

additional_cfn_template = NONE

vpc_settings

Settings section relating to VPC to be used

See VPC Section.

vpc_settings = public

ebs_settings

Settings section relating to EBS volume mounted on the master. When using multiple EBS volumes, enter multiple settings as a comma separated list. Up to 5 EBS volumes are supported.

See EBS Section.

ebs_settings = custom1, custom2, ...

scaling_settings

Settings section relation to scaling

See Scaling Section.

scaling_settings = custom

efs_settings

Settings section relating to EFS filesystem

See EFS Section.

efs_settings = customfs

raid_settings

Settings section relating to RAID drive configuration.

See RAID Section.

raid_settings = rs

tags

Defines tags to be used in CloudFormation.

If command line tags are specified via –tags, they get merged with config tags.

Command line tags overwrite config tags that have the same key.

Tags are JSON formatted and should not have quotes outside the curly braces.

See AWS CloudFormation Resource Tags Type.

tags = {"key" : "value", "key2" : "value2"}

vpc

VPC Configuration Settings:

[vpc public]
vpc_id = vpc-xxxxxx
master_subnet_id = subnet-xxxxxx

vpc_id

ID of the VPC you want to provision cluster into.

vpc_id = vpc-xxxxxx

master_subnet_id

ID of an existing subnet you want to provision the Master server into.

master_subnet_id = subnet-xxxxxx

ssh_from

CIDR formatted IP range in which to allow SSH access from.

This is only used when AWS ParallelCluster creates the security group.

Defaults to 0.0.0.0/0.

ssh_from = 0.0.0.0/0

additional_sg

Additional VPC security group Id for all instances.

Defaults to NONE.

additional_sg = sg-xxxxxx

compute_subnet_id

ID of an existing subnet you want to provision the compute nodes into.

If it is private, you need to setup NAT for web access.

compute_subnet_id = subnet-xxxxxx

compute_subnet_cidr

If you wish for AWS ParallelCluster to create a compute subnet, this is the CIDR that.

compute_subnet_cidr = 10.0.100.0/24

use_public_ips

Define whether or not to assign public IP addresses to Compute EC2 instances.

If true, an Elastic IP will be associated to the Master instance. If false, the Master instance will have a Public IP or not according to the value of the “Auto-assign Public IP” subnet configuration parameter.

See networking configuration for some examples.

Defaults to true.

use_public_ips = true

vpc_security_group_id

Use an existing security group for all instances.

Defaults to NONE.

vpc_security_group_id = sg-xxxxxx

ebs

EBS Volume configuration settings for the volumes mounted on the master node and shared via NFS to compute nodes.

[ebs custom1]
shared_dir = vol1
ebs_snapshot_id = snap-xxxxx
volume_type = io1
volume_iops = 200
...

[ebs custom2]
shared_dir = vol2
...

...

shared_dir

Path/mountpoint for shared EBS volume. Required when using multiple EBS volumes. When using 1 ebs volume, this option will overwrite the shared_dir specified under the cluster section. The example below mounts to /vol1

shared_dir = vol1

ebs_snapshot_id

Id of EBS snapshot if using snapshot as source for volume.

Defaults to NONE.

ebs_snapshot_id = snap-xxxxx

volume_type

The API name for the type of volume you wish to launch.

Defaults to gp2.

volume_type = io1

volume_size

Size of volume to be created (if not using a snapshot).

Defaults to 20GB.

volume_size = 20

volume_iops

Number of IOPS for io1 type volumes.

volume_iops = 200

encrypted

Whether or not the volume should be encrypted (should not be used with snapshots).

Defaults to false.

encrypted = false

ebs_volume_id

EBS Volume Id of an existing volume that will be attached to the MasterServer.

Defaults to NONE.

ebs_volume_id = vol-xxxxxx

scaling

Settings which define how the compute nodes scale.

[scaling custom]
scaledown_idletime = 10

scaledown_idletime

Amount of time in minutes without a job after which the compute node will terminate.

This does not apply to awsbatch.

Defaults to 10.

scaledown_idletime = 10

examples

Let’s say you want to launch a cluster with the awsbatch scheduler and let batch pick the optimal instance type, based on your jobs resource needs.

The following allows a maximum of 40 concurrent vCPUs, and scales down to zero when you have no jobs running for 10 minutes.

[global]
update_check = true
sanity_check = true
cluster_template = awsbatch

[aws]
aws_region_name = [your_aws_region]

[cluster awsbatch]
scheduler = awsbatch
compute_instance_type = optimal # optional, defaults to optimal
min_vcpus = 0                   # optional, defaults to 0
desired_vcpus = 0               # optional, defaults to 4
max_vcpus = 40                  # optional, defaults to 20
base_os = alinux                # optional, defaults to alinux, controls the base_os of the master instance and the docker image for the compute fleet
key_name = [your_ec2_keypair]
vpc_settings = public

[vpc public]
master_subnet_id = [your_subnet]
vpc_id = [your_vpc]

EFS

EFS file system configuration settings for the EFS mounted on the master node and compute nodes via nfs4.

[efs customfs]
shared_dir = efs
encrypted = false
performance_mode = generalPurpose

shared_dir

Shared directory that the file system will be mounted to on the master and compute nodes.

This parameter is REQUIRED, the EFS section will only be used if this parameter is specified. The below example mounts to /efs. Do not use NONE or /NONE as the shared directory.:

shared_dir = efs

encrypted

Whether or not the file system will be encrypted.

Defaults to false.

encrypted = false

performance_mode

Performance Mode of the file system. We recommend generalPurpose performance mode for most file systems. File systems using the maxIO performance mode can scale to higher levels of aggregate throughput and operations per second with a trade-off of slightly higher latencies for most file operations. This can’t be changed after the file system has been created.

Defaults generalPurpose. Valid Values are generalPurpose | maxIO (case sensitive).

performance_mode = generalPurpose

throughput_mode

The throughput mode for the file system to be created. There are two throughput modes to choose from for your file system: bursting and provisioned.

Valid Values are provisioned | bursting

throughput_mode = provisioned

provisioned_throughput

The throughput, measured in MiB/s, that you want to provision for a file system that you’re creating. The limit on throughput is 1024 MiB/s. You can get these limits increased by contacting AWS Support.

Valid Range: Min of 0.0. To use this option, must specify throughput_mode to provisioned

provisioned_throughput = 1024

efs_fs_id

File system ID for an existing file system. Specifying this option will void all other EFS options but shared_dir. Config sanity will only allow file systems that: have no mount target in the stack’s availability zone OR have existing mount target in stack’s availability zone with inbound and outbound NFS traffic allowed from 0.0.0.0/0.

Note: sanity check for validating efs_fs_id requires the IAM role to have permission for the following actions: efs:DescribeMountTargets, efs:DescribeMountTargetSecurityGroups, ec2:DescribeSubnets, ec2:DescribeSecurityGroups. Please add these permissions to your IAM role, or set sanity_check = false to avoid errors.

CAUTION: having mount target with inbound and outbound NFS traffic allowed from 0.0.0.0/0 will expose the file system to NFS mounting request from anywhere in the mount target’s availability zone. We recommend not to have a mount target in stack’s availability zone and let us create the mount target. If you must have a mount target in stack’s availability zone, consider using a custom security group by providing a vpc_security_group_id option under the vpc section, adding that security group to the mount target, and turning off config sanity to create the cluster.

Defaults to NONE. Needs to be an available EFS file system:

efs_fs_id = fs-12345

RAID

RAID drive configuration settings for creating a RAID array from a number of identical EBS volumes. The RAID drive is mounted on the master node, and exported to compute nodes via nfs.

[raid rs]
shared_dir = raid
raid_type = 1
num_of_raid_volumes = 2
encrypted = true

shared_dir

Shared directory that the RAID drive will be mounted to on the master and compute nodes.

This parameter is REQUIRED, the RAID drive will only be created if this parameter is specified. The below example mounts to /raid. Do not use NONE or /NONE as the shared directory.:

shared_dir = raid

raid_type

RAID type for the RAID array. Currently only support RAID 0 or RAID 1. For more information on RAID types, see: RAID info

This parameter is REQUIRED, the RAID drive will only be created if this parameter is specified. The below example will create a RAID 0 array:

raid_type = 0

num_of_raid_volumes

The number of EBS volumes to assemble the RAID array from. Currently supports max of 5 volumes and minimum of 2.

Defaults to 2.

num_of_raid_volumes = 2

volume_type

The the type of volume you wish to launch. See: Volume type for detail

Defaults to gp2.

volume_type = io1

volume_size

Size of volume to be created.

Defaults to 20GB.

volume_size = 20

volume_iops

Number of IOPS for io1 type volumes.

volume_iops = 500

encrypted

Whether or not the file system will be encrypted.

Defaults to false.

encrypted = false