Spot

Overview

Spot Instances are Amazon Web Services EC2 line of budget instance options. They allow you to pay a substantially cheaper price for instances. The tradeoff is that the instances are not guaranteed, they may be shutdown at any time. The current design of the Deadline Balancer does not allow users to take advantage of these discount instances. This event plugin tries to replicate the functionality of the Balancer while using Spot Fleet Instances. Spot Fleet is a convenient way to manage many Spot Instances simultaneously. You can learn more about Spot Instances on the AWS Spot Instance Website.

Workflow

Here is an overview of how the Spot event plugin works.

The Spot event plugin triggers after House Cleaning. After House Cleaning, we aggregate the number of queued tasks (the work to be done) for every Deadline Group taking into account any applicable Limits per Deadline job. Next, we either create or modify a Spot Fleet Request for every Deadline Group that has a Spot Fleet Configuration. Each Spot Fleet Request will ask AWS for a number of Spot Instances equal to the queued tasks or the maximum capacity set by the user. We use the TargetCapacity value of the Spot Fleet Request Configuration as our user editable maximum. That means a different maximum can be set for each configured group.

Once the Spot Fleet Requests have been fulfilled and the Workers have started, they will automatically assign themselves to the Group they were created for. Optionally, you can add a JSON array to declare a Pools entry to the Spot Fleet Configuration to automatically have 1 or more Deadline Pool names added to the Worker instances that are started for each group. If configured, the Pool names are added to the Worker in the order they are entered in the Spot Fleet JSON config file.

Finally, once the Workers have been in a state other than Rendering for a certain amount of time, they will shut themselves down. The Worker will mark itself as offline and remove itself from the Worker List. When a Spot Instance is shutdown, it is terminated. The Idle Shutdown event plugin setting can be used to control how long in minutes before termination occurs for an idle Worker.

Determining Targets

For every Group you have configured, the Spot Event Plugin will scale the target number of instances based on the state of your farm during House Cleaning. Here are some of the factors it takes into account:

  • State: Only Queued Tasks are considered eligible to work on. Additionally, if a job has a WhiteList then no instances will be started for it.
  • Limits: The number of available Limit stubs you have will constrain the number of instances to start. Unlimited Limits are ignored. Machine Limits are taken into consideration.
  • Concurrent Tasks: If Concurrent Tasks are enabled on a Job and on it’s Plugin, they will be taken into account too. However, Worker level Task Limits are not taken into account. That means if you have selected an instance type with a small number of CPUs for a Job with a high number of concurrent tasks, you will not receive the number of instances you may be expecting. This is done because we can’t determine which instance types will start when using a Spot Fleet with mixed instance types.

Setup

Warning

Ensure Deadline client software is installed locally to your rendernode image (AMI) and NOT centrally mounted if you are using Linux.

Warning

Ensure “House Cleaning Interval” in Configure Repository Options -> “House Cleaning” is NOT set below 30 seconds (default=60 seconds) and the setting: “Allow Workers to Perform House Cleaning If Pulse is not Running” is disabled as shown below in Configure Repository Options. Consider running at least one other instance of Pulse on another machine for redundancy.

../_images/sep_house_cleaning.png

Note

Deadline’s Resource Tracker introduced in 10.0.27 will also be used to monitor the health of Spot instances started by Deadline. It’s designed to prevent incurring extra costs for instances that aren’t correctly connected to Deadline.

Before you can use the Spot event plugin, you are going to need an AWS account. You will need to get an API Access Key and Secret Key from the IAM Console under Users and Security Credentials. You can find more information about that in this AWS blog entry and the AWS documentation.

To access these settings, simply enter Super User mode and select Tools -> Configure Events form the Monitor’s menu. From there, select the Spot entry from the list on the left.

../_images/cep_spot.png

The Spot event plugin settings are:

Options

  • State: How this event plug-in should respond to events. If Global, all jobs and Workers will trigger the events for this plugin. If Disabled, no events are triggered for this plugin. Default: Disabled.

    Note

    To complete a clean shutdown of an already running Spot Event Plugin environment; disable the Spot Event Plugin via Deadline Monitor, ensure you shutdown all Pulse instances and then terminate any Spot Fleet Requests in the AWS EC2 Mgmt Console.

  • Enable Resource Tracker: (Only disable for AMIs with Deadline 10.0.26 or earlier). Use of the Deadline Resource Tracker requires that your custom AMIs have Deadline Client 10.0.27 or later installed on them. If enabled, the Deadline Resource Tracker will help optimize your resources by terminating instances and Spot Fleet requests that don’t appear to be behaving as expected. We recommend upgrading your AMIs and enabling the Deadline Resource Tracker. Default: True.

Security Credentials

  • Use Local Credentials: Whether or not you wish to use your local AWS credentials (found in ~/.aws/credentials (Linux & Mac) or %USERPROFILE%\.aws\credentials (Windows)). If you are using local credentials, you may leave the Access Key ID and Secret Access Key fields blank. We recommend using local credentials.
  • AWS Named Profile: The AWS Named Profile that contains your Spot Credentials. (Required when Use Local Credentials is set to True.)
    • Note: The credential and config files (~/.aws/credentials and ~/.aws/config (Linux & Mac) or %USERPROFILE%\.aws\credentials and %USERPROFILE%\.aws\config (Windows)) must be on the machine that performs house cleaning (which is usually the machine running Pulse).
    • For more information on AWS Named Profiles and how to create them, visit: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html
  • Access Key ID: The AWS Access Key ID for your Spot Credentials. (Required when Use Local Credentials is set to False.)
  • Secret Access Key: The AWS Secret Access Key for your Spot Credentials. (Required when Use Local Credentials is set to False.)

Configuration

  • Logging Level: Different logging levels. Select Verbose for detailed logging about the inner workings of the Spot Event Plugin. Select Debug for all Verbose logs plus additional information on AWS API calls that are used. Default: Standard.

  • Region: The AWS region in which to start the spot fleet requests. Default: eu-west-1.

  • Spot Fleet Request Configurations: A mapping between your Groups and Spot Fleet Requests (SFRs). One request per group formatted as so: { "group_name":{spot_fleet_request}, "2nd_group_name":{spot_fleet_request} }. Spot Fleet Requests (SFRs) are JSON formatted and can be downloaded from the AWS console. Default: {}.

    Warning

    If you modify your JSON configuration after deployment, then ensure any active SFRs are terminated in the AWS console and Pulse is restarted. It is best to only modify the JSON configuration when you are not actively rendering on AWS.

  • Idle Shutdown: Number of minutes that a AWS Worker will wait in a non-rendering state before it is shutdown. Default: 10.

  • Delete SEP Terminated Workers: If enabled, Deadline Spot Event Plugin terminated AWS Workers will be deleted from the Workers Panel on the next House Cleaning cycle. Warning: The terminated Worker’s reports will also be deleted for each Worker, which may be undesired for future debugging a render job issue. Default: False.

  • Delete EC2 Spot Interrupted Workers: If enabled, EC2 Spot interrupted AWS Workers will be deleted from the Workers Panel on the next House Cleaning cycle. Warning: The terminated Worker’s reports will also be deleted for each Worker, which may be undesired for future debugging a render job issue. Default: False.

  • Strict Hard Cap: If enabled, any active instances greater than the target capacity for each group will be terminated. Workers may be terminated even while rendering. Default: False.

  • Maximum Instances Started Per Cycle: The Spot Plugin will request this maximum number of instances per House Cleaning cycle. Default: 50.

  • Pre Job Task Mode: How the Spot Event Plugin handles Pre Job Tasks. Conservative will only start 1 Spot instance for the pre job task and ignore any other tasks for that job until the pre job task is completed. Ignore will not take the pre job task into account when calculating target capacity. Normal will treat the pre job task like a regular job queued task. Default: Conservative.

  • AWS Instance Status: If enabled, the Worker Extra Info X column will be used to display AWS Instance Status if the instance has been marked to be stopped or terminated by EC2 or Spot Event Plugin. All timestamps are displayed in UTC format. Default: Disabled.

Spot Fleet Configuration

The Spot Fleet Configuration setting is a JSON dictionary. It represents a one to one mapping between a Deadline Group and Spot Fleet Configurations.

The dictionary has Deadline Group names as keys and Spot Fleet Configurations as values. To download a Spot Fleet Configuration, start by creating a new Spot Fleet Request.

../_images/events_spot_start.png

Warning

The Spot Event Plugin only supports ‘Maintain’ Spot Fleets. Be sure to enable that feature, otherwise the fleets will not be able to scale correctly.

Warning

As of Deadline 10.1.8, our default Spot Credentials IAM policy only allows IAM instance profiles with IAM role names that begin with DeadlineSpot.

At the bottom of the “Request Spot Instances” page, there is a button to download the Configuration as JSON:

../_images/events_spot_download_json_config.png

Copy everything in the downloaded file and place it into the Spot Fleet Configuration for the desired Deadline Group.

Here’s an example with two Deadline Groups, called: deadline_group and another_deadline_group. Each Group also has an [optional] Deadline Pools list added:

{
    "deadline_group":{
        "Pools": [
            "pool1",
            "pool2"
        ],
        "IamFleetRole": "arn:aws:iam::357432474442:role/aws-ec2-spot-fleet-tagging-role",
        "AllocationStrategy": "diversified",
        "TargetCapacity": 0,
        "SpotPrice": "0.105",
        "ValidFrom": "2016-12-13T16:48:12Z",
        "ValidUntil": "2017-12-13T16:48:12Z",
        "TerminateInstancesWithExpiration": true,
        "LaunchSpecifications": [
        {
            "ImageId": "ami-b04s92d0",
            "InstanceType": "c3.large",
            "KeyName": "key_pair",
            "SpotPrice": "0.105",
            "BlockDeviceMappings": [
            {
                "DeviceName": "/dev/xvda",
                "Ebs": {
                    "DeleteOnTermination": true,
                    "VolumeType": "gp2",
                    "VolumeSize": 8,
                    "SnapshotId": "snap-c87f35ec"
                }
            }
            ],
            "NetworkInterfaces": [
            {
                "DeviceIndex": 0,
                "SubnetId": "subnet-3efcba4b",
                "DeleteOnTermination": true,
                "Groups": [
                   "sg-2d623a54"
                ],
                "AssociatePublicIpAddress": true
            }
            ]
        }
        ],
        "Type": "maintain"
    },
    "another_deadline_group":{
        "Pools": [
            "pool2",
            "pool1"
        ],
        "IamFleetRole": "arn:aws:iam::357466224442:role/aws-ec2-spot-fleet-tagging-role",
        "AllocationStrategy": "diversified",
        "TargetCapacity": 0,
        "SpotPrice": "0.133",
        "ValidFrom": "2016-12-15T16:47:06Z",
        "ValidUntil": "2017-12-15T16:47:06Z",
        "TerminateInstancesWithExpiration": true,
        "LaunchSpecifications": [
        {
            "ImageId": "ami-d722f0b7",
            "InstanceType": "m3.medium",
            "SpotPrice": "0.067",
            "BlockDeviceMappings": [
            {
                "DeviceName": "/dev/sda1",
                "Ebs": {
                    "DeleteOnTermination": true,
                    "VolumeType": "gp2",
                    "VolumeSize": 8,
                    "SnapshotId": "snap-42713105"
                }
            }
            ],
            "SecurityGroups": [
            {
                "GroupId": "sg-06b82060"
            },
            {
                "GroupId": "sg-5058c026"
            }
            ],
            "SubnetId": "subnet-2040e466"
        },
        {
            "ImageId": "ami-d722f0b7",
            "InstanceType": "m4.large",
            "SpotPrice": "0.12",
            "BlockDeviceMappings": [
            {
                "DeviceName": "/dev/sda1",
                "Ebs": {
                    "DeleteOnTermination": true,
                    "VolumeType": "gp2",
                    "VolumeSize": 8,
                    "SnapshotId": "snap-47113105"
                }
            }
            ],
            "SecurityGroups": [
            {
                "GroupId": "sg-06b44060"
            },
            {
                "GroupId": "sg-5238c036"
            }
            ],
            "SubnetId": "subnet-1010e466"
        },
        {
            "ImageId": "ami-d722b0b7",
            "InstanceType": "m3.large",
            "SpotPrice": "0.133",
            "BlockDeviceMappings": [
            {
                "DeviceName": "/dev/sda1",
                "Ebs": {
                    "DeleteOnTermination": true,
                    "VolumeType": "gp2",
                    "VolumeSize": 8,
                    "SnapshotId": "snap-47743105"
                }
            }
            ],
            "SecurityGroups": [
            {
                "GroupId": "sg-06b84063"
            },
            {
                "GroupId": "sg-5058c016"
            }
            ],
            "SubnetId": "subnet-1042e466"
        }
        ],
    "Type": "maintain"
    }
}

Wildcards

As of Deadline 10.0.27, the Spot Event Plugin supports wildcards in group names in the configuration. This allows you to use the same Spot Fleet Request configuration for many groups matching the same pattern. For example, to use the same configuration for any group starting with “spot_”, you can set up a configuration like the one below:

{
        "spot_*":{
                "Pools": [
                        "pool1",
                        "pool2"
                ],
                "IamFleetRole": "arn:aws:iam::357432474442:role/aws-ec2-spot-fleet-tagging-role",
                "AllocationStrategy": "diversified",
                "TargetCapacity": 0,
                "SpotPrice": "0.105",
                "ValidFrom": "2016-12-13T16:48:12Z",
                "ValidUntil": "2017-12-13T16:48:12Z",
                "TerminateInstancesWithExpiration": true,
                "LaunchSpecifications": [
                {
                        "ImageId": "ami-b04s92d0",
                        "InstanceType": "c3.large",
                        "KeyName": "key_pair",
                        "SpotPrice": "0.105",
                        "BlockDeviceMappings": [
                        {
                                "DeviceName": "/dev/xvda",
                                "Ebs": {
                                        "DeleteOnTermination": true,
                                        "VolumeType": "gp2",
                                        "VolumeSize": 8,
                                        "SnapshotId": "snap-c87f35ec"
                                }
                        }
                        ],
                        "NetworkInterfaces": [
                        {
                                "DeviceIndex": 0,
                                "SubnetId": "subnet-3efcba4b",
                                "DeleteOnTermination": true,
                                "Groups": [
                                   "sg-2d623a54"
                                ],
                                "AssociatePublicIpAddress": true
                        }
                        ]
                }
                ],
                "Type": "maintain"
        }
}

When using wildcards in group names, consider the following caveats:

  • If a Deadline group matches two wildcard groups, an error will be thrown. No Spot Fleets will be started for that group.
  • If a Deadline group is explicitly configured in the SEP configuration, but also matches a wildcard group, the Spot Event Plugin will use the Spot Fleet Request configuration for the group that is configured explicitly.

IAM Policies

IAM Policies are crucial to ensure the security and permissions of the identities and resources in your AWS account. The Spot event plugin requires four different IAM roles to function, one for the user whose access key and secret key you’ve entered (in the Spot Event Plugin Configure Events…Spot…Login), one for the Worker instances (defined in the IAM instance profile in a Spot Fleet Request), one for the Resource Tracker, and one for the Spot Fleet Request itself (defined in the IAM fleet role in a Spot Fleet Request). Click Here for more information about IAM Policies.

Spot Credentials

The IAM user whose credentials have been entered will be responsible for starting and stopping Spot Fleet Requests via your primary Pulse.

../_images/events_spot_iam_user_permissions.png

You can grant the required permissions by attaching the following AWS managed IAM policies to your IAM user:

Warning

In Deadline 10.1.8 and later, our AWSThinkboxDeadlineSpotEventPluginAdminPolicy includes an iam:PassRole permission that only allows IAM Roles that have a name beginning with DeadlineSpot in your IAM Instance Profile.

Note

For further security, consider modifying these policies with an IP address condition on each policy statement. Place this text after the “Resource” entry in each of the statements in the policies.

"Condition": {
    "IpAddress" : {
        "aws:SourceIp" : ["<your_public_ip_address>"]
    }
}

This way, only API calls from your IP address will be accepted by AWS.

Creating the Spot Credentials IAM User

To create the Spot Credentials IAM user:

  1. Sign in to the AWS Console.

  2. Click on Services at the top of the AWS Console, and then click on IAM under “Security, Identity and Compliance”.

  3. On the “IAM” panel that appears, select Users, and then click the Add user button.

    ../_images/events_spot_iam_user_creation_1.png
  4. In the “Add user” panel that appears:

    • For the User name, enter DeadlineSpotEventPluginAdmin (or any name of your choosing).
    • For Access type, choose Programmatic access.
  5. Click the Next: Permissions button to continue.

    ../_images/events_spot_iam_user_creation_2.png
  6. Search for the AWSThinkboxDeadlineSpotEventPluginAdminPolicy and AWSThinkboxDeadlineResourceTrackerAdminPolicy policies in the search box, and check both of them in the list below.

  7. Click the Next: Tags button to continue.

    ../_images/events_spot_iam_user_creation_3.png
  8. You can add tags with additional data to the user, but they are not required.

  9. Click the Next: Review button to continue.

    ../_images/events_spot_iam_user_creation_4.png
  10. Verify that the information is correct, and click Create user.

    ../_images/events_spot_iam_user_creation_5.png
  11. Once the user is created you’ll see a confirmation screen. Here you can get your Access key and Secret access key, which you’ll need to enter in the “Security Credentials” field of the Configure Event Plugin dialog.

Warning

This secret access key can be used to access your AWS account. It’s important that you keep it stored securely. Please refer AWS Guidelines for standard best practices for management of AWS access keys.

IAM Instance Profile

The Worker instance needs to determine which Spot Fleet Request it came from, and to report its status to the Resource Tracker.

../_images/events_spot_iam_worker_statement.png

You can assign the required permissions by attaching the following AWS-managed IAM policy to an IAM role:

You must choose a Role name that begins with “DeadlineSpot”, such as DeadlineSpotWorker, in order to work with the permissions in our Spot Event Plugin Admin IAM Policy.

Warning

If you use our default IAM policy for your Spot Credentials, then the IAM role that you attach to your instance profile must have a name beginning with DeadlineSpot.

Creating the Instance IAM Role

To create the IAM role for your IAM instance profile:

  1. Sign in to the AWS Console

  2. Click on Services at the top of the AWS Console, and then click on IAM under “Security, Identity and Compliance”.

  3. On the “IAM” panel that appears, click Roles, and then click the Create role button.

    ../_images/events_spot_instance_role_creation_1.png
  4. In the “Create role” panel that appears, choose the following:

    • Under Select type of trusted entity, choose AWS service.
    • Under Choose a use case, choose EC2.
  5. Click the Next: Permissions button to continue.

    ../_images/events_spot_instance_role_creation_2.png
  6. Search for AWSThinkboxDeadlineSpotEventPluginWorkerPolicy in the search box, and check it in the list below.

  7. Click the Next: Tags button to continue.

    ../_images/events_spot_instance_role_creation_3.png
  8. You can add tags with additional data to the role, but they are not required.

  9. Click the Next: Review button to continue.

    ../_images/events_spot_instance_role_creation_4.png
  10. In the Role name field, enter DeadlineSpotWorker. You can choose a different name for this role, however, the name must begin with “DeadlineSpot” in order to work with our default Spot Credentials IAM policies.

  11. Verify that the information is correct, and click Create role.

  12. Choose this role as the “IAM instance profile” in your Spot Fleet Configuration.

Resource Tracker Role

The Deadline Resource Tracker requires an IAM Role in order to access AWS resources in your account. Learn more about how to create the required IAM role.

IAM Fleet Role

The Spot Fleet Request itself needs an IAM role too.

../_images/events_spot_iam_fleet_role.png

By default, the IAM Fleet Role is set to aws-ec2-spot-fleet-tagging-role. Leave this setting as is.

VPC Endpoints

If one or more of your EC2 instances are running inside a private subnet in your VPC, then you will need to provide a mechanism for those instances to be able to access the AWS web service endpoints used in the above IAM policies. A VPC endpoint enables you to privately connect your VPC to supported AWS services and VPC endpoint services powered by PrivateLink without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection. Instances in your VPC do not require public IP addresses to communicate with resources in the service. Traffic between your VPC and the other service does not leave the Amazon network. AWS typically provides an endpoint per service/region.

To create a VPC endpoint for an AWS service, use the create an Interface or Gateway Endpoint procedure. Spot Event Plugin requires the following endpoints:

  • HouseCleaningStatement:

    com.amazonaws.<region>.cloudformation (interface)
    com.amazonaws.<region>.dynamodb (gateway)
    com.amazonaws.<region>.ec2 (interface)
    com.amazonaws.<region>.events (interface)
    com.amazonaws.<region>.s3 (gateway)
    com.amazonaws.<region>.sqs (interface)
    com.amazonaws.<region>.sts (interface)
    
  • WorkerStatement:

    com.amazonaws.<region>.ec2 (interface)
    com.amazonaws.<region>.sqs (interface)
    

where <region> represents the region identifier for an AWS region, such as eu-west-2 for the EU (London) Region.

Warning

The HouseCleaningStatement, which is run by the Remote Connection Server or Pulse, also depends on access to the following public facing AWS API endpoints:

  • iam.amazonaws.com, for which no VPC endpoint is available
  • sts.amazonaws.com

You can grant the required access by using a NAT gateway.

FAQ

Can the Spot Event Plugin be used in combination with the Deadline Balancer?

No. The Deadline Balancer and Spot Event plugin can technically be run at the same time but they do not complement each other. They will create Workers for the same tasks.

Does the AWS Portal work with this Spot Event Plugin?

No. Both components can technically be run at the same time but they do not complement each other.

Can the Spot Event Plugin be used in combination with the Amazon Cloud Plugin?

No. The Amazon Cloud Plugin starts EC2 On-Demand/Reserved instances whereas the Spot Event Plugin uses the heavily discounted EC2 Spot instances for compute.

Does the Spot Event Plugin take Deadline ‘Pools’ into account when calculating number of EC2 Spot Instances to request?

No. You should use Deadline Groups to configure which Deadline jobs are run on Spot instances in AWS EC2. However, you can configure an order based, comma-separated string of the Pool names you would like to be assigned to all the Deadline Workers which are started on EC2 Spot for the relevant Deadline Group that made the Spot request.

Can I use Limits to control the number of Spot instances provisioned by the Spot Event Plugin?

Yes. Deadline Resource and License Limits at the app plugin and job level are taken into account when calculating the number of Spot instances to be requested per Deadline Group configured in your Spot Fleet JSON configuration file.

Do I need Deadline licenses (floating or Usage-based) to render with Spot Event Plugin?

No! Deadline does not require any licenses for it to run while on AWS. You will still need licenses for your rendering software if they require one.