Factory Plans
Simple local plan
The plan below is the simplest factory defines the two bots from pypi source and a single stage called test
and a local connection.
1# simple-local-plan.yml
2factory:
3 name: qalx-orcaflex-local
4 bots:
5 batch_bot:
6 source:
7 pypi: qalx_orcaflex
8 bot_path: qalx_orcaflex.bots:batch_bot
9 sim_bot:
10 source:
11 pypi: qalx_orcaflex
12 bot_path: qalx_orcaflex.bots:sim_bot
13
14 stages:
15 test:
16 local_connection:
17 type: local
18 bots:
19 sim_bot:
20 queue-name: test-sim-queue
21 processes: 2
22 batch_bot:
23 queue-name: test-batch-queue
24 processes: 1
Building this factory with the command line below:
qalx factory-build --plan simple-local-plan.yml --stage test
Will start a sim_bot
in two processes and a batch_bot
in one process. They will read from the specified
queue names in the highlighted lines.
Local multiple stage plan
The plan below adds an additional stage to the plan above.
1# local-multiple-stage-plan.yml
2factory:
3 name: qalx-orcaflex-local
4 bots:
5 batch_bot:
6 source:
7 pypi: qalx_orcaflex
8 bot_path: qalx_orcaflex.bots:batch_bot
9 sim_bot:
10 source:
11 pypi: qalx_orcaflex
12 bot_path: qalx_orcaflex.bots:sim_bot
13
14 stages:
15 test:
16 local_connection:
17 type: local
18 bots:
19 sim_bot:
20 queue-name: test-sim-queue
21 processes: 2
22 batch_bot:
23 queue-name: test-batch-queue
24 processes: 1
25 production:
26 local_connection:
27 type: local
28 bots:
29 sim_bot:
30 queue-name: sim-queue
31 processes: 8
32 batch_bot:
33 queue-name: batch-queue
34 processes: 1
Building this factory with the command line below:
qalx factory-build --plan local-multiple-stage-plan.yml --stage production
Will now start a sim_bot
in eight processes. These will read from a different queue as well.
Remote plan
The plan below shows a stage which will deploy the bots to a server on AWS.
1# remote-aws-plan.yml
2factory:
3 name: qalx-orcaflex-remote
4 bots:
5 batch_bot:
6 source:
7 pypi: qalx_orcaflex
8 bot_path: qalx_orcaflex.bots:batch_bot
9 sim_bot:
10 source:
11 pypi: qalx_orcaflex
12 bot_path: qalx_orcaflex.bots:sim_bot
13
14 stages:
15 test:
16 remote_on_aws:
17 type: aws
18 parameters:
19 ImageId: ami-1a2bc3d4e5f6g7h8i
20 InstanceType: m5.12xlarge
21 # key-pair is needed if you want to connect to the remote server
22 KeyName: my-key-pair
23 NetworkInterfaces:
24 - DeviceIndex: 0
25 AssociatePublicIpAddress: true
26 # subnet needs to exist and have access to the internet
27 SubnetId: subnet-11aa22b
28 # security group needs to exist and at least allow outbound traffic
29 GroupSet:
30 - sg-q9w8e7r6t5yg2g1h0
31 bots:
32 sim_bot:
33 queue-name: test-sim-queue
34 processes: 2
35 batch_bot:
36 queue-name: test-batch-queue
37 processes: 1
The key parameters are in the highlighted lines:
ImageId
: this is image needs to be apyqalx
base image (contact us if you don’t have access to these) with OrcaFlex installed and with access to a licence.InstanceType
: this is the server that the bots will be deployed to.KeyName
: if you want to be able to remote desktop to the servers you need to specify an existing key pair name.NetworkInterface
: the server will need to be placed in a subnet with access to the internet and the licence server. It will also need a security group that allows this and RDP access if required.
qalx factory-build --plan remote-aws-plan.yml --stage test
Multiple clusters
This plan is a more complete example, it allows you to run bots locally during development, then deploy them to a small cloud server for testing and then offers production clusters in 3 different sizes.
1# full-cluster-plan.yml
2factory:
3 name: qalx-orcaflex-cluster
4 bots:
5 batch_bot:
6 source:
7 pypi: qalx_orcaflex
8 bot_path: qalx_orcaflex.bots:batch_bot
9 sim_bot:
10 source:
11 pypi: qalx_orcaflex
12 bot_path: qalx_orcaflex.bots:sim_bot
13
14 stages:
15 dev: # this is the stage to use when you are developing your automation locally
16 development:
17 type: local
18 bots:
19 sim_bot:
20 queue-name: dev-sim-queue
21 processes: 2
22 batch_bot:
23 queue-name: dev-batch-queue
24 processes: 1
25 test: # this is the stage to use when you are testing small batches and remote processing
26 remote_on_aws:
27 type: aws
28 parameters:
29 ImageId: ami-1a2bc3d4e5f6g7h8i
30 InstanceType: c6i.2xlarge
31 KeyName: my-key-pair
32 NetworkInterfaces: &network_interface
33 - DeviceIndex: 0
34 AssociatePublicIpAddress: true
35 SubnetId: subnet-11aa22b
36 GroupSet:
37 - sg-q9w8e7r6t5yg2g1h0
38 bots:
39 sim_bot:
40 queue-name: test-sim-queue
41 processes: 8
42 batch_bot:
43 queue-name: test-batch-queue
44 processes: 1
45
46 cluster_small: # this is a small production cluster with 1 large 128 core server
47 node_1: &cluster_node
48 type: aws
49 parameters:
50 ImageId: ami-1a2bc3d4e5f6g7h8i
51 InstanceType: c6i.32xlarge
52 KeyName: my-key-pair
53 NetworkInterfaces: *network_interface
54 bots:
55 sim_bot:
56 queue-name: prod-sim-queue
57 processes: 128
58 batch_bot:
59 queue-name: prod-batch-queue
60 processes: 1
61
62 cluster_medium: # this cluster has two large servers
63 node_1: *cluster_node
64 node_2: *cluster_node
65
66 cluster_large: # this cluster has 5 large servers
67 node_1: *cluster_node
68 node_2: *cluster_node
69 node_3: *cluster_node
70 node_4: *cluster_node
71 node_5: *cluster_node
This plan leverages anchors and aliases in YAML to simplify the plan. The anchors are defined on the highlighted lines and then re-used with aliases lower down the plan.
Calling the command:
qalx factory-build --plan full-cluster-plan.yml --stage cluster_large
Will start 5 servers in AWS with 128 cores each.
Terminating servers
You can _demolish_ a factory using the command:
qalx factory-demolish --name my-factory-name
Where my-factory-name
is the name on line 3 of all the plans above.
This command will gracefully terminate the bots on each server and then terminate the server. Each bot should finish the
job is was doing before the server terminates. This is great when you know that your batches are all finished but if
you are running batches overnight you might have some pricey servers sitting idle until someone checks that they are
finished.
The plan below shows how you can leverage CloudWatch alarms to ensure that each server will terminate (and stop costing money) once it has bene idle for a certain period.
1# remote-aws-alarm-plan.yml
2factory:
3 name: qalx-orcaflex-remote
4 bots:
5 batch_bot:
6 source:
7 pypi: qalx_orcaflex
8 bot_path: qalx_orcaflex.bots:batch_bot
9 sim_bot:
10 source:
11 pypi: qalx_orcaflex
12 bot_path: qalx_orcaflex.bots:sim_bot
13
14 stages:
15 test:
16 remote_on_aws:
17 type: aws
18 alarm: terminate_on_idle_2pc
19 parameters:
20 ImageId: ami-1a2bc3d4e5f6g7h8i
21 InstanceType: m5.12xlarge
22 # key-pair is needed if you want to connect to the remote server
23 KeyName: my-key-pair
24 NetworkInterfaces:
25 - DeviceIndex: 0
26 AssociatePublicIpAddress: true
27 # subnet needs to exist and have access to the internet
28 SubnetId: subnet-11aa22b
29 # security group needs to exist and at least allow outbound traffic
30 GroupSet:
31 - sg-q9w8e7r6t5yg2g1h0
32 bots:
33 sim_bot:
34 queue-name: test-sim-queue
35 processes: 2
36 batch_bot:
37 queue-name: test-batch-queue
38 processes: 1
39
40 alarms:
41 terminate_on_idle_2pc:
42 MetricName: CPUUtilization
43 Statistic: Average
44 Period: 60 # seconds
45 ComparisonOperator: LessThanThreshold
46 EvaluationPeriods: 10
47 Threshold: '2'
48 Namespace: AWS/EC2
49 AlarmActions:
50 - terminate
In this plan as soon as the server has a CPU utilisation below 2% for ten minutes the server will be terminated.
Attention
The CPU levels that define an “idle” state will vary based on the server size and the jobs the bots are doing. You should be careful not to define an alarm that will terminate a server that is finishing off one final long-running job. These alarms do not wait for jobs to finish before terminating the server.
Workflows
Often when a batch completes you will want to process the information in the batch to create plots or populate a report.
The plan below shows how to use the workflows
a feature of factories to send jobs from batch_bot
on to an plot_bot
.
1# simple-workflow-plan.yml
2factory:
3 name: qalx-orcaflex-local
4 bots:
5 batch_bot:
6 source:
7 pypi: qalx_orcaflex
8 bot_path: qalx_orcaflex.bots:batch_bot
9 sim_bot:
10 source:
11 pypi: qalx_orcaflex
12 bot_path: qalx_orcaflex.bots:sim_bot
13 plot_bot:
14 source:
15 path: examples/bots
16 bot_path: plots:plot_bot
17
18 stages:
19 test:
20 workflow: plot_flow
21 local_connection:
22 type: local
23 bots:
24 sim_bot:
25 queue-name: test-sim-queue
26 processes: 2
27 batch_bot:
28 queue-name: test-batch-queue
29 processes: 1
30 plot_bot:
31 queue-name: test-plot-queue
32 processes: 1
33
34 workflows:
35 plot_flow:
36 - batch_bot:
37 plot_bot
More details on creating custom bots can be found in Custom Bots.