Building Batches
While it is possible to send individual simulations to a Sim Bot the best way to run simulations is to build them into batches.
To do this, you create an OrcaFlexBatch
and then add data files to it, this is done with the
add method:
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, DirectorySource
qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
batch_queue="batch bot", # queue our batch bot will look at for jobs
sim_queue="sim bot", # queue our sim bot will look at for jobs
)
with OrcaFlexBatch(name="My Batch", session=qfx, batch_options=batch_options) as batch:
batch.add(source=DirectorySource(r"S:\Project123\OFX\Batch1"))
The reason for the with
block is explained in Context manager but it is important to
know that once the code has successfully exited the with
block then the batch has been built and
submitted.
Sources
There are various options for the source argument:
DirectorySource
Data files will be submitted for every OrcaFlex file in a directory.
The following code will include all .dat files in the Batch1 folder:
from qalx_orcaflex.core import DirectorySource
my_files = DirectorySource(r"S:\Project 123\OFX\Batch1")
The following code will include all .dat, .yml and .yaml files in Batch1:
Warning
qalx_orcaflex
does not support the concept of BaseFile
in text data files (yet)
from qalx_orcaflex.core import DirectorySource
my_files = DirectorySource(r"S:\Project 123\OFX\Batch1", include_yaml=True)
The following code will include all .dat files in Batch1 and all the subdirectories of Batch1:
from qalx_orcaflex.core import DirectorySource
my_files = DirectorySource(r"S:\Project 123\OFX\Batch1", recursive=True)
ModelSource
Saves the model data from an OrcFxAPI.Model
instance. You need to give the model a name:
from qalx_orcaflex.core import ModelSource
import OrcFxAPI as ofx
model = ofx.Model()
my_files = ModelSource(model, "My Model")
ModelSource
will save the data in the model instance at the
time it is created, any future changes to the OrcFxAPI.Model
instance will not be reflected.
This is so you can happily use the same instance for all your load cases and not worry about
loading the base model from disk each time.
from qalx_orcaflex.core import ModelSource
import OrcFxAPI as ofx
model = ofx.Model(r"C:\MY_MASSIVE_MODEL.dat")
model['My Line'].Length[0] = 100
my_100_model_source = ModelSource(model, "Model with l=100")
model['My Line'].Length[0] = 200
my_200_model_source = ModelSource(model, "Model with l=200")
The sources above will have different line lengths even though the line in the model
variable
has a length of 200.
Note
There is no need to add “.dat” to the name of the model. This will be added automatically.
FileSource
This will add a single file from a path.
from qalx_orcaflex.core import FileSource
my_files = FileSource(path=r"S:\Project 123\OFX\Tests\MY_TEST_MODEL.dat")
DataFileItemsSource
This is for when you have data files already in qalx and you want to re-run them for some reason.
coming soon!
This source will become more useful when we implement “model deltas” which will allow you to load a base model from qalx and specify all the load case details as they are added to the batch.
Batches with restarts
Batches which contain models that restart from another simulation should “just work”. That is, those models should run in the same way as every other model in the batch and the results will be processed in the same way. However, there are a few things that you should understand about how qalx-OrcaFlex processes these to ensure you do not suffer unexpected behaviours.
Use OrcFxAPI on build
By default, qalx-OrcaFlex uses OrcFxAPI
to determine if a file is a restart file and it will use the
use it to obtain the chain of parent models. This means that you need to have an OrcaFlex licence available on
the machine you are using to build batches.
If you do not have a licence available or do not want to use one, it is possible to pass use_orcfxapi=False
to
any of the Sources detailed above. This will result in no licence being used but at a cost of less robust
acquisition of the parent chain.
Parents “Inside” and “Outside” batches
qalx-OrcaFlex needs to have the parent simulation available to run a restart model. The way it treats this parent is slightly different if the model is not included within the same batch as the child. In this situation the simulation will be run and will be saved to qalx but none of the post-processing will be applied to the parent simulation.
Consider the batch of models below:
c:\Project1\base.dat
c:\Project1\m1.yml [parent=c:\Project1\base.dat]
c:\Project1\m2.yml [parent=c:\Project1\m1.yml]
If you create a batch with DirectorySource
from c:\Project1
then all the simulations will be run and any
post-processing you request will be performed on all three simulations as they are considered “inside” the batch.
However, if you create a batch using a single FileSource
for c:\Project1\m2.yml
then all three simulations will
be run (they have to be for m2.yml
to run) but only the m2.yml
simulation will be subjected to post-processing.
The base.dat
and m1.yml
simulation are considered to be “outside” the batch.
Distributed processing of complex trees
Consider the following directory of models, that you want to run as a batch.
base.dat
m1.yml [parent=base.dat]
m2.yml [parent=m1.yml]
another_base.dat
q1.yml [parent=another_base.dat]
q2.yml [parent=q1.yml]
m3.yml [parent=m2.yml]
x1.yml [parent=m1.yml]
qalx-OrcaFlex is used to run batches in multiple processes across distributed servers in various environments. However, for restarts to work the parent simulation file must be available on the server. Rather than running the parents of every model on every server the system will group chains of restarts together and run them sequentially on the same server. If a batch contains non-related chains however, it will split these to be run in parallel.
Given the example batch from above it is clear that the following all depend on base.dat
:
base.dat
m1.yml [parent=base.dat]
m2.yml [parent=m1.yml]
m3.yml [parent=m2.yml]
x1.yml [parent=m1.yml]
So you can expect these to all be run sequentially in the same process. The remaining chain which depends on
another_base.dat
may be run in another process on the same server or on another server all together.
another_base.dat
q1.yml [parent=another_base.dat]
q2.yml [parent=q1.yml]
Batch Options
When you create your batch you need to give some options and configuration settings. Firstly, you need to say which queue you want to submit to. These should correspond to the queues that are specified when you start your Bots. This flexibility allows you to create separate queues for different projects or teams and have multiple bots running in parallel.
batch_queue
: name of the batch queuesim_queue
: name of the simulation queue
The other options are passed to Batch Bot
wait_between_completion_checks
:(default=30) if
wait_to_complete
has been set to True, which it is by default, then this will be the number of seconds that Batch Bot will wait before checking if the jobs have all completed.summarise_results
:(default=True) set this to
False
if you want Batch Bot to skip making Results summaries.build_threads
:(default=10) the cases in the batch will be added to qalx in parallel. Because the bottleneck in submitting cases is usually waiting for a HTTP response from the API, it should be ok to use lots of threads. If you find that your machine grids to a halt during this process you might want to reduce the number of threads. Equally, if you have to create many thousands of cases and are using a powerful machine you can increase the number of threads. Having too many threads will probably cause you to hit rate-limiting on the API.
send_batch_to
:(default=[]) this is a list of names of queues that the batch will be sent to once it has finished processing. This will happen regardless of other settings such as
summarise_results
.send_sim_to
:(default=[]) this is a list of names of queues that every sim will be sent to once it has completed.
notifications
:see Notifications below.
timeout
:time in seconds that this batch should be expected to be complete by. See Notifications below.
Notifications
There is an option to get qalx-OrcaFlex to send notifications when certain events happen to a batch. There are three events that have templated notifications that are sent by email:
data_models.notifications.NotificationSubmitted
: this will be sent once theBatchBot
has submitted all the simulations to the queue for simulation.data_models.notifications.NotificationCompleted
: this will be sent once the batch has been marked as complete by theBatchBot
.data_models.notifications.NotificationTimedOut
: this will be sent if the batch has been not marked as complete by theBatchBot
after the specifiedtimeout
as described above.
By default, no notifications are sent. They can be enabled by adding them to the BatchOptions
as an argument like so:
import OrcFxAPI as ofx
import qalx_orcaflex.data_models as dm
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, ModelSource
qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
batch_queue="batch bot",
sim_queue="sim bot",
notifications=dm.Notifications(
notify_submitted=dm.notifications.NotificationSubmitted(),
notify_completed=dm.notifications.NotificationCompleted(),
notify_timed_out=dm.notifications.NotificationTimedOut(),
)
)
You can adjust who gets the notification email through keyword arguments to each notification. e.g,.:
notifications=dm.Notifications(
notify_submitted=dm.notifications.NotificationSubmitted(
include_creator=True,
to=['bob@analysiscorp.co'],
cc=['anne@analysiscorp.co'],
bcc=['project_1235566@proj.analysiscorp.co'],
subject='I think we finally have this working Bob!'
)
)
For more details see the Notifications API docs.
Context manager
OrcaFlexBatch
is a context manager, this means that you need to use it in a with
block. Doing this means that the data files and associated items will only be created in qalx
when the code you run to build you batch has completed successfully.
For example, the following code will error when the line length is set with the wrong data type:
import OrcFxAPI as ofx
import qalx_orcaflex.data_models as dm
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, ModelSource
qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
batch_queue="batch bot",
sim_queue="sim bot",
)
m = ofx.Model()
line = m.CreateObject(ofx.otLine, "My Line")
with OrcaFlexBatch(name="I will never get built", session=qfx,
batch_options=batch_options) as batch:
for length in [100, '120']:
line.Length[0] = length
batch.add(ModelSource(m, f"Case l={length}"))
In the above code “Case l=100” will not be added to qalx so you don’t have to worry about creating resources that contain errors or partial batches.
Complete example
The following example shows that you can add to a batch from multiple sources:
import OrcFxAPI as ofx
import qalx_orcaflex.data_models as dm
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, ModelSource, \
DirectorySource, FileSource
qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
batch_queue="batch bot",
sim_queue="sim bot",
)
m = ofx.Model()
line = m.CreateObject(ofx.otLine, "My Line")
with OrcaFlexBatch(name="My Batch", session=qfx,
batch_options=batch_options) as batch:
for length in [100, 120]:
line.Length[0] = length
batch.add(ModelSource(m, f"Case l={length}"))
batch.add(DirectorySource(r"S:\Project 123\OFX\140-160m models"))
batch.add(FileSource(r"C:\User\AnneAlysis\My Models\180m.dat"))
Advanced concepts
OrcaFlexJob
The OrcaFlexBatch
object manages the creation of any number of OrcaFlexJob
objects. These
are an entity of type pyqalx.Set
which means that they are a collection of references to
pyqalx.Item
which contain all the information about the simulation that you want to run. It
may be useful to know the structure of this object so that you know where to find certain
information about the jobs in your batch.
Note
some fields exist on qalx_orcaflex.data_models.OrcaFlexJob
that are not detailed below,
that is because they are not used or implemented in this version of qalx_orcaflex
job_options
a set of options for the Sim Bot:- time_to_wait:
(default = 0) jobs will pause for this number of second before starting, this is useful if you are using a network dongle and the server hosting it can get overwhelmed by lots of simultaneous licence requests.
- record_progress:
(default = True) send updates on simulation progress
- save_simulation:
(default = True) save the simulation in qalx
- licence_attempts:
(default = 3600) number of times to try getting a licence before failing
- max_search_attempts:
(default = 10) number of attempts at Smart Statics
- max_wait:
(default = 60) the longest time to wait in seconds between trying to get a licence
- update_interval:
(default=5) the time to wait between sending progress updates to qalx. It is better to set this to be longer if you are hitting your usage limits or the API rate limit.
- delete_message_on_load:
(default=False) delete the queue message in the bot onload function. This is useful to avoid the job being duplicated in the queue if it takes more than 12 hours to process. See https://docs.qalx.net/bots#onload.
data_file
an item containing an OrcaFlex data file.file: the file item
file_name: the name of the file, used when saving back to disk
- meta:
data_file_name: the full path to the file if it came from disk
sim_file
the saved simulation fileresults
a mapping of result names to guids of the item that contain Resultsmodel_views
a mapping of model view name to details about the model viewsaved_views
a mapping of model view name to guid of item with image file.progress
a structure with information about the progress of the simulation:progress: a summary of the current progress
start_time: time the job started
end_time: time the job ended
current_time: current time in the job
time_to_go: how long estimated to completion in seconds
percent: progress as a percentage
pretty_time: a nice string of time to go e.g. “3 hours, 4 mins”
warnings
:an item containing all the text warnings from OrcaFlex as well as any warnings created by Sim Bot
load_case_info
: all the Load Case Information
JobState
Information about the state of a job is saved on the metadata of the OrcaFlexJob
. There is a
python Enum provided as
qalx_orcaflex.data_models.JobState
with the values relating to the states of the job as per
the tables below.
Enum |
Value |
Description |
---|---|---|
JobState.NEW |
“New” |
When a job has been created |
JobState.QUEUED |
“Queued” |
When the job has been added to the queue |
JobState.PRE_PROCESSING |
“Pre-processing” |
The job has been loaded by a bot |
JobState.PROCESSING |
“Processing” |
The job is about to be run bot |
JobState.LOADING_MODEL_DATA |
“Loading model data” |
The model data is about to be loaded |
JobState.MODEL_DATA_LOADED |
“Model data loaded” |
The model data has loaded |
JobState.RUNNING_STATICS |
“Running statics” |
Trying to find a static solution |
JobState.STATICS_FAILED |
“Statics failed” |
Couldn’t find a static solution |
JobState.RUNNING_DYNAMICS |
“Running dynamics” |
Running simulation dynamics |
JobState.SAVING_SIMULATION |
“Saving simulation” |
Saving the simulation data to qalx |
JobState.SIMULATION_SAVED |
“Simulation saved” |
Simulation data saved |
JobState.EXTRACTING_RESULTS |
“Extracting results” |
Extracting results |
JobState.RESULTS_EXTRACTED |
“Results extracted” |
All results extracted |
JobState.EXTRACTING_MODEL_VIEWS |
“Extracting model views” |
Extracting model views |
JobState.MODEL_VIEWS_EXTRACTED |
“Model views extracted” |
All model views extracted |
JobState.EXTRACTING_MODEL_VIDEOS |
“Extracting model videos” |
Extracting videos |
JobState.MODEL_VIDEOS_EXTRACTED |
“Model videos extracted” |
Videos extracted |
JobState.SIMULATION_UNSTABLE |
“Simulation unstable” |
Simulation was unstable |
JobState.ERROR |
“Error” |
There was an error |
JobState.USER_CANCELLED |
“User cancelled” |
A user cancelled the job |
Custom sources
Perhaps you have a separate system for storing OrcaFlex data files with fancy features and you
want to add data files to a batch from that system without having to download them locally. You
can do this by creating a custom source, the only rules are that it should inherit
qalx_orcaflex.core.BaseSource
and implement a to_jobs
instance method. The code below
should provide a rough idea of how this can be achieved.
from io import BytesIO
from typing import Mapping
from my_fancy_system import get_batch_list, get_file_object
from qalx_orcaflex.core import BaseSource, QalxOrcaFlex, OrcaFlexBatch
from qalx_orcaflex.helpers import clean_set_key
import qalx_orcaflex.data_models as dm
class FancySource(BaseSource):
def __init__(self, project_code, batch_name):
super(FancySource, self).__init__() # initialise the parent class
self.project_code = project_code
self.batch_name = batch_name
def to_jobs(self, base_job: Mapping) -> Mapping:
for case in get_batch_list(f"{self.project_code}/{self.batch_name}"):
# Here we assume that case is something like "Project123/Batch456/Case1.dat"
case_name = case.split("/")[-1]
# We need to create a specific structure that can be passed to
# `QalxSession().item.add`
data_file = {
"input_file": BytesIO(get_file_object(case)), # needs to allow `.read()`
"meta": {
"_class": "orcaflex.job.data_file", # this is standard class
"data_file_name": case, # this can be the full path
},
"file_name": case_name, # this is how it will be saved if it's
# downloaded later
}
job = self._update_copy(
# the `base_job` will contain all the info that is being passed to
# all the jobs like results etc. so we update a copy of that with our
# data file
base_job, {
"data_file": data_file,
# case_name is used to store the set on the group. It cannot have
# @ or . in the string so we clean it.
"case_name": clean_set_key(case_name)
}
)
# MAKE THIS A GENERATOR!
yield job
qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
batch_queue="batch bot",
sim_queue="sim bot",
)
with OrcaFlexBatch(name="My Batch", session=qfx,
batch_options=batch_options) as batch:
batch.add(FancySource("Project 123", "Batch 3"))
Batch waiter
A waiter for a batch provides the functionality to wait until all the processing of a batch is complete, before the next
section of the code is executed. This can be useful in the case where some additional post-processing is required, after
a batch has been completed. Normally, some manual checking for completion would be needed until the reporting or post-processing
code is executed. The batch waiter automates this workflow and can be run as a context-manager from the when_complete
method on a batch. This is shown with an example below.
import OrcFxAPI as ofx
import qalx_orcaflex.data_models as dm
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, ModelSource, \
DirectorySource, FileSource
qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
batch_queue="batch bot",
sim_queue="sim bot",
)
m = ofx.Model()
line = m.CreateObject(ofx.otLine, "My Line")
with OrcaFlexBatch(name="My Batch", session=qfx,
batch_options=batch_options) as batch:
for length in [100, 120]:
line.Length[0] = length
batch.add(ModelSource(m, f"Case l={length}"))
batch.add(DirectorySource(r"S:\Project 123\OFX\140-160m models"))
batch.add(FileSource(r"C:\User\AnneAlysis\My Models\180m.dat"))
with batch.when_complete(
interval=20, timeout=1*60*60, run_with_gui=False
):
pass
# This section of the code will be executed once the batch processing is complete.
# The waiter checks the status of the batch every 20 seconds. There is a specified
# timeout of one hour when the waiter will exit anyway. The option `run_with_gui`
# can be set to True and this will show the progress of the batch visually in a window