Building Batches

While it is possible to send individual simulations to a Sim Bot the best way to run simulations is to build them into batches.

To do this, you create an OrcaFlexBatch and then add data files to it, this is done with the add method:

from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, DirectorySource

qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
batch_queue="batch bot",  # queue our batch bot will look at for jobs
sim_queue="sim bot",  # queue our sim bot will look at for jobs
    )
with OrcaFlexBatch(name="My Batch", session=qfx, batch_options=batch_options) as batch:
    batch.add(source=DirectorySource(r"S:\Project123\OFX\Batch1"))

The reason for the with block is explained in Context manager but it is important to know that once the code has successfully exited the with block then the batch has been built and submitted.

Sources

There are various options for the source argument:

DirectorySource

Data files will be submitted for every OrcaFlex file in a directory.

The following code will include all .dat files in the Batch1 folder:

from qalx_orcaflex.core import DirectorySource
my_files = DirectorySource(r"S:\Project 123\OFX\Batch1")

The following code will include all .dat, .yml and .yaml files in Batch1:

Warning

qalx_orcaflex does not support the concept of BaseFile in text data files (yet)

from qalx_orcaflex.core import DirectorySource
my_files = DirectorySource(r"S:\Project 123\OFX\Batch1", include_yaml=True)

The following code will include all .dat files in Batch1 and all the subdirectories of Batch1:

from qalx_orcaflex.core import DirectorySource
my_files = DirectorySource(r"S:\Project 123\OFX\Batch1", recursive=True)

ModelSource

Saves the model data from an OrcFxAPI.Model instance. You need to give the model a name:

from qalx_orcaflex.core import ModelSource
import OrcFxAPI as ofx

model = ofx.Model()
my_files = ModelSource(model, "My Model")

ModelSource will save the data in the model instance at the time it is created, any future changes to the OrcFxAPI.Model instance will not be reflected. This is so you can happily use the same instance for all your load cases and not worry about loading the base model from disk each time.

from qalx_orcaflex.core import ModelSource
import OrcFxAPI as ofx

model = ofx.Model(r"C:\MY_MASSIVE_MODEL.dat")
model['My Line'].Length[0] = 100
my_100_model_source = ModelSource(model, "Model with l=100")
model['My Line'].Length[0] = 200
my_200_model_source = ModelSource(model, "Model with l=200")

The sources above will have different line lengths even though the line in the model variable has a length of 200.

Note

There is no need to add “.dat” to the name of the model. This will be added automatically.

FileSource

This will add a single file from a path.

from qalx_orcaflex.core import FileSource
my_files = FileSource(path=r"S:\Project 123\OFX\Tests\MY_TEST_MODEL.dat")

DataFileItemsSource

This is for when you have data files already in qalx and you want to re-run them for some reason.

coming soon!

This source will become more useful when we implement “model deltas” which will allow you to load a base model from qalx and specify all the load case details as they are added to the batch.

Batches with restarts

Batches which contain models that restart from another simulation should “just work”. That is, those models should run in the same way as every other model in the batch and the results will be processed in the same way. However, there are a few things that you should understand about how qalx-OrcaFlex processes these to ensure you do not suffer unexpected behaviours.

Use OrcFxAPI on build

By default, qalx-OrcaFlex uses OrcFxAPI to determine if a file is a restart file and it will use the use it to obtain the chain of parent models. This means that you need to have an OrcaFlex licence available on the machine you are using to build batches.

If you do not have a licence available or do not want to use one, it is possible to pass use_orcfxapi=False to any of the Sources detailed above. This will result in no licence being used but at a cost of less robust acquisition of the parent chain.

Parents “Inside” and “Outside” batches

qalx-OrcaFlex needs to have the parent simulation available to run a restart model. The way it treats this parent is slightly different if the model is not included within the same batch as the child. In this situation the simulation will be run and will be saved to qalx but none of the post-processing will be applied to the parent simulation.

Consider the batch of models below:

c:\Project1\base.dat
c:\Project1\m1.yml [parent=c:\Project1\base.dat]
c:\Project1\m2.yml [parent=c:\Project1\m1.yml]

If you create a batch with DirectorySource from c:\Project1 then all the simulations will be run and any post-processing you request will be performed on all three simulations as they are considered “inside” the batch.

However, if you create a batch using a single FileSource for c:\Project1\m2.yml then all three simulations will be run (they have to be for m2.yml to run) but only the m2.yml simulation will be subjected to post-processing. The base.dat and m1.yml simulation are considered to be “outside” the batch.

Distributed processing of complex trees

Consider the following directory of models, that you want to run as a batch.

base.dat
m1.yml [parent=base.dat]
m2.yml [parent=m1.yml]
another_base.dat
q1.yml [parent=another_base.dat]
q2.yml [parent=q1.yml]
m3.yml [parent=m2.yml]
x1.yml [parent=m1.yml]

qalx-OrcaFlex is used to run batches in multiple processes across distributed servers in various environments. However, for restarts to work the parent simulation file must be available on the server. Rather than running the parents of every model on every server the system will group chains of restarts together and run them sequentially on the same server. If a batch contains non-related chains however, it will split these to be run in parallel.

Given the example batch from above it is clear that the following all depend on base.dat:

base.dat
m1.yml [parent=base.dat]
m2.yml [parent=m1.yml]
m3.yml [parent=m2.yml]
x1.yml [parent=m1.yml]

So you can expect these to all be run sequentially in the same process. The remaining chain which depends on another_base.dat may be run in another process on the same server or on another server all together.

another_base.dat
q1.yml [parent=another_base.dat]
q2.yml [parent=q1.yml]

Batch Options

When you create your batch you need to give some options and configuration settings. Firstly, you need to say which queue you want to submit to. These should correspond to the queues that are specified when you start your Bots. This flexibility allows you to create separate queues for different projects or teams and have multiple bots running in parallel.

  • batch_queue: name of the batch queue

  • sim_queue: name of the simulation queue

The other options are passed to Batch Bot

  • wait_between_completion_checks:

    (default=30) if wait_to_complete has been set to True, which it is by default, then this will be the number of seconds that Batch Bot will wait before checking if the jobs have all completed.

  • summarise_results:

    (default=True) set this to False if you want Batch Bot to skip making Results summaries.

  • build_threads:

    (default=10) the cases in the batch will be added to qalx in parallel. Because the bottleneck in submitting cases is usually waiting for a HTTP response from the API, it should be ok to use lots of threads. If you find that your machine grids to a halt during this process you might want to reduce the number of threads. Equally, if you have to create many thousands of cases and are using a powerful machine you can increase the number of threads. Having too many threads will probably cause you to hit rate-limiting on the API.

  • send_batch_to:

    (default=[]) this is a list of names of queues that the batch will be sent to once it has finished processing. This will happen regardless of other settings such as summarise_results.

  • send_sim_to:

    (default=[]) this is a list of names of queues that every sim will be sent to once it has completed.

  • notifications:

    see Notifications below.

  • timeout:

    time in seconds that this batch should be expected to be complete by. See Notifications below.

Notifications

There is an option to get qalx-OrcaFlex to send notifications when certain events happen to a batch. There are three events that have templated notifications that are sent by email:

  • data_models.notifications.NotificationSubmitted: this will be sent once the BatchBot has submitted all the simulations to the queue for simulation.

  • data_models.notifications.NotificationCompleted: this will be sent once the batch has been marked as complete by the BatchBot.

  • data_models.notifications.NotificationTimedOut: this will be sent if the batch has been not marked as complete by the BatchBot after the specified timeout as described above.

By default, no notifications are sent. They can be enabled by adding them to the BatchOptions as an argument like so:

import OrcFxAPI as ofx

import qalx_orcaflex.data_models as dm
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, ModelSource

qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
    batch_queue="batch bot",
    sim_queue="sim bot",
    notifications=dm.Notifications(
        notify_submitted=dm.notifications.NotificationSubmitted(),
        notify_completed=dm.notifications.NotificationCompleted(),
        notify_timed_out=dm.notifications.NotificationTimedOut(),
    )
)

You can adjust who gets the notification email through keyword arguments to each notification. e.g,.:

notifications=dm.Notifications(
        notify_submitted=dm.notifications.NotificationSubmitted(
                include_creator=True,
                to=['bob@analysiscorp.co'],
                cc=['anne@analysiscorp.co'],
                bcc=['project_1235566@proj.analysiscorp.co'],
                subject='I think we finally have this working Bob!'
            )
        )

For more details see the Notifications API docs.

Context manager

OrcaFlexBatch is a context manager, this means that you need to use it in a with block. Doing this means that the data files and associated items will only be created in qalx when the code you run to build you batch has completed successfully.

For example, the following code will error when the line length is set with the wrong data type:

import OrcFxAPI as ofx

import qalx_orcaflex.data_models as dm
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, ModelSource

qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
    batch_queue="batch bot",
    sim_queue="sim bot",
)
m = ofx.Model()
line = m.CreateObject(ofx.otLine, "My Line")

with OrcaFlexBatch(name="I will never get built", session=qfx,
                   batch_options=batch_options) as batch:
    for length in [100, '120']:
        line.Length[0] = length
        batch.add(ModelSource(m, f"Case l={length}"))

In the above code “Case l=100” will not be added to qalx so you don’t have to worry about creating resources that contain errors or partial batches.

Complete example

The following example shows that you can add to a batch from multiple sources:

import OrcFxAPI as ofx

import qalx_orcaflex.data_models as dm
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, ModelSource, \
    DirectorySource, FileSource

qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
    batch_queue="batch bot",
    sim_queue="sim bot",
)
m = ofx.Model()
line = m.CreateObject(ofx.otLine, "My Line")

with OrcaFlexBatch(name="My Batch", session=qfx,
                   batch_options=batch_options) as batch:
    for length in [100, 120]:
        line.Length[0] = length
        batch.add(ModelSource(m, f"Case l={length}"))
    batch.add(DirectorySource(r"S:\Project 123\OFX\140-160m models"))
    batch.add(FileSource(r"C:\User\AnneAlysis\My Models\180m.dat"))

Advanced concepts

OrcaFlexJob

The OrcaFlexBatch object manages the creation of any number of OrcaFlexJob objects. These are an entity of type pyqalx.Set which means that they are a collection of references to pyqalx.Item which contain all the information about the simulation that you want to run. It may be useful to know the structure of this object so that you know where to find certain information about the jobs in your batch.

Note

some fields exist on qalx_orcaflex.data_models.OrcaFlexJob that are not detailed below, that is because they are not used or implemented in this version of qalx_orcaflex

  • job_options a set of options for the Sim Bot:

    • time_to_wait:

      (default = 0) jobs will pause for this number of second before starting, this is useful if you are using a network dongle and the server hosting it can get overwhelmed by lots of simultaneous licence requests.

    • record_progress:

      (default = True) send updates on simulation progress

    • save_simulation:

      (default = True) save the simulation in qalx

    • licence_attempts:

      (default = 3600) number of times to try getting a licence before failing

    • max_search_attempts:

      (default = 10) number of attempts at Smart Statics

    • max_wait:

      (default = 60) the longest time to wait in seconds between trying to get a licence

    • update_interval:

      (default=5) the time to wait between sending progress updates to qalx. It is better to set this to be longer if you are hitting your usage limits or the API rate limit.

    • delete_message_on_load:

      (default=False) delete the queue message in the bot onload function. This is useful to avoid the job being duplicated in the queue if it takes more than 12 hours to process. See https://docs.qalx.net/bots#onload.

  • data_file an item containing an OrcaFlex data file.

    • file: the file item

    • file_name: the name of the file, used when saving back to disk

    • meta:
      • data_file_name: the full path to the file if it came from disk

  • sim_file the saved simulation file

  • results a mapping of result names to guids of the item that contain Results

  • model_views a mapping of model view name to details about the model view

  • saved_views a mapping of model view name to guid of item with image file.

  • progress a structure with information about the progress of the simulation:

    • progress: a summary of the current progress

    • start_time: time the job started

    • end_time: time the job ended

    • current_time: current time in the job

    • time_to_go: how long estimated to completion in seconds

    • percent: progress as a percentage

    • pretty_time: a nice string of time to go e.g. “3 hours, 4 mins”

  • warnings:

    an item containing all the text warnings from OrcaFlex as well as any warnings created by Sim Bot

  • load_case_info: all the Load Case Information

JobState

Information about the state of a job is saved on the metadata of the OrcaFlexJob. There is a python Enum provided as qalx_orcaflex.data_models.JobState with the values relating to the states of the job as per the tables below.

Enum

Value

Description

JobState.NEW

“New”

When a job has been created

JobState.QUEUED

“Queued”

When the job has been added to the queue

JobState.PRE_PROCESSING

“Pre-processing”

The job has been loaded by a bot

JobState.PROCESSING

“Processing”

The job is about to be run bot

JobState.LOADING_MODEL_DATA

“Loading model data”

The model data is about to be loaded

JobState.MODEL_DATA_LOADED

“Model data loaded”

The model data has loaded

JobState.RUNNING_STATICS

“Running statics”

Trying to find a static solution

JobState.STATICS_FAILED

“Statics failed”

Couldn’t find a static solution

JobState.RUNNING_DYNAMICS

“Running dynamics”

Running simulation dynamics

JobState.SAVING_SIMULATION

“Saving simulation”

Saving the simulation data to qalx

JobState.SIMULATION_SAVED

“Simulation saved”

Simulation data saved

JobState.EXTRACTING_RESULTS

“Extracting results”

Extracting results

JobState.RESULTS_EXTRACTED

“Results extracted”

All results extracted

JobState.EXTRACTING_MODEL_VIEWS

“Extracting model views”

Extracting model views

JobState.MODEL_VIEWS_EXTRACTED

“Model views extracted”

All model views extracted

JobState.EXTRACTING_MODEL_VIDEOS

“Extracting model videos”

Extracting videos

JobState.MODEL_VIDEOS_EXTRACTED

“Model videos extracted”

Videos extracted

JobState.SIMULATION_UNSTABLE

“Simulation unstable”

Simulation was unstable

JobState.ERROR

“Error”

There was an error

JobState.USER_CANCELLED

“User cancelled”

A user cancelled the job

Custom sources

Perhaps you have a separate system for storing OrcaFlex data files with fancy features and you want to add data files to a batch from that system without having to download them locally. You can do this by creating a custom source, the only rules are that it should inherit qalx_orcaflex.core.BaseSource and implement a to_jobs instance method. The code below should provide a rough idea of how this can be achieved.

from io import BytesIO
from typing import Mapping

from my_fancy_system import get_batch_list, get_file_object

from qalx_orcaflex.core import BaseSource, QalxOrcaFlex, OrcaFlexBatch
from qalx_orcaflex.helpers import clean_set_key
import qalx_orcaflex.data_models as dm

class FancySource(BaseSource):

    def __init__(self, project_code, batch_name):
        super(FancySource, self).__init__() # initialise the parent class
        self.project_code = project_code
        self.batch_name = batch_name

    def to_jobs(self, base_job: Mapping) -> Mapping:
        for case in get_batch_list(f"{self.project_code}/{self.batch_name}"):
            # Here we assume that case is something like "Project123/Batch456/Case1.dat"
            case_name = case.split("/")[-1]
            # We need to create a specific structure that can be passed to
            # `QalxSession().item.add`

            data_file = {
                "input_file": BytesIO(get_file_object(case)), # needs to allow `.read()`
                "meta": {
                    "_class": "orcaflex.job.data_file", # this is standard class
                    "data_file_name": case, # this can be the full path
                },
                "file_name": case_name, # this is how it will be saved if it's
                # downloaded later
            }
            job = self._update_copy(
                # the `base_job` will contain all the info that is being passed to
                # all the jobs like results etc. so we update a copy of that with our
                # data file
                base_job, {
                    "data_file": data_file,
                    # case_name is used to store the set on the group. It cannot have
                    # @ or . in the string so we clean it.
                    "case_name": clean_set_key(case_name)
                }
            )
            # MAKE THIS A GENERATOR!
            yield job

qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
    batch_queue="batch bot",
    sim_queue="sim bot",
)
with OrcaFlexBatch(name="My Batch", session=qfx,
                   batch_options=batch_options) as batch:
    batch.add(FancySource("Project 123", "Batch 3"))

Batch waiter

A waiter for a batch provides the functionality to wait until all the processing of a batch is complete, before the next section of the code is executed. This can be useful in the case where some additional post-processing is required, after a batch has been completed. Normally, some manual checking for completion would be needed until the reporting or post-processing code is executed. The batch waiter automates this workflow and can be run as a context-manager from the when_complete method on a batch. This is shown with an example below.

import OrcFxAPI as ofx

import qalx_orcaflex.data_models as dm
from qalx_orcaflex.core import QalxOrcaFlex, OrcaFlexBatch, ModelSource, \
    DirectorySource, FileSource

qfx = QalxOrcaFlex()
batch_options = dm.BatchOptions(
    batch_queue="batch bot",
    sim_queue="sim bot",
)
m = ofx.Model()
line = m.CreateObject(ofx.otLine, "My Line")

with OrcaFlexBatch(name="My Batch", session=qfx,
                   batch_options=batch_options) as batch:
    for length in [100, 120]:
        line.Length[0] = length
        batch.add(ModelSource(m, f"Case l={length}"))
    batch.add(DirectorySource(r"S:\Project 123\OFX\140-160m models"))
    batch.add(FileSource(r"C:\User\AnneAlysis\My Models\180m.dat"))
with batch.when_complete(
    interval=20, timeout=1*60*60, run_with_gui=False
):
    pass
# This section of the code will be executed once the batch processing is complete.
# The waiter checks the status of the batch every 20 seconds. There is a specified
# timeout of one hour when the waiter will exit anyway. The option `run_with_gui`
# can be set to True and this will show the progress of the batch visually in a window