The section serves as a glossary for technical terminology (read: confusing words that Jon made up) used in the LoonyBin documentation
Base directory | Directory on the home machine where pointers to all files and logs created during workflow execution are placed. |
Dependency edge | An edge providing N (one or more) output files and parameters from a parent tool to N outputs and parameters of a child tool |
Design machine | Machine where the workflow was designed (i.e. using the LoonyBin Workflow Designer GUI). |
DAG | Directed Acyclic Graph. In LoonyBin, a representation of a workflow such that each vertex is a tool that given a set of parameter values, transforms a set of inputs into a set of outputs. |
Realization | A single DAG that can be unpacked from a LoonyBin HyperDAG workflow. This usually corresponds to a single empirical experiment in which each tool is given a single set of parameter values and there is only one data processing path. |
HyperDAG | A Directed Acyclic Hypergraph. In LoonyBin, it is a packed representation of multiple DAGs such that multiple Realizations are encoded via Packing Tools. This usually corresponds to a set of many empirical experiments in which many parameter values were tried (i.e. parameter sweeps or carpet bombing for hyperparameters) or having multiple processing paths so that the user can compare the effect of using different tools (or chains of tools) on the overall workflow. |
Home machine | Machine where 1) we run the central process that is responsible for launching processes on machines when their dependencies have been satisfied and where 2) the base directory will be created and populated. |
Loon log | LoonyBin’s log format that containing plaintext tab-separated key-value pairs, one record per line. Each loon log records all log events from the beginning of a workflow through the current vertex. |
Packing tool | A LoonyBin tool such as an OR tool that can create more Realizations in its outgoing Dependency Edges than are present in its incoming Dependency Edges |
Packing vertex | A vertex representing a packing tool. |
Path directory | A directory pointed to by a path file. |
Path file | A file with a name of the form {requiredPathName}.path containing a single string: a single absolute path. All files at this path will be symlinked into the tool working directory of tool vertices listing this path file as a requirement. |
Parameter box | A special tool that runs no commands, but instead only holds arbitrary parameters. These are useful for sharing parameters across various tools or conducting parameter sweeps via packing tools. |
Preanalyzer | A command that is defined by a tool descriptor and is run before the main commands of a tool to either log some information about the inputs or parameters to a vertex or check the sanity of the input data or parameters. |
Postanalyzer | A command that is defined by a tool descriptor and is run after the main commands of a tool to either log some information about the inputs, outputs, parameters, or program logs or otherwise check sanity. |
Realization directory | A subdirectory under a vertex directory containing the tool working directory and the tool final directory. |
Realization edge | A dependency edge from a tool vertex into a packing vertex. |
Realization set | A set of realization edges feeding into a single packing vertex such as an OR vertex. Each realization set takes the name of its packing vertex. |
Target working directory | The directory on each target execution machine under which subdirectories will be created for each vertex name executed on this machine during the workflow. |
Tool | A user-defined UNIX program (e.g. shell script, binary, Java application, etc.) |
Tool pack directory | The directory where LoonyBin should recursively search for tool descriptors to populate the toolbox shown in the left pane of the Workflow Designer GUI. |
Tool working directory | A subdirectory under a realization directory containing all of the input files and required executables necessary to run a tool’s commands, preanalyzers, and postanalyzers. |
Tool final directory | A subdirectory under a realization directory containing the output files of a tool and the loon log for this vertex. |
Tool descriptor | A python program that documents the inputs, outputs, and parameters of a tool and can produce the commands to run the tool given values for all of the input, output, and parameter names. The descriptor can also invoke Preanalyzers and Postanalyzers to log information and check sanity. |
Tool pack | A set of tool descriptors with some common theme. |
Vertex directory | A directory created under the base directory or target working directory containing all of the realization directories for this vertex. |
(Updated as of V0.4.0)