The Workflow Fragment Description Ontology (wf-fd) is a simple ontology designed to link the common workflow fragments detected by applying graph mining techniques to a collection of workflows to the original workflow collection. That is, wf-fd links the common workflow fragments to the workflows where they appear.
The latest owl encoding of the wf-fd ontology ontology can be found here
A scientific workflow can be seen as a digital instrument that allows scientists to encode a scientific experiment in the form of a set of computational or data manipulation steps. Scientific workflows play an important role in the reproducibility and replicability of scientific experiments, as well as in repurposing and reusing results from previous experiments [Goderis et. al.].
Given their importance in the research lifecycle, scientific workflows are beginning to be included in scientific publications, together with datasets and other elements used in the context of an experiment. At the same time, repositories of workflows like myExperiment, Crowdlabs or Galaxy facilitate workflow publication, exchange and reuse. These repositories currently store thousands of workflows (referred to as workflow templates), which have been uploaded by scientists in many different domains (ranging from life sciences to text analytics or astronomy).
These workflow templates and the provenance associated to their executions are used for different purposes: detection of the source of an error in a particular execution, determining workflow similarity among workflows, automatic workflow mining for helping in workflow design or detection of common workflow fragments among the workflow dataset.
The Workflow Fragment Description ontology (Wf-fd) aims to model workflow fragments by connecting them to the different workflows to which they correspond. The objective of the ontology is twofold:
The Wf-fd ontology complements the work described in [Garijo et. al.], where an approach for detecting common workflow fragments is described. Some of the descriptions and motivation used in this document are extracted from that publication.
wffd | <http://purl.org/net/wf-fd#> |
p-plan | <http://purl.org/net/p-plan#> |
owl | <http://www.w3.org/2002/07/owl#> |
rdfs | <http://www.w3.org/2000/01/rdf-schema#> |
dcterms | <http://purl.org/dc/terms/> |
Wf-fd extends the Ontology for provenance and plans (P-plan) to link workflow fragments to the workflow templates where they can be found. The next tables summarize the classes and properties that have been used to extend or complement P-plan to adapt it to our particular domain. No dataproperties are included, because Wf-fd doesn't define any:
As stated, the Wf-fd ontology aims to link workflow fragments to their occurrences in workflow templates. A wffd:WorkflowFragment
extends
p-plan:Plan
, since a fragment of a workflow is a workflow itself (which is also a type of p-plan:Plan). A workflow fragment has steps
(p-plan:Step
) which represent the individual data manipulation steps of a particular fragment.
There are two types of wffd:WorkflowFragments
: wffd:DetectedResultWorkflowFragment
and wffd:TiedWorkflowFragment
. The
former refers to those workflow fragments found as a result of applying graph mining techniques among a workflow collection (i.e., the results of the algorithms).
The latter is used to represent how a particular result workflow fragment is bound to a part of a workflow template. This separation is necessary to
properly point to the different parts of a workflow where a fragment appears. For instance, if we find that a fragment appears twice in a workflow, then we need
two wffd:TiedWorkflowFragments
to group the workflow steps belonging to each fragment. For this we use the relationship wffd:foundAs
,
which links a wffd:DetectedResultWorkflowFragment
to the wffd:TiedWorkflowFragment
which represents it in the workflow.
Workflow fragments may be included in other workflow fragments. In order to capture this overlap among the detected result workflow fragments, we use the relationship wffd:isPartOfWorkflowFragment
.
This facilitates querying the results, being able to retrieve efficiently the fragments related to each other.
Another function to facilitate relating fragments is wffd:foundIn
. This function connects a wffd:DetectedResultWorkflowFragment
to the Workflow (p-plan:Plan
) where it was found (i.e., where one or more wffd:TiedWorkflowFragments
have been wffd:foundAs
).
Workflow fragments are composed by steps (p-plan:Step
). Since we are interested in representing the ordering of the steps in the result fragment, we use the property p-plan:isPrecededBy
between detected result fragment steps. The ordering of other workflow fragment steps, such as the ones belonging to a tied workflow fragment, is out of the scope of Wf-fd.
The Wf-fd complete diagram can be seen in Figure 1 below.
An example of usage of the Wf-fd ontology can be seen in Figure 2. Three workflows are represented on the top of the figure (Workflow 1, Workflow 2 and Workflow 3) with their steps coloured in orange. Each step has its URI represented on the top (e.g., :step1W1), while their type is represented in angle brackets. In order to avoid adding complexity to the figure, only the relevant type of the step for detecting the fragments is shown (e.g., <A>, <B>, etc).
On the bottom of the figure the two detected result workflow fragments are shown. As depicted in
Figure 1, each workflow fragment step belongs to a workflow fragment. The detected result workflow fragments (shown in yellow)
are linked to their tied workflow fragment (parts of the original workflows where they appear, shown in green) with the wffd:foundAs
relationship. Each fragment is also
directly connected to the original workflow with the property wffd:foundIn
.
IRI: http://purl.org/net/wf-fd#TiedResultWorkflowFragment
IRI: http://purl.org/net/wf-fd#DetectedResultWorkflowFragment
A Detected Result Workflow Fragment is a Workflow Fragment detected automatically by using graph matching tecnhiques over a workflow collection. This fragment is a result of the algortihms and it is composed of Detected Result Steps.
IRI: http://purl.org/net/wf-fd#WorkflowFragment
A Workflow Fragment is a set of connected steps which belongs as part of a scientific workflow. A Workflow Fragment is a directed acyclic graph (DAG) and may have one or more Steps.
There are two types of Workflow Fragments: The Tied Result Workflow Fragments and the Detected Result Workflow Fragments. On one hand, the Detected Result Workflow Fragments are used to describe the results of the graph matching algorithms applied to a workflow collection. On the other hand, the Tied Workflow Result Fragments are auxiliary structures to point out the Steps of a Workflow belonging to a Detected Result Workflow Fragment .
IRI: http://purl.org/net/wf-fd#foundAs
Property that links a Detected Result Workflow Fragment to a Tied Result Workflow Fragment. That is, this property links a workflow fragment found as a result of applying graph mining techniques to a collection of workflows to the correspondant fragment of the workflow itself.
IRI: http://purl.org/net/wf-fd#foundIn
Property used to state in which workflows (p-plan:Plans) has a wffd:DetectedResultWorkflowFragment been found
IRI: http://purl.org/net/wf-fd#isPartOfWorkflowFragment
Property that specifies which Workflow Fragments overlap with each other. In this case an overlap means to be included partially (or completely) as part of another fragment
IRI: http://purl.org/net/wf-fd#detectedByAlgorithm
Property used to link a wffd:DetectedResultWorkflowFragment with the algorithm used to detect it. The name of the algorithm will be a Literal.
The authors would like to thanks Silvio Peroni for developing LODE, a Live OWL Documentation Environment used for representing the Corss Referencing Section of this document.
A Tied Workflow Result Fragment is a Workflow Fragment used as an auxiliary structure to point to Workflow Steps of a collection of workflows.