The Workflow Fragment Description Ontology

The Workflow Fragment Description Ontology

Release 12 March 2014

This version:
http://vocab.linkeddata.es/wffd/version/13032014/
Previous version:
http://vocab.linkeddata.es/wffd/version/17092013/
Latest version:
http://purl.org/net/wf-fd
Revision
1.1
Authors:
Daniel Garijo, Ontology Engineering Group, Universidad Politécnica de Madrid
Contributors:
Idafen Santana, Ontology Engineering Group, Universidad Politécnica de Madrid
Oscar Corcho, Ontology Engineering Group, Universidad Politécnica de Madrid
Yolanda Gil, Information Sciences Institute, University of Southern California
Imported Ontologies:
P-plan: The Ontology for Provenance and Plans
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic License.

Abstract

The Workflow Fragment Description Ontology (wf-fd) is a simple ontology designed to link the common workflow fragments detected by applying graph mining techniques to a collection of workflows to the original workflow collection. That is, wf-fd links the common workflow fragments to the workflows where they appear.

The latest owl encoding of the wf-fd ontology ontology can be found here

Table of Contents

1. Introduction back to ToC

A scientific workflow can be seen as a digital instrument that allows scientists to encode a scientific experiment in the form of a set of computational or data manipulation steps. Scientific workflows play an important role in the reproducibility and replicability of scientific experiments, as well as in repurposing and reusing results from previous experiments [Goderis et. al.].

Given their importance in the research lifecycle, scientific workflows are beginning to be included in scientific publications, together with datasets and other elements used in the context of an experiment. At the same time, repositories of workflows like myExperiment, Crowdlabs or Galaxy facilitate workflow publication, exchange and reuse. These repositories currently store thousands of workflows (referred to as workflow templates), which have been uploaded by scientists in many different domains (ranging from life sciences to text analytics or astronomy).

These workflow templates and the provenance associated to their executions are used for different purposes: detection of the source of an error in a particular execution, determining workflow similarity among workflows, automatic workflow mining for helping in workflow design or detection of common workflow fragments among the workflow dataset.

The Workflow Fragment Description ontology (Wf-fd) aims to model workflow fragments by connecting them to the different workflows to which they correspond. The objective of the ontology is twofold:

  1. Model the workflow fragments and how they overlap with each other independently of their connection to template.
  2. Model how the workflow fragments can be found within workflow templates.

The Wf-fd ontology complements the work described in [Garijo et. al.], where an approach for detecting common workflow fragments is described. Some of the descriptions and motivation used in this document are extracted from that publication.

1.1. Namespace declarations back to ToC

Table 1: Namespaces used in the document
wffd<http://purl.org/net/wf-fd#>
p-plan<http://purl.org/net/p-plan#>
owl<http://www.w3.org/2002/07/owl#>
rdfs<http://www.w3.org/2000/01/rdf-schema#>
dcterms<http://purl.org/dc/terms/>

1.2. Terminology back to ToC

In this document we use workflow related terminology. The following list covers the main concept used across the document:

2. Wf-fd Overview back to ToC

Wf-fd extends the Ontology for provenance and plans (P-plan) to link workflow fragments to the workflow templates where they can be found. The next tables summarize the classes and properties that have been used to extend or complement P-plan to adapt it to our particular domain. No dataproperties are included, because Wf-fd doesn't define any:

4.1 Classes

4.2 Properties

4.3 Data Properties

3. Wf-fd Description back to ToC

As stated, the Wf-fd ontology aims to link workflow fragments to their occurrences in workflow templates. A wffd:WorkflowFragment extends p-plan:Plan, since a fragment of a workflow is a workflow itself (which is also a type of p-plan:Plan). A workflow fragment has steps (p-plan:Step) which represent the individual data manipulation steps of a particular fragment.

There are two types of wffd:WorkflowFragments: wffd:DetectedResultWorkflowFragment and wffd:TiedWorkflowFragment. The former refers to those workflow fragments found as a result of applying graph mining techniques among a workflow collection (i.e., the results of the algorithms). The latter is used to represent how a particular result workflow fragment is bound to a part of a workflow template. This separation is necessary to properly point to the different parts of a workflow where a fragment appears. For instance, if we find that a fragment appears twice in a workflow, then we need two wffd:TiedWorkflowFragments to group the workflow steps belonging to each fragment. For this we use the relationship wffd:foundAs, which links a wffd:DetectedResultWorkflowFragment to the wffd:TiedWorkflowFragment which represents it in the workflow.

Workflow fragments may be included in other workflow fragments. In order to capture this overlap among the detected result workflow fragments, we use the relationship wffd:isPartOfWorkflowFragment. This facilitates querying the results, being able to retrieve efficiently the fragments related to each other.

Another function to facilitate relating fragments is wffd:foundIn. This function connects a wffd:DetectedResultWorkflowFragment to the Workflow (p-plan:Plan) where it was found (i.e., where one or more wffd:TiedWorkflowFragments have been wffd:foundAs).

Workflow fragments are composed by steps (p-plan:Step). Since we are interested in representing the ordering of the steps in the result fragment, we use the property p-plan:isPrecededBy between detected result fragment steps. The ordering of other workflow fragment steps, such as the ones belonging to a tied workflow fragment, is out of the scope of Wf-fd.

The Wf-fd complete diagram can be seen in Figure 1 below.

Wf-fd as an extension of p-plan.
Figure 1. Wf-fd as an extension of p-plan: Workflow fragments are represented as plans with steps. The extension is represented in blue, while in orange the original classes are shown.

An example of usage of the Wf-fd ontology can be seen in Figure 2. Three workflows are represented on the top of the figure (Workflow 1, Workflow 2 and Workflow 3) with their steps coloured in orange. Each step has its URI represented on the top (e.g., :step1W1), while their type is represented in angle brackets. In order to avoid adding complexity to the figure, only the relevant type of the step for detecting the fragments is shown (e.g., <A>, <B>, etc).

On the bottom of the figure the two detected result workflow fragments are shown. As depicted in Figure 1, each workflow fragment step belongs to a workflow fragment. The detected result workflow fragments (shown in yellow) are linked to their tied workflow fragment (parts of the original workflows where they appear, shown in green) with the wffd:foundAs relationship. Each fragment is also directly connected to the original workflow with the property wffd:foundIn.

Example of usage.
Figure 2. Wf-fd: example of usage. On the top side of the figure some workflows of the workflow collection are represented. The names of their staps are different, but some of their types are common. On the bottom part of the figures we can see the fragments detected (yellow plus orange). With the help pf wf-fd, all the fragments are properly represented anb bound to their corresponding fragments in the workflow templates.

4. Cross reference for Wf-fd classes and properties

This section provides details for each class and property defined by Wf-fd.

4.1 Classes

wffd:TiedResultWorkflowFragmentc back to ToC or Class ToC

IRI: http://purl.org/net/wf-fd#TiedResultWorkflowFragment

A Tied Workflow Result Fragment is a Workflow Fragment used as an auxiliary structure to point to Workflow Steps of a collection of workflows.

has super-classes
wffd:WorkflowFragment c
is in range of
wffd:foundAs op

wffd:DetectedResultWorkflowFragmentc back to ToC or Class ToC

IRI: http://purl.org/net/wf-fd#DetectedResultWorkflowFragment

A Detected Result Workflow Fragment is a Workflow Fragment detected automatically by using graph matching tecnhiques over a workflow collection. This fragment is a result of the algortihms and it is composed of Detected Result Steps.

has super-classes
wffd:WorkflowFragment c
is in domain of
wffd:foundAs op, wffd:foundIn op, wffd:isPartOfWorkflowFragment op, wffd:detectedByAlgorithm dp
is in range of
wffd:isPartOfWorkflowFragment op

wffd:WorkflowFragmentc back to ToC or Class ToC

IRI: http://purl.org/net/wf-fd#WorkflowFragment

A Workflow Fragment is a set of connected steps which belongs as part of a scientific workflow. A Workflow Fragment is a directed acyclic graph (DAG) and may have one or more Steps.

There are two types of Workflow Fragments: The Tied Result Workflow Fragments and the Detected Result Workflow Fragments. On one hand, the Detected Result Workflow Fragments are used to describe the results of the graph matching algorithms applied to a workflow collection. On the other hand, the Tied Workflow Result Fragments are auxiliary structures to point out the Steps of a Workflow belonging to a Detected Result Workflow Fragment .

has super-classes
p-plan:Plan c
has sub-classes
wffd:TiedResultWorkflowFragment c, wffd:DetectedResultWorkflowFragment c
is in domain of
wffd:isPartOfWorkflowFragment op
is in range of
wffd:isPartOfWorkflowFragment op

4.2 Object Properties

wffd:foundAsop back to ToC or Object Property ToC

IRI: http://purl.org/net/wf-fd#foundAs

Property that links a Detected Result Workflow Fragment to a Tied Result Workflow Fragment. That is, this property links a workflow fragment found as a result of applying graph mining techniques to a collection of workflows to the correspondant fragment of the workflow itself.

wffd:foundInop back to ToC or Object Property ToC

IRI: http://purl.org/net/wf-fd#foundIn

Property used to state in which workflows (p-plan:Plans) has a wffd:DetectedResultWorkflowFragment been found

wffd:isPartOfWorkflowFragmentop back to ToC or Object Property ToC

IRI: http://purl.org/net/wf-fd#isPartOfWorkflowFragment

Property that specifies which Workflow Fragments overlap with each other. In this case an overlap means to be included partially (or completely) as part of another fragment

has domain
wffd:WorkflowFragment c
has range
wffd:WorkflowFragment c

4.3 Data Properties

wffd:detectedByAlgorithmdp back to ToC or Object Data Property ToC

IRI: http://purl.org/net/wf-fd#detectedByAlgorithm

Property used to link a wffd:DetectedResultWorkflowFragment with the algorithm used to detect it. The name of the algorithm will be a Literal.

has domain
wffd:DetectedResutlWorkflowFragment c
has range
xsd:String

5. References back to ToC

6. Acknowledgements back to ToC

The authors would like to thanks Silvio Peroni for developing LODE, a Live OWL Documentation Environment used for representing the Corss Referencing Section of this document.

Changes from last version