Name: pywdl
Owner: Broad Institute
Description: Python bindings for WDL
Created: 2015-11-24 14:06:37.0
Updated: 2018-01-15 00:50:36.0
Pushed: 2017-11-09 19:51:35.0
Homepage: null
Size: 127
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
There is a very low probability that you're in the right place. If you're looking for a python WDL parser see the WDL repo.
This repository is deprecated The intention of this repository was to provide a Python object model on top of parsed WDL. It is out of date but we're leaving it here in case someone wants an example of how to do such a thing. If you'd like to pick up the torch please let us know.
NOTE AGAIN If you're reading below this you're almost certainly in the wrong place!
A Python implementation of a WDL parser and language bindings.
For Scala language bindings, use WDL4S.
PyWDL works with Python 2 or Python 3. Install via setup.py
:
thon setup.py install
Or via pip:
p install wdl
The main wdl
package provides an interface to turn WDL source code into native Python objects. This means that a workflow {}
block in WDL would become a Workflow
object in Python and a task {}
block becomes a Task
object.
To parse WDL source code into a WdlDocument
object, import the wdl
package and load a WDL string with wdl.loads("wdl code")
or WDL from a file-like object using wdl.load(fp, resource_name)
.
For example:
rt wdl
rt wdl.values
code = """
my_task {
le file
mmand {
./my_binary --input=${file} > results
tput {
File results = "results"
flow my_wf {
ll my_task
e the language bindings to parse WDL into Python objects
namespace = wdl.loads(wdl_code)
workflow in wdl_namespace.workflows:
print('Workflow "{}":'.format(workflow.name))
for call in workflow.calls():
print(' Call: {} (task {})'.format(call.name, call.task.name))
task in wdl_namespace.tasks:
name = task.name
abstract_command = task.command
def lookup(name):
if name == 'file': return wdl.values.WdlFile('/path/to/file.txt')
instantated_command = task.command.instantiate(lookup)
print('Task "{}":'.format(name))
print(' Abstract Command: {}'.format(abstract_command))
print(' Instantiated Command: {}'.format(instantated_command))
Using the language bindings as shown above is the recommended way to use PyWDL. One can also directly access the parser to parse WDL source code into an abstract syntax tree using the wdl.parser
package:
rt wdl.parser
code = """
my_task {
le file
mmand {
./my_binary --input=${file} > results
tput {
File results = "results"
flow my_wf {
ll my_task
rse source code into abstract syntax tree
= wdl.parser.parse(wdl_code).ast()
int out abstract syntax tree
t(ast.dumps(indent=2))
cess the first task definition, print out its name
t(ast.attr('definitions')[0].attr('name').source_string)
nd all 'Task' ASTs
_asts = wdl.find_asts(ast, 'Task')
task_ast in task_asts:
print(task_ast.dumps(indent=2))
nd all 'Workflow' ASTs
flow_asts = wdl.find_asts(ast, 'Workflow')
workflow_ast in workflow_asts:
print(workflow_ast.dumps(indent=2))
An AST is the output of the parsing algorithm. It is a tree structure in which the root node is always a Document
AST
The best way to get started working with ASTs is to visualize them by using the wdl parse
subcommand to see the AST as text. For example, consider the following WDL file
example.wdl
a {
mmand {./foo_bin}
b {
mmand {./bar_bin}
c {
mmand {./baz_bin}
flow w {}
Then, use the command line to parse and output the AST:
l parse example.wdl
ument:
ports=[],
finitions=[
(Task:
name=<string:1:6 identifier "YQ==">,
declarations=[],
sections=[
(RawCommand:
parts=[
<string:2:12 cmd_part "Li9mb29fYmlu">
]
)
]
),
(Task:
name=<string:4:6 identifier "Yg==">,
declarations=[],
sections=[
(RawCommand:
parts=[
<string:5:12 cmd_part "Li9iYXJfYmlu">
]
)
]
),
(Task:
name=<string:7:6 identifier "Yw==">,
declarations=[],
sections=[
(RawCommand:
parts=[
<string:8:12 cmd_part "Li9iYXpfYmlu">
]
)
]
),
(Workflow:
name=<string:10:10 identifier "dw==">,
body=[]
)
Programmatically, if one wanted to traverse this AST to pull out data:
rt wdl.parser
rt wdl
open('example.wdl') as fp:
ast = wdl.parser.parse(fp.read()).ast()
_a = ast.attr('definitions')[0]
_b = ast.attr('definitions')[1]
_c = ast.attr('definitions')[2]
ast in task_a.attr('sections'):
if ast.name == 'RawCommand':
task_a_command = ast
ast in task_a_command.attr('parts'):
if isinstance(ast, wdl.parser.Terminal):
print('command string: ' + ast.source_string)
else:
print('command parameter: ' + ast.dumps())
The Ast
class is a syntax tree with a name and children nodes.
Attributes:
name
is a string that refers to the type of AST, (e.g. Workflow
, Task
, Document
, RawCommand
)attributes
is a dictionary where the keys are the name of the attribute and the values can be one of three types: Ast
, AstList
, Terminal
.Methods:
def attr(self, name)
. ast.attr('name')
is the same as ast.attributes['name']
.def dumps(self, indent=None, b64_source=True)
- returns a String representation of this AstList. the indent
parameter takes an integer for the indent level. Omitting this value will cause there to be no new-lines in the resulting string. b64_source
will be passed to recursive invocations of dumps
.The wdl.parser.Terminal
object represents a literal piece of the original source code. This always shows up as leaf nodes on Ast
objects
Attributes:
source_string
- String segment from the source code.line
- Line number where source_string
was in source code.col
- Column number where source_string
was in source code.resource
- Name of the location for the source code. Usually a file system path or perhaps URI.id
- Numeric identifier, unique to the top level Ast
. Used mostly internally.str
- String identifier of this terminal. Used mostly internally.Methods:
def dumps(self, b64_source=True, **kwargs)
- return a String representation of this terminal. b64_source
means that the source code will be base64 encoded because sometimes the source contains newlines or special characters that make it difficult to read when a whole AST is string-ified.class AstList(list)
represents a sequence of Ast
, AstList
, and Terminal
objects
Methods:
def dumps(self, indent=None, b64_source=True)
- returns a String representation of this AstList. the indent
parameter takes an integer for the indent level. Omitting this value will cause there to be no new-lines in the resulting string. b64_source
will be passed to recursive invocations of dumps
.Parsing a WDL file will result in unevaluated expressions. For example:
flow test {
t a = (1 + 2) * 3
ll my_task {
input: var=a*2, var2="file"+".txt"
This workflow definition has three expressions in it: (1 + 2) * 3
, a*2
, and "file"+".txt"
.
Expressions are stored in wdl.binding.Expression
object. The AST for the expression is stored in this object.
Expressions can be evaluated with the eval()
method on the Expression
class.
rt wdl
nually parse expression into wdl.binding.Expression
ession = wdl.parse_expr("(1 + 2) * 3")
aluate the expression.
turns a WdlValue, specifically a WdlIntegerValue(9)
uated = expression.eval()
t the Python value
t(evaluated.value)
Sometimes expressions contain references to variables or functions. In order for these to be resolved, one must pass a lookup function and an implementation of the functions that you want to support:
rt wdl
wdl.values import WdlInteger, WdlUndefined
test_lookup(identifier):
if identifier == 'var':
return WdlInteger(4)
else:
return WdlUndefined
test_functions():
def add_one(parameters):
# assume at least one parameter exists, for simplicity
return WdlInteger(parameters[0].value + 1)
def get_function(name):
if name == 'add_one': return add_one
else: raise EvalException("Function {} not defined".format(name))
return get_function
lInteger(12)
t(wdl.parse_expr("var * 3").eval(test_lookup))
lInteger(8)
t(wdl.parse_expr("var + var").eval(test_lookup))
lInteger(9)
t(wdl.parse_expr("add_one(var + var)").eval(test_lookup, test_functions()))
l --help
e: wdl [-h] [--version] [--debug] [--no-color] {run,parse} ...
flow Description Language (WDL)
tional arguments:
unarse} WDL Actions
run Run you a WDL
parse Parse a WDL file, print parse tree
onal arguments:
, --help show this help message and exit
version show program's version number and exit
debug Open the floodgates
no-color Don't colorize output
Parse a WDL file:
l parse examples/ex2.wdl
ument:
finitions=[
(Task:
name=<ex2.wdl:1:6 identifier "c2NhdHRlcl90YXNr">,
declarations=[],
sections=[
(RawCommand:
A wdl file can be converted to the dot format in order to be able to visualize the pipeline as a graph. For example:
l2dot -i hello.wdl -o hello.dot
Then use interactive renderer xdot or save to an image:
ot hello.dot
t -Tsvg hello.dot -o hello.svg