Tutorial

In this tutorial, we go through the steps of installing λ-blocks, writing a computation graph, and proceed to execute it.

Installation

Dependencies

If you’re using Debian, Ubuntu, or a system of this family, the required dependencies should all be available in your package manager:

sudo apt install python3 python3-venv libyaml-dev

If you’re not using Debian or a Debian-based system, be sure to install Python 3 and the development headers of libyaml, this is necessary for pip to compile PyYAML.

Finally, if you want to use the Spark blocks, you will need Spark and pyspark to be installed on your system (but this is not required for this tutorial).

λ-blocks

While λ-blocks is still in its early days of development, it is not available through pip, nor in any distribution package manager. Therefore, the best is to install it in a virtual environment this way:

git clone https://github.com/lambdablocks/lambdablocks.git
cd lambdablocks
pyvenv VENV
source VENV/bin/activate
python3 setup.py install

Also not required for this tutorial, these dependencies are needed for some blocks in the included library:

pip install matplotlib requests-oauthlib

Verification

Don’t leave your activated virtual environment, and try executing:

blocks.py --help

If you get the help page of this executable, all is set!

Writing a computation graph

Now that everything is installed, let’s dive into writing a first λ-blocks program.

Such a program, also called a computation graph, is written in YAML, a simple data representation format. Create a new file and name it wordcount.yml: it will contain the description of a computation graph to perform a Wordcount. Add this content:

---
name: wordcount
description: Counts words
modules: [lb.blocks.unixlike]
---
- block: cat
  name: cat
  args:
    filename: examples/wordlist

This YAML file contains two parts: the first one is a key/value list giving information on the computation graph (such as its name, description, and used modules). The second part is more interesting: it contains the list of the code blocks that are the vertices of our graph. For now, there is only one vertice: it uses the block lb.blocks.unixlike.cat(). It has a unique name cat (since we use only once the block cat in this program, the vertice name can be the same as the block name), and one argument, a path to a file. As you may have guessed, this block acts like the Unix cat utility: it reads a file.

This program won’t do much, except for reading a file. You can try to execute it this way:

blocks.py -f wordcount.yml

If nothing happens, it is normal: the file has been read by λ-blocks, but it isn’t supposed to be displayed on the console. If you get an error, the path you provided may be incorrect: be sure to execute the command within in the lambdablocks folder, or to change the filename argument.

Let’s add a few vertices in our graph, and link them together to compute a Wordcount implementation:

---
name: wordcount
description: counts words
modules: [lb.blocks.unixlike]
---
- block: cat
  name: cat
  args:
    filename: examples/wordlist

- block: group_by_count
  name: group
  inputs:
    data: cat.result

- block: sort
  name: sort
  args:
    key: "lambda x: x[1]"
    reverse: true
  inputs:
    data: group.result

- block: show_console
  name: show
  inputs:
   data: sort.result

We now have 4 blocks (or vertices):

  • cat reads a file and outputs a list of lines found in this file;
  • group_by_count reads a list, and outputs a list of unique items, along with the number of times they appear in the list;
  • sort reads a list, and outputs a sorted list, sorted by the second item of each element;
  • show_console displays its inputs on the user console.

A block has named inputs and named outputs. To link two blocks together, we specify the inputs of a block in the inputs key. For example, the block group_by_count takes one input, data, that is the output result of the block cat.

Let’s try to execute this graph:

blocks.py -f wordcount.yml

That’s it! You should get a list of fruits, along with their number of occurences.

Using plugins

λ-blocks, while processing a computation graph, can execute plugins, which are pieces of Python code able to act on the graph. For example, let’s try the included lb.plugins.debug plugin:

blocks.py -f wordcount.yml -p lb.plugins.debug

This plugin will display an excerpt of the results produced by each block, which allows you to effectively see what every block is doing. This is useful to follow the data as it is transformed from the entry of the graph to all the following vertices.

You can also try to execute the lb.plugins.instrumentation plugin the same way, which will measure the time taken by every block to compute, useful to detect bottlenecks:

blocks.py -f wordcount.yml -p lb.plugins.debug lb.plugins.instrumentation

Unsurprisingly, the cat block should be the slowest, because it requires to read a file on disk.

Next steps

Now that we’ve seen some possibilities of λ-blocks and how it works, you can look at some examples, check the list of available blocks, the list of available plugins, write your own blocks or write your own plugins.