Output streams and groupings

Single output streams

You can define a component with a single output stream just assigning a list or a tuple to the class attribute OUTPUT_FIELDS.

class CountingBolt(Bolt):

    OUTPUT_FIELDS = ['word', 'counter']

Pyleus also accepts namedtuple for output fields schema declaration:

Record = namedtuple('Record', 'year month day value')

class SampleBolt(Bolt):

   OUTPUT_FIELDS = Record

This allows you to import your schema in the downstream component and to build code that is more resilient to schema changes (at the price of building nametuples from incoming tuple values). You can find some examples of this technique in the examples folder.

When you define a single output stream, all emitted tuples will go to the default stream and you are then allowed to use the short definition syntax for groupings:

- bolt:
    name: my-first-bolt
    module: my_topology.my_first_bolt
    groupings:
        - shuffle_grouping: my-spout

- bolt:
    name: my-second-bolt
    module: my_topology.my_second_bolt
    groupings:
        - fields_grouping:
            component: my-first-bolt
            fields:
                - year
                - month

Note

Please note that, while spouts must define at least an output stream, bolts do not have to.

Multiple output streams

Storm allows components to define an arbitrary numbers of output streams. For doing that in Pyleus, you need to define your output streams in a dict as the following:

class MultipleBolt(Bolt):

    OUTPUT_FIELDS = {
        "stream-id": ["id", "value"],
        "stream-date": ["year", "month", "day", "value"],
    }

As a consequence you need to use the complete definition syntax for groupings in the topology YAML file:

- bolt:
    name: my-first-bolt
    module: my_topology.my_first_bolt
    groupings:
        - shuffle_grouping:
            component: my-first-spout
            stream: stream-A

- bolt:
    name: my-second-bolt
    module: my_topology.my_second_bolt
    groupings:
        - fields_grouping:
            component: my-first-bolt
            stream: stream-date
            fields:
                - year
                - month

See also

See GitHub for an example topology declaring multiple output streams.

Available stream groupings

  • Shuffle grouping:

    - shuffle_grouping:
        component: a-component
        stream: a-stream
    
  • Local or shuffle grouping:

    - local_or_shuffle_grouping:
        component: a-component
        stream: a-stream
    
  • Global grouping:

    - global_grouping:
        component: a-component
        stream: a-stream
    
  • All grouping:

    - all_grouping:
        component: a-component
        stream: a-stream
    
  • None grouping:

    - none_grouping:
        component: a-component
        stream: a-stream
    
  • Fields grouping:

    - fields_grouping:
        component: a-component
        stream: a-stream
        fields:
            - a-field
            - another-field
    

Danger

Storm direct grouping is not yet supported.

See also

For a complete reference of Storm groupings see Apache Storm Documentation.