TRY ME

Try Valo for free

We want to show you something amazing.

We'll send you a link to download a fully functional Valo copy to play with.



Great! Check your email and enjoy Valo



Apologies, we seem to be having a problem processing your input, please try again

Tags

What are Tags?

Tags are ordered sets of symbols associated with a field in a payload.

In JSON, these appear as arrays of strings:

{
        "msg": "This is a regular string field",
        "tag1": ["A", "B", "C", "D", "E", "F"],
        "tag2": ["ERROR", "INFO"]
}

Tags are defined in the schema using the ‘tags’ type. Using the Streams API, The schema for the payload above would look like:

{
        "schema" : {
                "topDef" : {
                        "type" : "record",
                        "properties" : {
                                "msg"  : { "type" : "string" },
                                "tag1" : { "type" : "tags" },
                                "tag2" : { "type" : "tags" }
                        }
                }
        }
}

Note

Tags are always optional!

Tagging

Valo supports adding tags to a payload after it has been posted to the stream, or ‘Tagging’. One or more tag functions can be defined using the Streams API. These function are invoked by Valo after they are posted to the stream, but before they are sent to the execution engine or repositories.

Defining Tagging functions

Tagging functions can be created/updated for a stream using the /streams/:tenant/:collection/tags end point in the Streams API.

Example tag functions:

{
        "tags": [
          {
                "field" : "level",
                "lang" :  "javascript",
                "enabled" : true,
                "script" : "function tag(payload) {
                                         if(payload.msg.contains('broken')) return ['error']
                                   }"
          },
          {
                "field" : "orderNumbers",
                "sourceFields" : ["msg"],
                "lang" :  "regex",
                "enabled" : false,
                "script" : "[A-Z]{3}\\d{4}"
          }
        ]
}

Each function will be invoked against every payload on the stream, the result of each function will be aggregated into a sorted set of values. If there are values already defined in the tag field, these will be merged with the tagging function output.

If an error occurs in the tagging function, the result of that function will be ignored (the payload will remain unchanged).

Tag function fields

Field Description
field The target field that will be modified by the tagging function. This field must be of type ‘tags’
lang The script language. Currently supported: ‘javascript’ and ‘regex’
script The actual source script
sourceFields The input sourced for the script (Only used by ‘regex’ for now)
enabled If false, the tagging function will not be invoked.

JavaScript Tagging

You can define a tagging function using JavaScript:

function tag(payload) {
        if(payload.msg.contains('broken')) return ['error']
}

Please note the following restrictions. The function:

  • Must be called ‘tag’.
  • Can optionally take a single argument, this will be the target payload. The script can access the fields as input for the function (see below).
  • Must return an array of strings. Duplicates will be removed and the result will be sorted, so order is irrelevant.

The JavaScript function can access the payload in an idiomatic way, given a payload like so:

{
        "msg": "This is a regular string field",
        "letters" : ["A", "B", "C", "D", "E", "F"],
        "person" : {
                        "name" : "Fred",
                        "age" : 42
        }
}

The function can access the fields like so:

function tag(payload) {
        // Split the message field by space, then take the 4th element ("regular")
        var msg = payload.msg.split(" ")[3];
        var second = payload.letters[1]; // second is "B"
        var nameAndAge = payload.person.name + " is " + payload.person.age; // nameAndAge is "Fred is 42"
        // Return some random stuff...
        return [msg, second, nameAndAge];
}

Warning

Try to make your tagging functions short and sweet! The tagging function is invoked for every payload, if the function is slow it will directly affect performance.

Regex Tagging

If all you need to do is extract one or more values from a string, regex tagging is a reasonable alternative to a script.

The regex specified must conform to Java regex Pattern. All matching values will be used.

Querying tags

There are a few functions that operate on the tags type. Given the schema on a stream /streams/demo/infrastructure/logs:

{
        "schema" : {
                "topDef" : {
                        "type" : "record",
                        "properties" : {
                                "msg"  : { "type" : "string" },
                                "level" : { "type": "tags" },
                                "orders" : { "type": "tags" }
                        }
                }
        }
}

To stream all documents where the tags field level has some tags (not empty) use the isTagged function:

FROM /streams/demo/infrastructure/logs where isTagged(level)

Conversely, stream all documents where the tags field orders has has no tags (empty) use the notTagged function:

FROM /streams/demo/infrastructure/logs where notTagged(orders)

To stream all documents that contain the order ABC123 use the hasTag function:

FROM /streams/demo/infrastructure/logs where hasTag(orders, 'ABC123')

hasTag can be combined with and/or as you might expect:

FROM /streams/demo/infrastructure/logs where hasTag(orders, 'ABC123') or hasTag(orders, 'XYZ456')

Example

Given a schema for stream /streams/demo/example/mystream:

{
        "schema" : {
                "topDef" : {
                        "type" : "record",
                        "properties" : {
                                "msg"  : { "type" : "string" },
                                "level" : { "type": "tags" },
                                "orders" : { "type": "tags" }
                        }
                }
        }
}

We will define 2 tagging functions.

  • The first will be a javascript function that will set the ‘level’ field to [“CRITICAL”] if the ‘msg’ field contain the word broken, [“INFO”] otherwise.
  • The second will be a regex that extracts order numbers and put them in the ‘orders’ field.

We define our functions at the /streams/demo/example/mystream/tags endpoint like so:

{
        "tags": [
          {
                "field" : "level",
                "lang" :  "javascript",
                "enabled" : true,
                "script" : "function tag(payload) { if(payload.msg.contains('broken')) return ['CRITICAL'] else return ['INFO'] }"
          },
          {
                "field" : "orders",
                "sourceFields" : ["msg"],
                "lang" :  "regex",
                "enabled" : true,
                "script" : "[A-Z]{3}\\d{4}"
          }
        ]
}

Note

For clarity here I have defined the javascript function on one line and used single quotes for strings. This means I can avoid escaping the characters in the JSON

If we now post some payloads to /streams/demo/example/mystream:

[
    {
        "msg" : "Regular message, nothing to see"
    },
    {
        "msg" : "Something has broken in orders ABC1234 and XYZ6789"
    },
    {
        "msg" : "Order DEF456 was successful"
    }
]

The tagging functions would transform these to:

[
    {
        "msg" : "Regular message, nothing to see",
        "level" : ["INFO"]
    },
    {
        "msg" : "Something has broken in orders ABC1234 and XYZ6789",
        "level" : ["CRITICAL"],
        "orders" : ["ABC1234", "XYZ6789"]
    },
    {
        "msg" : "Order DEF4567 was successful" ,
        "level" : ["INFO"],
        "orders" : ["DEF4567"]
    }
]