Valo is a distributed computation engine for streams of data. It combines analytical functions and machine learning algorithms with a unified approach to real-time and historical queries, backed by heterogenous storage engines and an advanced execution engine, all within a single easy to deploy platform.
Just click the button to download a Valo node and get started. All the APIs are ready to use out of the box, from creating streams of data to running queries.
The only pre-requisites are the Oracle JRE 8. Valo runs on Linux, Windows and OSX.
You bet. Queries are handled by a new generation of execution engine which automatically runs the same processing in memory on real-time data streams or against the built-in storage engines. This means excellent locality, running the analysis against the data on disk and taking advantage of any indices present.
No complex interaction between multiple systems or nasty surprises after streaming terabytes of data across the network.
In most cases, moving from real-time to historical is as simple as adding “from historical” to the start of the query.
Data ingestion is simple, just an HTTP POST. We support JSON, CSV, YAML and CBOR documents. To perform historical queries, you need to decide which repository to use. We currently have two:
- Semi-Structured Repo - Indexes document style data using
- Lucene - Time-series Repo - Column oriented store
Selecting the repository that most closely matches the structure of your data allows Valo to execute your queries as optimally as possible. See Repositories for more information.
The SDK to add custom functions will be available in the second half of 2016. Once added, custom functions are treated as “native”, optimally distributed and executed just like the built-ins.
Valo is accessible through an HTTP REST API, accessible from any language and a wealth of command line tools.
We’re putting the finishing touches to a Scala client with add Java and .NET coming this year. We also plan to integrate with Python soon.
Get in touch at firstname.lastname@example.org and let us know what languages you’d like to see supported.
Please get in touch at email@example.com. The single node can handle a lot of data but for full scalability and resiliency you’ll need to be running a cluster of nodes. We’re looking to run trial deployments in the first half of 2016 to ensure the minimum of risk.
See cluster architecture for more information about the architecture.
Valo is a highly available for writes, eventually consistent (AP) system. There is no single point of failure and stream data is replicated on multiple nodes.
It is elastically scalable, allowing the addition of new nodes to handle increased demand and storage.
Clustering is undergoing extensive testing right now to check these properties hold up against production loads. Please get in touch with us for more information.