Blog

Rapid IoT Prototyping with Google Cloud Platform, RuuviTags and InfluxDB Cloud 2.0

20 August 2019, 11:43

Many of our ML projects are based on IoT data. Having an easy way to set up and iterate on our IoT system, helps tremendously on delivering value faster.

This blogpost is explaining an easy/fast way to setup such an IoT system. Specifically, we want to check out the new version of InfluxDB which removes the need of having to manage infrastructure. InfluxDB 2.0 will be available as a managed service on Google Cloud Platform later this year.

In about an hour, I managed to build a realtime dashboard using RuuviTags and InfluxDB Cloud 2.0.

. . .

Hardware

RuuviTags are low power Bluetooth beacons that broadcast temperature, humidity, pressure and accelerometer data. The hardware and design are fully open-source.

Prototype IoT architecture

I’m using the open-source Ruuvi Station application on an old Android phone to publish the data from 3 Ruuvi tags to a Google Cloud Function.

In the cloud function, I check if it’s data from my Ruuvi Station, convert the date-time into epoch in UTC and I transform the JSON data into InfluxDB line protocol. I’m using the write REST endpoint to write multiple lines in one request to InfluxDB Cloud 2.0.

def writetoinfluxcloud(ruuvidata):
    ruuvistationts = ruuvidata['time']
    # iso CET tot epoch UTC
    ts = int(datetime.strptime(ruuvistationts, '%Y-%m-%dT%H:%M:%S%z').timestamp())
    
    batteryLevel = ruuvidata['batteryLevel']
    deviceId = ruuvidata['deviceId']
    eventId = ruuvidata['eventId']
    locationaccuracy = ruuvidata['location']['accuracy']
    lat = ruuvidata['location']['latitude']
    lon = ruuvidata['location']['longitude']

    RuuviStationData = f'RuuviStation,deviceId={deviceId},eventId={eventId},lat={lat},lon={lon} batteryLevel={batteryLevel},locationaccuracy={locationaccuracy} {ts}'

    for tag in ruuvidata['tags']:
        accelX = tag['accelX']
        accelY = tag['accelY']
        accelZ = tag['accelZ']
        dataFormat = tag['dataFormat']
        humidity = tag['humidity']
        id = tag['id']
        name = tag['name'].replace(" ", "\\ ")
        pressure = tag['pressure']
        rssi = tag['rssi']
        temperature = tag['temperature']
        updateAt = tag['updateAt']
        voltage =tag['voltage']
        ts = int(datetime.strptime(updateAt, '%Y-%m-%dT%H:%M:%S%z').timestamp())
        RuuviStationData = RuuviStationData + "\n" + f'RuuviTag,id={id},name={name},deviceId={deviceId} voltage={voltage},temperature={temperature},rssi={rssi},pressure={pressure},humidity={humidity},dataFormat={dataFormat},accelX={accelX},accelY={accelY},accelZ={accelZ} {ts}'
    r = requests.post(url, data=RuuviStationData, headers=token)

For long term analysis, because the data retention in the free tier is 72 hours, I also save the JSON data in Google Storage so I can access the data in BigQuery as an external table. I could create a dashboard with a long term view in Google Data Studio or do a bit more ETL to push the data to a properly partitioned, nested and clustered BigQuery table.

InfluxDB 2.0 — first impressions

I’m impressed with InfluxDB Cloud 2.0:
-It’s easy to register and get started
-The UI is consistent
-The data explorer is great to quickly browse and filter time-series data
– Flux is an advantage because you only have to learn one single language to write queries, more advanced transformations/calculations and alerts
-Affordable pay as you go pricing so it’s great for quick prototypes or for companies that want a fully managed service

The documentation is still a bit rough around the edges and more Telegraf connections, for example to Google PubSub, needs to be available.
For more complex data visualisation I still prefer Grafana but that’s covered by an InfluxDB data source with Flux support.

Google Cloud Functions are a fast, affordable and scalable way to integrate with all kinds of different services without having to think about infrastructure. Using a few lines of Python / Go / Node.js I could extend the function to write data to for example openweather.org or an IoT platform.

With a low number of RuuviTags, a lot of projects will fall into the free-tier on Google Cloud Platform.

Production architecture

In production, the IoT architecture has to be more secure and robust. Depending on the requirements and budget the Ruuvi Station application has to be replaced by a dedicated gateway.
If the RuuviTags might end up in a lot of different environments it’s recommended to secure and monitor the solution using Google IoT Core.

Using the Google IoT Core MQTT broker, with telemetry and device state messages in AVRO or protocol buffer instead of JSON, will reduce the load on the network.
In well-secured networks and for low scale use cases I would publish the data from the gateway to Google PubSub. Google IoT Core publishes all device telemetry and states on Google PubSub topics so it’s very easy, flexible and cost-efficient to use multiple subscribers to process the data.

To write to InfluxDB it’s recommended to transform the data to line format and use another Google PubSub topic to push the data to Telegraf.
Alerts or more complex calculations can be done in different components of the solution. For example using Google Cloud Functions, Google Dataflow, within InfluxDB using Flux or in Grafana.

This offers enough flexibility to use the best tool for the job or run certain parts of the solution on the edge.
For very large scale IoT projects Google BigTable is a great alternative to store the time series.

Google Dataflow (Apache Beam) is a great fit for any IoT project due to the advanced window functionality and auto-scaling. More advanced calculations are also possible within InfluxDB using Flux.

ML learning models, for example, forecasts or anomaly detection, can be called from Google Cloud Functions and/or Google Dataflow.
ML results such as time series or individual events can be written to Google Storage/Google BigQuery or InfluxDB for realtime data visualisation.

I’m looking forward to InfluxDB becoming available on Google Cloud Platform. Since it can also run on-premise it’s a versatile and flexible tool for a lot of IoT projects.

Get in touch if you want more info about IoT and ML on Google Cloud Platform. If enough people are interested I can clean up the code for the Cloud Function and publish it on Github.