Streaming data via OPeNDAP

Last month I finally got myself a Raspberry Pi, when I participated in PyCon here in Santa Clara. This weekend I decided to play with it and a concept that I've been working on the new release of Pydap (available at my repo): streaming real time data via OPeNDAP. While the Pydap server always had the capability of streaming infinite datasets, as far as I know there were no clients that could process the data stream in real time. This is a feature that I've wanted to implement for a long time, but it required a lot of refactoring, involving changes in both the HTTP library and the XDR unpacking. The current repo has a simpler implementation of the client, and I finally managed to get it working.

In order to put it to use I created a special Pydap server on my Raspberry Pi, streaming data from a temperature sensor. The sensor was connected to the Raspberry Pi following this tutorial, and measurements are read using the RPi.GPIO library. We then create a Pydap dataset with a Sequence variable, normally used to represent sequential data from a database or a CSV file. The major difference here is that the Sequence data will be created on the fly by reading from the sensor — i.e., the dataset is materialized when requested, instead of being stored in memory or disk. The server will stream the data as fast as the client can consume it, since the whole process is based on Python generators.

The code itself is pretty simple. All we need to do is to create the SensorData class, which defines a generator yielding tuples with the values. The get_temperature() function can be derived from the tutorial.

from pydap.model import *
from pydap.handlers.lib import IterData, BaseHandler

class SensorData(IterData):                                                     
    """                                                                         
    Sensor data as a structured array like object.                              

    """                                                                         
    def gen(self):                                                              
        while 1:                                                                
            timestamp = time.time()                                             
            temperature, voltage = get_temperature()                            
            yield timestamp, voltage, temperature                               
            time.sleep(0.1)                                                     

dataset = DatasetType('roberrypi')                                          
seq = dataset['sensor'] = SequenceType('sensor')                            
seq['time'] = BaseType('time', units='seconds since 1970-01-01')            
seq['voltage'] = BaseType('voltage', units='mV')                            
seq['temperature'] = BaseType('temperature', units='deg C')                 
seq.data = SensorData('sensor', seq.keys())

if __name__ == '__main__':
    app = BaseHandler(dataset)                                                  
    from werkzeug.serving import run_simple                                     
    run_simple('0.0.0.0', 8080, app, use_reloader=True, threaded=True)

Running the script will create a server on http://localhost:8080/, and you check the typical OPeNDAP responses at http://localhost:8080/.{das,dds,dods}. If you want to look at the data itself you can check the ASCII response at http://localhost:8080/.asc, or use the development version of Pydap to access it like any other dataset.

I also create a simple Flask application that reads the data stream from the ASCII response and plots it on a real-time graph using Smoothie Charts. You can see it working here: http://69.181.252.12:5000/ (the data takes a while to load because the browser will accumulate a certain number of bytes before generating events) Update: I'm running a server at http://vps.dealmeida.net:5000/ (DDS|DAS) serving the CPU load and the number of bytes transferred. There's a static page on a different server showing the data at http://dealmeida.net/opendap-streaming/. The plot reads the data from the binary (dods) response using CORS-enabled XHR and parses it using some black magic. I'll post about this later.

Photo by Vanessa Schott