DataStore - an abstraction layer for storing data and processing requests

Update 2012/01: I still haven't started working on this project and I've heard much interest from others. Some have offered to pick up the project. I will update this post once the project finds a home.

Table of contents

  1. Motivation
  2. Idea
  3. From past to present, what now?
  4. What does better mean?
  5. Why should you care?
  6. My scenario
  7. What worked for me
  8. Proposed features
  9. Help!

Motivation

I wrote an application which extensively used Ajax requests to communicate with the server. After a while I wanted to add localStorage. Though most actions were grouped together, there were still more than just a handful of Ajax requests that had to be changed. Then I thought what if I want to add IndexedDB later on...

... you can see where this is going: maintenance hell.

Then I thought: other people must be having similar problems. Hence...

Idea

A library to unify all the different data storage/retrieval/sending/receiving API's such as XMLHttpRequest, WebSockets, localStorage, IndexedDB, and make it easier to use any number of them at once.

From past to present, what now?

Past: Before, all we had was AJAX requests. Really.

To present: With the new technologies coming up in the HTML5 era, we've got localStorage and IndexedDB, WebSockets, node.js, and more. Hectic.

What now? Don't you wish there was a better way to send and receive data in the browser?

What does better mean?

My general goals for this are:

  1. Simple key/value store common abstraction.
  2. Pluggable handlers for each type of send/receive.
  3. Use other abstractions specified in each handler (library surfaces your API as well).
  4. Straightforward way to define flow of data. More on this later.

Anything else you wish it could do?

Why should you care?

Short answer: maintenance, scalability, flexibility.

As these technologies become widely supported, you will start seeing a common problem for websites heavily relying on AJAX (or any kind of data transfer without page reloads): how do you take advantage of them without rewriting your entire codebase every time there's a new technology (API/storage engine/etc) coming out?

My scenario

The whole reason I got thinking about this was because it happened to me. And it was frustrating.

I had this client-side application using jQuery.ajax requests, and I wanted to take advantage of localStorage for some of them, for data that I didn't need to get from the server on every page load.

I considered:

  • Quick'n'dirty: Rewrite these pieces of the application to do both localStorage and ajax requests as fallback.
  • Slightly better: A library that's flexible enough for my purposes.
  • Ideal: A library that would allow me to enable/disable localStorage as an intermediary step on a per-request basis, make it easy to add IndexedDB support later, etc.

What worked for me

The simpler thing I went with was a Data object with a couple of functions.

Example usage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
// main.js
window.data = new DataStore({
    url: '/fetch_new_data',
    // show a spinny tangy
    sync_before: function showSyncInProgress() { ... },
    // hide the spinny thingy, maybe show a fading notification
    sync_success: function showSyncDone() { ... },
    // hide the spinny thingy, definitely show some message
    sync_error: function showSyncFailed() { ... }
}

// example request
var i = 0;
window.data.process_request({
    ajax: {url: '/new_comment', type: 'POST',
           data: $('#comment-form').serialize()},
    key: 'comment_' + (i++),
    value: {'author': $('#comment-form .author').val(),
            'text': $('#comment-form .text').val()}
});

ajax.data and value are actually very similar, with an important exception in most applications (e.g. Django): the csrftoken. We don't need to store that in localStorage for every request. So I chose to keep the two completely separate. You could subclass DataStore and make it save you this extra work per request.

Below is an example implementation (raw file):

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
/* This depends on Crockford's json2.js
 * from https://github.com/douglascrockford/JSON-js
 * Options:
 *     - url: function()
 *     - sync_before: function()
 *     - sync_success: function()
 *     - sync_error: function()
 */
function DataStore(options) {
    window.data = this;
    this.storage = window.localStorage;
    // date of last time we synced
    this.last_sync = null;
    // queue of requests, populated if offline
    this.queue = [];

    /**
     * Gets data stored at `key`; `key` is a string
     */
    this.get_data = function (key) {
        var str_data = this.storage.getItem(key);
        return JSON.parse(str_data);
    }

    /**
     * Sets data at `key`; `key` is a string
     */
    this.set_data = function (key, data) {
        var str_data = JSON.stringify(data);
        this.storage.setItem(key, str_data);
    }

    /**
     * Syncs data between local storage and server, depending on
     * modifications and online status.
     */
    this.sync_data = function () {
        // must be online to sync
        if (!this.is_online()) {
            return false;
        }

        this.last_sync = this.get_data('last_sync');

        // have we never synced before in this browser?
        if (!this.last_sync) {
            // first-time setup
            // ...
            this.last_sync = {};
            this.last_sync.when = new Date().getTime();
            this.last_sync.is_modified = false;
        }

        if (this.last_sync.is_modified) {
            var request_options;
            // sync modified data
            // you can pass callbacks here too
            while (this.queue.length > 0) {
                request_options = this.queue.pop();
                $.ajax(request_options.ajax);
            }
            this.set_data('queue', []);
            this.last_sync.is_modified = false;
        }
        // data is synced, update sync time
        this.set_data('last_sync', this.last_sync);

        // get modified data from the server here
       $.ajax({
            type: 'POST',
            url: options.url,
            dataType: 'json',
            data: {'last_sync': this.last_sync.sync_date},
            beforeSend:
                // here you can show some "sync in progress" icon
                options.sync_before,
            error:
                // an error callback should be passed in to this Data
                // object and would be called here
                options.sync_error,
            success: function (response, textStatus, request) {
                // callback for success
                options.sync_success(
                    response, textStatus, request);
            }
        });


    /**
     * Process a request. This is where all the magic happens.
     */
    this.process_request = function(request_options) {
        request_options.beforeSend();
        this.set_data(request_options.key, request_options.value);

        if (this.is_online()) {
            $.ajax(request_options.ajax);
        } else {
            this.queue.push(request_options);
            this.last_sync.is_modified = true;
            this.set_data('last_sync', this.last_sync);
            // there are issues with this, storing functions as
            // strings is not a good idea :)
            this.set_data('queue', this.queue);
        }

        request_options.processed();
    }

    /**
     * Return true if online, false otherwise.
     */
    this.is_online = function () {
        if (navigator && navigator.onLine !== undefined) {
            return navigator.onLine;
        }
        try {
            var request = new XMLHttpRequest();
            request.open('GET', '/', false);
            request.send(null);
            return (request.status === 200);
        }
        catch(e) {
            return false;
        }
    }
}

Proposed Features

The example API isn't bad, but I think it could be better. Perhaps something along the lines of Lawnchair. As I'm writing this, I realize that writing an API is going to take longer than I'd like - therefore, this will serve as a teaser and food for thought. Feedback is welcome.

  • Add an .each method for iterating over retrieved objects (inspired by Lawnchair)
  • Standard DataStore.save, .get, .remove, etc.
  • Support for these "storage engines": localStorage, IndexedDB, send-to-server.
  • Support for these request types: XMLHttpRequest, WebSockets.
  • Store, at the very least, primitive values and JSON.
  • Include callbacks for various stages in the process of a request, similar to jQuery.ajax, e.g. beforeSend, complete, success, error. Figure out a good way to do this at each layer (minimize confusion).
  • For each request, specify which layers and in what order to go through. For example, if you want to store something in localStorage, IndexedDB, and send it to the server, you could do it in that order or the reverse.
  • Control whether to go to the next layer type depending on whether the previous succeeded or failed. Say, if you want to send the request to server but that fails, try localStorage as a fallback. Or the opposite.
  • Include a .get_then_store shortcut for getting the data from layer A and storing it in layer B?
  • Extensible: as easy as DataStore.addLayer(layerName, layerHandler), where layerHandler (obviously) implements some common API along with exposing some of its own, if necessary (e.g. ability to query or find, for IndexedDB).
  • As sending and getting data from the server means keeping two or more databases in sync, collisions may arise. Provide a collision callback or some smart defaults for handling collision. E.g. sometimes server data is always right (trusted more than user data), other times local data is king.

Help!

Hopefully my rant has gotten you thinking about the right approach. What would you like to see? What would make this something you would use and be happy with?

If you are interested in getting involved with coding this, contact me at paul at craciunoiu {dot} net.