Friday, October 31, 2014

XmlRpcMessageService - A Google Apps Script Library

This post is part of a series: Trakt.TV & Subtitles - A Google Apps Script Project

Overview

As part of my project I worked on fetching subtitles availability from several providers. One of the providers is OpenSubtitles.org. This site has an API which is implemented based on XML-RPC. From Wikipedia on XML-RPC: "XML-RPC is a remote procedure call (RPC) protocol which uses XML to encode its calls and HTTP as a transport mechanism". This protocol is quite verbose and is not that JavaScript friendly. Because of these reasons I decided to implement a library to ease the use of XML-RPC in JavaScript. This library enables the user to describe the request and get the response in plain JSON. The bottom line is that using this library, executing a method call in JavaScript, for example the "LogIn" method to OpenSubtitles.org, can be as easy as:
var methodCall = {
   methodName : "LogIn",
   params : [ "", "", "en", "OS Test User Agent" ]
}
var url = "http://api.opensubtitles.org/xml-rpc";
var methodResponse;
var onSuccess = function(mr) { methodResponse = mr; };
XmlRpcMessageService.execMethodCall(url, methodCall, onSuccess);

The First Translation - Verbose JSON

First I defined a quite verbose JSON model which translates to and from an XML-RPC in a straight forward conversion. Examples are better than words-

A simple value is described in XML-RPC:
<value><string>David</string></value>
In JSON:
{ type : "string", value : "David" }

An array in XML-RPC:
<value><array>
   <data>
      <value><string>David</string></value>
      <value><string>John</string></value>
   </data>
</array></value>
In JSON:
{ type : "array", value : [
      { type : "string", value : "David" },
      { type : "string", value : "John" }
   ] }

A struct value in XML-RPC:
<value><struct>
   <member>
      <name>qwerty</name>
      <value><double>1234</double></value>
   </member>
   ...
</struct></value>
In JSON:
{ type : "struct", value : [
      name : "qwerty",
      value : { type : "double", value : "1234" }
   ] 
}
So basically, you can see that a similar schema is kept for the different value types.
A method call in XML-RCP contains the name of the method and the values for the parameter:
<methodCall>
   <methodName>foo</methodName>
   <params>
      <param>...value...</param>
      ...
   </params>
</methodCall>
In JSON:
{
   methodName : "foo",
   params : [
      { type : "...", value : "..." },
      ...
   ]
}
The result of this transformation is still verbose, more or less the same as the XML representation. The main advantage of the JSON representation is that it is much easier to use in JavaScript since there is no need for string manipulation, XML encoding, etc.

The Second Transformation - "Semantic" JSON

(Maybe the name "semantic" is not that good, but I'll stick with it for now)
After implementing the first translation, which can be used in all scenarios, I decided to go a little bit further. I thought - instead of persisting the name of the value type, persist the value in its designated type. So for the next transformation I decided on the following mapping:

JSON - Verbose
JSON - Semantic
double value object
{ type: "double", value: 7 }
number
7
string value object
{ type: "string", value: "Hello" }
string
"Hello"
boolean value object
{ type: "boolean", value: true }
boolean
true
array value object
{ type: "array", value: [ ... ] }
array
[ ... ]
struct value object
{ type: "struct", value: [ { name: "bar", value: ... } ] }
object
{ bar: ... }

The method call itself stays the same, the values become more simple:
{
   methodName : "foo",
   params : [ "Hello", 7, true, { bar : 100 } ]
}

Of course this transformation can be applied on the response as well, which is simple as:
{ params : [ ... ] }

Conclusion

The idea was to make the use of XML-RPC web services easier in JavaScript in general, and in Google Apps Script specifically. This implementation has two-step transformation. The first step is as verbose as the original XML-RPC, but can be used easily in JavaScript. The second step makes it much less verbose but does not support, for now, some of the value types defined in XML-RPC specification (e.g. integer, base64, etc).


The code is available in GitHub:
https://github.com/yinonavraham/GoogleAppsScripts/tree/master/XmlRpcMessageService

Full Example for Comparison


XML-RPC
<methodCall>
   <methodName>foo</methodName>
   <params>
      <param>
         <value><string>Hello</string></value>
      </param>
      <param>
         <value><double>7</double></value>
      </param>
      <param>
         <value><boolean>1</boolean></value>
      </param>
      <param>
         <value><array>
            <data>
               <value><string>a</string></value>
               <value><string>b</string></value>
            </data>
         </array></value>
      </param>
      <param>
         <value><struct>
            <member>
               <name>bar</name>
               <value><double>100</double></value>
            </member>
         </struct></value>
      </param>
   </params>
</methodCall>

JSON - Verbose
{ 
   methodName : "foo", 
   params : [ 
      { type: "string", value : "Hello" }, 
      { type: "double", value : 7 },
      { type: "boolean", value: true },
      { type: "array", value: [ 
            { type: "string", value: "a" }, 
            { type: "string", value: "b" } 
         ] },
      { type: "struct", value: [ 
            { name: "bar", value: { type: "double", value: 100 } } 
         ] } 
   ]  
}

JSON - Semantic
{ 
   methodName : "foo", 
   params : [ "Hello", 7, true, [ "a", "b" ], { bar : 100 } ] 
}

Tuesday, October 21, 2014

GmailSearchBuilder - A Google Apps Script Library

Preface

I recently wanted to write a simple Google Apps Script for automatically delete old email messages (threads). The reason was a lot of email threads that started to accumulate, which are not important enough for me to keep.

There are solutions out there. All those I had seen are based on assigning a label to the messages to be deleted (e.g. "DELETE ME"). Some are quite naive - iterating over all the threads returned from the GmailLabel object, which is not the best way to go, the performance of this solution is poor. The better solutions use GmailApp.search method with a query such as "label:delete-me older_than:30d" to fetch the threads with the "DELETE ME" label which are older than 30 days. (Good post: Create time-based Gmail filters with Google Apps Script)

Also, you need to assign the label for that to work. Manual assignment is tedious. Of course it can be solved by filters which assign this label automatically. But what if you want to assign other labels, and the "DELETE ME" label is only the secondary label to assign? (e.g. when you filter messages into different labels, but all of them can be deleted after a while). This can also be done by a scheduled job which assigns the "DELETE ME" label to all messages with other specified labels (which are not yet marked for deletion). This can be done similarly using the search query "label:my-other-label -label:delete-me".

After writing those queries, and trying to figure out how to deal with some issues, like what to do with spaces in the label name (one solution- replace all spaces with '-'), I decided to implement a service for building search queries. The goal of this service is to solve for you those questions (another one for example: how to format dates), while letting you all the flexibility of the query syntax.

The Query Builder

Well, it's quite simple. The idea was to have a fluent API using method chaining. The builder needs to give flexibility similar to writing hard-coded query strings, with simplicity and to handle for the developer the tedious technical stuff. Since it's a builder, it can be used for building queries dynamically in run-time. The builder also lets you call the search directly on the builder itself as a syntactic sugar.

Some examples
Assume:
var GSB = GmailSearchBuilder;

1- Building and running a simple query
var query = GSB.newQuery().label("My Other Label").exclude().label("DELETE ME").build();
var threads = GmailApp.search(query);
//The query: label:my-other-label -label:delete-me

2- Building and running directly from the builder
var threads = GSB.newQuery().label("DELETE ME").olderThan(30,DateType.DAY).search();
//The query: label:delete-me older_than:30d

3- Building a query with a sub-query (group)
var threads = GSB.newQuery().subQuery(GSB.newQuery().label("My Label 1").or().label("My Label 2")).exclude().label("DELETE ME").search();
//The query: (label:my-label-1 OR label:my-label-2) -label:delete-me

3- Search for threads between dates
var threads = GSB.newQuery().from("john@acme.com").before(new Date(2014,11,31)).after(new Date(2014,0,1)).larger(1,SizeType.MB).search();
//The query: from:john@acme.com before:2014/12/31 after:2014/01/01 larger:1048576


The source code is available in GitHub:
https://github.com/yinonavraham/GoogleAppsScripts/tree/master/GmailSearchBuilder

Friday, October 17, 2014

EnhancedCacheService - A Google Apps Script Library

This post is part of a series: Trakt.TV & Subtitles - A Google Apps Script Project

Overview

GAS's native CacheService is a very good caching service. It provides several levels of caching (user, script, ...) with very good basics for caching string values. But this is also part of it drawbacks:
  1. It can store only string values
  2. The values are limited to 128KB
  3. Some additional features are lacking, such as additional information on entries (e.g. when an entry was last updated)
With these limitations, and the requirements I had in the project I was working on, I decided to implement an enhancement to this service: EnhancedCacheService
The purpose of the new service was to wrap the existing service and add additional features while preserving the existing features and the flexibility of choosing the cache type. The additional features which are currently implemented:
  1. Support for native JavaScript types, such as: number, boolean, object
  2. Support for values larger than 128KB
  3. Additional information on cache entries - get the date an entry was last updated
In order to instantiate an enhanced cache service, use:
var cache = EnhancedCacheService.wrap(CacheService.getUserCache());
As you can see, this gives you the freedom to choose the type of cache you want to use.

Basics

In order to support the requirements I have decided to store a value descriptor instead of the value itself. This answers two main requirements:
  • Ability to get additional information on cache entries
  • Support in the simple native data types, such as boolean and number
This is done by creating a value descriptor object containing the following information:
  • The value, in its original form (number, string, boolean, null)
  • The name of the type of the value (e.g. 'string', 'boolean', etc.)
  • The time-to-live that was set for the entry
  • The time the entry was set (to be used for last updated)
This value descriptor is being stringified (using JSON.stringify) and stored as the value of the entry.
The method structure that was chosen to support the various type was (where <Type> is replaced by the specific type, e.g. "Boolean"):
cache.put<Type>(String key, Type value, Number ttl) : Void
cache.get<Type>(String key) : Type
In each put & get method the key and value are verified to be of the correct type. The ttl parameter (i.e. time-to-live in seconds) is optional, same as in the native cache service.

Support For JavaScript Objects

Objects are a little bit more complex. Since the objects are stored also as strings, there is a need to allow custom methods for stringifying and parsing the object value (although in most cases JSON's default methods are enough, there are some cases such as the Date object where it is not). For this reason, both getObject and putObject methods take an optional parse/stringify method. If not specified, JSON's methods are used.

So, for example, in order to store and get a simple object:
cache.putObject('p1', { name: 'John', age: 30 });
var p = cache.getObject('p1');
The returned value is an object, stringified and parsed using the default methods.
In order to store an object for which the default methods are not enough, the following can be done (in this example, for the Date object):
var stringifyDate = function(o) { return '' + o.getTime(); };
cache.putObject('d1', new Date(), undefined, stringifyDate);
var parseDate = function(s) { return new Date(+s); };
var d = cache.getObject('d1', parseDate);

Support For Large Values

In order to support larger values but still use the existing cache service I have decided to split large values between several entries. String values are the main candidates for being too long (number, boolean and null have no chance of reaching the max size). Since objects can also be too large (and for some other reasons), I have decided to store them as strings. This allows me to use the same implementation for all relevant value types.

When storing an entry, the value in the value descriptor is checked - if it is a string too long, if so it needs to be split. In such cases the value is split to smaller parts, each is stored in a separate entry. The keys of the split entries are collected and stored as part of the value descriptor instead of the value itself.

When getting a value, the opposite operation is done. If the value descriptor has keys instead of values, the values are taken and rejoined to the original value. Only then the value is returned (or parsed in case of an object).


The code is available in GitHub:
https://github.com/yinonavraham/GoogleAppsScripts/tree/master/EnhancedCacheService

Trakt.TV & Subtitles - A Google Apps Script Project

Before I Get to the Technical Stuff

My wife and I like watching TV series, but we like choosing what and when, not what the satellite/cables providers dictate. This is why we mostly use XBMC for that. XBMC is great - you get all the new episodes from a variety of providers, quickly and with good quality. Since we're not native English speakers we also like having subtitles. XBMC is great for that as well.

A while ago, before we started using XBMC, I used to go once in a while browsing for new episodes, subtitles, etc. It was frustrating... XBMC solved most of it. I also found Trakt.TV a great tool for keeping track of what we haven't watched yet.

The only part that was left is keeping track of the subtitles availability. For that I had two options: start watching the episode, then try to find matching subtitles. The other option was to look for the subtitles in the providers' websites before that. Both options aren't nice. There are some series that we don't really mind watching without subtitles, but some we wont watch until they have subtitles. In those cases it is frustrating to go through all that work just to find there are no subtitles available...

This is why I started this project. The goal was to give a view which consolidates my shows' progress (from Trakt.TV) with the availability of subtitles for each un-watched episode. I chose Google Apps Script because of two reasons: 1- it is quite simple, it gives a variety of built-in services and it's free. 2- I did not know it and wanted to see what it has to offer.

Technical Overview

OK, so as I mentioned before, the project is based mainly on Google Apps Script. Although it has quite a lot native services, I had some gaps of missing functionality I needed to fill. The project currently includes the following modules/services:
  1. TrakTV & Subtitles-
    This script is published as a webapp. It is the main module, it implements the model generation and the UI rendering of the service.
  2. EnahancedCacheService- (post)
    This is a service for adding more features to GAS's native CacheService. It mainly supports various value types (not just string: e.g. number, boolean, object) and supports values larger than 128KB.
  3. TraktTVClient-
    This is a client for the Trakt.TV REST API. It currently implements only what is really needed for my needs, but it can be enhanced with much more quite easily if needed.
  4. SubsCenterOrgClient-
    This is a client for the SubsCenter.org subtitles provider. Since this provider has no API, so instead it parses the website - only what is currently required.
  5. OpenSubtitlesOrgClient-
    This is a client for the OpenSubtitles.org subtitles provider. This provider has an API based on XML-RPC.
  6. XmlRpcMessageService- (post)
    This is a service for creating and handling XML-RPC messages. It lets the user work with simple JavaScript objects instead of the verbose XML. Not all data types defined in the specification are supported by this implementation.
In the next posts I will elaborate on some of the services mentioned above.