Wednesday, June 11, 2014

"SQL like" Java Collections

In my daily work I do a lot of Java coding, but I also use SQL quite a lot.
I like SQL for its declarative nature. I think it is quite intuitive to spell out queries. In .NET there is LINQ which gives nice query capabilities. So I thought why not have similar API in Java - for querying collections. I tried to look for some existing libraries and found some. The closest solution I found was JoSQL. As stated in its website:
"JoSQL (SQL for Java Objects) provides the ability for a developer to apply a SQL statement to a collection of Java Objects. JoSQL provides the ability to search, order and group ANY Java objects and should be applied when you want to perform SQL-like queries on a collection of Java Objects."
JoSQL uses a Query object which gets a string with the SQL query. When it is called, the query is parsed and executed. I guess, but did not check, that reflection is used extensively there. It looks quite nice, but the main issue that I find with it is that it is error prone. The query validity is not enforced in design-time. If you have an error you will know it only in runtime, and there it is a matter of how much the error message is useful, and hopefully you caught it during development/QA.

With this issue I have decided to implement a different API which will give some query capabilities over collections and its validity will be evaluated in design-time. The main characteristics I chose for my API were:
  • Generics
  • Fluent API (i.e. method chaining)
  • Functional Programming
The library it-self is not complete, I did it only as a POC for myself. It is hosted on Google Code:
https://code.google.com/p/ynn-util-collections/wiki/SQLlikeCollections
Let's see some examples.
First, let's assume we have the following import statements:
import static ynn.util.collections.Predicates.*;
import static ynn.util.collections.SqlLikeCollections.*;
The first import statement imports some predefined static predicate methods such as "and", "or", "not", "lessThan", etc. The second import statement imports the entry point static methods for our queries.
Now that we have our basics imported we can write queries.

Examples


Example 1 -

Select all strings in a collection which are not null:
Collection<String> data = Arrays.asList(
    null, "A", "B", "C", null, "D", "E", "F", "G", "H", "I", null);
Collection<String> result = selectFrom(data).where(not(isNull()));
As you can see, this is quite a simple query. You can also notice that the result is typed and keeps the same type as the source (i.e. String). The syntax is quite similar to SQL, which would probably be something like:
SELECT * FROM "DATA" WHERE NOT "DATA"."VALUE" IS NULL

Let's examine some other more complex examples.

Example 2 -

In this example we will have a slightly more complex where clause and we'll also add some sorting. The API has some predefined predicates and utilities for Strings, so we'll import them as well:
import static ynn.util.collections.Predicates.StringPredicates.*;
Now for the example:
Collection<String> result = selectFrom(data)
    .where
        and(
            not(isNull()),
            stringStartsWith("A"),
            stringContains("a")
        )
    )
    .orderBy(LEXICOGRAPHIC_ORDER);
I this example you can see we select all strings which are not null, and it starts with "A", and it contains "a". The result is ordered by lexicographic order. Quite readable, right?

Of course the API is not limited to Strings or just simple objects. Let's see a more complex example.

Example 3 -

In this example we first need to define our Person class:
public class Person {

    private String name;
    private int age;
    private double weight;

    public Person() { }

    public Person(String name, int age, double weight) { 
        this.name = name;
        this.age = age;
        this.weight = weight;
    }

    // Getter & setter methods go here, I omitted them for simplicity
    // Also any other specific implementation (toString, ...)

    /*
     * VALUE PROVIDERS
     */

    public static ElementValueProvider<Person, String> name() {
        return new ElementValueProvider<Person, String>() {
            @Override
            public String getValue(Person element) {
                return element.getName();
            }
        };
    }
        
    public static ElementValueProvider<Person, Integer> age() {
        return new ElementValueProvider<Person, Integer>() {
            @Override
            public Integer getValue(Person element) {
                return element.getAge();
            }
        };
    }
	
}
You probably noticed the static methods under the "VALUE PROVIDERS" comment. Their purpose will be clearer in the example:
List<Person> data = Arrays.asList(
    new Person("John", 2, 2.2),
    new Person("David", 1, 1.1),
    new Person("Bob", 4, 4.4),
    new Person(null, 2, 7.7),
    new Person("Robin", 1, 3.3),
    new Person("Danny", 2, 5.5),
    new Person("Alice", 3, 6.6)
);

Collection<Person> result = 
    selectFrom(data)
    .where(
        not(isNull(Person.name())))
    .orderBy(
        desc(Person.age()), 
        asc(Person.name()));
As you can see above, we wrote a query for selecting all persons whom their name is not NULL. We then sorted the collection, first descending by age, then ascending by name. The corresponding SQL query would probably look something like:
SELECT * FROM "DATA"
WHERE NOT "PERSON"."NAME" IS NULL
ORDER BY "PERSON"."AGE" DESC, "PERSON"."NAME" ASC
Quite similar, right? And it's type safe as well!

One last example - I also started to play with the idea to support mass updates. See the example below.

Example 4 -

In this example we will update all persons with NULL name to have empty string for a name and their age to zero. For the example we need to add the following methods to our Person class:
    /*
     * UPDATERS
     */

    public static Updater<Person> nameTo(final String newName) {
        return new Updater<Person>() {
            @Override
            public void update(Person element) {
                element.setName(newName);
            }
        };
    }

    public static Updater<Person> ageTo(final int newAge) {
        return new Updater<Person>() {
            @Override
            public void update(Person element) {
                element.setAge(newAge);
            }
        };
    }
Now we can write our mass update statement:
update(data)
    .where(
        isNull(Person.name()))
    .set(
        Person.nameTo(""), 
        Person.ageTo(0));
See? nice, right? 

Conclusion

Of course there is a lot to extend, many capabilities missing, the performance of the current implementation can be easily improved, etc. This was a proof of concept. I think having such API can make developers' life easier, especially if they also write real SQL...
By-the-way - now that Java 1.8 is out, with all its new capabilities such as lambda expressions, method reference, functional interfaces, aggregate methods, streams, etc. It gives exactly what I was trying to accomplish here. The main difference is the syntax, but the usage is very similar.