Working with Large JSON Data Sets in Android

By Rob Gravelle

Working with Large JSON Data Sets in Android

In my recent Working with Android's In-memory JSONObject article, I described how to utilize Android's excellent JSONObject and JSONArray classes to load a JSON data set into memory and work with its contents. While they can easily manage small feeds, things get a little more complicated when you're dealing with very large data sets because mobile devices only allocate so much memory to your applications. Therefore it is often prudent to employ a third-party library to parse a JSON-formatted stream so that you don't have to load the entire JSON structure into memory. In today's article, we'll be taking a look at one such library called Gson.

More about Gson and Where to Get It

Gson is a Java library that can convert Java Objects into their JSON representation, as well as convert a JSON string to an equivalent Java object. The G in Gson stands for Google, the makers of that ubiquitous search engine we all know and love. The Gson library was originally developed by Google for internal use, but they eventually released it to the public in May of 2008 under the terms of Apache License 2.0. The latest version, 2.7, was released in June of 2016.

Now, the interesting part is that Gson can read and write JSON as an object model or stream.

Object model access is available via the JsonElement class hierarchy. These classes operate on a JSON document as a navigable object tree. This model is easy to use and capable because it permits random read and write access to the entire JSON document.

Streaming access to JSON documents is available via the JsonReader and JsonWriter classes. These classes operate on a JSON document as a sequence of tokens that are traversed in depth-first order. Because the streams operate on one token at a time, they minimize memory consumption.

To get your hands on the jar file, don't bother downloading it from the Github repository; that'll only get you a zip archive with a lot of source files. The best way to obtain the latest version is to visit the Apache Maven Project site and do a search for "Gson" using the textbox in the lower-left corner of the page (you'll have to scroll down to see it). My search led me to version 2.6.

The Test Data

For this tutorial, my code reads from a file that I created using the excellent online JSON Generator tool. I removed a couple of attributes, but basically, the tool generates a JSON array of people. There are a number of stats about each person, including their age, eye color, as well as a list of their friends. Here is the output for one person:

{
  "_id": "57e2d27e530017619934e45c",
  "index": 0,
  "guid": "927e4976-72ea-4b4e-b0d7-32ed6eeaa5b4",
  "isActive": false,
  "balance": "$2,937.62",
  "picture": "http://placehold.it/32x32",
  "age": 36,
  "eyeColor": "blue",
  "name": {
    "first": "Trujillo",
    "last": "Hahn"
  },
  "company": "SUPREMIA",
  "email": "trujillo.hahn@supremia.io",
  "phone": "+1 (858) 467-3082",
  "address": "667 Atlantic Avenue, Sperryville, Florida, 9518",
  "about": "Ullamco do eu excepteur Lorem velit.
           Magna pariatur minim cillum qui dolor adipisicing Lorem Lorem.
           Adipisicing cupidatat aute labore aliquip ex mollit minim duis sunt proident nostrud sunt.
           In magna voluptate ad ut proident magna in sit velit amet elit dolore enim.",
  "registered": "Wednesday, November 18, 2015 6:15 PM",
  "latitude": "-53.438592",
  "longitude": "42.045409",
  "friends": [
    {
      "id": 0,
      "name": "Bean Alford"
    },
    {
      "id": 1,
      "name": "Etta Crawford"
    },
    {
      "id": 2,
      "name": "Thomas Randall"
    }
  ],
  "greeting": "Hello, I am Trujillo Hahn.",
  "favoriteFruit": "strawberry"
}

Reading Our Data in Stream Mode

I find that Gson is best suited for iterating through a JSON Array because that is easily done using streaming.

You'll need to instantiate two classes in order to process a stream:

  1. A Gson object that contains static methods such as toJson() and fromJson() for conversions.
  2. A JSONReader to read from the stream. I like to put it in a try-with-resources block so that the stream will be closed automatically. A try/catch can also be used to handle exceptions such as JsonIOException, JsonSyntaxException, JsonParseException, and IOException.

Once you've instantiated your objects, you have to call the reader's beginArray() method so that it knows to expect an array of JSON objects. The Reader offers the hasNext() instance method that you can use within a while loop to iterate over the array.

So far, nothing has been converted into a Java object, which means minimal memory usage. It is not until we are inside the loop that we convert each person object into a Java class. In order to do that, we have to call the Gson fromJson() instance method, passing in the reader and the Java ".class" property of the Java class. The .class property is a special attribute that all Java classes have that contains information about the class. It is often used in testing and other operations that require reflective information on a class.

public static void listPeople( Reader in )
    throws JsonIOException, JsonSyntaxException,
                   JsonParseException, IOException {
  Gson gson = new GsonBuilder().create();

  // Read file in stream mode
  try (JsonReader reader = new JsonReader(in)) {
         reader.beginArray();
         while (reader.hasNext()) {
            // Read data into object model
            Person person = gson.fromJson(reader, Person.class);
            System.out.println(person);
         }
  }
}

The Person Class

Gson maps Json objects to their equivalent Java class using the fromJson() method's second .class parameter. Among other things, the .class attribute provides a list of setters that Gson can call to set each class attribute. Note that our Java class does not have to contain all of the properties; we can just include the ones that we are interested in, which further limits memory consumption, by the way.

My Person class does contain every attribute because I used another online tool called jsonschema2pojo to generate the Person class directly from the JSON code. You just take the person in the previous snippet, paste it into the textbox, and select your options.

public class Person {
  private String id;
  private Integer index;
  private String guid;
  private Boolean isActive;
  private String balance;
  private String picture;
  private Integer age;
  private String eyeColor;
  private Name name;
  private String company;
  private String email;
  private String phone;
  private String address;
  private String about;
  private String registered;
  private String latitude;
  private String longitude;
  private List<Friend> friends = new ArrayList<Friend>();
  private String greeting;
  private String favoriteFruit;
 
  //getters and setters follow...

I used jsonschema2pojo to create the referenced Friend and Name classes as well.

Here is a zip file containing source files for the demo.

Conclusion

In today's article we covered how to use Gson to read large JSON data sets in stream mode to minimize the consumption of memory. This technique is especially pertinent to Android devices, which limit memory allocation to applications. In an upcoming article, we'll learn how to get at nested JSON attributes without having to read the entire outer object into memory.



Rob Gravelle

Rob Gravelle resides in Ottawa, Canada, and has built web applications for numerous businesses. Recently, he developed his own jquery-tables library.

Rob's alter-ego, "Blackjacques", is an accomplished guitar player, that has released several CDs. His band, Ivory Knight, was rated as one of Canada's top hard rock and metal groups by Brave Words magazine (issue #92) and reached the #1 spot in the National Heavy Metal charts on ReverbNation.com.



Make a Comment

Loading Comments...

  • Web Development Newsletter Signup

    Invalid email
    You have successfuly registered to our newsletter.
  •  
  •  
  •  
Thanks for your registration, follow us on our social networks to keep up-to-date