Parse and Transform JSON

In the previous exercise, we used the IonLoader, a high-level API for parsing JSON and loading it into Ion objects. This API is easy to use and is perfect for most needs. Any required data transformations can be done after the entire JSON object is converted to Ion.

However, Ion also offers a low-level, high-speed parsing API that may be useful for cases where the conversion of JSON to Ion must be more tightly controlled or where building an Ion document in memory may not be desirable.

In this exercise, we’ll perform the same work we did in the previous exercise–loading a JSON object into Ion and reformatting the date of birth–but we’ll use the low-level parsing APIs to do it.

We’ll start where we left off in the last section. The App.java file should be open in the Cloud9 IDE.

Create a new method for this exercise in the App.java file by adding the code below.

private static void parseFromJson() throws Exception {
}

Modify the main() method to call our new method for this exercise instead of the method for the last exercise. The main() method should now look like this:

public static void main(String[] args) throws Exception {
    //buildAndWriteIon();
    //readAndUpdateIon();
    //readFromJsonTargetedReplacement();
    parseFromJson();
}

Add the following import to the App.java file.

import java.util.ArrayList;

Add the line below to the parseFromJson() method. This declares a string variable that contains our JSON data, just as did in the previous exercise.

String jsonInput = "{\"PersonId\":\"987654321\",\"FirstName\":\"Mary\",\"LastName\":\"Smith\",\"MoneyInWallet\":143.39,\"DateOfBirth\":\"1979-10-15\",\"NumberOfLegs\":2,\"LikesGreenBeans\":true,\"ThingsInPocket\":[\"phone\"],\"HomeAddress\":{\"Street1\":\"400 N. Broadway\",\"City\":\"Yonkers\",\"State\":\"NY\",\"Zip\":\"10705\"}}";

Next we’ll initialize the IonSystem as before. We’ll also create a java.util.ArrayList to store parsed documents in.

IonSystem ionSys = IonSystemBuilder.standard().build();
ArrayList<IonValue> values = new ArrayList<IonValue>();

Next we’ll build an IonReaderBuilder. We’ll use this to build an IonReader that will help us parse our JSON.

IonReaderBuilder readerBuilder = IonReaderBuilder.standard();

Now read and parse the JSON input using the IonReader using the code below. The code uses the IonReaderBuilder to build an appropriate IonReader for our JSON input. IonReaders can be built from a number of data sources, including java.io.InputStream, java.io.Reader, IonValue objects, strings, and byte arrays.

The IonReader is a low-level API. We use the reader to parse each individual component and to “step in” and out of container types. Here, we call reader.next() in a loop to read the items(s) at the top of our JSON object. For each iteration, we’ll call a parseElement() method that we have not yet written. We’ll add all of our top-level JSON objects to our ArrayList. Finally, we’ll pretty-print all of the Ion documents in our ArrayList.

The real magic happens in the parseElement() method which we’ll tackle next.

try (IonReader reader = readerBuilder.build(jsonInput)) {
    while (reader.next() != null) {
        values.add(parseElement(ionSys, reader));
    }
}

for (IonValue value: values) {
    System.out.println(value.toPrettyString());
}

Add a new method to our App.java file by inserting the code below.

private static IonValue parseElement(IonSystem ionSys, IonReader reader) throws Exception {
}

The parseElement() method will extract IonValue objects from the parsed JSON object using the reader and will recursively call itself for container types such as IonStruct and IonList. This method is where we’ll do our data-type transformation for the date of birth field.

Modify the parseElement() method as listed below. This will serve as a skeleton for the method that we’ll fill in. We start by declaring the IonValue value that we’ll return back to the method’s caller. Next, we use a switch to process parsed values based on their type as identified by our reader. Finally, we return the parsed value.

private static IonValue parseElement(IonSystem ionSys, IonReader reader) throws Exception {

    IonValue value = null;

    switch (reader.getType()) {
    }

    return value;
}

Add code below to the switch statement. Each of the cases create a new IonValue type using the IonSystem and a data parsed from the IonReader.

case BOOL:
    value = ionSys.newBool(reader.booleanValue());
    break;
case DECIMAL:
    value = ionSys.newDecimal(reader.bigDecimalValue());
    break;
case FLOAT:
    value = ionSys.newFloat(reader.doubleValue());
    break;
case INT:
    value = ionSys.newInt(reader.intValue());
    break;
case TIMESTAMP:
    value = ionSys.newTimestamp(reader.timestampValue());
    break;    

Next we need to handle parsed string values. We want to transform our date of birth string value into an IonTimestamp. Add the code below to the switch statement. For strings, we identify whether or not we’re dealing with a date by matching the string against a regular expression. If the string does not match the date pattern, we simply create and populate a new IonString value as we did for the other data types. However, if the string does match the date pattern, we’ll convert it to an IonTimestamp as we did in our previous exercise.

case STRING:
    String str = reader.stringValue();
    if (str.matches("^\\d{1,4}-\\d{1,2}-\\d{1,2}$")) {

        String[] parts = str.split("-");

        int year = Integer.parseInt(parts[0]);
        int month = Integer.parseInt(parts[1]);
        int day = Integer.parseInt(parts[2]);

        value = ionSys.newTimestamp(Timestamp.forDay(year, month, day));
    } else {
        value = ionSys.newString(str);
    }
    break;

Next we’ll add logic to handle structure types, such as the root-level document and nested documents. Add the code below to the switch statement.

For container types, we need the reader to “step in” and out of them to read their values. We start by calling stepIn() on the reader. We’ll create an empty IonStruct to serve as our container. Next we iterate over all of the values we parse in the nested structure and add it to our IonStruct. For each value, make a recursive call to parseElement(). This will enable us to handle multiple layers of nesting. When we’ve read all of the elements at the current level of the JSON document, call stepOut() on our reader to return to the level above.

case STRUCT:
    reader.stepIn();
    IonStruct struct = ionSys.newEmptyStruct();
    while (reader.next() != null) {
        struct.put(reader.getFieldName(), parseElement(ionSys, reader));
    }
    value = struct;
    reader.stepOut();
    break;

Next we’ll add logic to handle list types. Add the code below to the switch statement.

The logic for parsing list types is similar to the logic for struct types. Step in to the element with the reader, read all values at this level, parse them using parseElement(), and add them to the list. Step out of the level when all elements have been read.

case LIST:
    reader.stepIn();
    IonList list = ionSys.newEmptyList();
    while (reader.next() != null) {
        list.add(parseElement(ionSys, reader));
    }
    value = list;
    reader.stepOut();
    break;

The parseFromJson() and parseElement() methods should now look like this:

private static void parseFromJson() throws Exception {
    String jsonInput = "{\"PersonId\":\"987654321\",\"FirstName\":\"Mary\",\"LastName\":\"Smith\",\"MoneyInWallet\":143.39,\"DateOfBirth\":\"1979-10-15\",\"NumberOfLegs\":2,\"LikesGreenBeans\":true,\"ThingsInPocket\":[\"phone\", \"lipstick\"],\"HomeAddress\":{\"Street1\":\"400 N. Broadway\",\"City\":\"Yonkers\",\"State\":\"NY\",\"Zip\":\"10705\"}}";

    IonSystem ionSys = IonSystemBuilder.standard().build();
    ArrayList<IonValue> values = new ArrayList<IonValue>();

    IonReaderBuilder readerBuilder = IonReaderBuilder.standard();    
    try (IonReader reader = readerBuilder.build(jsonInput)) {
        while (reader.next() != null) {
            values.add(parseElement(ionSys, reader));
        }
    }

    for (IonValue value: values) {
        System.out.println(value.toPrettyString());
    }        
}


private static IonValue parseElement(IonSystem ionSys, IonReader reader) throws Exception {

    IonValue value = null;

    switch (reader.getType()) {
        case BOOL:
            value = ionSys.newBool(reader.booleanValue());
            break;
        case DECIMAL:
            value = ionSys.newDecimal(reader.bigDecimalValue());
            break;
        case FLOAT:
            value = ionSys.newFloat(reader.doubleValue());
            break;
        case INT:
            value = ionSys.newInt(reader.intValue());
            break;
        case TIMESTAMP:
            value = ionSys.newTimestamp(reader.timestampValue());
            break;             
        case STRING:
            String str = reader.stringValue();
            if (str.matches("^\\d{1,4}-\\d{1,2}-\\d{1,2}$")) {

                String[] parts = str.split("-");

                int year = Integer.parseInt(parts[0]);
                int month = Integer.parseInt(parts[1]);
                int day = Integer.parseInt(parts[2]);

                value = ionSys.newTimestamp(Timestamp.forDay(year, month, day));
            } else {
                value = ionSys.newString(str);
            }
            break;
        case STRUCT:
            reader.stepIn();
            IonStruct struct = ionSys.newEmptyStruct();
            while (reader.next() != null) {
                struct.put(reader.getFieldName(), parseElement(ionSys, reader));
            }
            value = struct;
            reader.stepOut();
            break;
        case LIST:
            reader.stepIn();
            IonList list = ionSys.newEmptyList();
            while (reader.next() != null) {
                list.add(parseElement(ionSys, reader));
            }
            value = list;
            reader.stepOut();
            break;                
    }

    return value;
}            

Note that this code is not a complete parsing solution and is not intended for use in your applications.

Run the code. The output should look like this:

{
  PersonId:"987654321",
  FirstName:"Mary",
  LastName:"Smith",
  MoneyInWallet:143.39,
  DateOfBirth:1979-10-15,
  NumberOfLegs:2,
  LikesGreenBeans:true,
  ThingsInPocket:[
    "phone",
    "lipstick"
  ],
  HomeAddress:{
    Street1:"400 N. Broadway",
    City:"Yonkers",
    State:"NY",
    Zip:"10705"
  }
}

Note that our date of birth has been converted from a string to an Ion timestamp as before. Since we performed the conversion inline as we parsed, the “DateOfBirth” field has not been moved to the bottom of the document as it was in the previous exercise.

This method of parsing the JSON input and building an Ion document from it is clearly more work than the previous method that used the IonLoader. However, this method provides much more control of the conversion process.

If you are having trouble getting your program to run, click here to download the complete App.java file for this lab.