Custom JSON Loader

Custom JSON Loader

An in-depth breakdown of how I created my own custom JSON loader

·

12 min read

Introduction

Welcome to this blog post! This is going to be the first one not strictly on graphics programming or rendering code, but it is related in a bit of a roundabout way. If you haven’t read any of my other post so far, then you can do so here - and I’ll also give a bit of a breakdown as to what this series is about. My name is Brandon, and I have been programming using C++ for many years now. I focus all of my programming efforts into things gaming related, with my latest quest is to make my own custom game engine - which is also what this series of blog posts is about.

JSON

So, JSON, huh. Whats that? JSON is short for JavaScript Object Notation, and is a very useful format for storing data. It is very often compared to XML in terms of usefulness. Below is an example of a JSON file to see if you have seen the format somewhere before:

{
    "scene" : 0,
    "scenes" : [
        {
            "name" : "Scene",
            "nodes" : [
                0,
                1,
                2
            ]
        }
    ],    
    ],
    "meshes" : [
        {
            "name" : "Cube.001",
            "primitives" : [
                {
                    "attributes" : {
                        "POSITION" : 0,
                        "NORMAL" : 1,
                        "TANGENT" : 2,
                        "TEXCOORD_0" : 3
                    },
                    "indices" : 4
                }
            ]
        }
    ],
    "accessors" : [
        {
            "bufferView" : 0,
            "componentType" : 5126,
            "count" : 24,
            "max" : [
                1,
                1,
                1
            ],
            "min" : [
                -1,
                -1,
                -1
            ],
            "type" : "VEC3"
        },
    ],
    "bufferViews" : [
        {
            "buffer" : 0,
            "byteLength" : 288,
            "byteOffset" : 0
        },
    ],
    "buffers" : [
        {
            "byteLength" : 1224,
            "uri" : "cubetest.bin"
        }
    ]
}

(The actual data is from a simplified GLTF model file for a cube, and yes the bits I have removed make it invalid GLTF :( )

This format has lots of different uses, but in my context of creating a game engine my main ones are:

  • Storing model data using GLTF.

  • Holding lists of internal data.

  • Creating container type - such as a .skybox format (example below).

      {
          "Up": "Assets\Textures\Skyboxes\Apocalypse\apocalypse_up.png",
          "Down": "Assets\Textures\Skyboxes\Apocalypse\apocalypse_down.png",
          "Right": "Assets\Textures\Skyboxes\Apocalypse\apocalypse_right.png",
          "Left": "Assets\Textures\Skyboxes\Apocalypse\apocalypse_left.png",
          "Forward": "Assets\Textures\Skyboxes\Apocalypse\apocalypse_front.png",
          "Back": "Assets\Textures\Skyboxes\Apocalypse\apocalypse_back.png"    
      }
    

Structure of a JSON file

Now that you have seen a couple of file examples, i’m going to be breaking down what goes into making up the format.

Specification
If you are interested in seeing the official specification for JSON files, then it can be found here: https://www.json.org/json-en.html, or here: https://json-schema.org/specification

The file format itself has three main components:

  • Objects.

  • Variables.

  • Lists.

Each one has its own specific syntax denoting that it is of that type. Here is a breakdown of each:

Variables:

These are the simplest of the three and consists of a name, a colon, and a bit of data. This data can be a string or some arbitrary data. Here are some examples of valid variables:

"MyMagicVariableName": "My cool variable string",
"MyIntegralVariableOfMagic": 1.0,
"FunBool": true

Lists:

Being very similar to variables, lists consist of a name, a colon, and then a scope of square brackets […]. Within these square brackets is a list of data. Each element can theoretically be any type, but is mostly storing a collection of the same component type. Such as a list of objects, or a list of variables. Below is an example:

"StringListExample":[
    "I have done blogs on other topics",
    "such as order indepenent transparency",
    "and deferred rendering",
    "here is the link: https://indiegamescreation.hashnode.dev/"
],

"ObjectListExample":[
    "ObjectOne": {
    },

    "ObjectTwo":{
    }
]

Objects:

Objects are the most complex of the three. They are made up of a scope of curly brackets {…}.

They can have a name associated with them, given before the curly brackets, but don’t have to have one. They can also contain a bunch of different types of data, such as some variables, lists, and even other objects. Below is an example:

{
        "name" : "TestDataObject",
        "rotation" : [
             0.16907575726509094,
             0.7558803558349609,
            -0.27217137813568115,
             0.570947527885437
        ],
        "translation" : [
            4.076245307922363,
            5.903861999511719,
            -1.0054539442062378
        ],
        "Sub-object":{
        }
},

Loading Logic

Now that each component has been covered, how do you go about loading in this data into code? That is actually a really complex question due to the amount of variety that the format allows for. This is how I ended up going about it:

  • Create a code definition for each component.

  • Make a loop that goes through the file step by step and breaks it down into which component is currently being interpreted.

  • Based on the component determined, add to the current scope, which then changes the functionality of the loop.

  • When a full component is found, add it to the store and pop back the scope by one.

Code side definitions:

Variables have a name and some arbitrary data. So the data type below fully handles their use case:

std::pair<std::string, std::string>

Lists have a name, can contain anything, and have an unknown length, so this code covers its type.

std::pair<std::string, std::vector<std::any>>
std::any
For anyone unfamiliar with std::any I recommend reading up on it at least a little bit, it is a very useful concept for low level engine development: https://en.cppreference.com/w/cpp/utility/any

Objects are the most complex of the three and required a full ‘class’ definition:

struct JSONObject
{
    JSONObject();
    JSONObject(std::string name, JSONObject* parent);
    ~JSONObject();

    // ------------------------------------------------------------------------------------------------- //

    bool AddVariable(const std::string& variableName, const std::string& startValue);
    bool GetVariableValue(const std::string& variableName, std::string& returnValue);
    bool GetVariableValue(const std::string& variableName, std::string*& outValueAbleToBeChangedByCaller);

    // ------------------------------------------------------------------------------------------------- //

    bool AddList(const std::string& listName, const std::vector<std::any>& startData);
    bool GetList(const std::string& listName, std::vector<std::any>& outVector);
    bool GetList(const std::string& listName, std::vector<std::any>*& outValueAbleToBeChangedByCaller);

    // ------------------------------------------------------------------------------------------------- //

    bool AddSubObject(const JSONObject& newObjectData);
    bool GetSubObject(const std::string& objectName, JSONObject& outObject);
    bool GetSubObject(const std::string& objectName, JSONObject*& outValueAbleToBeChangedByCaller);

    // ------------------------------------------------------------------------------------------------- //    

    void OutputDataToFile(std::ofstream& openFile, unsigned int tabCount = 0, bool outputname = true);

    // ------------------------------------------------------------------------------------------------- //    

    void Clear();

    // ------------------------------------------------------------------------------------------------- //    

    JSONObject*                                                mParent;
    std::string                                                mName;
    std::vector<JSONObject>                                    mSubObjects;
    std::vector<std::pair<std::string, std::string>>           mVariables;
    std::vector<std::pair<std::string, std::vector<std::any>>> mLists;
};

This object contains lists for each of its contents types of variables, lists, and sub-objects. It also has a name, and a pointer to a parent, for the case where this object is a sub-object itself. All of the functions stated are for either adding or getting data from this object, with the exception of clear which clears everything stored. And OutputDataToFile() which writes out the data to a given file.

Json File

Now that there is a code definition for the format’s elements, a whole ‘file‘ definition can be begun to be stated. This definition looks like so:

class JSONFile
{
public:
    JSONFile(std::string filePath);
    JSONFile(char* data, unsigned int length);
    ~JSONFile();

    [[nodiscard]] JSONObject& GetRootNode() { return mFile; }

    void OutputData(); // Writes data out to the console for debug purposes
    void WriteDataBackToFile();
    void Clear();

private:
    void LoadFile(std::string filePath);
    void LoadFile(char* data, unsigned int length);

    void HandleDefiningObject  (char* data, unsigned int length, unsigned int& currentIndex, std::vector<JSONLoadingState>& fileLoadingStateStack, JSONObject*& currentObjectDefinition);
    void HandleDefiningVariable(char* data, unsigned int length, unsigned int& currentIndex, std::vector<JSONLoadingState>& fileLoadingStateStack, JSONObject*& currentObjectDefinition);
    void HandleDefiningList    (char* data, unsigned int length, unsigned int& currentIndex, std::vector<JSONLoadingState>& fileLoadingStateStack, JSONObject*& currentObjectDefinition);
    void LoopForOpeningOfFile  (char* data, unsigned int length, unsigned int& currentIndex, std::vector<JSONLoadingState>& fileLoadingStateStack, bool& startedReadingData);

    std::string GrabName(char* data, unsigned int& currentIndex);

    void        OutputDataForObject(JSONObject* object, unsigned int recursionDepth);

    JSONObject  mFile;
    std::string mFilePath;
};

This class takes advantage of the fact that JSON object will always have a ‘root‘ node - which is just an object that wraps around the entire scope of the file. Using this knowledge, the file can hold a root node, and all of the sub-data is held within it.

[[nodiscard]]
This is something that many people may not have seen before, and its not something that I use often. What it does is cause the compiler to throw a warning if the caller doesn’t do anything with the returned value (storing it in a variable counts as doing something). https://en.cppreference.com/w/cpp/language/attributes/nodiscard

Reading in the File

Now that each element of the process has a wrapper around it, I can get to the actual loading functionality. The logic for this part is to setup a while loop to go until either the end of the file has been reached, or some fail-safe loop count has been hit - as an infinite loop is no fun thing.

Eagle eyed readers may have noticed that there was two different LoadFile functions defined before, one taking a file path and another using an array of chars and a length. The one with a file path loads in the file’s raw data, and then passes it into the other function to keep the loading flows the same.

The LoadFile(char* data, unsigned int length) version is actually fairly straightforward (it wasn’t when coming up with it, but is now :D)

Here is the definition:

void JSONFile::LoadFile(char* data, unsigned int length)
{
    JSONObject*                   currentObjectDefinition = &mFile;
    std::vector<JSONLoadingState> fileLoadingStateStack   = {JSONLoadingState::None};
    unsigned int                  currentIndex            = 0;
    bool                          startedReadingData      = false;

    while(currentIndex < length)
    {
        switch (fileLoadingStateStack.back())
        {
        default:
            ASSERTFAIL("Invalid JSON formatting of file!");
        break;

        case JSONLoadingState::None:
            // If we have read any data so far and gotten back to this point then we have finished the file
            if (startedReadingData)
                return;

            LoopForOpeningOfFile(data, length, currentIndex, fileLoadingStateStack, startedReadingData);
        break;

        case JSONLoadingState::DefiningObject:
            HandleDefiningObject(data, length, currentIndex, fileLoadingStateStack, currentObjectDefinition);
        break;

        case JSONLoadingState::DefiningVariable:
            HandleDefiningVariable(data, length, currentIndex, fileLoadingStateStack, currentObjectDefinition);
        break;

        case JSONLoadingState::StatingList:
            HandleDefiningList(data, length, currentIndex, fileLoadingStateStack, currentObjectDefinition);
        break;
        }
    }

    // Now we have handled the whole file so the stack should be completely empty
    ASSERTMSG(fileLoadingStateStack.empty(), "JSON File is an invalid format!");
}

The logic behind it is to have a ‘stack‘ of current scope, where the scope is added to/removed from based on the current state - be that defining a specific concept, or looping for the opening of the file. The JSON states enum looks like this:

enum class JSONLoadingState : char
{
    None, // Starting state
    StatingList,
    DefiningObject,
    DefiningVariable
};

Each sub-function gets fairly complex, but all work on the same idea: loop through the data from where was passed in, and depending on what is read, define new objects or break out. For the sake of completeness (and because this blog has ended up going into a lot more detail than I had originally planned) I’m going to give the function definitions for finding the start of the file (as its by far the most simple), and then give the one for defining an object (as its the most complex).

void JSONFile::LoopForOpeningOfFile(char* data, unsigned int length, unsigned int& currentIndex, std::vector<JSONLoadingState>& fileLoadingStateStack, bool& startedReadingData)
{
    bool looping = true;

    while (looping && currentIndex < length)
    {
        switch (data[currentIndex])
        {
        case '{':
            fileLoadingStateStack.push_back(JSONLoadingState::DefiningObject);
            looping            = false;
            startedReadingData = true;
        break;

        default: break;
        }

        ++currentIndex;
    }
}

What this is doing is looking for the opening ‘{‘ at the start of the file. If it doesn’t find it then it increments the ID and moves onto the next character. Now for the complex one (I have compressed it a bit to make it take up less space but not removed any actual code here):

void JSONFile::HandleDefiningObject(char*                          data, 
                                    unsigned int                   length, 
                                    unsigned int&                  currentIndex, 
                                    std::vector<JSONLoadingState>& fileLoadingStateStack, 
                                    JSONObject*&                   currentObjectDefinition)
{
    bool        looping = true;
    std::string name    = "";

    while (looping && currentIndex < length)
    {
        switch (data[currentIndex])
        {
        // Speech marks means that we are defining either a new object within this object, variable or list
        case '"':
        {
            name = GrabName(data, currentIndex); // Get the name of this object            
            bool searching = true;               // Find the next valid character

            while (searching)
            {
                switch (data[currentIndex])
                {
                default:
                    searching = false;
                break;

                case ' ':
                case '\t':
                case '\n':
                case ':':
                    currentIndex++;
                break;
                }
            }

            // See what the object type is next
            switch (data[currentIndex])
            {
            case '[':
                fileLoadingStateStack.push_back(JSONLoadingState::StatingList);

                // Create the new list in the object - giving it its name
                currentObjectDefinition->mLists.push_back(std::pair(name, std::vector<std::any>()));
                currentIndex++;
            return;

            case '{':
                fileLoadingStateStack.push_back(JSONLoadingState::DefiningObject);

                // Create the new sub-object in the current object - giving it its name and parent
                currentObjectDefinition->mSubObjects.push_back(JSONObject(name, currentObjectDefinition));

                // Move the object being pointed to down one
                currentObjectDefinition = &currentObjectDefinition->mSubObjects.back();
                currentIndex++;
            return;

            default:
                fileLoadingStateStack.push_back(JSONLoadingState::DefiningVariable);

                // Create the new variable, giving it is name
                currentObjectDefinition->mVariables.push_back(std::pair(name, ""));
            return;
            }
        }
        break;

        // Ending of an object definition
        case '}':

            // Remove the defining state from the stack
            fileLoadingStateStack.pop_back();

            // Move the object definition stack back up one to the parent
            currentObjectDefinition = currentObjectDefinition->mParent;

            // Move past the bracket
            currentIndex++;
        return;

        case ' ':  // Empty space
        case '\n': // New line
        case ',':  // Comma
        case '\0': // End line statement
            ++currentIndex;
        break;

        // If we dont know how to handle this state then we need to throw an error
        default: ASSERTFAIL("JSON file format is invalid!");
        }
    }
}

The other two functions follow the same idea as the two above, and I invite you to try writing them just to see how finicky they can be to get right.

That’s it for the loading logic. Now I’ll give an example of the classes being used so that you can see how useful this flow can be.

Using the Code:

The whole reason I originally wrote a JSON loader was to facilitate loading in GLTF models into my engine (as they are stored using JSON). Yes I could have used an existing library for this, but where is the fun in that? Showing how to load GLTF model data into code would take up an entire article itself, so I’m going to give a more basic example here.

Here is an example of loading in JSON data from file, modifying the contents, and then writing it back out:

    JSONFile exampleFile(filePath);
    JSONObject& root = exampleFile.GetRootNode();

    std::string exampleVariable;
    if (root.GetVariableValue("testVariable", exampleVariable))
    {
        exampleVariable = "New value being set";
    }

    std::vector<std::any> examplelist;
    if (root.GetList("testList", examplelist))
    {
        examplelist.clear();
    }
    exampleFile.WriteDataBackToFile();

Conclusion

To conclude, this post has covered what JSON is, common uses for the file format, and how I went about creating my own loader using C++. This has ended up being one of my longer blogs, but I felt that completeness in the code for this one was important, just in case someone was actually following along trying to write their own using this as a base idea.

Thanks for reading! If you are interested in this topic, or in the area of graphics programming, then I have other blog posts available here: https://indiegamescreation.hashnode.dev/. Or you could sign up for the newsletter so that you get notified whenever I post a new one. Oh, and if you liked the blog then there is a like button at the bottom of the page. I have been trying to understand this website’s algorithm recently and have an idea that it is entirely based off of the amount of likes a post has.

Next weeks blog is going to be on how I have used scripting to enhance my engine development flows.