JSON Schema Validator, Generator & Editor Guide

How the JSON standard is defined and how to put it to use in your code and in your APIs

      

Two decades after its introduction, JSON is a widely used data interchange format. Flexible and language-agnostic, JSON can represent simple to complex data in a way that’s easy for humans and machines to interpret. Due to the rise of mobile and APIs, JSON has become widespread throughout the industry.

In this guide, we’ll see how the JSON standard is defined and how you can put it to use the right way—in your code and in your APIs.

🔗 🔗 What is JSON?

JSON is JavaScript Object Notation, a language-independent data format commonly used by APIs to communicate requests and responses.

While based on JavaScript, JSON can be used in any modern language. The simple, flexible JSON syntax can express most data structures. JSON is easy to read and write, for both humans and machines. Compared to more verbose data formats, such as XML, JSON is lightweight, with little syntactical overhead.

You can format JSON with just a few characters on a standard keyboard. For example:

{
  "id": 246,
  "name": "Jason Harmon"
}

As you’ll see later in this guide, JSON is capable of much more complex data. The individual elements within even the most advanced JSON structure is based on a few simple rules. These rules—the notation of JSON itself—are derived from the language that inspired the format, JavaScript.

🔗 🔗 JSON vs JavaScript

JSON is not only named after JavaScript, it was created based upon the language. Specifically, JSON is a subset of the JavaScript standard ECMA-404. However, JSON is not the same as JavaScript. JSON is a data format, while JavaScript is a scripting language.

In the early 2000s, engineer and architect Douglas Crockford defined JSON as a lightweight data interchange format. At the time, XML was a popular data format, though it posed some difficulties for some languages, most notably JavaScript. Parsing XML into a data structure is computationally-intensive for larger files. In addition, XML’s tag-based syntax makes for bulky and redundant data.

JSON, on the other hand, can be evaluated directly by JavaScript (though for security reasons, it’s best to parse the data). Other languages are also able to easily consume JSON data.

🔗 🔗 JSON Structure

Perhaps the most recognizable element of JSON is the curly brackets { and } that typically wrap JSON data or files. Indeed, those brackets define an object, an important part of the JSON structure. However, objects are only one element of a larger definition of JSON.

The JSON format includes four core data types and two data structures that can hold multiple values. In combination, these six types allow JSON to describe data of various shapes, from simple records to complex, nested documents.

JSON structure includes these six data types:

  • Object
  • Array
  • String
  • Number
  • Boolean
  • Null

An object contains one or more values, attached to a key. Also called a hashtable or dictionary in some programming languages, this JSON data structure uses the curly brackets to surround the object, colon to separate key from value, and commas to delimit each value.

An array contains one or more values in an ordered list. Unlike objects, arrays have no keys, only values. Arrays use square brackets [ and ], with commas to delimit each value.

Both objects and arrays may contain any data type as values, including other objects and arrays. The structure is much easier to understand by example, many of which are provided in the next section.

🔗 🔗 JSON Examples

Just as a picture is worth 1,000 words, you can better understand this data format through some JSON file examples. What is a JSON file and how does the structure play out in practice? We have several sample JSON objects in the sections that follow.

🔗 🔗 Basic JSON Object

You’ve already seen a basic JSON object in the previous section. It uses curly brackets to encapsulate related pieces of data. Let’s add a few more fields to this JSON sample to represent a todo list task.

{
  "id": 12345,
  "name": "Do a thing",
  "completed": false,
  "completed_at": null
}

The object data type is the most important JSON building block. Objects contain keys, which translate to field names in API requests, necessary to deliver data between systems.

🔗 🔗 JSON Array Example

The second data structure within JSON is the array, which holds a list of values. While the order of properties within objects is often not maintained, position matters in an array.

[12345, “Do a thing”, false, null]

Arrays use square brackets [ and ], rather than the curly style employed by objects. In the example array above we’ve represented the values from the todo JSON object, primarily to show that each element of an array does not need to be the same type.

In practice, arrays will most often include similar data:

[1, 1, 2, 3, 5, 8, 13, 21]

Like objects, arrays will show up in most API responses, as APIs often need to show multiple results (often referred to as “collections”).

🔗 🔗 Complete JSON Example

We can expand the todo example to show a list of tasks. In the process, we’ll use all six data types JSON has to offer:

[
  {
    "id": 12345,
    "name": "Do a thing",
    "completed": false,
    "completed_at": null
  },
  {
    "id": 67890,
    "name": "Do another thing",
    "completed": false,
    "completed_at": null
  }
]

The two todo list objects are held within a single array. Each todo list object contains an id number, name string, completed boolean, and completed_at holds a null value.

🔗 🔗 Nested JSON Example

One important distinction of JSON data is its flexibility. Both objects and arrays can also be values, which allows for hierarchy and nesting.

This sample JSON expands the todo list examples used previously to show an example of nested values:

{
  "two_week_task_counts": [
    [11, 4, 2, 23, 6, 14, 22],
    [14, 1, 3, 26, 11, 24, 9]
  ]
  "task_list": [
    {
      "id":12345,
      "name":"Do a thing",
      "completed":false,
      "completed_at":null,
      "next_task_ids": [67890],
      "user":{
        "id":246,
        "name":"Jason Harmon"
      }
    },
    {
      "id":67890,
      "name":"Do another thing",
      "completed":false,
      "completed_at":null,
      "next_task_ids": [],
      "user":{
        "id":246,
        "name":"Jason Harmon"
      }
    }
  ]
}

Though a large block of text, the primary JSON object only includes two fields: two_week_list_count and task_list.

Within the first is an array holding two values that each represent a week. Each of those values is, itself, an array with seven other values (all integers, in this case). Just in the first couple lines of the JSON file, we have three levels of data: the object key, the array of weeks, and each week’s days.

The second field of the primary JSON object holds the task list we used in previous examples. That array of two tasks is now another level deeper, inside the object. Each todo object has a couple more fields, as well. The next_task_ids field is an array of task IDs (though the example data shows arrays of one and zero items). There is also a user object, with its own set of fields.

Data is often even more complex than this example, but here we see how nested JSON works. It’s flexible enough to represent most data.

🔗 🔗 JSON Date Format

In our example task list, we never saw an example of a completed task. In that situation, the completed_at field would need to represent a date and time. JSON has no explicit date format, so they are typically defined as strings.

There are other date standards, with ISO 8601 being the most widely used. RFC3339 defines includes several methods, depending on what you want to communicate. Here’s how we might express a date and time in our todo JSON object:

{
  "id":12345,
  "name":"Do a thing",
  "completed":true,
  "completed_at":2021-04-01T00:01:30.237Z”
}

According to this data, todo with ID 12345 was completed a little after midnight UTC time on April 1, 2021.

This format is further supported by OpenAPI data types, which include two format types for dates: date and date-time.

🔗 🔗 Create JSON From Your Data

Most JSON is output from code, not written by hand. In your programming language of choice, you can convert an object to JSON, using whichever data structures are supported by your language.

Below we’ll show some examples in popular languages.

🔗 🔗 PHP Object to JSON

<?php
$oneTask = new stdClass();
$oneTask->id = 12345;
$oneTask->name = "Do a thing";
$oneTask->completed = False;
$oneTask->completed_at = NULL;

echo json_encode($oneTask);
?>

🔗 🔗 Python Dictionary to JSON

import json
oneTask = {
  'id': 12345,
  'name': 'Do a thing',
  'completed': False,
  'completed_at': None
}
json.dumps(oneTask)
print(json.dumps(oneTask))

🔗 🔗 JavaScript JSON Stringify

const oneTask = {
  id: 12345,
  name: 'Do a thing',
  completed: false,
  completed_at: null,
};
console.log(JSON.stringify(oneTask));

🔗 🔗 Use JSON in APIs

JSON is the most popular data format for public APIs, outpacing XML by more than 10% all time. However, looking only at the recent data, the story is even clearer: there are more than 5X the number of JSON APIs as the nearest format, and APIs are twice as likely to use the JSON content type as any other format.

Alongside JSON, most APIs use REST or otherwise HTTP-driven interfaces. To build your own, you can learn from many public APIs, examples available on GitHub, and API description formats.

In this section, we’ll cover some REST API best practices with JSON data, as well as some JSON standards: the OpenAPI Specification, JSON Schema, and JSONAPI.

🔗 🔗 Use JSON as Request or Response Data

In the previous section, we described creating JSON from your data. When a developer makes a request to your API, they will receive your JSON text as a response. What had previously been a data structure in your programming language is serialized into a JSON string. The API consumer then reads this JSON into their code, likely converting it into a native object of their language. Notably, your server and the client do not need to use the same language or toolset, one of the advantages of the JSON format.

For example, you might call the endpoint /todos and get the following response:

[
  {
    "id": 12345,
    "name": "Do a thing",
    "completed": false,
    "completed_at": null
  },
  {
    "id": 67890,
    "name": "Do another thing",
    "completed": false,
    "completed_at": null
  }
]

Here we see a JSON array containing two objects, each with details on a particular “todo,” or task. The content type JSON uses is application/json, but it will be sent as plain text. You can see a similar result right in your browser by visiting todos.stoplight.io/todos.

An API is likely to use multiple HTTP methods on the same endpoint. While the above example uses GET to retrieve the list of todos, a REST API might use a POST to the same URL to create a new todo. In this case, JSON will be included in the request data to describe the new todo:

{
  "name": "Do one more thing",
  "completed": false
}

This JSON data is a subset of the full data, but it’s all that’s required to create a new todo. Often, a POST request will then respond with more JSON, this time just the object created by the API call:

{
  "id": 67891,
  "name": "Do one more thing",
  "completed": false,
  "completed_at": null
}

Here we get the full todo object, including the auto-generated ID. But there’s only one item, because that’s all that’s needed. If you GET the endpoint again, you’d expect to receive an array with three todos, rather than two.

These simple examples show the bulk of how JSON is used in a typical API. However, more important are the standards used to support these common JSON use cases. Most notable among them is OpenAPI, covered in the next section.

🔗 🔗 OpenAPI Specification

Most developers can easily build an API that returns JSON data. In fact, it’s ordinary and expected of a modern developer. What’s much less common is for a developer to build a robust, documented API. However, that’s much easier than it used to be, thanks to the OpenAPI specification.

OpenAPI is a description format that defines an API’s servers, endpoints, and data objects. It can be used in the earliest planning stages of an API to collaborate on the API design. The API description, a file stored as either JSON or YAML, can be used throughout the API lifecycle. You can generate documentation, build mock servers, and validate that you build an API that matches your design.

Here is an API example, described with OpenAPI:

{
  "openapi": "3.0.0",
  "info": {
    "title": "todos",
    "version": "1.0"
  },
  "servers": [
    {
      "url": "https://todos.stoplight.io"
    }
  ],
  "paths": {
    "/todos": {
      "get": {
        "summary": "Your GET endpoint",
        "tags": [],
        "responses": {
          "200": {
            "description": "OK",
            "content": {
              "application/json": {
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/components/schemas/todo-full"
                  }
                }
              }
            }
          }
        },
        "operationId": "get-todos",
        "description": "List todos"
      },
      "post": {
        "summary": "",
        "operationId": "post-todos",
        "responses": {
          "200": {
            "description": "OK",
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/todo-full"
                }
              }
            }
          }
        },
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/todo-partial"
              }
            }
          }
        }
      }
    }
  },
  "components": {
    "schemas": {
      "todo-full": {
        "title": "todo-full",
        "type": "object",
        "properties": {
          "id": {
            "type": "number"
          },
          "name": {
            "type": "string"
          },
          "completed": {
            "type": "boolean"
          },
          "completed_at": {
            "type": ["string", "null"],
            "format": "date-time"
          }
        },
        "required": ["id", "name", "completed"]
      },
      "todo-partial": {
        "title": "todo-partial",
        "type": "object",
        "properties": {
          "name": {
            "type": "string"
          },
          "completed": {
            "type": "boolean"
          }
        },
        "required": ["name", "completed"]
      }
    }
  }
}

This OpenAPI document is written as JSON, but you can see an example OpenAPI YAML here. The format used to describe an API is not tied to the format used for requests and responses. Since most APIs use JSON, most OpenAPI documents describe JSON—note the application/json within the content object above.

While JSON is a compact, flexible data format, consumers of a JSON API want to expect certain fields in the responses. OpenAPI provides that expectation—which some refer to as an “API contract.” In the above example, we know that every todo will always include id,name, andcompleted. Optionally, it may also include acompleted_at` timestamp.

OpenAPI helps you describe all aspects of your APIs. Another related format, JSON Schema, is dedicated to the objects within an API or any other place JSON is used.

🔗 🔗 JSON Schema Specification

Before JSON became the popular data format, XML was most commonly used for data interchange. Almost every API at that time would include XML responses, regardless of the type of API (often REST or SOAP). While more verbose than JSON, XML strictly adheres to schemas, which determine the expected elements within the data. Eventually, JSON needed a similar approach, which led to the creation of the JSON Schema specification.

JSON Schema is a standard to describe JSON documents. API request and response data is one of the common uses of JSON Schema, but any JSON file can be described. For example, configuration files are another common usage of JSON Schema, where both parties need to validate that a file meets the schema.

So what does a JSON Schema file look like? Well, it’s written in JSON, so it gets all the benefits of the format, such as strong readability and simple writability. That also means that everything we’ve touched on so far, such as data types, also applies to JSON Schema files.

JSON Schema files contain information as keywords. Think of a JSON Schema file as following the general JSON format, but the schema itself constrains the availability of keywords. Let’s look at what keywords can make up a JSON Schema file.

🔗 🔗 JSON Schema Metadata

JSON Schema files often start with the following metadata keywords:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "http://example.com/product.schema.json",
  "title": "Product",
  "description": "A product in the catalog",
  "type": "object"
}

The first keyword listed is the $schema keyword. JSON Schema is continually updated, so it’s good practice to include the version of the standard.

Note that title and description are descriptive keywords (also known as annotations). JSON files won’t be validated against these, but they are still useful for understanding the schema’s purpose.

You’ll notice that we specify the entire JSON Schema’s type: object. As we’ll look at soon, JSON Schema can specify what data type its properties are. While individual properties are user-configurable, this is the one case where type must be set.

🔗 🔗 JSON Schema Properties

For validation purposes, we must turn to the properties keyword. Within properties, JSON Schema files contain keys, which are described using validation keypairs. The key can be describing any number of things that will show up in the relevant JSON file. Let’s look at a real-world example:

{
  "type": "object",
  "properties": {
	"id": {
  	"type": "number"
	}
  }
}

With this JSON Schema file, we’ve established that we can expect, but not require, JSON files with the id key. The power of JSON Schema files lies in imposing conditions on those keys for validation.

JSON Schema Data Types

In this example, The id key has the validation keyword type set to number. Therefore, any customerID in a validated JSON file must be of the number data type, such as the following file:

{
  "id": 123
}

If you were to validate the a JSON file using this schema, an id of another data type (a string, for example) wouldn’t pass validation.

JSON Schema uses the same data types as JSON, with the addition of integer. If decimals are required, you can use a number instead.

JSON Schema Data Formats

This ability to further specify what is acceptable in the file can be honed even more. JSON Schema contains many validation keywords for all data types. That being said, there are often times where you’ll need additional, or even custom, keywords. JSON Schema has a way to address that.

JSON Schema does so using the format keyword. This can be used to specify strings with a particular semantic structure. The built-in formats cover things such as date formats and email addresses.

{
  "type": "object",
  "properties": {
	"id": {
  	"type": "number"
	},
	"email-address": {
  	"type": "email"
	}
  }
}

In this example, we’re looking for a string that matches the form contains an “@”, among other criteria. Basically, it needs to be a real email address, and these criteria ensure it at least takes the form of one. If the string doesn’t take that form, it won’t be valid, at least in theory.

There’s a catch, though. Validation with format keywords isn’t a required feature for validators. This means that you’ll have to look into your own validator to see how it handles specific format keywords.

Why is this the case? Well, it’s actually to provide more flexibility for JSON files. This same technique can be used to specify more niche keywords. Perhaps your field of work has a commonly used format. As long as you use a validator that supports custom format keywords, you can streamline your JSON files.

🔗 🔗 Required Fields in JSON Schema

In our original example, JSON files are only valid if their id key is set to a number. That’s a simplification, however. It would be more accurate to say our JSON Schema invalidates files if their id key is not a number. What’s the difference? Well, a JSON file would still be validated if there was no id key at all. Keys are only required by a JSON Schema if explicitly set to be so.

Fortunately, JSON Schema files can specify which keywords are necessary for validation with the required validation keyword:

{
  "type": "object",
  "properties": {
	"id": {
  	"type": "number"
	}
  },
  "required": [
	"id"
  ]
}

The required validation keyword is an array containing all the keys that JSON files must have to be considered valid. In the above example, the id key is required. Now, a file will only be validated if the following conditions are met:

  • There is a key named id
  • The key id is any number

This allows for straight-forward requirements. That being said, JSON Schema also supports the ability to implement conditional requirements.

🔗 🔗 Dependencies

Dependencies allow the schema to modify itself if certain conditions are met. Let’s look at an example using property dependencies. Imagine we have a JSON file that contains a user-id, an order-id, and an address for that user. If someone has placed an order, then the JSON Schema should require an address to go along with it. This conditional requirement can be done with dependencies:

{
  "type": "object",
  "properties": {
	"user-id": {
  	"type": "number"
	},
	"order-id": {
  	"type": "number"
	},
	"address": {
  	"type": "string"
	}
  },
  "dependencies": {
	"order": [
  	"address"
	]
  }
}

That’s not the only way to implement dependencies; there are schema dependencies as well.

Of course, this is a high-level overview of the attributes that make up JSON Schema. There’s a lot more you can do with it.

🔗 🔗 JSON Schema vs OpenAPI

Here’s another example of how a simple JSON Schema file might look:

{
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "$id": "https://todos.stoplight.io/schema/full-todo",
    "title": "Todo Item",
    "type": "object",
    "properties": {
      "id": {
        "type": "number"
      },
      "name": {
        "type": "string"
      },
      "completed": {
        "type": "boolean"
      },
      "completed_at": {
        "type": [
          "string",
          "null"
        ],
        "format": "date-time"
      }
    }
}

If this looks familiar, it describes the format used within the example API shown in the OpenAPI section of this guide. In fact, the object within the properties section is identical to the todo-full schema from the OpenAPI example document.

JSON Schema predates OpenAPI and the newer format took inspiration from JSON Schema. However, both formats evolved separately, so they’re slightly different. The OpenAPI community is working to remove the differences in future versions. In the meantime, you’ll want to understand where OpenAPI and JSON Schema differ if you plan to implement both.

And there might be a reason to use JSON Schema and OpenAPI for different purposes within your organization. The biggest difference between the use cases for the two formats is that JSON Schema can be used to describe non-APIs, as well. The best example may be OpenAPI itself, which uses a JSON Schema to define the format of OpenAPI documents.

🔗 🔗 JSON:API Specification

So far, the JSON-related API specifications explained in this guide are meant to document your own decisions when building an API. This final in the trio of JSON standards helps you make some decisions consistent with industry best practices.

When designing APIs, you will face a lot of decisions. You will need to name fields, create error messages, and determine how to communicate pagination, among others. The JSON:API specification makes many of these decisions for you, so you can get to designing the part of your API that is unique.

Another advantage to using the JSON:API recommendations is developers will know what to expect. At a minimum, you want to be consistent between your own APIs. Even better is to use the same conventions that many others have agreed-upon. It saves internal battles over which style is best and ensures that you’ll be using a format that others will recognize.

For example, here’s a basic pagination response that conforms to the JSON:API specification:

{
  "meta": {
    "totalPages": 13
  },
  "data": [
    {
      "type": "articles",
      "id": "3",
      "attributes": {
        "title": "JSON:API paints my bikeshed!",
        "body": "The shortest article. Ever.",
        "created": "2015-05-22T14:56:29.000Z",
        "updated": "2015-05-22T14:56:28.000Z"
      }
    }
  ],
  "links": {
    "self": "http://example.com/articles?page[number]=3&page[size]=1",
    "first": "http://example.com/articles?page[number]=1&page[size]=1",
    "prev": "http://example.com/articles?page[number]=2&page[size]=1",
    "next": "http://example.com/articles?page[number]=4&page[size]=1",
    "last": "http://example.com/articles?page[number]=13&page[size]=1"
  }
}

The links object, for example, displays the URLs to access specific pages of results: the current page, first, previous, next, and last. Adopt this format for your own results and nobody will have to wonder whether it’s first_page, first_pg, or first.

🔗 🔗 Get Visibility Into the JSON Objects in Your APIs

You may already have a lot of JSON running through your organization. Chances are you have APIs that produce it and may even have some API descriptions that define your schemas. The Stoplight Platform helps you efficiently build on your existing assets, so you can build higher quality APIs in a shorter time.

    
      
Develop Microservices up to 10x Faster.