BlogsDope image BlogsDope

Handling JSON with Python

Nov. 16, 2020 PYTHON JSON 6910

JSON stands for JavaScript Object Notation. It is a popular and widely used data format used to exchange information. In this tutorial, we are going to learn how to use JSON in Python. JSON, in Python, can either be stored as a variable (in a string format) or in a file. We will take into account both ways of storing data. So, without further ado, let’s get started!

The first step is to import a built-in package json at the top of your program. This package will provide you necessary methods to work with JSON data.

import json

Before moving ahead, let’s see an example of data in JSON format.

{
	"students" : [
		{
			"name": "stevin",
			"age":18,
			"marks":90,
			"marksImproved": true
		},
		{
			"name": "kevin",
			"age":null,
			"marks":70,
			"marksImproved": false
		},
		{
			"name": "karen",
			"age":19,
			"marks":94,
			"marksImproved": true
		}
	]
}

The above example contains a key “students”, which is an array of objects. Each object represents a student having a name, age, marks, and the Boolean value showing whether the grades have improved or not. As you can observe, this looks a lot like a python dictionary and is easy to understand.

Convert from Python to JSON


Here we are going to learn how to encode Python data as JSON stored as a Python string. The json package provides a dumps() method to encode data. Consider the following example to understand it.


import json

data = {
    "students": [
        {"name": "stevin", "age": 18, "marks": 90, "marksImproved": True},
        {"name": "kevin", "age": None, "marks": 70, "marksImproved": False},
    ]
}

data_json = json.dumps(data)
print(data_json)
print(type(data))
print(type(data_json))

Output

{"students": [{"name": "stevin", "age": 18, "marks": 90, "marksImproved": true}, {"name": "kevin", "age": null, "marks": 70, "marksImproved": false}]}
<class 'dict'>
<class 'str'>

In the above example, the variable data contains a dictionary with the key “students”, which is a list of dictionaries having keys name, age, marks, and gradeImproved. The json.dumps() method converts the Python dictionary to a string containing the information in JSON format. The data types get converted into their equivalent JSON (JavaScript) types given in the table below.

PythonJSON
dictobject
listarray
tuplearray
strstring
intnumber
floatnumber
Truetrue
Falsefalse
Nonenull

As we can see in the above example, None gets converted to its equivalent type null, True to true, etc. By default, the dumps() method returns the result into the most compact form. However, to format the JSON string and to make it human-readable, we can pass a few arguments to the dumps() method. The indent argument takes a non-negative integer or a valid string to specify the indentation value. Let’s see the output of the same example if the indent is equal to 2.

data_json = json.dumps(data, indent = 2)
print(data_json)

Output

{
  "students": [
    {
      "name": "stevin",
      "age": 18,
      "marks": 90,
      "marksImproved": true
    },
    {
      "name": "kevin",
      "age": null,
      "marks": 70,
      "marksImproved": false
    }
  ]
}

As you can see, this is more human-redable.

The separator argument takes a tuple (item_separator, key_separator). The first item takes a string to separate each item, and the second item represents how to separate the key-value pairs. If the indent argument is None (default value), then the separator’s default value is (‘, ’, ‘: ’) and (‘,’, ‘: ’) otherwise.

To sort the data by keys, set the sort_keys argument to True. By default, it is False. Let’s see.

data_json = json.dumps(data, indent = "\t", sort_keys=True)
print(data_json)

Output

{
	"students": [
		{
			"age": 18,
			"marks": 90,
			"marksImproved": true,
			"name": "stevin"
		},
		{
			"age": null,
			"marks": 70,
			"marksImproved": false,
			"name": "kevin"
		}
	]
}

While the dumps() method converts the Python data to a JSON string, the dump() method dumps the Python data to a JSON file. The arguments to the json.dump() method are the same, except we have to provide an additional file pointer argument. Let’s see the following example. 

import json

data = {
    "students": [
        {"name": "stevin", "age": 18, "marks": 90, "marksImproved": True},
        {"name": "kevin", "age": None, "marks": 70, "marksImproved": False},
    ]
}

with open("json_data.json", "w") as fp:
    json.dump(data, fp, indent="\t")

In the above example, we open the file “json_data.json” in the write mode and write the data to it using the json.dump() method.

We have seen how to encode data to JSON strings and files, let’s now see how to do the opposite, i.e., decode JSON data to Python. Moreover, the process of encoding and decoding is known as serialization and deserialization, respectively.

Convert from JSON to Python


The json.loads() method converts the JSON string to a Python object according to the conversion table given below.

JSONPython
objectdict
arraylist
stringstr
number (int)int
number (real)float
trueTrue
falseFalse
nullNone

Let’s go through an example and see how this works.

import json

json_string = """{
   "students":[
      { "name": "stevin",
         "age":18,
         "marks":90,
         "marksImproved": true
      },
      {
         "name": "kevin",
         "age":null,
         "marks":70,
         "marksImproved": false
      },
      {
         "name": "karen",
         "age":19,
         "marks":94,
         "marksImproved": true
      }
      ]
}
"""

parsed_data = json.loads(json_string)
print(type(parsed_data))
print(type(parsed_data["students"]))
for student in parsed_data["students"]:
    print(student)

Output

<class 'dict'>
<class 'str'>
{'name': 'stevin', 'age': 18, 'marks': 90, 'marksImproved': True}
{'name': 'kevin', 'age': None, 'marks': 70, 'marksImproved': False}
{'name': 'karen', 'age': 19, 'marks': 94, 'marksImproved': True}

The json_string variable contains a multi-line string that is a valid JSON. The json.loads() method converts that string to its equivalent Python data type, i.e., a dict. The key “students” contains an array of objects, and we know that an array gets converted to a list. We iterate through the list and display each object, which gets converted to a dict as well.

The json package provides the load() method to load the data from a JSON file to a Python object. Let's see an example to understand this.

import json

with open("data.json", "r") as fp:
    data = json.load(fp)

print(type(data))
print(type(data["students"]))
for student in data["students"]:
    print(student)

Output

<class 'dict'>
<class 'str'>
{'name': 'stevin', 'age': 18, 'marks': 90, 'marksImproved': True}
{'name': 'kevin', 'age': None, 'marks': 70, 'marksImproved': False}
{'name': 'karen', 'age': 19, 'marks': 94, 'marksImproved': True}
{'name': 'sydney', 'age': 20, 'marks': 80, 'marksImproved': False}
{'name': 'jake', 'age': 18, 'marks': 75, 'marksImproved': True}

We open the file “data.json” in the read mode, stored in the same directory as the program file. The json.load() method reads the file into the data variable of type dict. We iterate through each of the students and display their details. The contents of the “data.json” file are given below.

{
	"students": [
		{
			"age": 18,
			"marks": 90,
			"marksImproved": true,
			"name": "stevin"
		},
		{
			"age": null,
			"marks": 70,
			"marksImproved": false,
			"name": "kevin"
    },
    {
			"age": 19,
			"marks": 94,
			"marksImproved": true,
			"name": "karen"
    },
    {
			"age": 20,
			"marks": 80,
			"marksImproved": false,
			"name": "sydney"
    },
    {
			"age": 18,
			"marks": 75,
			"marksImproved": true,
			"name": "jake"
		}
	]
}

Note that if the Python object is encoded as JSON and later decoded, it may necessarily not be equal to the original one. Let’s see an example.

import json

data = {"students": [{"name": "stevin",
                      "age": 18,
                      "marks": (90,80),
                      "marksImproved": True
                      },
                     {
                         "name": "kevin",
                         "age": None,
                         "marks": (70,60),
                         "marksImproved": False
                     }]}

encoded = json.dumps(data)
decoded = json.loads(encoded)
print(data == decoded)

Output

False

Can you guess why they are not equal?

As we know that the list and tuple are both converted to an array, and array gets converted back to list only. Therefore, the original data has a key “marks” that contains a tuple of scores obtained in two subjects. When we encode it, the dumps() method converts it to an array, and later that array is converted to a list according to the conversion table. Therefore, the decoded data is not equal to the original one.

Let’s see another example.

import json

data = {1: "One",
        2: "Two",
        3: "Three",
        4: "Four"}

encoded = json.dumps(data)
decoded = json.loads(encoded)
print(data == decoded)

Output

False

When encoding data, all the keys of the Python dict are coerced to strings if they are not already in that type because the JSON object has keys of type string. Therefore, when converting from JSON to Python object, the output data may not be equal to the original data because the decoded data will have all keys of type string and the original may have some non-string keys.


Liked the post?
A computer science student having interest in web development. Well versed in Object Oriented Concepts, and its implementation in various projects. Strong grasp of various data structures and algorithms. Excellent problem solving skills.
Editor's Picks
0 COMMENT

Please login to view or add comment(s).