Get latest Update ! Subscribe Now !

How to use Indian Gov. Open Source API Real time Air Quality Index from various locations and fetch data with Python.

If a data is open source so those data can be used by anyone under certain terms and conditions without any violation of policy.
Warning! Don't use these data where it violates the policy.

In our day to day life data create a big challenge, Government collects the data and process in day to day life in huge amount of volume , in which some data are accessible to the citizens, civil society, those data which is not sensitive can be use by the public for social, economic and developmental purposes. Unfortunately, if you want to use these open source data you must have a knowledge of different types of programming skills.


Hi, My name is Santosh Kumar aka littleboy8506.

Before starting , First you must have knowledge of what is API

Basically Api stand for Application Programming Interface, It's like a broker or intersection of certain software which helps in creating a chain of call requests for fetching data from one software to another. API basically uses the API Key for fetching such data.
Today we are using the API that contains the information regarding the Air Quality and the Humdity of State/City of India or you can use same way to fetch with the different country's API.




Open Source real time data in JSON format:



In above figure as you can see the dictionary type data named as "records" which in json format, it contain the following fields or label:

id : unique id to identify the entry of data.
country : A data that belongs to country.
state : Describe the state of country.
city: Real data fetch through city to city from their statiion.
station: A station that collects the real time data.
last_update: last time of update.
pollutant_id: polutant id it can be NO2, SO2 , PM2.5 and PM10.
pollutant_min: Minimum value of current pollution.
pollutant_max: Maximum number of current pollution.
pollutant_avg: Average value of current pollution.
pollutant_unit: Basically it is Not Added.




Import the required library for fetching the data from APi whose format is .JSON file:

Now, We need to import some important libraries or modules that are being used in our program or code, the first one is json module which helps in to manage the json, the second is requests module allow to send the http request, and it return the object file. And the last one pandas, it is a open source data analysis and manipulation tool which helps in to create a structured data.

Set the offset and the limit of data entry, you can find these data in json file as shown below in figure :

To get proper data we need to set number of json file limit which is called dataoffset, in our example it around 1387 basically it is json file which can be accessible via api key and the limit define the number of entry in this json file.


Create an empty data frame according to your data which you want to fetch :

Now, create an empty data frame with columns name as shown below which is similar to the json file's entry and add the one extra column name "offset" , it will use later which help in, to retrive the data from exported .csv file.




The For loop that call using API Key inorder to fetch all the data which you want :

Make a loop till dataoffset that helps in to fetch all json files and append all data into single dataframe, there are three step to get data given below:
1. Get Data Files(using request module).
2. Count the length of records.
3. Fetch and Store the data into dataframe.

Export the fetched data into Excel File:

All the data are stored into the dataframe object, inorder to export the data into csv file we need to use .to_excel() function as shown below :

dataframe.to_excel("output.xlsx",index=False)
print("done")




Output:

The exported data shown below in csv format :