Click here to Visit the RBI’s new website

BBBPLogo

RBI Bulletin


Search Archives

PDF document (1748 kb)
Inflation and Inflation Expectations: A Distributional Mapping
Date : Sep 18, 2023

by R. K. Sinha^

The article analyses statistical characteristics of the Consumer Price Index-Combined (CPI-C) based inflation and inflation expectations datasets and identifies suitable statistical distributions for these. The identification of appropriate distributions facilitates in establishing a one-to-one mapping of these distributions. The mapping provides a conversion/correspondence of a data point from one dataset to another. These models have the potential to forecast inflation and are also potentially useful to measure Inflation-at-Risk (IaR).

Introduction

The CPI-C based inflation data is published by the Ministry of Statistics and Programme Implementation (MoSPI) together with granular-level data. One type of granularity is by-product item at the all-India level. Another is according to the product group and sub-group level according to States/Union Territories (UTs) and Regions (Rural/Urban). The Reserve Bank conducts the Inflation Expectations Survey of Households (IESH), which provides expectations of the respondents (surveyed households) on inflation for the near term. Such surveys are known for biases internationally, and accordingly, the levels of inflation expectations often differ from the realised inflation. Nevertheless, they have proved to be very useful for tracking the directional changes. Several recent studies (Das et al., 2019; Shaw, 2019; Muduli et al., 2022) have attempted to assess the inherent biases in such surveys and removed them to establish a meaningful comparison between inflation and inflation expectations.

In this article, we carry out a comparative study of the statistical characteristics of entire distribution of the datasets of actual inflation of MoSPI and inflation expectations1 of the surveyed households rather than just modelling and mapping the central tendencies of the two datasets. It may please be noted that comparing and modelling aggregate inflation/inflation expectation numbers often lose inherent information in the dataset, as these are just the derived numbers.

The article is divided into five sections. After the introductory section, the datasets of inflation and inflation expectations are described in the second and third sections, respectively. The fourth section connects the findings of these two sections through suitable mappings and suggests possible uses of it. The last section concludes the article.

II. Statistical characteristics of CPI-C based Inflation Dataset

The data on CPI-C based inflation (aggregate as well as granular level) is published by the MoSPI on a monthly frequency. Statistically, the mean of inflation of the aggregate and granular-level datasets of the same period should match closely, the standard deviation (SD) of granular data can be expected to be higher as compared to the SD of aggregate data, as aggregate data is a distribution of the mean of the granular data. The modal inflation of the aggregate data falls in the band of 5 per cent to 6 per cent, while it is in the band of 4 per cent to 5 per cent in the case of disaggregate data for the period January 2014 to June 2023. The greater variability in the granular data represents individual product level shocks, which can be favourable (bringing the aggregate level inflation towards target point) or adverse (moving away the aggregate level inflation from the target point). The lowest and highest inflation in the aggregate level data stand at 1.46 per cent (recorded in June 2017) and 8.60 per cent (recorded in January 2014), respectively during January 2014 to June 2023 (Chart 1).

Chart 1: Probability Density of CPI-C Granular and Aggregate Data

The distribution of inflation in the granular level has varied significantly across the months driven by the relative presence of extreme values. We attempt to analyse the statistical properties of the granular dataset2 over the period January 2014 to June 2023.

The disaggregated dataset of CPI-C may, initially, appear to have some characteristics of a normal (bell curve).3 However, the dataset is found to be very leptokurtic i.e., having high peak than normal, with a kurtosis at 15.856. The distribution visually appears to be more-or-less symmetric, although has a mild positive skewness of 0.869. A best fit Normal distribution, viz., N (5.0430, 7.1185) is also plotted, demonstrating the nature of poor fitting with under-estimation at around central and extreme values, and compensating over-estimation in between (Chart 2). The underlying leptokurtic dataset has fatty tails with around 2.5 per cent of observations each in extreme parts, i.e., inflation lower than -10 per cent in the left tail and more than 20 per cent in the right tail, representing severe shocks (Chart 1).

As the normal distribution fails to explain characteristics of the dataset, we explore and search for other suitable statistical distributions, which may potentially explain the nature of this dataset. It is observed that no single statistical distribution explains the dataset adequately. Two best-fit distributions were identified as Cauchy (μ = 4.7930 and σ = 2.4758) and Laplace (μ = 5.0430 and λ = 0.1987) though they also do not fit the dataset appropriately (Chart 2 and Table 1).

Chart 2: Probability Density of CPI-C Disaggregate Data

It may be mentioned that the granular level dataset is composed of various product/sub-product groups across the regions (Rural/Urban) and States/ UTs leading to wide heterogeneity. Fitting of sub-sets of datasets by product categories, having larger heterogeneity than regions and States/UTs, indicate more precise modelling for some of the products. Also, we observe larger variations in the descriptive statistics of these subsets. For example, inflation of ‘cloth and footwear’ appeared to be closest to a bell curve (normal); inflation of ‘housing’ hovered in a tight spread (narrow range) over time (Annex - Chart A1 and Table A1).

Table 1: Fitting of CPI-C Disaggregate Dataset through Suitable Statistical Distributions
Name of the Distribution Estimated Parameters (Kolmogorov – Smirnov Test
(Critical Value at 5 per cent = 0.01944)
Rejection at 5 per cent level of significance
Normal µ = 5.0430
σ = 7.1185
K-S Statistics = 0.12413 Yes
Cauchy µ = 4.7930
σ = 2.4758
K-S Statistics = 0.03687 Yes
Laplace µ = 5.0430
λ = 0.1987
K-S Statistics = 0.06548 Yes
Source: Author’s calculations.

If a single distribution fails to fit the underlying dataset appropriately, various studies have explored and demonstrated the use of mixture distributions, having potential to capture the characteristics of the dataset more appropriately. These mixture distributions can be constructed with or without the identification of a threshold, a particular value of the random variable. The threshold approach partitions the dataset into two parts, and the parts are modelled separately using different statistical distributions.

Several studies e.g., Cooray and Ananda (2005) and Scollnik (2007) used the Lognormal-Pareto model; Ciumara (2006) and Scollnik and Sun (2012) applied the Weibull-Pareto model; Nadarajah and Bakar (2014) suggested Lognormal-Burr model; to mix two distributions with a threshold. Other approaches suggest mixing of two statistical distributions across the entire distribution without any threshold, but that might have fixed or dynamic weights (mixing parameters). Frigessi et al. (2002) demonstrated a dynamic mixture model for the unsupervised tail estimation without estimating the threshold. The study used a Weibull-Pareto pair, which assigned a higher weight, starting from one, to Weibull at the left part of the distribution which is gradually reduced and tend to zero at the right tail of the distribution. All these studies demonstrated the same dataset viz., Danish fire loss data, a famous insurance dataset known for its heavy right tail.

Unlike the above dataset, which has only one possible heavy tail loss, as values are bounded at zero, the underlying inflation dataset has two clear tails, which have varied significantly across the months. We split the data into two parts with inflation at 4 per cent, as the threshold. These two parts of the dataset are modelled separately. However, we transform the data before the modelling as detailed below:

Let {xL} and {xR} are the data points of the initial granular dataset of inflation covering the distinct ranges (-∞ to 4 per cent) and [4 per cent to ∞), respectively. We define:

Both, XL and XR range from 0 to ∞ now.

We now fit the data and identify that 3-parameter Burr and 3-parameter Dagum4 are the two distributions, which could explain the characteristics of the data appropriately for both the parts. We use Burr distribution in our case, and rest of the analysis is centered around Burr. Burr distribution is a versatile distribution and has been found to be suitable for many insurance datasets. Sastry and Sinha (2010) used a 4-parameter Burr distribution to describe Danish fire loss data and found it to be competitive to several mixture distributions, as proposed by some of the studies for this dataset, as mentioned earlier in this section.

The probability density function (pdf) of a 3-parameter Burr distribution is defined as:

Where, k (>0) and α (>0) are the first and second shape parameters, respectively; β (>0) is the scale parameter.

The distribution function (df) of a 3-parameter Burr distribution is defined as:

The descriptive statistics (DS) of each part of the distribution indicates resemblances of observed data and fitting by Burr (Table 2).

The fitting by using mixture distribution indicates stark improvement over the initial approach of using single distribution. Now the derived variables (XL and XR) are transformed back to the original variable (X) and the modelled probability density functions are proportioned into their respective weights and stitched together. This way, the derived single pdf from the synthetic pair of Burr (2.2685, 1.1129, 5.6631) and Burr (2.7037, 1.2135, 7.9923) adds to unity with appropriate weights5 and explains the data in a much better way (Chart 3).

Table 2: Descriptive Statistics of two Parts of the Distribution
DS Left Part (XL) Right Part (XR)
Weight in Full Data 0.41015 0.58985
Mean 4.0977
(-0.0977 for X)
4.6112
(8.6112 for X)
Range 0 to 46.90
(-42.90 to +4.00 for X)
0 to 89.19
(+4 to +93.19 for X)
SD 5.4349 5.8530
Skewness 2.7690 4.1868
Kurtosis 11.933 31.462
Fitting of Burr Distribution
Parameters k = 2.2685 k = 2.7037
  α = 1.1129 α = 1.2135
  β = 5.6631 β = 7.9923
K-S Statistics
(Critical value at 5 per cent)
0.01839
(0.02611)
0.01381
(0.02384)
Source: Author’s calculations.

Chart 3: Fitting of CPI-C Disaggregate Data - Synthetic Burr

If a single distribution, whether derived on a standalone basis or through the mixing of distributions, fails to capture the characteristics of extreme tails adequately and precisely, its estimates of probabilities in the extreme tails are neither reliable nor usable, as it may be under or over-estimating these consistently. In such cases, the alternative solution is to model the extreme tails exclusively through the Extreme Value Theory (EVT) tools. In the current case, the synthetic Burr appears to fit well the entire curve including the tails (Chart 3). We examine the same statistically and explore if EVT tools would be a valuable addition in this context.

The distribution of excesses over a high threshold, say u, in the right tail of inflation is defined as:

The distribution of excesses represents the probability that inflation (X) exceeds the threshold inflation u by at most an amount xR, where, xR = x-u, given the information that X exceeds the threshold u. In terms of the underlying function, the same is as below:

The functions, Fu(xR) and F(xR + u), are the conditional and unconditional distribution functions, respectively. The function F(xR + u) is equivalent to F(x), as xR = x - u. The F(u) is the cumulative probability at the threshold “u”. The underlying distribution function may have an infinite right endpoint, i.e., it allows the possibility of arbitrarily very large inflation value with a very small probability.

Similarly, the distribution of shortfall over a lower threshold u in the left tail of inflation is defined as:

where, xmin is the smallest observation.

The distribution of shortfall represents the probability that inflation (X) falls short the threshold inflation u by at most an amount XL, where, xL = u - x, given the information that X falls short of the threshold u. In terms of the underlying function, the same is as below:

The EVT essentially considers the larger/smaller few observations of the dataset at the extreme ends and not the complete dataset. The EVT deals with conditional probabilities for example, what is the probability of inflation exceeding 25 per cent, given that it is more than 10 per cent. The challenge with EVT is determining the threshold level. Ideally, a higher threshold should be preferred. However, as the threshold increases, the modeller is left with a very small number of observations raising debatable issues on the reliability of probability estimates. Accordingly, there has to be an optimum level of threshold. There are a few standard techniques to determine the threshold statistically. One such technique is plotting the ‘Mean Excess Function’. The same is described below:

If u is the threshold, the mean excess function e(u) can be estimated by the sum of exceedances (or shortfall, in case of left tail of the distribution) over the threshold u divided by the number of data points exceeding the threshold u. In other words, the mean excess function indicates the expected overshoot of a threshold given that it exceeds the threshold.

For the right tail, an upward trend of the mean excess function may indicate heavy-tailed behaviour of the data; a horizontal line may suggest an exponentially distributed data, and a downward trend may indicate a short-tailed data. The underlying data may follow GPD, if the empirical mean excess function shows an upward trend, in particular, a positive gradient (upward sloping) straight line (McNeil, 1997). In our context, the mean excess plot of the data (Chart 4) is a clear upward sloping line exhibiting suitability for a Generalised Pareto Distribution (GPD) at various possible threshold points.

A threshold is chosen from inspecting the plot of mean excess. Accordingly, the mean excess function is computed for our dataset6. It is observed that the mean excess function more-or-less maintains linearity and does not diverge across the board. This indicates that GPD may be potentially an appropriate choice for fitting the exceedances (Chart 4).

We explore and examine the appropriateness of GPD in our context and also compare it with our fitted synthetic Burr distribution. We find that the GPD is inferior to synthetic Burr, which has a poor fit at many thresholds (especially at lower u values) and has higher K-S Statistics values. The GPD appears to improve with the increase of threshold and converges with Burr though does not exhibit betterment over it. We demonstrate this for the right tail of the data (Chart 5).

Chart 4: Mean Excess Plot

Chart 5: Burr and GPD Comparison

We now move to the distribution of inflation in the aggregate data. As we observed earlier (Chart 1) that the distribution of CPI-C based aggregate inflation has much shorter tails, as compared to the granular level inflation distribution. The distribution is found to be almost symmetric and platykurtic (less peaked than normal). The Log-Pearson Type III distribution is identified to be the closest representation of the aggregate level inflation data (Table 3 and Chart 6).

Table 3: Descriptive Statistics and Fitting of CPI-C Aggregate Inflation
DS Estimates Fitting
Sample Size (N) 112 3-parameter
Log – Pearson Type III
α = 6.4118
β = - 0.1371
γ = 2.4545

K-S Statistics = 0.03671
(Critical value at
5% = 0.12832)
Mean 5.1039
Range 7.14 (1.46 to 8.60)
SD 1.5925
Skewness 0.0653
Kurtosis 2.4621
Note: The data period is from January 2014 to June 2023. The data of April 2020 and May 2020 are not included, which were imputed and published by MoSPI subsequently.
Source: Author’s calculations.

Chart 6: Fitting of CPI-C Aggregate Inflation

The quantile-quantile (QQ) plot exhibits a straight line highlighting the apprpriateness of the 3-parameter Log-Pearson Type III distribution for the CPI-C headline distribution (Chart 7).

Chart 7: QQ Plot

Chart 8: Evolution of Mean and SD

We also analyse the evolution of inflation distribution with the incoming of each incremental data point for the CPI-C aggregate data.7

Evolution and Stabilisation of Statistical Moments of Inflation (January 2014 to June 2023)

The mean inflation8 witnessed a more-or-less consistent drop since the beginning of January 2014 till September 2019, touching a trough of 4.54 per cent, which rose gradually in the subsequent period to 5.10 per cent in June 2023. The Standard deviation (SD) of the distribution appears to be settling at around 1.6 per cent (Chart 8).

The skewness of the distribution dipped gradually in recent years towards zero-level, leading to a symmetric distribution. The distribution turned platykurtic (less peaked than normal) again in February 2020, just prior to COVID emergence, which remained leptokurtic (more peaked than normal) throughout December 2017 to January 2020. The unstable values of skewness and kurtosis during 2014 are due to the small sample size. Further, these do not appear to precisely converge given the current sample size (Chart 9).

Now, we explore the statistical properties of inflation expectations in the following section, which is sourced from the Inflation Expectations Survey of Households (IESH) conducted by the Reserve Bank. There are other sources of inflation expectations/forecasts such as the Survey of Professional Forecasts (SPF), which is also conducted by the Reserve Bank. We restrict the analysis to IESH in the current context, as mentioned in the introductory section.

III. Statistical Characteristics of the Inflation Expectations Dataset

The inflation expectations survey of households is a bimonthly survey, wherein qualitative and quantitative expectations on inflation are sought from around 6,000 households9 in select cities in the urban areas. Here, we analyse only the quantitative inflation expectations of households, which are captured from the households for three-time points - current period, 3-month ahead period and 12-month ahead period. We consider the dataset for the IESH starting from March 2014 (Round 35) to May 2023 (Round 71B) including the two bimonthly surveys conducted every year in addition to four quarterly surveys. The granular (unit) level data on inflation expectations are also released by the Reserve Bank, in addition to web releasing the summary (aggregate) data. Accordingly, we analyse both the datasets, as carried out for CPI-C based inflation in the previous section.

Chart 9: Evolution of Skewness and Kurtosis

The households happen to generally report higher inflation than actual inflation. Further, there is a tendency to report higher inflation for 3-month ahead and further to 12-month ahead as compared to the current inflation (Table 4). We identify that the 4-parameter Burr distribution explains the IESH unit-level data appropriately. In the previous section, we identified that a mixture of two 3-parameter Burr distributions explains the distribution of CPI-C granular inflation well. From the IESH granular dataset, we observe that there has been a preference of households to report inflation in round numbers. This preference leads to the bunching of frequencies at round numbers and distorts the distribution. Further, unlike CPI-C granular inflation, the IESH granular inflation expectations have only one tail viz., right tail, as the lowest band (inflation less than one per cent) does not produce an extreme left tail, although it is unbounded theoretically, it is likely to be considered as between zero to one per cent by the respondents, which indeed appeared to contain a low frequency, barring 12-month inflation expectations dataset. The descriptive statistics and fitting of distribution are provided in Table 4 and Chart 10, respectively.

Table 4: Descriptive Statistics and Fitting of IESH Granular Data
DS Dataset
Current Period 3-month ahead 12-month ahead
Mean 11.765 13.150 13.849
Median 8.50 9.50 10.50
Range 99.50
(0.50 to 100.00)
98.50
(0.50 to 99.0)
99.50
(0.50 to 100.0)
SD 11.241 12.689 14.131
Skewness 3.1441 2.9822 2.7168
Kurtosis 15.916 14.001 12.163
Fitting 4-parameter Burr
k = 0.2894
α = 8.0122
β = 9.8506
γ = - 4.7867
K-S Statistics =
0.09987
(Critical value at
5% = 0.13675)

4-parameter Burr
k = 0.3447
α = 6.3927
β = 10.1440
γ = - 4.2926
K-S Statistics =
0.07008
(Critical value at
5% = 0.13446)

4-parameter Burr
k = 0.9789
α = 2.4867
β = 11.8570
γ = - 2.0122
K-S Statistics =
0.08236
(Critical value at
5% = 0.13446)
Source: Author’s calculations.

Chart 10 exhibits the characteristics of respondents regarding their preference for round numbers, as mentioned, with round numbers in the multiples of 5, viz., 5, 10, 15, 20…..and so on. The distribution is found to be very (positively) skewed as well as very leptokurtic for all three datasets (each for current inflation, 3-month ahead inflation and 12-month ahead inflation). The 4-parameter Burr appears to pass the goodness of fit at 5 per cent though, a superior fitting may still be feasible possibly through a mixture distribution due to the inherent nature of round number preferences while responding at survey rounds. The same is not attempted in the current context though.

Chart 10: Fitting of IESH Granular Data

Now, we move to the fitting of IESH aggregate data. The summary statistics of the IESH data is released for two central tendencies (mean10 and median). We identify that 4-parameter Johnson SB distribution11 tracks the distribution of mean inflation of IESH well. The descriptive statistics and fitting are exhibited in Table 5 and Chart 11, respectively.

Chart 11: Fitting of IESH Aggregate Datasets (Perception and Expectations)

Table 5: Descriptive Statistics and Fitting of IESH Aggregate Data (Mean)
DS Dataset
Current Period 3-month ahead 12-month ahead
N 51 51 51
Mean 9.284 9.961 10.051
Range 5.40
(7.30 to 12.70)
4.80
(7.90 to 12.70)
5.20
(8.30 to 13.50)
SD 1.246 1.258 1.171
Skewness 0.7234 0.2401 0.9043
Kurtosis 3.7781 2.2039 4.0463
Fitting 4-parameter Johnson SB
γ = 5.0416
δ = 2.8226
λ = 27.7870
ξ = 5.1452
K-S Statistics =
0.08065
4-parameter Johnson SB
γ = 0.3289
δ = 0.9778
λ = 6.0331
ξ = 7.3584
K-S Statistics =
0.06640
4-parameter Johnson SB
γ = 3.0075
δ = 1.8969
λ = 15.1540
ξ = 7.2878
K-S Statistics =
0.08959
Critical value at 5% = 0.18659
(same for all the datasets, as these are ungrouped datasets with same sample size)
Source: Author’s calculations.

The distribution of mean inflation of IESH is found to be (positively) skewed unlike the CPI-C aggregate inflation, which was found to be almost symmetric.

After studying and analysing the statistical properties of inflation and inflation expectations at the granular and aggregate level, we attempt to map these in the following section.

IV. Mapping of Distributions

The findings of section II and III reveal that the statistical moments of the distributions of various analysed datasets differ significantly from each other. The findings are summarized in Chart 12.

The mappings, as collated in Chart 12, provide an equivalence of distribution with the other. For example, a data point of IESH (aggregate) for current inflation following the 4-parameter Johnson SB (γ = 5.0416, δ = 2.8226, λ = 27.7870, ξ = 5.1452) has a correspondence with a data point of CPI-C (aggregate) following Log-Pearson Type III (α = 6.4118, β = - 0.1371 and γ = 2.4545). The functional relationship of two datasets can be used in many ways. A simple approach is to map through the cumulative distribution function (CDF). These are potentially useful as IESH is forward-looking, whereas CPI-C realised inflation is post-facto. Accordingly, the mapping has the potential to forecast inflation. The forecast for CPI-C inflation using IESH data can be possible under two mappings – direct and indirect, as below:

Direct mapping

It is based on aggregate numbers and does not use granular-level information. Let X and Y be random variables representing realised inflation and 3-month ahead inflation expectations of IESH, respectively, both at an aggregate level. The mapping of a particular value ‘y’ of the 3-month ahead inflation expectation to a value of x (of realised inflation) can be done by equating FJohnson SB (y) with Flog-Pearson III (x). The steps to do this mapping is given as below:


Chart 12: Mapping of Distributions

For example, the result of the 72nd round of IESH, which was conducted during July 1-10, 2023, indicated a mean of 10.20 per cent for the 3-month ahead inflation (i.e., forecast for October 2023). Using the 4-parameter Johnson SB (γ = 0.3289, δ = 0.9778, λ = 6.0331, ξ = 7.3584) distribution of 3-month ahead inflation in IESH, we compute FJohnson SB (10.20) = 0.58525. We compute x, by solving the equation, x = F-1 log-Pearson III (0.58525), wherein F-1 is an inverse CDF. This provides an estimate for x = 5.43 per cent.

The above mapping could be an alternative to the traditional econometric models, which are commonly used to forecast inflation through forward-looking inflation expectations. Based on the above approach using the identified distributions with estimated parameters, we estimate the inflation for the months since March 2014, barring those months, wherein any of the two – inflation and inflation expectations are not available. We compare the estimates of inflation with realised inflation using this approach and a simple regression-based approach, an econometric tool, and observe that the proposed approach is quite competitive, which additionally provides valuable insights into the detailed profile of the datasets (Chart 13).

More complex mapping of distributions could be done using Copula12 functions, although they are not attempted in this article.

Indirect mapping

In direct mapping, one data point of one distribution is mapped with one data point of the other distribution, which is suitable for the aggregate dataset, as we have only one (aggregate) number per month for inflation expectations and one for the realised inflation. Instead of aggregating single numbers, we can establish a mapping between granular datasets of inflation expectations and realised inflation through many one-to-one mappings. These mappings could include the mapping of inflation at disaggregate level such as urban city (centre) of IESH versus corresponding State of CPI-C, etc. The indirect mapping could be complex and could be done in two stages, as below:


Chart 13: Scatter Plot – Estimated Inflation and Realised Inflation

In the indirect mapping, it is feasible to analyse the distribution of incoming (new) data of a month at a granular level and compare the historical distribution (covering several months) to identify changes in the shape of the distribution, in terms of changes in moments such as skewness, kurtosis etc. This is not possible in the aggregate, being a single number. We fit the 4-parameter Burr distribution to all rounds under study on an individual basis also to get estimates of parameters of the distribution as also the round-wise descriptive statistics (Annex Table A2).

We find that the mean and standard deviation of the granular level IESH data are positively correlated with the realised CPI-C inflation. The skewness and kurtosis are negatively correlated. Similarly, two parameters (α and β) appeared to be linked positively with the realised inflation, while the other two (viz., k and γ) parameters are inversely correlated. These indicators may play a useful role in econometric models as input variables aiding in forecasting inflation (Annex Table A3).

The identification of one-to-one mappings in stage 1, which exhibits closer co-movement, shall be a useful exercise. The same is not explored in the current article. The second stage of the indirect mapping is expected to be stronger, being part of the same dataset.

Inflation-at-Risk (IaR)

The above mappings may also be useful in assessing Inflation-at-Risk (IaR)13. As lower inflation (left tail) has not been a concern in the Indian context, we compute IaR at 95 per cent and 99 per cent for the CPI-C aggregate data using historical (observed) inflation and using inflation expectations from IESH.

The estimated distribution of CPI-C inflation i.e., Log-Pearson Type III (α = 6.4118, β = - 0.1371 and γ = 2.4545) corresponds to the inverse CDFs - F-1(0.95) and F-1(0.99) viz. the Inflation-at-Risks - IaR0.95 and IaR0.99 at 7.84 per cent and 8.84 per cent, respectively. The same using the estimated distribution of IESH i.e., 4-parameter Johnson SB (γ = 0.3289, δ = 0.9778, λ = 6.0331, ξ = 7.3584) for 3-month ahead inflation corresponds the inverse CDFs of F-1(0.95) and F-1(0.99) to IaR0.95 and IaR0.99 at 12.15 per cent and 12.70 per cent, respectively. Thus, the IaR0.95 = 12.15 per cent and IaR0.99 = 12.70 per cent of 3-month ahead inflation expectations have equivalence with the IaR0.95 = 7.84 per cent and IaR0.99 = 8.84 per cent of realised inflation, respectively. The detailed quantile mapping of IESH current and 3-month ahead inflation along with CPI-C inflation is provided for completeness (Table 6). Similar computations could be carried out using granular level data under the indirect mapping approach, as discussed earlier.

Table 6: Quantile Mapping of IESH Inflation Expectation and CPI-C Aggregate Inflation
Quantiles CPI-C Aggregate Inflation IESH Current Inflation Perception IESH 3-Month Ahead Inflation Expectation
0.01 1.78 7.05 7.73
0.05 2.55 7.52 8.07
0.10 3.04 7.82 8.33
0.15 3.39 8.04 8.56
0.20 3.69 8.22 8.76
0.25 3.95 8.38 8.95
0.30 4.18 8.54 9.14
0.35 4.41 8.69 9.32
0.40 4.63 8.84 9.50
0.45 4.84 8.98 9.69
0.50 5.06 9.13 9.87
0.55 5.27 9.29 10.06
0.60 5.49 9.45 10.26
0.65 5.72 9.62 10.46
0.70 5.96 9.81 10.68
0.75 6.22 10.02 10.90
0.80 6.51 10.26 11.15
0.85 6.84 10.56 11.42
0.90 7.25 10.95 11.74
0.95 7.84 11.56 12.15
0.99 8.84 12.83 12.70
Source: Author’s calculations.

V. Conclusion

The statistical properties of granular-level inflation and inflation expectation datasets remain important and can be analysed through suitable statistical distributions. This article attempts to map the datasets of survey-based inflation and actual inflation through their long-run statistical distributions, which appear to be an unexplored area of research. The variants of Burr distributions are found to be appropriate in explaining statistical characteristics of both the granular level datasets, viz. survey-based inflation expectations and the realised inflation. The aggregation of these datasets provides useful summary statistics such as headline inflation numbers.

As the survey-based inflation expectations are forward-looking and have been useful in forecasting inflation for the short-term for which econometric tools are widely used, the functional relationship through suitable statistical distributions derived in the article may facilitate short-term forecasting as a non-econometric tool. Further, the roun-dwise estimated parameters for the survey-based inflation expectations may also be used as an input to the suitable econometric models. The identified distributions can also be used to measure Inflation-at-Risk for the observed inflation and survey-based inflation expectation datasets.

References

Andrade, P., Ghysels, E. and Idier, J. (2012), “Tails of Inflation Forecasts and Tales of Monetary Policy”. Working Papers, Banque de France.

Carreau, J. and Bengio, Y. (2009), “A hybrid Pareto model for asymmetric fat-tailed data: the univariate case”, Extremes 12, 53-76.

Cooray, K. and Ananda, M. M. A. (2005), “Modelling actuarial data with a composite Lognormal-Pareto model”, Scandinavian Actuarial Journal (5), 321-334.

Das, A., Lahiri, K. and Zhao, Y. (2019), “Inflation expectations in India: learning from household tendency surveys”. International Journal of Forecasting, 35 (3), 980-993.

Frigessi, A., Haug, O. and Rue, A. (2002), “Dynamic mixture model for unsupervised tail estimation without threshold selection”, Extremes, 5, 219-235.

McNeil, A. J. (1997), “Estimating the tails of loss severity distributions using extreme value theory”, ASTIN Bulletin, Vol. 27, No. 1, 117-137.

Muduli, S., Nadhanael, G. V. and Pattanaik, S. (2022), “Assesing inflation expectations adjusting for households’ biases”, Monthly Bulletin, Reserve Bank of India, December.

Nadarajah, S. and Bakar, S. A. A. (2014), “New Composite Models for the Danish Fire Insurance Data”, Scandinavian Actuarial Journal, 2, 180-187.

Salido, D. L. and Loria, F. (2021), “Inflation at Risk”, Federal Reserve Board, September 08.

Sastry, D. V. S. and Sinha, R. K. (2010), “A Revisit to Danish fire loss data”, Conference Proceedings, 12th Global Conference of Actuaries (GCA), Mumbai, India.

Scollnik, D. P. (2007), “On composite Lognormal-Pareto model”, Scandinavian Actuarial Journal, Vol. 2007, Issue 1/2007, 20-33.

Scollnik, D. P. and Sun, C. (2012), “Modelling with Weibull-Pareto models”, North American Actuarial Journal, 16 (2), 260-272.

Shaw, P. (2019), “Using rational expectations to predict inflation”, Reserve Bank of India Occasional Papers, Vol. 40, No. 1.

Sinha, R. K. (2023). “India’s Steady State Equilibrium Inflation: A Revisit”, Monthly Bulletin, Reserve Bank of India, May.


Annex

Chart A1: CPI-C Granular Data (Product Group-Wise Fitting of Inflation)

Chart A1: CPI-C Granular Data (Product Group-Wise Fitting of Inflation) (Concld.)

Table A1: CPI-C Granular Data (Goodness of Fit)
DS Best Fit Goodness of Fit
1. Food and Beverages
Mean 4.72 Cauchy (μ, σ)
μ = 4.455
σ = 3.420
K-S Statistics = 0.03111
Critical Value at 5 per cent = 0.01928
SD 9.46
Range 136.09
Skewness 0.8244
Kurtosis 10.5708
2. Pan and Tobacco
Mean 6.22 Johnson SU (γ, δ, λ, ξ)
γ = - 0.2939
δ = 1.5054
λ = 5.6234
ξ = 4.8419
K-S Statistics = 0.03005
Critical Value at 5 per cent = 0.03314
SD 4.84
Range 53.85
Skewness 0.6025
Kurtosis 7.3770
3. Cloth and Footwear
Mean 5.65 Generalised Gamma (k, α, β, γ)
k = 1.6752
α = 94.813
β = 3.5953
γ = - 48.695
K-S Statistics = 0.02143
Critical Value at 5 per cent = 0.0331
SD 3.32
Range 30.16
Skewness 0.0040
Kurtosis 3.5698
4. Housing
Mean 5.44 Burr (k, α, β, γ)
k = 0.3489
α = 28.030
β = 20.683
γ = - 17.264
K-S Statistics = 0.02326
Critical Value at 5 per cent = 0.04537
SD 2.92
Range 19.38
Skewness 1.7588
Kurtosis 7.5862
5. Fuel and Light
Mean 5.45 Johnson SU (γ, δ, λ, ξ)
γ = - 0.0883
δ = 1.4241
λ = 6.8561
ξ = 4.9099
K-S Statistics = 0.03855
Critical Value at 5 per cent = 0.02995
SD 6.30
Range 104.16
Skewness 0.2176
Kurtosis 7.8465
6. Miscellaneous
Mean 5.08 Johnson SU (γ, δ, λ, ξ)
γ = -0.3685
δ = 1.6384
λ = 4.8154
ξ = 3.7645
K-S Statistics = 0.00780
Critical Value at 5 per cent = 0.02852
SD 3.69
Range 74.42
Skewness 0.5900
Kurtosis 6.3318
Source: Author’s calculations.

Table A2: IESH Data (Round-Wise Fitting of 3-Month Ahead Inflation)
Survey Rounds Period of Survey 4-Parameter Burr Distribution (Estimated parameters) Descriptive Statistics Published Prints
k α β γ Mean* SD Skewness Kurtosis IESH 3-month ahead Mean Inflation IESH 3-month ahead Median Inflation Realised CPI-C Inflation#
35 Mar-14 0.3890 7.320 18.287 -9.292 17.601 14.993 2.486 10.233 12.3 12.9 6.77
36 Jun-14 0.2289 11.157 23.521 -16.494 20.748 19.468 2.112 7.531 12.5 14.0 5.63
37 Sep-14 1.0350 2.039 15.455 0.324 21.805 19.744 2.004 7.277 12.7 14.6 4.28
38 Dec-14 0.9008 2.974 8.637 -1.012 10.365 9.494 3.827 25.393 8.8 8.3 5.25
39 Mar-15 0.7021 3.756 8.626 -1.498 10.731 9.528 3.882 25.077 9.0 8.5 5.40
40 Jun-15 0.6444 4.283 10.484 -2.526 12.231 10.429 3.580 21.570 10.1 10.0 4.41
41 Sep-15 0.5529 4.307 10.101 -2.216 13.519 11.909 3.255 17.591 10.6 10.4 5.61
42 Dec-15 0.3618 6.313 10.005 -3.071 13.749 12.144 2.983 14.159 10.5 9.9 4.83
43 Mar-16 0.6304 3.272 6.807 -0.547 10.946 11.036 3.730 21.811 8.8 8.1 5.77
44 Jun-16 0.4702 5.158 10.030 -3.533 12.364 11.773 3.231 16.556 9.7 9.2 4.39
45 Sep-16 0.6900 3.805 9.991 -2.241 12.109 11.521 3.573 19.768 9.7 9.5 3.41
45B Nov-16 0.5190 4.441 8.395 -2.288 11.356 11.504 3.709 21.040 9.1 8.2 3.65
46 Dec-16 0.6270 5.581 10.085 -4.180 8.843 7.692 4.331 32.604 7.9 7.3 3.89
47 Mar-17 0.6111 4.951 9.175 -3.145 9.477 8.723 3.982 24.903 8.2 7.5 1.46
47B May-17 0.5241 6.050 9.976 -4.386 9.401 9.014 4.182 26.427 8.1 7.3 3.28
48 Jun-17 0.6939 5.146 9.604 -3.202 8.899 7.203 4.481 34.669 8.1 7.5 3.28
49 Sep-17 0.4139 7.261 9.362 -4.067 9.503 8.886 4.255 27.648 8.2 7.2 5.21
49B Nov-17 0.3860 6.654 8.724 -3.494 10.304 10.210 3.773 21.608 8.4 7.5 4.44
50 Dec-17 0.4357 5.831 8.724 -3.502 10.159 10.146 3.736 21.247 8.4 7.5 4.28
51 Mar-18 0.4533 6.737 10.251 -4.550 9.967 8.848 4.042 26.851 8.6 7.8 4.92
51B May-18 0.4994 4.642 8.454 -2.125 11.544 10.892 3.553 19.713 9.4 8.7 3.69
52 Jun-18 0.3841 6.238 10.144 -4.293 12.201 11.690 3.122 15.356 9.5 8.9 3.70
53 Sep-18 0.3098 6.523 9.215 -3.723 13.224 12.953 2.779 11.973 9.9 9.4 2.11
53B Nov-18 0.3102 7.708 10.619 -5.227 12.166 11.315 3.025 14.568 9.7 9.0 2.57
54 Dec-18 0.3726 6.174 8.590 -3.194 11.263 11.273 3.522 18.303 9.1 8.2 2.86
55 Mar-19 0.4732 5.271 8.028 -2.320 10.162 9.210 3.986 25.473 8.7 7.8 3.18
55B May-19 0.3909 6.603 8.869 -3.649 10.119 9.225 3.642 21.563 8.6 7.6 3.28
56 Jul-19 0.4151 5.723 7.895 -2.590 10.254 9.702 3.646 20.489 8.6 7.6 4.62
57 Sep-19 0.4767 4.941 8.098 -2.364 10.554 9.325 3.119 16.027 8.9 8.0 7.35
57B Nov-19 0.2914 8.021 10.718 -5.315 12.540 11.751 3.018 14.512 9.8 9.2 6.58
58 Jan-20 0.2456 9.788 11.045 -6.055 12.192 11.575 3.088 15.106 9.5 8.6 NA
59 Mar-20 0.4305 5.954 8.876 -2.782 10.980 9.610 3.510 19.517 9.2 8.5 6.23
59B May-20 0.1809 16.522 19.864 -14.896 14.411 13.375 2.654 11.462 10.6 10.4 6.69
60 Jul-20 0.1979 15.068 20.132 -14.933 14.735 13.551 2.582 11.019 10.8 10.5 7.61
61 Sep-20 0.1767 16.053 20.088 -15.110 15.261 14.410 2.492 10.252 10.8 10.4 4.59
61B Nov-20 0.2505 8.788 11.589 -5.986 14.202 13.503 2.653 10.997 10.4 10.1 5.03
62 Jan-21 0.2865 7.708 10.275 -4.722 13.110 12.510 2.869 12.823 10.0 9.3 4.23
63 Mar-21 0.2244 10.097 12.130 -6.629 14.178 13.369 2.689 11.339 10.4 10.1 6.26

Table A2: IESH Data (Round-Wise Fitting of 3-Month Ahead Inflation) (Concld.)
Survey Rounds Period of Survey 4-Parameter Burr Distribution
(Estimated parameters)
Descriptive Statistics Published Prints
k α β γ Mean* SD Skewness Kurtosis IESH 3-month ahead Mean Inflation IESH 3-month ahead Median Inflation Realised CPI-C Inflation#
63B May-21 0.1680 15.823 19.861 -14.492 16.561 15.527 2.299 8.642 11.3 10.8 5.30
64 Jul-21 0.1335 23.541 26.553 -21.489 16.904 15.415 2.149 7.879 11.7 11.3 4.48
65 Sep-21 0.1571 16.348 19.905 -14.466 17.231 16.112 2.193 8.020 11.4 10.8 5.66
65B Nov-21 0.2908 6.865 13.349 -6.371 17.914 15.994 2.141 7.856 11.9 12.3 6.07
66 Jan-22 0.1776 14.816 17.495 -12.027 15.480 14.241 2.474 9.884 11.1 10.6 7.79
67 Mar-22 0.1706 16.596 20.122 -14.744 15.628 14.173 2.312 8.772 11.1 10.7 7.01
67B May-22 0.1419 23.030 26.925 -21.733 16.545 14.856 2.233 8.514 11.4 10.8 7.00
68 Jul-22 0.3039 7.324 10.986 -4.521 14.302 12.710 2.633 11.188 10.7 10.3 6.77
69 Sep-22 0.2155 11.932 17.219 -11.210 16.070 14.675 2.465 9.847 11.3 10.8 5.72
69B Nov-22 0.2651 9.224 13.205 -6.982 14.432 12.714 2.691 11.788 10.9 10.4 6.44
70 Jan-23 0.3470 6.865 11.370 -4.566 13.917 11.883 2.747 12.329 10.8 10.5 4.70
71 Mar-23 0.3251 7.186 10.753 -4.299 13.504 11.829 2.937 13.665 10.5 10.2 4.87
71B May-23 0.4260 5.205 9.225 -2.338 13.186 11.666 2.939 13.570 10.3 10.1 6.83
Note: *: Computed from the raw data (without incorporating any filter/trimming) and is unweighted.
#: Realised CPI-C Inflation is 3-month ahead print from the month of the survey.
NA: The CPI-C inflation for April 2020 (3-month ahead from Round 58) was not initially published by MoSPI, which was imputed and published subsequently.
Source: Author’s calculations.

Table A3: Correlation Matrix of IESH Indicators (with Published CPI-C Inflation)
  Realised CPI-C Inflation k α β γ Mean* SD Skewness Kurtosis Published Mean Published Median
Realised CPI-C Inflation 1.000                    
k -0.378 1.000                  
α 0.406 -0.774 1.000                
β 0.433 -0.577 0.894 1.000              
γ -0.414 0.738 -0.967 -0.954 1.000            
Mean* 0.424 -0.380 0.514 0.766 -0.600 1.000          
SD 0.367 -0.408 0.522 0.749 -0.603 0.982 1.000        
Skewness -0.496 0.591 -0.625 -0.746 0.655 -0.927 -0.919 1.000      
Kurtosis -0.474 0.628 -0.586 -0.672 0.602 -0.878 -0.888 0.983 1.000    
Published Mean 0.482 -0.429 0.524 0.751 -0.599 0.976 0.937 -0.940 -0.900 1.000  
Published Median 0.426 -0.271 0.396 0.695 -0.502 0.975 0.932 -0.878 -0.825 0.975 1.000
Source: Author’s calculations.

^ The author is from the Monetary Policy Department (MPD). The views expressed in this article are those of the author and do not represent the views of the Reserve Bank of India.

1 The comparative study is also possible using the information on inflation expectations of other respondents such as professional forecasters. The same is not explored in this article.

2 The study considers 22 larger States/UTs, which have individual weights of more than 0.25 per cent in the CPI-C basket. These States collectively cover 98.30 per cent of CPI-C basket and have greater level of granularity as compared to the set of smaller States.

3 The probability density function of the disaggregate and aggregate inflation data would be different though the central tendency derived from these two datasets would be comparable. However, other statistical moments of the data (viz., standard deviation, skewness and kurtosis) may differ significantly. For example, the standard deviation of the granular data would be higher than that of the aggregate data.

4 Dagum distribution is the inverse of Burr distribution, which is used to fit heavy tailed distributions.

5 The appropriate weights of the left and right tails are derived from the observed data at around 0.41 and 0.59 (Table 2).

6 We use probability weighted counts of observations in our context to reflect varied and appropriate weights of the products/sub-products. This is unlike many other studies, wherein simple counts are applied to derive mean excess function, which is meaningful in loss distributions such as Danish fire loss dataset, demonstrated by many studies.

7 The evolution and stabilisation of granular-level data of CPI-C based inflation was discussed in the study (Sinha, 2023) covering the pre-COVID and post-COVID periods with additions of half-yearly data. The study found a consistent right-ward shift in the distribution in the post-COVID period.

8 The mean inflation at a month (t) is a simple average of inflation starting from January 2014 to the month (t).

9 The number of surveyed households has increased in recent times with the introduction of new cities/centres for the survey.

10 The published mean of the IESH may not match with the mean computed from the granular level data due to various aggregation and methodological issues.

11 Johnson System Bounded (SB) distribution is a system of curves for bounded data, which can be transformed to an approximately normal distribution through an appropriate transformation function. Similarly, Johnson System Unbounded (SU) distribution is a system of curves for unbounded data, which is found to be suitable in this study for select product groups in the CPI-C granular dataset (Annex Table A1).

12 Copula is a multivariate cumulative distribution function for which the marginal (individual) probability distribution of each variable is uniformly distributed in the interval [0, 1]. It models the dependence (inter-correlation) structure between random variables and is widely used in financial datasets.

13 There are different approaches to define and measure Inflation-at- Risk (IaR). IaR was originally introduced by Andrade et al. (2012) to assess the risks to the inflation outlook. The study constructed a Value-at-Risk (VaR)-type measure of tail risk for inflation using survey-based conditional density forecasts. They observed that the magnitude and the asymmetry of inflation risks varied over time. A recent study by Salido and Loria (2021) used a probability-type measure for IaR. It also highlighted importance of skewness in the IaR. In the current context, we use a simple VaR-type measure for IaR.


2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
Archives
Top