Ground Validation of GPM IMERG-F Precipitation Products with the Point Rain Gauge Records on the Extreme Rainfall Over a Mountainous Area of Sumatra Island

Article Info Received: November 23, 2021 Revised: January 9, 2022 Accepted: January 11, 2022 Published: January 31, 2022 Abstract: Accurate satellite precipitation estimates over areas of complex topography are still challenging, while such accuracy is of importance to the adoption of satellite data for hydrological applications. This study evaluated the ability of Integrated Multi-satellitE Retrievals for GPM -Final (IMERG) V06 product to observe the extreme rainfall over a mountainous area of Sumatra Island. Fifteen years of optical rain gauge (ORG) observation at Kototabang, West Sumatra, Indonesia (100.32°E, 0.20°S, 865 m above sea level), were used as reference surface measurement. The performance of IMERG-F was evaluated using 13 extreme rain indexes formulated by the Expert Team on Climate Change Detection and Indices (ETCCDI). The IMERG-F overestimated the values of all precipitation amountbased indices (PRCPTOT, R85P, R95P, and R99P), three precipitation frequency-based indices (R1mm, R10mm, R20mm), one precipitation duration-based indices (CWD), and one precipitation intensity-based indices (RX5day). Furthermore, the IMERG-F underestimated the values of precipitation frequency-based indices (R50mm), one precipitation duration-based indices (CDD), one precipitation intensity-based indices (SDII). In terms of correlation, only five indexes have a correlation coefficient (R) > 0.5, consistent with Kling–Gupta Efficiency (KGE) value. These results confirm the need to improve the accuracy of the IMERG-F data in mountainous areas.


Introduction
Extreme rainfall has significant impacts on society, the economy, and the environment. It is responsible for various natural disasters, including flash flooding and erosion that can lead to damage, injury, and death (Adfy & Marzuki, 2021;Ulfah et al., 2021;Utami & Marzuki, 2020;Wahyuni et al., 2015). There are some factors that can trigger the occurrence of extreme rainfall, including mesoscale convective systems (MCSs), synoptically forced weather systems, and tropical cyclones (Schumacher & Johnson, 2006). MCSs are the leading factor of extreme rainfalls, especially in tropical areas, including the Indonesia Maritime continent (IMC). Several tropical phenomena contributing to this rain occur in the IMC, such as tropical disturbances (Mulyana et al., 2018), El-Niño Southern Oscillation (ENSO) (Dewi & Marzuki, 2020;Supari et al., 2018;Vitri & Marzuki, 2014), and the Madden-Julian Oscillation (MJO) (Baranowski et al., 2020). Therefore, accurate observation and analysis of extreme rainfalls in the IMC are necessary to minimize the impact of the disaster caused by this rain.
Surface rainfall observation using rain gauge and radar is an essential component in observing extreme rainfalls. Rain gauge with high density of observation will provide real-time and accurate precipitation information. However, in the IMC, the density of rain gauge observation is still very low due to several factors, including the topography, the islands, and procurement cost . To overcome the limitation of surface observations in the IMC, precipitation data from satellite products is an option. Although the temporal resolution of satellite products is not as good as that of rain gauges and surface-based radars, satellite products have several advantages, such as broader observation coverage and good observation continuity. In addition, the spatial resolution is also homogenous, different from the rain gauge, in which the observation density can vary from one area to another.
One of the potential satellite products to be used in extreme rain observations in the IMC is Integrated Multi-satellite Retrievals for GPM (IMERG) data. IMERG provides bulk data by combining all passive microwave instruments in the Global Precipitation Measurement (GPM) constellation (Huffman et al., 2015). GPM provides worldwide rain and snow observations every three hours. IMERG calibrates, aggregates, and interpolates all satellite microwave precipitation forecasts, along with microwave calibrated infrared (IR) satellite forecasts, rainfall gauge analysis, and other potential rainfall estimates. IMERG has a temporal resolution of 30 minutes and a spatial resolution of 0.1° (Huffman et al., 2015). Hence, it is much better than other satellite products. However, IMERG has various uncertainties and errors (J. Tan et al., 2016), so it must be evaluated and validated before use.
IMERG validation depends on the availability of surface data, both rain gauge, and radar data. This condition is often a problem, especially in areas where the availability of surface data is low, such as at highlands and mountains. IMERG based on passive microwave (PMW) and Infrared (IR) sensors is difficult to detect deep convection caused by unstable air masses in a mountainous region (Kim et al., 2017), so validation for these areas is increasingly needed. This study evaluated the performance of IMERG in a mountainous area of Sumatra by utilizing optical rain gauge (ORG) observations for 15 years (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016). The ORG is located at the equatorial atmosphere radar (EAR) observation site in Kototabang, West Sumatra, Indonesia, elevation 865 m above sea level (Marzuki et al., 2009Marzuki, Hashiguchi, et al., 2016;Marzuki, Hiroyuki, et al., 2016). Evaluation and validation of IMERG data in Indonesia, especially Sumatra, is still minimal . Hence, this research will provide important information regarding the performance of IMERG to observe extreme rainfall in Indonesia, especially in the mountainous area of Sumatra.

GPM IMERG Final Run Precipitation Product
Global Precipitation Measurement (GPM) was launched on 27 February 2014. GPM carries many sensors with frequencies between 10 GHz and 183 GHz. Observational data from the GPM sensor is combined with multi-satellite observation data to produce IMERG Precipitation Products (Hou et al., 2014).
IMERG data is available in three types, namely, IMERG-Early (IMERG-E), Late (IMERG-L), and Final (IMERG-F) with latencies of 4 hours, 12 hours, and 2.5-3.5 months, respectively (Huffman et al., 2019). In addition to latency differences, the algorithms of the three data types are also different. Details of IMERG algorithms can be found in Huffman et al. (2019). In this work, we validated the IMERG-F data during 2002-2016. Each IMERG data has advantages in applications. IMERG-F data is used in the hydrological cycle and model validation (Sungmin et al., 2017). One of the essential factors in observing the hydrological cycle and validating the model is accurate daily to annual rainfall (Wong et al., 2017). The value of annual rainfall is often influenced by extreme rainfall because of its high intensity.

Optical Rain Gauge Data
IMERG-F data was validated using Optical Rain Gauge (ORG) observation data. The ORG is installed in the mountains area of Sumatra, at the atmospheric observation station operated by the National Institute of Aeronautics and Space (LAPAN) in Kototabang (100.32°E, 0.20°S, 865 m above sea level), west Sumatra, Indonesia. The location of the observation station can be seen in Figure 1. ORG is a laser-based instrument. The measured rainfall is based on the intensity variation captured by the sensor when the raindrops pass through the beam. This technology is commonly known as scintillation technology. The detailed specifications and working principles of the ORG can be found on the company's website (OSI, n.d.).  The ORG installed in Kototabang is ORG-815 type. This ORG can observe rainfall in the range of 0.1 mm/h to 500 mm/h. The temporal resolution of the ORG observations is 1 minute. These 1-min data are then summed to obtain daily data that will be used to validate the IMERG-F. As a quality control, we only used data that were complete within one day of observation. The ORG data at Kototabang has been widely used in several studies, which shows its reliability in rainfall observations (Marzuki et al., 2009(Marzuki et al., , 2013Marzuki, Hiroyuki, et al., 2016). The ORG in Kototabang collected data from 2002 to 2016. Therefore, this study validates IMERG-F for extreme rain during 15 years (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016). The validation process only used the data with a minimum availability of 80% in one year. This percentage refers to the annual data threshold used in observing the extreme rain index at IMC Tangang et al., 2018). We obtained ten years of observation that meet this threshold ( Figure 2).

Validation Matrices
The extreme rain from IMERG-F was validated based on IMERG-F's ability to observe the extreme rain index in Kototabang. We used several extreme rain indexes formulated by the Expert Team on Climate Change Detection and Indices (ETCCDI) ( Table 1). All indexes are calculated per year observation so that we obtain one value for each index per year. The algorithm for calculating each index can be found on the ETCCDI website (ETCCDI Climate Change Indices, n.d.). The extreme rain index in Table 1 is divided into four categories: precipitation amount-based indices, precipitation frequency-based indices, precipitation duration-based indices, and precipitation intensitybased indices. The Kling-Gupta Efficiency (KGE) was applied to all extreme rain indexes (Gupta et al., 2009) as the validation matrix. This statistical method is relatively new and has been widely used to evaluate extreme rain observations using satellites (Hosseini-Moghari & Jiang & Bauer-Gottwein, 2019;Tang et al., 2020). Mathematically, KGE is expressed by: ... (1) where R is the Pearson correlation coefficient, b is the mean ratio, and g is the variance ratio between the extreme rain index from IMERG and ORG observations. The perfect value for KGE is 1, with a range from -∞ to 1. The R value ranges from -1 to 1, with 1 indicating a perfect positive correlation and -1 indicating a perfect negative correlation, and 0 indicating no correlation. Furthermore, b and g have perfect values of 1, and values >1 indicate overestimation and < 1 indicate underestimation.
In addition to the extreme rain index, we also validated the rainfall intensity from IMERG-F using the Probability Density Function (PDF) and Cumulative Distribution Function (CDF). PDF and CDF methods are commonly used to validate many satellite products (Ma et Figure 3 shows the CDF and PDF values of all data in Kototabang. The percentile values shown on the CDF plot are the 50th, 75th, 95th, and 99th percentiles, respectively. The daily rainfall values (ORG/IMERG-F) for the 50th, 75th, 85th, 95th, and 99th percentiles are 3.27/4.60 mm/day, 12.43/12.32 mm/day, 20.04/18.94 mm/day, 36.84/32.73 mm/day, and 60.93/54.12 mm/day, respectively. Thus, for the same daily rainfall, the CDF value of IMERG data is larger than that of ORG, except for low rainfall (50th percentiles or less). The CDF values show IMERG-F underestimate daily rainfall for the lower percentile. On the other hand, for the higher percentile, IMERG-F tends to overestimate, which can be observed at the 95th and 99th percentiles (Figure 3a). The 95th and 99th percentiles are usually associated with extreme rainfall events with a small recurrence frequency (Zhang et al., 2008). The CDF result is consistent with the PDF (Figure 3b). For PDF, daily rainfall values are divided into seven categories following the validation study in Singapore (M. L. Tan & Duan, 2017). IMERG-F is highly underestimated for rainfall below 0.1 mm/day, which is consistent with the result obtained in Singapore (M. L. Tan & Duan, 2017). Underestimation for very low rainfall contributes to the high FAR value of the IMERG-F data . IMERG-F overestimated rainfall values above 0.1 mm/day and below 50 mm/day. For rains above 50 mm/day, IMERG-F underestimated, as can also be seen from the CDF (Figure 3b).

Precipitation amount-based indices assessment
Precipitation amount-based indices describe the amount of annual rainfall (PRCPTOT) from the entire data and rainfall for several percentiles (R85p, R95p, and R99p). The value of each precipitation amountbased index can be seen in Figure 4. In general, IMERG observations overestimated rainfall for all indexes when compared to ORG observations. Such overestimation value decreases with increasing percentile value. This condition is related to the ability of IMERG to tend to overestimate rainfall for low and medium rainfall and underestimate for very low and very high rainfall (reference), as also seen in the PDF value for rain above 0.1 mm/day to 50 mm/day ( Figure 3).

Precipitation frequency-based indices assessment
Precipitation frequency-based indices are carried out with four rainfall thresholds. Rainfall 1 mm/day (R1mm) represents the number of wet days per year. R10mm and R20mm indicate the number of days with heavy and very heavy rain intensity. Furthermore, the rainfall of 50 mm/day is chosen as the threshold for extreme rainfall. This classification follows some previous research conducted at IMC (Supari et al., 2017;Tangang et al., 2018). In addition, the value of R50mm also corresponds to the 99th percentile obtained through the CDF (Figure 3a). The validation matrices of frequency-based indices are given in Table 2. The R1mm, R10mm, and R20mm indexes show an overestimated value (β>1), while R50mm shows an underestimated value (β<1). This trend is also shown in the boxplot of frequency-based indices ( Figure 5). Thus, the number of days with R50mm intensity from IMERG-F data was smaller than in ORG, although rainfall above the 99th percentile of IMERG was slightly higher than that of ORG ( Figure  4d). Furthermore, the values of g for all frequencybased indices varied, indicating an inconsistency in the temporal variability. The correlation value decreases significantly with increasing the annual rainfall threshold. R1mm and R10mm showed good (R>0.7) and moderate (0.5<R<0.7) correlations, consistent with KGE values (Table 2). This result is also consistent with the frequency-based indices in the mountainous area of Nepal (Nepal et al., 2021).

Precipitation duration-based indices assessment
Precipitation duration-based indices were performed through the ability of IMERG-F to observe consecutive dry days (CDD) and consecutive wet days (CWD). Dry days are the number of days with rainfall intensity below 1 mm/day, while wet days are days with intensity above 1 mm/day, respectively. Figure 6 shows the CDD and CWD values at Kototabang. The CDD value of IMERG-F was lower than that of ORG, while IMERG-F overestimated the CWD index. This can also be seen in the parameter b values of > 1 for CWD and b < 1 for CDD ( Table 2). The same pattern is also shown by the g parameter: g < 1 for CDD and g > 1 for CWD. On the other hand, a better correlation value is shown by the CDD index compared to CWD. Better b and R values for CDD result in a better KGE value for CDD than CWD ( Table 2). The same pattern was also observed in several studies (Ning et al., 2017;Nepal et al., 2021). This is related to the overestimation of rainfall observations with an intensity of 1 mm/day (Figure 2b), which causes the high miss alarm obtained by IMERG-F in identifying wet days in Kototabang.

Precipitation intensity-based indices assessment
Precipitation intensity-based indices were observed based on the daily maximum rainfall index (RX1day), maximum consecutive 5-days precipitation (RX5days), and Simple Daily Intensity Indices (SDII). SDII is the ratio between the total annual rainfall and the number of annual wet days. The intensity-based indices values from 10 years of ORG and IMERG-F observations in Kototabang are shown in Figure 7. Excellent KGE values are shown by RX5days (0.65), while RX1day and SDII shows low KGE values (<0). The low KGE values of RX1day and SDII were associated with underestimating the mean and variance (β and g < 1). This result shows the low ability of IMERG-F in observing rainfall with high rainfall intensity, as indicated by the CDF value (Figure 2a). The better KGE value for RX5days is related to the better ability of IMERG in observing 5-day rainfall compared to daily (Yang et al., 2020).

Conclusion
This study shows that the performance of IMERG-F in extreme rainfall observations varies, depending on the type of extreme rainfall indexes. The IMERG-F overestimated the value of the majority of extreme rainfall indexes. The IMERG-F performed well for estimating extreme rainfall indices based on precipitation amount (PRCPTOT, R85P, R95P, and R99P), as indicated by the correlation coefficient (R) and Kling-Gupta Efficiency (KGE) values. For the frequency-based index, the performance is good only for weak (R1mm) and moderate rain (R10mm) rain, while for heavy rain (R20mm) and extreme rain (R50mm), the performance is not good. A good performance was also seen in precipitation durationbased indices (CWD) and precipitation intensity-based indices (RX5day). This study shows that IMERG products can be used to fill data gaps in poorly gauged regions, but their use needs to be used with caution, especially in mountainous areas. In addition, the results of this study also emphasize the need to improve the accuracy of IMERG data for mountainous areas. The current study only uses one observation and certainly cannot represent the entire Sumatra region. A more comprehensive study on the ground validation of all IMERG products (not only IMERF-F) is being carried out, and the results are being reviewed for publication in another journal.