1. Introduction
A dryline is a boundary between moist and dry airs, which can be identified by a strong horizontal gradient of moisture (Grasso 2000). It commonly occurs over the Southern Great Plains in the United States (Owen 1966) peaking in strength and movement during spring to early summer (April–June), as it is the region where a dry continental air mass from the southwestern states or the Mexican plateau and a moist air mass from the Gulf of Mexico come together and form a surface boundary. The dryline is similar to fronts as it is a boundary between two different air masses, but dynamical forcing to lift the air is weaker than other fronts such as cold or warm fronts due to smaller thermal differences, hence less density difference across the boundary. Nevertheless, drylines often provide the lift and focus for severe storms when the environment is favorable. Therefore, many studies have developed ways to identify a dryline, and due to nonlinearity of the problem, some recent studies use machine learning technique to identify a dryline (Clark et al. 2015). Synoptic upper-level patterns over the Great Plains that are favorable for convection to develop along the dryline include a deep trough in the west of the Rockies and a strong downstream ridge resulting in amplified flow (Mitchell and Schultz 2020). Along with such synoptic patterns, the amount of low-level moisture and especially its east–west gradient is an important quantity to identify a dryline and predict convective initiation along the dryline. However, since low-level moisture observations tend to be spatially and/or temporally limited, it has been challenging for nowcasting of convective initiation along the dryline (Ziegler et al. 1997).
Radiosondes are regarded as the most trusted water vapor observation with high vertical resolution. Ground-based measurements such as ground-based lidars provide detailed description of vertical distribution of water vapor near ground. Raman lidar retrieves water vapor profiles by using a backscattered signal with a wavelength shift (Whiteman et al. 1992). The differential absorption lidar (DIAL) system uses backscattered signals from two wavelengths around the water vapor absorption line, one near the absorption line and the other away from the absorption line, to retrieve water vapor profiles (Browell et al. 1979). While both instruments provide high-resolution water vapor profiles, the data are spatially limited to small regions.
For water vapor observations with global coverage, satellite data are used. The radio occultation technique uses a bending angle from the Global Navigation Satellite System (GNSS) to derive humidity profiles through the relationship between refractivity and water vapor pressure (Kursinski et al. 1997). Due to high accuracy of the data and wide coverage over the globe, either the bending angle or total precipitable water (TPW) retrieved from the bending angle is assimilated into operational models (Dowell et al. 2022; ECMWF 2023). Passive microwave sensors also provide TPW measurements with higher spatial resolution. One of the most popular TPW retrieval algorithms using passive microwave sensors is called the microwave integrated retrieval system (MiRS). MiRS is based on a one-dimensional variational technique that uses the Community Radiative Transfer Model (CRTM) as a forward operator and climatology as a background (Boukabara et al. 2011). Taking multiple brightness temperature Tb observations as inputs, the algorithm finds state vectors including skin temperature Tskin, humidity, hydrometeor water contents, and surface emissivity by minimizing cost functions. TPW is retrieved from the postprocessing by integrating moisture content in the vertical. Although TPW retrieved from one microwave sensor is not enough to cover the entire globe, a global map of TPW can be produced by applying MiRS algorithms to multiple sensors and combining them. MiRS provides retrievals to create global TPW products such as the blended TPW (Kidder and Jones 2007), the Cooperative Institute for Meteorological Satellite Studies (CIMSS) Morphed Integrated Microwave Imagery at CIMSS TPW (MIMIC-TPW) (Wimmers and Velden 2011), and Advected Layered Precipitable Water (ALPW) (Forsythe et al. 2015; Gitro et al. 2018), which are widely used in operations. Each product uses different blending methods. The blended TPW product overlays the most recent data on top of old data, while MIMIC-TPW and ALPW use model wind data from a numerical weather prediction (NWP) model to advect TPW to the common time. The difference between MIMIC-TPW and ALPW is that the ALPW product has TPW at four vertical layers (surface to 850, 850–700, 700–500, and 500–300 hPa) and each layer is advected by the nearest wind of the midpoint of each pressure layer. Such global products are useful for monitoring synoptic-scale features such as atmospheric rivers, but the spatial and temporal resolutions (0.25° for MIMIC-TPW and 16 km for ALPW; 1 h) might not be enough to track a rapidly developing dryline, which has a sharp moisture gradient.
Infrared (IR) data from geostationary satellites provide a way of monitoring drylines with high spatial and temporal resolutions. Geostationary Operational Environmental Satellite (GOES) Advanced Baseline Imager (ABI) has three water vapor channels, but they are mostly sensitive to mid- to upper-tropospheric water vapor (Hilburn 2020). Due to low sensitivity to low-level water vapor, additional water vapor channels at 5.15 and 0.91 μm are planned for the next generation Geostationary Extended Observations (GeoXO) Imager (GXI). Before those observations become available in the 2030s, the split window difference (SWD) technique, which is the Tb difference between two channels around the IR window, is used to infer low-level water vapor (Dostalek et al. 2021). It is known to be useful for identifying near-surface boundaries and predicting convective initiation (Lindsey et al. 2018). Therefore, the operational GOES TPW product (Li et al. 2019) is retrieved using Tb values including all three water vapor channels and channels used for the SWD method along with ancillary NWP model outputs such as temperature or humidity profiles (Schmit et al. 2019). Although ABI data are mostly used to infer the vertically integrated value, recent study by Haynes et al. (2024) demonstrates potential to use ABI data to correct the entire humidity profiles, not just the integrated value, using machine learning techniques. It uses several machine learning models to postprocess temperature and humidity profiles from the Rapid Refresh (RAP) model, and all the machine learning model results show overall improvements in humidity profiles, mostly in middle and upper atmospheres.
This study focuses on low-level moisture over the Southern Great Plains, where drylines are most frequently formed. In this study, a U-Net model is developed to retrieve boundary layer precipitable water (BLPW) from GOES ABI data. BLPW is the amount of vertically integrated water vapor if condensed within the boundary layer, and its spatial and temporal evolution are critical information for identifying the dryline. Since the inputs of the U-Net model are the GOES data, BLPW can be estimated as frequently as the temporal resolution of GOES data, which are up to 1 min and, thus, enable us to observe the rapid development of the dryline. The U-Net model estimates are trained against BLPW from the High-Resolution Rapid Refresh (HRRR) model due to their high spatial and temporal resolutions, which ground-based or radiosonde observations cannot provide. The results are validated with the radiosonde data as well as the HRRR-based BLPW.
2. Data
a. Domain of interest
This study will focus on the Southern Great Plains where the dryline is most commonly located over North America; however, we note that drylines are not limited to North America (Bechis et al. 2020; van Schalkwyk et al. 2023; Akter and Tsuboki 2017). According to the climatology shown in Hoch and Markowski (2005), the dryline is most commonly found around 101°W at 0000 UTC in spring to early summer. Based on the climatology, the domain of interest is defined around 94°–106°W and 30°–40°N where dryline most frequently occurs, and it covers parts of Texas, Oklahoma, Colorado, and New Mexico as shown in Fig. 1. Daytime data from 1200 to 2300 UTC are used in this study because SWD loses its sensitivity to moisture at night when temperature inversions form, although inversions can persist into the first few hours of this period. The reason for stopping at 2300 UTC is because at that point, convective initiation has generally occurred, and the cloud cover in the scene obscures low-level moisture information. These choices are based on previous studies (Lindsey et al. 2014, 2018) that found that the value of SWD is greatest for characterizing the preconvective environment to increment forecasts of the location and timing of convective initiation.


The boundary of the domain of interest is colored in red, with a blue line at 101°W indicating the longitude that dryline most frequently forms.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1
b. HRRR model
The HRRR model is an hourly updating convection-allowing model developed at National Oceanic and Atmospheric Administration (NOAA) (Dowell et al. 2022). It provides accurate prediction and guidance for severe weather over CONUS and Alaska through the assimilation of a high volume of observation data including conventional observations, aircraft soundings (James and Benjamin 2017), and ground-based radar reflectivity (Weygandt et al. 2022). It has a spatial resolution of 3 km and 51 vertical levels.
In this study, a fixed height is used to represent the boundary layer height for simplicity, instead of the variable planetary boundary layer height, despite the latter varying diurnally and regionally. Although there is a limitation of using a fixed height, defining a boundary layer height can also introduce additional uncertainties. In addition, GOES is sensitive to a column of water vapor within some atmospheric layers, the depths of which are determined by weighting functions of the two channels that depend on moisture and temperature profiles, and these depths do not necessarily extend from the surface to the boundary layer height. Nevertheless, water vapor estimates at various heights tend to be correlated, and thus, setting a fixed layer around typical height for the boundary layer is considered to be a reasonable assumption. Therefore, a fixed height of 1.0 km is used throughout the study.
c. GOES-16 ABI
GOES-16 is the first geostationary satellite in the GOES-R series. It currently operates at 75.2°W and views the eastern part of CONUS as well as the Atlantic Ocean. The ABI on GOES-16 is a passive imaging radiometer with 16 channels. The ABI full disk, CONUS, and mesoscale sector data have 10-, 5-, and 1-min temporal resolutions, respectively, and such a high temporal resolution helps monitor and deliver early warnings for rapidly evolving severe weather events. ABI’s high spatial resolution (0.5 km for one of the visible channels, 1 km for other visible and near-infrared channels, and 2 km for infrared channels) is also useful to pinpoint the exact location of weather phenomenon. ABI has three infrared channels around the infrared window: channel 13 (10.3 μm) being the clean window channel with the least absorption by water vapor, channel 15 (12.3 μm) being the dirty window channel with the most absorption, and channel 14 (11.2 m) in the middle. Although ABI is an imager, not a sounder, and thus, does not provide detailed vertical profiles, slight absorption differences between these channels allow us to infer the amount of integrated water vapor in the atmospheric column, with stronger weighting toward the surface.
1) SWD
SWD is a Tb difference between a “clean” window channel at 10.3 μm and a “dirty” window channel at 12.3 μm, so-called a dirty window because the emissivity of water vapor at 12.3 μm is slightly larger than that at 10.3 μm. Due to different absorption characteristics at two channels, the SWD technique is used for various purposes, such as detecting dust, cirrus clouds, or low-level moisture. Dust tends to have negative SWD due to more absorption at 10.3 μm, and thus, dust products such as the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) Dust red, green, blue (RGB) product (Lensky and Rosenfeld 2008) or Dynamic Enhancement Background Reduction Algorithm (DEBRA; Miller et al. 2017) use the inverse of SWD to identify dust. On the other hand, clouds tend to have SWD values close to zero or positive values depending on the size and phase of cloud droplets. Smaller hydrometeors have larger SWD and the imaginary part of the refractive index, which is a measure of the absorption difference between ice and water. Thin cirrus clouds with small ice particles exhibit high SWD, while deep convective clouds with large particles tend to have values close to zero. In clear sky scenes, SWD values provide information about the amount of low-level moisture, which also depends on temperature and humidity profiles and the surface emissivity (Lindsey et al. 2014). According to sensitivity tests in Lindsey et al. (2014), the steeper the lapse rate is, and the greater the depth of the moisture is, the bigger the split window difference becomes. Using these characteristics in a clear sky scene, a dryline that is hard to observe from a single-channel satellite imagery can be identified prior to convective initiation (Lindsey et al. 2018).
2) TPW
TPW is the amount of vertically integrated water vapor if it is condensed to liquid, and it is usually represented with the unit of millimeter or inches. It is one of the ABI level 2 products, and it is a derived product from the GOES legacy atmospheric profile (LAP; Schmit et al. 2019) algorithm. The LAP algorithm retrieves temperature and moisture profiles from regression and one-dimensional variational (1DVAR)-based physical methods. The regression method produces the first guess for the 1DVAR using ABI Tb values, surface pressure, latitude, month, and land/ocean flag, and the 1DVAR method produces the final temperature and moisture profiles using the regression-derived profiles as background and first guess. Finally, TPW is derived integrating the moisture profiles over the column. The GOES TPW product has been helpful for forecasters to locate regions with deep moisture (Goodman et al. 2019). Despite its coarse spatial resolution of 10 km, TPW can be a good starting point to retrieve BLPW in a 3-km resolution, since most moisture occurs at lower levels. Therefore, experiments using the GOES TPW product as an additional input to the machine learning model are presented in section 4b.
d. Radiosonde
The Integrated Global Radiosonde Archive (IGRA), version 2, is used for validation (Durre et al. 2006, 2018). This provides 3529 validation samples over our domain of interest, mostly from 1200 UTC and few from 1800 to 1900 UTC, at six locations: Dodge City, Topeka, Amarillo, Norman, Midland, and Dallas. Bolton (1980) calculated vapor pressure from dewpoint temperature, vertically integrating using the formulation from Wentz (1997).
3. Methodology
Machine learning model
1) Input and output data
Input variables used for the machine learning model that are relevant to BLPW include GOES-16 ABI Tb values at 10.3, 11.2, or 12.3 μm, SWD, solar zenith angle (SZA), satellite zenith angle, Tskin from the HRRR model, and GOES TPW. Among these options, models are tested with different input combinations, and the results are compared to find the optimal input variables. The values used to scale the inputs into the range of 0–1 are chosen empirically and summarized in Table 1. Note that for SZA and satellite zenith angle, cosine is applied rather than using raw values. Output of the machine learning model is a map of BLPW, and the model is trained against BLPW derived from the HRRR model outputs using Eq. (1).
Minimum and maximum values used to normalize each input variable.


Since GOES data and HRRR model outputs have different spatial resolutions, GOES data in a 2-km resolution are interpolated into the 3-km HRRR grid to produce 384 × 384 images for both inputs and outputs. Hourly data during April, May, June, and July from 2017 to 2020 are used for training, and data in 2021 and 2022 are used for validation and testing, respectively. Only the daytime data from 1200 to 2300 UTC are used for both training and runs after the training because the SWD technique is only efficient during daytime, and ABI level 2 clear-sky mask (ACM) product is used to mask out cloudy points. If more than half of the grid points within a 3 × 3 box were confidently cloudy from the ACM product, the pixel is masked with 0 in both input and output images so that cloudy pixel does not impact the loss function. Training, validation, and testing data periods are summarized in Table 2. Since TPW is only available from 2020 onward, which limits the number of training data, 2023 data are added for training, and the same entire 2022 data are used for both validation and testing so that there are sufficient number of training data. Although it is not optimal to have the same data for validation and testing, it is done so to evaluate the model performance using the same dataset as in previous experiments without losing too many data for training. Results using TPW as inputs are compared with the results not using TPW to demonstrate proof-of-concept using the level 2 product as inputs. That way, our machine learning model serves to refine and sharpen our moisture knowledge, building on the physical knowledge incorporated into the level 2 retrieval.
A summary of training, validation, and testing datasets. The number of data is written in a bold font, next to the year used for each dataset.


2) Model architecture
The U-Net model is one of the most widely used image-to-image translation models, first introduced by Ronneberger et al. (2015). It consists of convolution and pooling layers and uses skip connections to bring fine-scale information from the encoder to the decoder, which otherwise is lost when upsampling. There are many variations such as U-Net+ (Zhou et al. 2018), U-Net3+ (Huang et al. 2020), or attention U-Net (Oktay et al. 2018) depending on the nature of the skip connections or attention modules. This study uses a classic U-Net model because the amount of skip connection is sufficient for this application. The architecture of the U-Net model used in this study is described in Fig. 2. Hyperparameters are randomly searched from the options summarized in the left column of Table 3 to create 100 combinations, and the final hyperparameters that are chosen based on a random search are shown in the right column of Table 3. Note that the final hyperparameters are chosen from the experiments using channel 13 Tb and SWD as inputs, and thus, these hyperparameters might not be optimal for other input options. However, since the purpose of this study is to evaluate model performance using different input options, the model structure and hyperparameters are set the same for all the experiments. The number of kernels are doubled as the depth of U-Net increases, and thus, the number of kernels are 16, 32, 64, 128, and 256 with five maxpooling. Mean-square error (MSE) is used as the loss function, but mean absolute error (MAE) is also evaluated.


U-Net model architecture is shown here. Blue and gray arrows represent convolution layers; while black, yellow, and green arrows represent concatenation, maxpooling, and upsampling, respectively. The number of kernels for each convolution layers are shown in pink colors. A rectified linear unit (ReLU) is used after all convolution layers.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1
Variables used for a hyperparameter search.


4. Results
a. Model performance using different input combinations
Machine learning models are trained to find nonlinear relationships between sets of input and output variables. If convergence to a global minima were guaranteed, then using individual channels versus channel differences would not matter. However, this is not guaranteed, so input feature engineering through taking channel differences can help a model converge to a global minimum more quickly. This study tries to gain insights into whether providing SWD as an input helps the model to learn better compared to training only with channel information. If only the channel information is given as input, the model has to figure out the difference between the two or three channels is related to BLPW, but if SWD is given as input, the model can directly use SWD to infer BLPW.
Table 4 summarizes the model performance for the testing data (2022) against the HRRR-based BLPW with the 95% confidence interval. The confidence interval is generated through bootstrapping with 10 000 resamples. The first experiment uses channel 13 Tb and SWD, the two variables considered to be the most important variables. However, since MSE using three channel Tb values is lower than MSE using channel 13 Tb and SWD, the rest of the experiments include all three channels. Having SWD in addition to all three channels reduced MSE even more, which suggests that directly telling the model where to focus helps the model training and shows the importance of incorporating relational inductive biases into a machine learning model (Battaglia et al. 2018). Ancillary information that improves the accuracy of BLPW estimates are Tskin and solar zenith angle. Since SWD is very much affected by temperature profile or lapse rate, information about near-surface temperature and time of day or SZA, which helps infer the presence of temperature inversion that might still exist in the early morning, are important. When either one of Tskin or SZA is added as an input, MSE increases, while having both as inputs shows improvements. It seems that knowing Tskin alone is not useful, but along with information about the time of day or SZA, it can better estimate BLPW. Therefore, the minimum MSE is achieved using channels 13, 14, and 15 Tb values, SWD, Tskin, and SZA (NOTPW model highlighted with a bold text in Table 4; the NOTPW model is named to facilitate comparison with subsequent experiments using TPW). Satellite zenith angle was thought to be helpful because GOES-16 slightly shifted from the checkout position (89.5°W) to its current location (75.2°W) in 2017 and because the integrated water vapor depends on the length of a viewing path, which depends on the satellite zenith angle. However, having satellite zenith angle as an input did not show an improvement in the results, which may be because over this small domain, there is a strong inverse correlation between the viewing angle and the time-mean moisture amount.
Model performances are shown in terms of MSE and MAE against HRRR-based BLPW (mm2 and mm, respectively) for the testing data with the 95% confidence interval. C13, C14, and C15 represent Tb values at channels 13, 14, and 15, respectively. The NOTPW model highlighted in a bold text is the model using channel 13, 14, 15 Tb values, SWD, Tskin, and SZA as inputs, which shows the minimum MSE and MAE.


b. Model performance using GOES TPW as an additional input
Additional experiments are conducted adding GOES TPW to the inputs, but the model performance is evaluated separately since it uses different datasets. The GOES TPW product is only available after 2020, and thus, 4 years of data in 2020–23 are used for training, validation, and testing as shown in Table 2. Three years of data during 2020, 2021, and 2023 are used for training, and the same year of 2022 data are used for both validation and testing to use sufficient number of data for training. For these experiments, cloudy pixels are masked in the way that the TPW product masks cloudy pixels, which is slightly different from the ACM product. The results are validated using 2022 data, which is the same data used to validate the previous experiments without TPW in Table 4. Table 5 shows the summary of the model performances for three experiments. The “CTL model” is trained with the same input as the NOTPW model in Table 4, but it should be noted that the training data period and clear sky mask are different from the NOTPW model. It is used as a control experiment to directly compare the results with the “ALL model” where TPW is used as an additional input and examine the impacts of having TPW as an input. The “ABI model” is a model trained only with ABI data including TPW product, and it does not include the HRRR model output. Since the GOES TPW product is retrieved using the NWP model output such as temperature and moisture profiles, it already has some of this information encoded in the retrievals. The results in Table 5 show that adding TPW improves the model performance in general. Even the ABI model, which does not use Tskin as input, shows improvement in MSE, probably because NWP information is embedded in the TPW. The ALL model has the minimum MSE as expected because it uses all the information available. Another thing to note is that all the MSE and MAE values in Table 5 are lower than those in Table 4. It can be attributed to two factors: different data period used for training (as shown in Table 1) and different clear sky masks used during training and evaluating MSE and MAE. The GOES TPW product is only retrieved in clear sky pixels, and the way that it is masked is slightly different from the ACM product, which is used for previous experiments. Therefore, experiments in Table 5 are trained and evaluated based on the TPW clear sky mask, which tends to mask larger area and not mask small low clouds. It appears that the difference in MSE is mainly due to different clear sky masks used during training.
Model performances in MSE and MAE against HRRR-based BLPW and radiosonde BLPW for adding GOES TPW as input. The names of the models for each experiment are CTL, TPW, and ABI models.


The performance of the three models is further evaluated in Fig. 3. Figure 3a shows MSE by time of day for each experiment against the HRRR-based estimates. The model performance for all three experiments is worse in the early morning (1200–1400 UTC; 0700–0900 CDT) and toward the terminator (2300 UTC; 1800 CDT). This is expected because SWD signals become less sensitive to the low-level moisture as the lapse rate decreases or the temperature inversion that developed overnight might continue in the early morning. Both the ABI model and the ALL model show improvements in the early morning compared to the CTL model. On the other hand, Fig. 3b shows validation results against the radiosonde data, which are considered to be the most accurate observation. It shows a scatterplot between radiosonde BLPW and U-Net model BLPW at 409 clear sky grid points that were available during 2022. Note that the majority of data points (392 out of 409) are collected at 1200 UTC because radiosonde is regularly launched every 6 h at 0000, 0600, 1200, and 1800 UTC. Only 17 data are collected outside of the regular launch times (1400, 1800, or 1900 UTC). Note that in Fig. 3b, HRRR-based estimates are also compared for reference. The HRRR-based estimates in green outperform other model results. Among the U-Net model, the ALL model still performs the best in terms of MSE, but MSE and coefficient of determination R2 are very similar between the ALL and ABI models. The CTL model performs much worse than the other models using GOES TPW because as shown in Fig. 3a, the CTL model tends to perform worse during early morning, and most of the data points are taken at 1200 UTC (0700 CDT). However, when they are only validated against the 17 data collected in the afternoon when the positive lapse rate is recovered, all the models perform better (MSE of 5.16, 3.17, and 3.38 and R2 of 0.76, 0.86, and 0.85, respectively, for CTL, ALL, and ABI models).


(a) MSE by the time of day for the CTL, ALL, and ABI models (purple, blue, and orange lines, respectively). (b) Comparisons between U-Net model BLPW estimates and radiosonde BLPW for each model. HRRR-based estimates (green) are also compared for reference in (b).
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1
c. Case study results
Although MSE and MAE are generally good metrics to evaluate the model performance, they do not quantify everything important for meteorological applications. For example, MSE and MAE do not highlight whether the model captures strong gradients, which is essential for observing the dryline. In this section, the model performance is evaluated looking at two case studies: one case with a dryline and the other case with the highest MSE value. Two case studies are presented to show in what situation the model performs well or struggles. In this section, U-Net-based BLPW from three models (CTL, ABI, and ALL models) in Table 5 are used.
1) A case with a dryline: 30 May 2022
According to the surface analysis in Fig. 4a, a dryline (orange line in Fig. 4a) developed across Texas and Oklahoma on 30 May 2022, and severe storms were initiated along the southern part of the dryline. As indicated by dewpoint temperature observations in the surface analysis (Fig. 4a), there is a large moisture gradient between east and west of the dryline. However, ABI’s lowest water vapor channel (channel 10; 7.3 μm) cannot capture the large moisture gradient near the surface as its weighting function peaks at the middle atmosphere.


(a) Surface analysis from the Storm Prediction Center at 2100 UTC and (b) channel 10 Tb (low-level water vapor channel) at 2101 UTC 30 May 2022.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1
Figure 5 compares progressions of BLPW throughout the day from the HRRR-based product and the three U-Net model estimates. Gray colors represent cloudy pixels, whose BLPW values are not retrieved. Orange dots in the figure is an easternmost point that has a BLPW gradient larger than 5 mm (100 km)−1, which is an empirically determined value in this study to represent a dryline, as a dryline is defined using specific humidity gradient threshold [3 g kg−1 (100 km) −1; Hoch and Markowski 2005]. At 1300 UTC, when the CTL model has the highest MSE as shown in Fig. 3a, the CTL model does not capture the dryline as well as other models or the HRRR-based BLPW, but it starts to better capture high moisture gradient from 1500 UTC. This is due to the weak SWD signal in the early morning that does not highlight the high gradient. Figure 6 compares SWD at 1300 and 1700 UTC. High SWD gradient, which can be a good indicator of a dryline, is present at 1700 UTC (Fig. 6b), while it is missed at 1300 UTC (Fig. 6a). However, a clearer dryline is depicted when TPW is added as shown in Figs. 5c and 5d. In addition, ABI model results in Fig. 5d show that only using ABI data can also resolve BLPW without additional HRRR model information. As the dryline slowly moves eastward and becomes stronger, convection starts to develop along the northern part of the dryline, and it resulted in several hail and wind reports in Kansas by 2300 UTC. All three models perform well in the afternoon and correctly show high moisture gradients along the empirically determined dryline (orange line), which is the line that severe storms are initiated. One thing to note is that U-Net-based BLPW looks smoother than the HRRR-based BLPW and does not capture fine resolution features observed in HRRR-based BLPW. This is likely the well-known problem that convolutional neural networks (CNNs) produce smoothed predictions as a way to hedge uncertainty (Ravuri et al. 2021).


(a) HRRR-based BLPW, (b) CTL model–based BLPW, (c) ALL model–based BLPW, and (d) ABI model–based BLPW are shown every 2 h from 1300 to 2300 UTC 30 May 2022. Gray color indicates the presence of clouds, and orange dots are the location of the empirically determined dryline where moisture gradient is high.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1


Maps of SWD at (a) 1300 UTC and (b) 1700 UTC.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1
2) A case that the HRRR-based BLPW shows the highest MSE: 16 April 2022
A case at 1200 UTC 16 April 2022 is when the HRRR model–based BLPW shows the largest error compared to radiosonde data. Figure 7 shows the HRRR-based BLPW and BLPW retrieved from the three models. Star in Fig. 1 marks the location of a radiosonde observation in Texas, and its color represents the BLPW value. BLPW from all three models and the HRRR-based BLPW are similar in terms of the overall pattern in the domain. However, when compared to the radiosonde data (marked with a star in Fig. 1), retrieved BLPW as well as the HRRR-based BLPW show overestimation, HRRR-based BLPW having the largest difference. Radiosonde-based BLPW is 9.24 mm, while the HRRR-based, CTL model, ABI model, and ALL model values are 17.00, 11.97, 12.99, and 15.29 mm, respectively. This case shows a limitation of using the HRRR model as the truth during training. Although using HRRR-based BLPW as the truth was the best option available, it has been shown to have biases in near-surface humidity (He et al. 2023), which can be passed down to the machine learning model. Nevertheless, in this case, machine learning–based estimates are closer to the observation, and this might suggest that even a dataset with biases or uncertainties can teach the machine learning model to discern general relationships between input and output datasets.


BLPW derived from the (a) HRRR model, (b) CTL model, (c) ABI model, and (d) ALL model at 1200 UTC 30 Jun 2022. The color of star in (a) represents the radiosonde-based BLPW value at that location.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1
5. Discussion
Two case studies as well as MSE and MAE evaluations show that BLPW estimates from the U-Net models are comparable to the HRRR model BLPW and radiosonde-based BLPW despite some limitations. The U-Net model–based BLPW is only available during daytime, and BLPW estimates tend to be less certain during early morning when the lapse rate has not recovered from the temperature inversion the night before. However, considering the main purpose is developing a product to monitor BLPW prior to the onset of severe storms, which are usually initiated in the daytime, nighttime data are of less interest. The main advantage of the U-Net-based BLPW over the existing BLPW product is that it can be produced in a few seconds with the 3-km resolution. Although Fig. 5 only shows the predicted BLPW every 2 h, the U-Net model outputs can be produced whenever the input data are available. The technique could be applied to the CONUS sector for 5-min updates on the dryline for predicting the rapidly developing convection along a dryline.
In addition, although the model is trained on the HRRR grid, the model can be easily applied to the raw ABI data because their spatial resolutions are very similar (ABI and HRRR data are in 2 and 3 km, respectively). Figure 8 shows results applying the ABI model to the 2-km ABI data that is not interpolated into the 3-km HRRR grid for the same dryline case in Fig. 5. The results are only plotted from 1300 to 1900 UTC since those are the times prior to convective initiation. BLPW estimates from Figs. 8 and 5d are very similar as well as their gradients, which implies that the model does not need to be retrained or modified to apply to a slightly different grid. However, it is important to note that the model results outside of the training domain should be treated with caution because it includes data that the model has not seen during training.


BLPW derived by applying the ABI model to the 2-km ABI data at (a) 1300 UTC, (b) 1500 UTC, (c) 1700 UTC, and (d) 1900 UTC 30 May 2022. This is the same dryline case as in Fig. 5d. Similarly, gray colors indicate the presence of clouds, and orange dots are the location of the empirically determined dryline where moisture gradient is high.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1
One of the main factors affecting the model performance is a clear sky mask used for training. The clear sky mask is required for training because under a cloudy scene, SWD signals become less relevant to low-level moisture but rather depend on cloud types, and it is almost impossible to derive BLPW since the IR signal is dominated by the cloud above. However, the clear sky mask itself has its own uncertainty, a lot coming from low clouds (Jiménez 2020), and accordingly, the quality or the spatial resolution of a clear sky mask used for training becomes an important factor for the model performance. The GOES TPW product tends to mask cloudy pixels in a coarser resolution, and often times, low cumulus clouds, which are usually small, are not classified as cloudy pixels. Therefore, only the thick clouds which exhibit clear radiance signatures and surrounding points are masked in the TPW product. Using such a coarse resolution clear sky mask that ignores some clouds seems to outperform one using a fine resolution clear-sky mask. The MSE and R2 comparing radiosonde data from the NOTPW model are 6.05 and 0.57, respectively, and the values are worse than those of the CTL model. The reason might be because moisture field does not vary much under such small low clouds, and thus, BLPW can be inferred from the neighboring clear sky pixels through the U-Net model, which learns spatial features and has a theoretical receptive field of 600 km. In addition, such clouds appear as a large group of speckles in the image, which can prohibit learning spatial features.
One case that shows a difference between using the fine-resolution ACM product and the TPW-based clear sky mask is on 3 May 2022 in Fig. 9. The NOTPW model shows the maximum MSE, and the results are compared with other model results in Fig. 10. Channel 2 reflectance and channel 13 Tb are presented in Figs. 9a and 9b to show clouds over the scene. Note that channel 2 reflectance is plotted in its original spatial resolution of 0.5 km, rather than in a 3-km resolution as in other figures, to better represent low clouds in the scene. SWD is shown in Fig. 9c and the corresponding HRRR-based BLWP is shown in Fig. 9d. Figures 10a–d show BLPW estimates from the NOTPW, CTL, ALL, and ABI models. Note that cloudy pixels (gray colors) in Figs. 10a and 10b are different due to different clear sky masks. Some of the low clouds dominant over Texas are not classified as cloudy pixels in the CTL, ALL, and ABI model results because the TPW-based clear sky mask is used. Based on low SWD values over Texas, the NOTPW model predicts low BLPW, while other models correctly predict BLPW values close to HRRR BLPW. This case study result confirms that including cloudy pixels during training helps predict BLPW in the presence of low clouds.


(a) ABI channel 2 reflectance, (b) channel 13 Tb, (c) SWD, and (d) HRRR-based BLPW for the case at 1600 UTC 3 May 2022.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1


(a) NOTPW model–based BLPW, (b) CTL model–based BLPW, (c) ALL model–based BLPW, and (d) ABI model–based BLPW for the case at 1600 UTC 3 May 2022.
Citation: Journal of Applied Meteorology and Climatology 63, 12; 10.1175/JAMC-D-24-0048.1
6. Summary and conclusions
Since by definition, the dryline is formed in the region with strong moisture gradients, accurate observation of low-level moisture is essential to locate the dryline and predict convective initiation in advance. To monitor the evolution of low-level moisture and drylines, high-spatial-resolution data and high-temporal-resolution data are required, but there is no such observation-based product that focuses on low-level moisture. This study develops a machine learning model to retrieve BLPW from GOES-16 so that the output has high spatial and temporal resolutions to monitor BLPW. Among different combinations of input variables, channel 13, 14, and 15 Tb values, SWD, Tskin, and SZA are found to be the best combination, and adding GOES TPW product improves the model performance even more. Given the GOES TPW products for which the NWP information is already embedded, surface temperature information from the HRRR model is no longer needed, and the model performs well solely using ABI data. Another thing to highlight from the results using different input data is that even if the information content is the same, the model result can improve by providing SWD in addition to Tb values to emphasize the direct relationship between SWD and BLPW. The validation results show that the model performance depends on the time of day and how many cloudy pixels are excluded in the training. Although cloudy pixels are set to zero in the training, it can still affect the model performance depending on the accuracy of the clear sky mask, since the U-Net model learns the spatial features.
While it is evident that the longwave infrared bands used for the SWD are not ideal for moisture retrieval, they do provide useful information for capturing strong moisture gradients at high spatial and temporal resolutions, where ground-based measurements cannot provide in many areas. The next generation of Geostationary Extended Observations (GeoXO) series is planned to add two new bands that are sensitive to low-level moisture (0.91 and 5.15 μm) (Lindsey et al. 2024). This will provide powerful new predictors for ML models to retrieve low-level moisture from GXI.
Acknowledgments.
We gratefully acknowledge support from NOAA GOES-R Program under Grant NA19OAR4320073.
Data availability statement.
HRRR model data are accessed each day from ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/hrrr/prod/. GOES-16 data including ABI brightness temperatures, clear sky mask product, and TPW product were accessed on 20 October 2023 from https://registry.opendata.aws/noaa-goes. Radiosonde data were accessed on 14 June 2023 from ftp://ftp.ncdc.noaa.gov/pub/data/igra/data/data-por.
REFERENCES
Akter, N., and K. Tsuboki, 2017: Climatology of the premonsoon Indian dryline. Int. J. Climatol., 37, 3991–3998, https://doi.org/10.1002/joc.4968.
Battaglia, P. W., and Coauthors, 2018: Relational inductive biases, deep learning, and graph networks. arXiv, 1806.01261v3, https://doi.org/10.48550/arXiv.1806.01261.
Bechis, H., P. Salio, and J. J. Ruiz, 2020: Drylines in Argentina: Synoptic climatology and processes leading to their genesis. Mon. Wea. Rev., 148, 111–129, https://doi.org/10.1175/MWR-D-19-0050.1.
Bolton, D., 1980: The computation of equivalent potential temperature. Mon. Wea. Rev., 108, 1046–1053, https://doi.org/10.1175/1520-0493(1980)108<1046:TCOEPT>2.0.CO;2.
Boukabara, S.-A., and Coauthors, 2011: MiRS: An all-weather 1DVAR satellite data assimilation and retrieval system. IEEE Trans. Geosci. Remote Sens., 49, 3249–3272, https://doi.org/10.1109/TGRS.2011.2158438.
Browell, E. V., T. D. Wilkerson, and T. J. Mcilrath, 1979: Water vapor differential absorption lidar development and evaluation. Appl. Opt., 18, 3474–3483, https://doi.org/10.1364/AO.18.003474.
Clark, A. J., A. MacKenzie, A. McGovern, V. Lakshmanan, and R. A. Brown, 2015: An automated, multiparameter dryline identification algorithm. Wea. Forecasting, 30, 1781–1794, https://doi.org/10.1175/WAF-D-15-0070.1.
Dostalek, J. F., L. D. Grasso, Y.-J. Noh, T.-C. Wu, J. W. Zeitler, H. G. Weinman, A. E. Cohen, and D. T. Lindsey, 2021: Using GOES ABI split-window radiances to retrieve daytime low-level water vapor for convective forecasting. Electron. J. Severe Storms Meteor., 16 (2), https://doi.org/10.55599/ejssm.v16i2.79.
Dowell, D. C., and Coauthors, 2022: The High-Resolution Rapid Refresh (HRRR): An hourly updating convection-allowing forecast model. Part I: Motivation and system description. Wea. Forecasting, 37, 1371–1395, https://doi.org/10.1175/WAF-D-21-0151.1.
Durre, I., R. S. Vose, and D. B. Wuertz, 2006: Overview of the integrated global radiosonde archive. J. Climate, 19, 53–68, https://doi.org/10.1175/JCLI3594.1.
Durre, I., X. Yin, R. S. Vose, S. Applequist, and J. Arnfield, 2018: Enhancing the data coverage in the integrated global radiosonde archive. J. Atmos. Oceanic Technol., 35, 1753–1770, https://doi.org/10.1175/JTECH-D-17-0223.1.
ECMWF, 2023: IFS Documentation CY48R1 - Part I: Observations. ECMWF Tech. Rep., 88 pp., https://www.ecmwf.int/en/elibrary/81367-ifs-documentation-cy48r1-part-i-observations.
Forsythe, J. M., S. Q. Kidder, K. K. Fuell, A. Leroy, G. J. Jedlovec, and A. S. Jones, 2015: A multisensor, blended, layered water vapor product for weather analysis and forecasting. J. Oper. Meteor., 3, 41–58, https://doi.org/10.15191/nwajom.2015.0305.
Gitro, C. M., and Coauthors, 2018: Using the multisensor advected layered precipitable water product in the operational forecast environment. J. Oper. Meteor., 6, 59–73, https://doi.org/10.15191/nwajom.2018.0606.
Goodman, S. J., T. J. Schmit, J. Daniels, and R. J. Redmon, 2019: The GOES-R Series: A New Generation of Geostationary Environmental Satellites. Elsevier, 306 pp.
Grasso, L. D., 2000: A numerical simulation of dryline sensitivity to soil moisture. Mon. Wea. Rev., 128, 2816–2834, https://doi.org/10.1175/1520-0493(2000)128<2816:ANSODS>2.0.CO;2.
Haynes, K., J. Stock, J. Dostalek, C. Anderson, and I. Ebert-Uphoff, 2024: Exploring the use of machine learning to improve vertical profiles of temperature and moisture. Artif. Intell. Earth Syst., 3, e220090, https://doi.org/10.1175/AIES-D-22-0090.1.
He, S., D. D. Turner, S. G. Benjamin, J. B. Olson, T. G. Smirnova, and T. Meyers, 2023: Evaluation of the near-surface variables in the HRRR weather model using observations from the ARM SGP site. J. Appl. Meteor. Climatol., 62, 769–780, https://doi.org/10.1175/JAMC-D-23-0003.1.
Hilburn, K., 2020: Inferring airmass properties from GOES-R ABI observations. 2020 Fall Meeting, Online, Amer. Geophys. Union, Abstract A008-0009, https://ui.adsabs.harvard.edu/abs/2020AGUFMA008.0009H/abstract.
Hoch, J., and P. Markowski, 2005: A climatology of springtime dryline position in the U.S. Great Plains region. J. Climate, 18, 2132–2137, https://doi.org/10.1175/JCLI3392.1.
Huang, H., and Coauthors, 2020: UNet 3+: A full-scale connected UNet for medical image segmentation. ICASSP 2020–2020 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, Institute of Electrical and Electronics Engineers, 1055–1059, https://doi.org/10.1109/ICASSP40776.2020.9053405.
James, E. P., and S. G. Benjamin, 2017: Observation system experiments with the hourly updating Rapid Refresh model using GSI hybrid ensemble–variational data assimilation. Mon. Wea. Rev., 145, 2897–2918, https://doi.org/10.1175/MWR-D-16-0398.1.
Jiménez, P. A., 2020: Assessment of the GOES-16 clear sky mask product over the contiguous USA using CALIPSO retrievals. Remote Sens., 12, 1630, https://doi.org/10.3390/rs12101630.
Kidder, S. Q., and A. S. Jones, 2007: A blended satellite total precipitable water product for operational forecasting. J. Atmos. Oceanic Technol., 24, 74–81, https://doi.org/10.1175/JTECH1960.1.
Kursinski, E. R., G. A. Hajj, J. T. Schofield, R. P. Linfield, and K. R. Hardy, 1997: Observing Earth’s atmosphere with radio occultation measurements using the Global Positioning System. J. Geophys. Res., 102, 23 429–23 465, https://doi.org/10.1029/97JD01569.
Lensky, I. M., and D. Rosenfeld, 2008: Clouds-Aerosols-Precipitation Satellite Analysis Tool (CAPSAT). Atmos. Chem. Phys., 8, 6739–6753, https://doi.org/10.5194/acp-8-6739-2008.
Li, J., T. J. Schmit, X. Jin, G. Martin, and Z. Li, 2019: GOES-R Advanced Baseline Imager (ABI) algorithm theoretical basis document for legacy atmospheric moisture profile, legacy atmospheric temperature profile, total precipitable water, and derived atmospheric stability indices. NOAA NESDIS Center for Satellite Applications and Research Tech. Doc., 110 pp., https://www.star.nesdis.noaa.gov/goesr/documents/ATBDs/Enterprise/ATBD_Enterprise_Soundings_Legacy_Atmospheric_Profiles_v3.1_2019-11-01.pdf.
Lindsey, D. T., L. Grasso, J. F. Dostalek, and J. Kerkmann, 2014: Use of the GOES-R split-window difference to diagnose deepening low-level water vapor. J. Appl. Meteor. Climatol., 53, 2005–2016, https://doi.org/10.1175/JAMC-D-14-0010.1.
Lindsey, D. T., D. Bikos, and L. Grasso, 2018: Using the GOES-16 split window difference to detect a boundary prior to cloud formation. Bull. Amer. Meteor. Soc., 99, 1541–1544, https://doi.org/10.1175/BAMS-D-17-0141.1.
Lindsey, D. T., and Coauthors, 2024: GeoXO: NOAA’s future geostationary satellite system. Bull. Amer. Meteor. Soc., 105, E660–E679, https://doi.org/10.1175/BAMS-D-23-0048.1.
Miller, S. D., R. L. Bankert, J. E. Solbrig, J. M. Forsythe, Y.-J. Noh, and L. D. Grasso, 2017: A dynamic enhancement with background reduction algorithm: Overview and application to satellite-based dust storm detection. J. Geophys. Res. Atmos., 122, 12 938–12 959, https://doi.org/10.1002/2017JD027365.
Mitchell, T., and D. M. Schultz, 2020: A synoptic climatology of spring dryline convection in the southern Great Plains. Wea. Forecasting, 35, 1561–1582, https://doi.org/10.1175/WAF-D-19-0160.1.
Oktay, O., and Coauthors, 2018: Attention U-Net: Learning where to look for the pancreas. arXiv, 1804.03999v3, https://doi.org/10.48550/arXiv.1804.03999.
Owen, J., 1966: A study of thunderstorm formation along dry lines. J. Appl. Meteor., 5, 58–63, https://doi.org/10.1175/1520-0450(1966)005<0058:ASOTFA>2.0.CO;2.
Ravuri, S., and Coauthors, 2021: Skilful precipitation nowcasting using deep generative models of radar. Nature, 597, 672–677, https://doi.org/10.1038/s41586-021-03854-z.
Ronneberger, O., P. Fischer, and T. Brox, 2015: U-Net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, N. Navabe et al., Eds., Lecture Notes in Computer Science, Vol. 9351, Springer, 234–241.
Schmit, T. J., and Coauthors, 2019: Legacy atmospheric profiles and derived products from GOES-16: Validation and applications. Earth Space Sci., 6, 1730–1748, https://doi.org/10.1029/2019EA000729.
van Schalkwyk, L., R. C. Blamey, M. Gijben, and C. J. C. Reason, 2023: A climatology of dryline-related convection on the western plateau of subtropical southern Africa. J. Geophys. Res. Atmos., 128, e2023JD038, https://doi.org/10.1029/2023JD038966%20.
Wentz, F. J., 1997: A well-calibrated ocean algorithm for special sensor microwave / imager. J. Geophys. Res., 102, 8703–8718, https://doi.org/10.1029/96JC01751.
Weygandt, S. S., S. G. Benjamin, M. Hu, C. R. Alexander, T. G. Smirnova, and E. P. James, 2022: Radar reflectivity–based model initialization using specified Latent Heating (Radar-LHI) within a diabatic digital filter or pre-forecast integration. Wea. Forecasting, 37, 1419–1434, https://doi.org/10.1175/WAF-D-21-0142.1.
Whiteman, D. N., S. H. Melfi, and R. A. Ferrare, 1992: Raman lidar system for the measurement of water vapor and aerosols in the Earth’s atmosphere. Appl. Opt., 31, 3068–3082, https://doi.org/10.1364/AO.31.003068.
Wimmers, A. J., and C. S. Velden, 2011: Seamless advective blending of total precipitable water retrievals from polar-orbiting satellites. J. Appl. Meteor. Climatol., 50, 1024–1036, https://doi.org/10.1175/2010JAMC2589.1.
Zhou, Z., M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, 2018: UNet++: A nested U-net architecture for medical image segmentation. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, D. Stoyanov et al., Eds., Lecture Notes in Computer Science, Vol. 11045, Springer, 3–11.
Ziegler, C. L., T. J. Lee, and R. A. Pielke Sr., 1997: Convective initiation at the dryline: A modeling study. Mon. Wea. Rev., 125, 1001–1026, https://doi.org/10.1175/1520-0493(1997)125<1001:CIATDA>2.0.CO;2.
