GAS consumption zero for 5 months. How to proceed with regression?

Q: I have to do a regression for gas consumption in a mall in Romania under the whole facility approach.
I have 12-month data for total gas consumption and tenant consumption. I have subtracted both to get common area gas consumption. However, for the months May to Sep, the subtracted result of the gas consumption is zero. So, end result, I have 12 months of common area gas consumption, but, zero consumption for 5 months. In this scenario, is the regression done on the remaining 7-month data or should I consider the whole 12 months? (ASHRAE says no missing data in the baseline period, but is the calculated zero consumption considered as missing data?)

Also, HDD base 12.5 gives me a CV of 0.21. is that acceptable? My gas savings is 15%.

Kindly advise how to proceed. I have attached the file.

A: First - the calculated zero consumption should not be treated as missing data since it represents no usage. Second - the situation you describe is one that does occur for certain building types/climate zones. For example, when looking at a lot of gas data for a project in California about why natural gas savings projects are not participating in NMEC programs (which uses Option C). The issue is exactly this – low gas use during the warmer months causes the CV(RMSE) to blow up – because the average energy use of the baseline year is in the denominator of the CV(RMSE).

To get around this, we are recommending that the low use period data be separated from the high use period data, and separate baseline models developed from them. It would be useful to develop some sort of average temperature that defines low use from high use periods, e.g. when the average monthly temp is below X, then there is low usage. This would be helpful when making predictions of adjusted baseline use under reporting period conditions – knowing when to apply the low use model or the high-use model – described next:

Make a model using the low use period only, calculate the CV(RMSE) and NMBE, R2, etc. and check t-statistics of the model coefficients to make sure you have a good model. Repeat for the high use period. (Don’t rely too much on R2 as a criteria though).

There are some modeling algorithms that are piecewise linear such as ASHRAE’s change-point models. The 3-parameter model has two segments, a sloped portion and a flat portion. I would think that different software packages would find the temperature (or HDD) where these segments come together. Software packages that run change-point model exist:

Energy Explorer from Prof. Kelly Kissock https://academic.udayton.edu/kissock/http/research/energysoftware.htm

Energy Charting and Metrics tool (an Excel add-in, free) https://sbwconsulting.com/ecam/ (probably the best option for Excel users)

nmecr (R code implementation of multiple modeling algorithms) https://github.com/kW-Labs/nmecr (also free and for users of R software)