DIY Zoning: Fault Tolerance


Equipment controlled by DZ is extremely expensive. Failures are often catastrophic and cause damages in four figures even under normal circumstances, when all your equipment is left just the way it is and your warranty company or the manufacturer takes care of fixing the problem.

We tread on a thin ice here, using an "unapproved" system. Therefore, we want to eliminate any possible complication. This includes early warning systems so we don't miss a problem brewing, and safety shutoffs that will allow to avoid damaging the whole system should a part of it fail.


Whereas this item doesn't seem to be related to fault tolerance at all, a neglected filter is a primary source of serious problems, all of which are related to restricted airflow. Most severe is a slugged compressor, which will almost invariably shorten its lifespan, and/or will kill it (unless it is a scroll type compressor).

Taking care of the filter is relatively easy. Two pressure sensors are required, one before the filter, the other is after it. When the pressure difference becomes significant enough, a warning will be issued. In paranoid mode, we could just shut down the system until the filter is replaced.

Indoor Fan

A complete failure of the indoor fan (a.k.a. air handler) is easy to detect with two pressure sensors, one before the blower and the other after. Identical pressure (or zero differential) will indicate a complete failure.

A partial failure is more difficult to detect. Additional complications get in the way with variable speed blowers - the pressure differential will be changing quite gradually, and some assistance of PID controllers will be required.


A complete failure of the compressor will manifest itself in drastically reduced air temperature differential across indoor coil, or, alternatively, refrigerant temperature differential across indoor coil.

A partial failure of the compressor (including improper charge) will result in a temperature difference between incoming and outgoing refrigerant line that is either too low, or too high. A temperature differential that is too high is also a sign of restricted airflow, which can be acceptable for a zoned forced air system.

Another, more expensive and intrusive, way of monitoring the compressor health is to introduce pressure sensors into the refrigerant line, on suction and discharge side.


Currently, the only way to determine the damper failure is to check the status of I/O operation of a damper driver. If the operation fails, the damper is considered inoperable (and usually, there'll be more than one, since typical controllers will control anywhere from 4 to 16 dampers).

When such a condition is detected, the last known damper state is recalled, and the balancing operation of the rest of the system that is still operable is adjusted. If all the dampers have failed, the system analyzes whether the remaining damper opening is enough to keep the system running, and the system operation continues in a non-zoned mode. If it so happens that the amount of remaining damper opening is insufficient for more or less normal operation, the system is shut down.


In existing implementation, sensors may fail one by one or all at once, intermittently or permanently, due to the nature of 1-Wire® network.

These are known causes of sensor failure:

  • Water (rain) getting into a sensor assembly causes failure to complete temperature conversion. Same thing for insufficient power;
  • Lightning may cause temporary malfunction of the whole network;
  • Electrostatic discharge may cause devices to temporarily depart or permanently fail;
  • Short circuit in the network will cause all the devices to fail indefinitely.

Since the exact cause is not possible to determine, it's not possible to predict whether the failure is temporary or permanent. Therefore, all the failures are treated as permanent with corresponding changes in behavior, however, the user will not be notified for a short while (one to five minutes) in hope that the failure is temporary.

There may be several possible locations for a temperature sensor failure, and several alleviation strategies:

  • A zone sensor failure. In this case, the zone is declared disabled, but the damper is open completely.
  • A compressor safety shutoff sensor failure. Nothing can be done about this, so the user is simply notified about the failure after a "good faith" timeout expires.

Pattern Analysis

There are failures and inadequacies that cannot be easily determined by a dedicated algorithm, but become obvious as soon as we pay attention to trends. A few examples:

  • A temperature sensor failure can be declared if a sensor reading doesn't change for extended period of time - they're always noisy, and this is an indication of a failure.
  • A zone can be declared deficient, or a damper failed, if the HVAC unit is on for an extended period of time, but the temperature in the zone doesn't change in a desired direction. One cause for that may be the damper failure, another - an inadequate ductwork, yet another - an open window, or a lot of people or equipment in the room that doesn't allow the HVAC unit to perform its job satisfactorily.

In any case, pattern analysis allows to uncover problems that are not otherwise obvious.