Friday, October 21, 2016

An example in refactoring the CEA

This post is the result of a refactoring effort between Shanshan Hsieh and I for the City Energy Analyst, particularly on the pull request #371.

Compare these two snippets of code:

Original

tHC_corr = [0, 0]
delta_ctrl = [0, 0]

# emission system room temperature control type
if control_system == 'T1':
    delta_ctrl = [2.5, -2.5]
elif control_system == 'T2':
    delta_ctrl = [1.2, -1.2]
elif control_system == 'T3':
    delta_ctrl = [0.9, -0.9]
elif control_system == 'T4':
    delta_ctrl = [1.8, -1.8]

# calculate temperature correction
if heating_system == 'T1':
    tHC_corr[0] = delta_ctrl[0] + 0.15
elif heating_system == 'T2':
    tHC_corr[0] = delta_ctrl[0] - 0.1
elif heating_system == 'T3':
    tHC_corr[0] = delta_ctrl[0] - 1.1
elif heating_system == 'T4':
    tHC_corr[0] = delta_ctrl[0] - 0.9
else:
    tHC_corr[0] = 0

if cooling_system == 'T1':
    tHC_corr[1] = delta_ctrl[1] + 0.5
elif cooling_system == 'T2':  # no emission losses but emissions for ventilation
    tHC_corr[1] = delta_ctrl[1] + 0.7
elif cooling_system == 'T3':
    tHC_corr[1] = delta_ctrl[1] + 0.5
else:
    tHC_corr[1] = 0

return tHC_corr[0], tHC_corr[1]

Refactored

control_delta_heating = {'T1': 2.5, 'T2': 1.2, 'T3': 0.9, 'T4': 1.8}
control_delta_cooling = {'T1': -2.5, 'T2': -1.2, 'T3': -0.9, 'T4': -1.8}
system_delta_heating = {'T0': 0.0, 'T1': 0.15, 'T2': -0.1, 'T3': -1.1, 'T4': -0.9}
system_delta_cooling = {'T0': 0.0, 'T1': 0.5, 'T2': 0.7, 'T3': 0.5}
try:
    result_heating = 0.0 if heating_system == 'T0' else (control_delta_heating[control_system] +
                                                         system_delta_heating[heating_system])
    result_cooling = 0.0 if cooling_system == 'T0' else (control_delta_cooling[control_system] +
                                                         system_delta_cooling[cooling_system])
except KeyError:
    raise ValueError(
        'Invalid system / control combination: %s, %s, %s' % (heating_system, cooling_system, control_system))

return result_heating, result_cooling

Ideally, these two codes produce the same result. I would argue though, that by using a table-based method, that is, keeping the data in lookup tables like dicts, makes the code a bit clearer to understand than using a lot of if...elif...else statements. This simplifies the code down to it’s core message: The result is the sum of the deltas for the control system and the system type.

Another improvement, in my mind, is getting rid of the X[0] and X[1] constructs. In the original code, these were used to group the heating and the cooling values respectively. While in a small snippet like this, that can still be understood, this paradigm breaks down quickly. One way to improve that would be to assign HEATING = 0 and COOLING = 1 and then index like this: tHC_corr[HEATING]. That communicates the intention much better. Using separate dictionaries for heating and cooling values for both emission systems and control systems sidesteps the issue alltogether.

Documentation

@shanshan and I went a step further than just changing the code. Here is a model of what the documentation of the function could look like:

"""
Model of losses in the emission and control system for space heating and cooling.

Correction factor for the heating and cooling setpoints. Extracted from EN 15316-2

(see cea\databases\CH\Systems\emission_systems.xls for valid values for the heating and cooling system values)

T0 means there's no heating/cooling systems installed, therefore, also no control systems for heating/cooling.
In short, when the input system is T0, the output set point correction should be 0.0.
So if there is no cooling systems, the setpoint_correction_for_space_emission_systems function input: (T1, T0, T1) (type_hs, type_cs, type_ctrl),
return should be (2.65, 0.0), the control system is only specified for the heating system.
In another case with no heating systems: input: (T0, T3, T1) return: (0.0, -2.0), the control system is only
specified for the heating system.

PARAMETERS
----------

:param heating_system: The heating system used. Valid values: T0, T1, T2, T3, T4
:type heating_system: str

:param cooling_system: The cooling system used. Valid values: T0, T1, T2, T3
:type cooling_system: str

:param control_system: The control system used. Valid values: T1, T2, T3, T4 - as defined in the
    contributors manual under Databases / Archetypes / Building Properties / Mechanical systems.
    T1 for none, T2 for PI control, T3 for PI control with optimum tuning, and T4 for room temperature control
    (electromagnetically/electronically).
:type control_system: str

RETURNS
-------

:returns: two delta T to correct the set point temperature, dT_heating, dT_cooling
:rtype: tuple(double, double)
"""

Note: The documentation explains what the function is for and also mentions the relevant standards (EN 15316-2) - this could also be references to research papers or whatever. Further, the input values and types are explained and also a link to the documentation for reference. A list of valid inputs is given.

What is missing? Boundary case behaviour - this should actually also be in the documentation! We test this behaviour in the next section:

Unit tests

Check out the file tests/test_sensible_loads.py:

import unittest


class TestCorrectionFactorForHeatingAndCoolingSetpoints(unittest.TestCase):
    def test_calc_t_em_ls_raises_ValueError(self):
        from cea.demand.sensible_loads import setpoint_correction_for_space_emission_systems
        self.assertRaises(ValueError, setpoint_correction_for_space_emission_systems, heating_system='T1',
                          cooling_system='T1', control_system=None)
        self.assertRaises(ValueError, setpoint_correction_for_space_emission_systems, heating_system='T1',
                          cooling_system='XYZ', control_system='T1')
        self.assertRaises(ValueError, setpoint_correction_for_space_emission_systems, heating_system='T1',
                          cooling_system=None, control_system='T1')

    def test_calc_t_em_ls_T0(self):
        from cea.demand.sensible_loads import setpoint_correction_for_space_emission_systems
        self.assertEqual(setpoint_correction_for_space_emission_systems('T1', 'T0', 'T1'), (2.65, 0.0))
        self.assertEqual(setpoint_correction_for_space_emission_systems('T0', 'T3', 'T1'), (0.0, -2.0))

Right now, it contains a single class, TestCorrectionFactorForHeatingAndCoolingSetpoints that inherits from unittest.TestCase. When the Jenkins runs, it will pick up this class, a) because the filename starts with test_ and b) because the classes inherit from TestCase. The Jenkins will then run all the test cases. Each method starting with test_ is run and checked. The self.assert* calls test expected values and actual computations of the setpoint_correction_for_space_emission_systems function. We can see that:

  • calling the method with None or 'XYZ' as one of the parameters should raise an instance of ValueError - this is an edge case that is being tested
  • the setpoint correction for heating or cooling emission systems of type 'T0' (no emission system) should be 0.0

Unit tests like this are a great way to describe how a function should behave, independant of the implementation. It also represents another “mode” of thinking, where you consider edge cases, expected values, failure modes etc. at a very low level. At the level of a “unit” of code.

I believe that especially for code like the CEA, we should use this technique to specify the expected behavior of the system, as we will be on the hook for bugs for the next decade or so. We want to be able to prove the correctnes of our implementation!

No comments:

Post a Comment