Skip to content

[Zigbee] Uninitialized callback pointers cause crash with heap allocation #12082

@dbn-b4e

Description

@dbn-b4e

Board

ESP32_C6

Device Description

Bug Report: Uninitialized callback pointers in ZigbeeEP and ZigbeeLight cause crash with heap allocation

Environment

  • ESP32 Arduino Core Version: 3.3.4
  • Board: ESP32-C6
  • Zigbee Library: Built-in Zigbee library (esp32-arduino)
  • IDE: Arduino CLI / PlatformIO

Summary

When using new to dynamically allocate Zigbee endpoint objects (e.g., ZigbeeLight, ZigbeeTempSensor), the firmware crashes with an instruction access fault when receiving Zigbee commands from a coordinator (e.g., Zigbee2MQTT). The crash occurs because several callback function pointers are not initialized to nullptr in the constructors.

Root Cause

In ZigbeeEP.cpp (line 25-44):

ZigbeeEP::ZigbeeEP(uint8_t endpoint) {
  _endpoint = endpoint;
  _ep_config.endpoint = 0;
  _cluster_list = nullptr;
  _on_identify = nullptr;        // ✓ Initialized
  _on_ota_state_change = nullptr; // ✓ Initialized
  _read_model = NULL;
  _read_manufacturer = NULL;
  // ...
  // _on_default_response is NOT initialized! ✗
}

Missing initialization: _on_default_response is declared in ZigbeeEP.h (line 195) but never initialized to nullptr.

In ZigbeeLight.cpp (line 18-25):

ZigbeeLight::ZigbeeLight(uint8_t endpoint) : ZigbeeEP(endpoint) {
  _device_id = ESP_ZB_HA_ON_OFF_LIGHT_DEVICE_ID;
  esp_zb_on_off_light_cfg_t light_cfg = ESP_ZB_DEFAULT_ON_OFF_LIGHT_CONFIG();
  _cluster_list = esp_zb_on_off_light_clusters_create(&light_cfg);
  _ep_config = {/* ... */};
  // _on_light_change is NOT initialized! ✗
}

Missing initialization: _on_light_change is declared in ZigbeeLight.h (line 49) but never initialized to nullptr.

Why This Causes a Crash

When objects are allocated with new (heap allocation), the memory is not zero-initialized. The uninitialized callback pointers contain garbage values.

When Zigbee commands are received:

  1. zbDefaultResponse() in ZigbeeEP.cpp (line 638-643) checks:

    if (_on_default_response) {  // Garbage value passes this check!
      _on_default_response(...);  // Crashes here - calling garbage address
    }
  2. Similarly, lightChanged() in ZigbeeLight.cpp (line 42-47) checks:

    if (_on_light_change) {  // Garbage value passes this check!
      _on_light_change(_current_state);  // Crashes here
    }

Why Examples Don't Crash

The official examples use static/global allocation:

ZigbeeLight light(1);  // Global - memory is zero-initialized

Global/static objects have their memory zero-initialized before constructor execution (C++ standard), so the uninitialized pointers happen to be nullptr. This masks the bug.

With dynamic allocation:

ZigbeeLight* light = new ZigbeeLight(1);  // Heap - NOT zero-initialized

The bug is exposed because heap memory contains garbage.

Crash Output

Guru Meditation Error: Core  0 panic'ed (Instruction access fault). Exception was unhandled.
Core  0 register dump:
MEPC    : 0x65656266  RA      : 0x42009c84  SP      : 0x40826a20  GP      : 0x408113b4
  • MEPC: 0x65656266 = "feeb" in ASCII (little-endian) = freed memory debug pattern
  • The CPU tried to execute code at a garbage/freed memory address

Steps to Reproduce

  1. Create a simple sketch using heap allocation:
#include <Zigbee.h>
#include <ZigbeeLight.h>

ZigbeeLight* light = nullptr;

void setup() {
  Serial.begin(115200);

  light = new ZigbeeLight(1);  // Heap allocation
  light->setManufacturerAndModel("Test", "Test");
  // Note: NOT setting onLightChange callback

  Zigbee.addEndpoint(light);
  Zigbee.begin(ZIGBEE_END_DEVICE);
}

void loop() {
  delay(100);
}
  1. Pair with a Zigbee coordinator (e.g., Zigbee2MQTT)
  2. Send an on/off command from the coordinator
  3. Result: ESP32 crashes with instruction access fault

Workaround

Set all callbacks explicitly before using the endpoint:

light = new ZigbeeLight(1);
light->onLightChange([](bool) {});  // Set to empty lambda
light->onIdentify([](uint16_t) {});
light->onDefaultResponse([](zb_cmd_type_t, esp_zb_zcl_status_t) {});

Suggested Fix

In ZigbeeEP.cpp constructor, add:

_on_default_response = nullptr;

In ZigbeeLight.cpp constructor, add:

_on_light_change = nullptr;

Similar fixes needed for other endpoint classes:

  • ZigbeeTempSensor - check for uninitialized callbacks
  • ZigbeeColorDimmableLight - check for uninitialized callbacks
  • Other Zigbee* endpoint classes

Additional Notes

This is a classic C++ pitfall: class members of pointer type are not automatically initialized in constructors. While static allocation masks the issue, any production code using dynamic allocation (factories, runtime configuration, etc.) will encounter this crash.


Reported by: OpenZigbee Project
Date: 2025-11-30

Hardware Configuration

GPIO9 button

Version

v3.3.4

Type

Bug

IDE Name

Arduino-cli

Operating System

Mac os X

Flash frequency

80

PSRAM enabled

no

Upload speed

115200

Description

Bug Report: Uninitialized callback pointers in ZigbeeEP and ZigbeeLight cause crash with heap allocation

Environment

  • ESP32 Arduino Core Version: 3.3.4
  • Board: ESP32-C6
  • Zigbee Library: Built-in Zigbee library (esp32-arduino)
  • IDE: Arduino CLI / PlatformIO

Summary

When using new to dynamically allocate Zigbee endpoint objects (e.g., ZigbeeLight, ZigbeeTempSensor), the firmware crashes with an instruction access fault when receiving Zigbee commands from a coordinator (e.g., Zigbee2MQTT). The crash occurs because several callback function pointers are not initialized to nullptr in the constructors.

Root Cause

In ZigbeeEP.cpp (line 25-44):

ZigbeeEP::ZigbeeEP(uint8_t endpoint) {
  _endpoint = endpoint;
  _ep_config.endpoint = 0;
  _cluster_list = nullptr;
  _on_identify = nullptr;        // ✓ Initialized
  _on_ota_state_change = nullptr; // ✓ Initialized
  _read_model = NULL;
  _read_manufacturer = NULL;
  // ...
  // _on_default_response is NOT initialized! ✗
}

Missing initialization: _on_default_response is declared in ZigbeeEP.h (line 195) but never initialized to nullptr.

In ZigbeeLight.cpp (line 18-25):

ZigbeeLight::ZigbeeLight(uint8_t endpoint) : ZigbeeEP(endpoint) {
  _device_id = ESP_ZB_HA_ON_OFF_LIGHT_DEVICE_ID;
  esp_zb_on_off_light_cfg_t light_cfg = ESP_ZB_DEFAULT_ON_OFF_LIGHT_CONFIG();
  _cluster_list = esp_zb_on_off_light_clusters_create(&light_cfg);
  _ep_config = {/* ... */};
  // _on_light_change is NOT initialized! ✗
}

Missing initialization: _on_light_change is declared in ZigbeeLight.h (line 49) but never initialized to nullptr.

Why This Causes a Crash

When objects are allocated with new (heap allocation), the memory is not zero-initialized. The uninitialized callback pointers contain garbage values.

When Zigbee commands are received:

  1. zbDefaultResponse() in ZigbeeEP.cpp (line 638-643) checks:

    if (_on_default_response) {  // Garbage value passes this check!
      _on_default_response(...);  // Crashes here - calling garbage address
    }
  2. Similarly, lightChanged() in ZigbeeLight.cpp (line 42-47) checks:

    if (_on_light_change) {  // Garbage value passes this check!
      _on_light_change(_current_state);  // Crashes here
    }

Why Examples Don't Crash

The official examples use static/global allocation:

ZigbeeLight light(1);  // Global - memory is zero-initialized

Global/static objects have their memory zero-initialized before constructor execution (C++ standard), so the uninitialized pointers happen to be nullptr. This masks the bug.

With dynamic allocation:

ZigbeeLight* light = new ZigbeeLight(1);  // Heap - NOT zero-initialized

The bug is exposed because heap memory contains garbage.

Crash Output

Guru Meditation Error: Core  0 panic'ed (Instruction access fault). Exception was unhandled.
Core  0 register dump:
MEPC    : 0x65656266  RA      : 0x42009c84  SP      : 0x40826a20  GP      : 0x408113b4
  • MEPC: 0x65656266 = "feeb" in ASCII (little-endian) = freed memory debug pattern
  • The CPU tried to execute code at a garbage/freed memory address

Steps to Reproduce

  1. Create a simple sketch using heap allocation:
#include <Zigbee.h>
#include <ZigbeeLight.h>

ZigbeeLight* light = nullptr;

void setup() {
  Serial.begin(115200);

  light = new ZigbeeLight(1);  // Heap allocation
  light->setManufacturerAndModel("Test", "Test");
  // Note: NOT setting onLightChange callback

  Zigbee.addEndpoint(light);
  Zigbee.begin(ZIGBEE_END_DEVICE);
}

void loop() {
  delay(100);
}
  1. Pair with a Zigbee coordinator (e.g., Zigbee2MQTT)
  2. Send an on/off command from the coordinator
  3. Result: ESP32 crashes with instruction access fault

Workaround

Set all callbacks explicitly before using the endpoint:

light = new ZigbeeLight(1);
light->onLightChange([](bool) {});  // Set to empty lambda
light->onIdentify([](uint16_t) {});
light->onDefaultResponse([](zb_cmd_type_t, esp_zb_zcl_status_t) {});

Suggested Fix

In ZigbeeEP.cpp constructor, add:

_on_default_response = nullptr;

In ZigbeeLight.cpp constructor, add:

_on_light_change = nullptr;

Similar fixes needed for other endpoint classes:

  • ZigbeeTempSensor - check for uninitialized callbacks
  • ZigbeeColorDimmableLight - check for uninitialized callbacks
  • Other Zigbee* endpoint classes

Additional Notes

This is a classic C++ pitfall: class members of pointer type are not automatically initialized in constructors. While static allocation masks the issue, any production code using dynamic allocation (factories, runtime configuration, etc.) will encounter this crash.


Reported by: OpenZigbee Project
Date: 2025-11-30

Sketch

/**
   * Minimal reproduction of Zigbee callback crash bug
   * ESP32-C6 with Arduino ESP32 Core 3.3.4
   *
   * BUG: ZigbeeLight._on_light_change and ZigbeeEP._on_default_response
   *      are not initialized to nullptr in constructors.
   *      With heap allocation (new), they contain garbage.
   *      When coordinator sends command, garbage pointer is called -> crash.
   *
   * To reproduce:
   * 1. Upload this sketch
   * 2. Pair with Zigbee2MQTT or other coordinator
   * 3. Toggle the light from coordinator
   * 4. CRASH: Guru Meditation Error, MEPC: 0x65656266
   */

  #ifndef ZIGBEE_MODE_ED
  #error "Set Tools -> Zigbee mode -> Zigbee ED (End Device)"
  #endif

  #include <Zigbee.h>
  #include <ep/ZigbeeLight.h>

  // Using NEW (heap allocation) - causes crash
  ZigbeeLight* light = nullptr;

  // Using static allocation - works fine (memory is zero-initialized)
  // ZigbeeLight light(1);

  void setup() {
      Serial.begin(115200);
      delay(1000);

      Serial.println("Zigbee Callback Bug Demo");
      Serial.println("========================");

      // Heap allocation - uninitialized pointers contain garbage
      light = new ZigbeeLight(1);
      light->setManufacturerAndModel("BugDemo", "CrashTest");

      // NOT setting callbacks - this is the bug trigger
      // light->onLightChange([](bool) {});        // If set, won't crash on light change
      // light->onDefaultResponse([](zb_cmd_type_t, esp_zb_zcl_status_t) {});  // If set, won't crash on default response

      Zigbee.addEndpoint(light);

      if (!Zigbee.begin(ZIGBEE_END_DEVICE)) {
          Serial.println("Zigbee failed!");
          while(1) delay(1000);
      }

      Serial.println("Zigbee started - pair with coordinator");
      Serial.println("Then toggle light -> CRASH expected");
  }

  void loop() {
      delay(100);
  }

  Key points:
  - Uses new ZigbeeLight(1) (heap allocation)
  - Does NOT set any callbacks
  - Crash happens when coordinator sends on/off command

  To prove it's the bug, uncomment the onDefaultResponse line - crash stops.

Debug Message

oz> Guru Meditation Error: Core  0 panic'ed (Instruction access fault). Exception was unhandled.

Core  0 register dump:
MEPC    : 0x65656266  RA      : 0x4200a016  SP      : 0x40826a30  GP      : 0x408113b4
TP      : 0x40826be0  T0      : 0x40030dca  T1      : 0x0000000f  T2      : 0x2f0a0200
S0/FP   : 0x40826a88  S1      : 0x4081f1b8  A0      : 0x0000000a  A1      : 0x00000000
A2      : 0x00000001  A3      : 0x00000006  A4      : 0x65656267  A5      : 0x40826a88
A6      : 0x00000006  A7      : 0x00000006  S2      : 0x40814000  S3      : 0x7fffffff
S4      : 0x001bc354  S5      : 0x00000000  S6      : 0x000186a0  S7      : 0xfffffffe
S8      : 0x00000000  S9      : 0x00000000  S10     : 0x00000003  S11     : 0x00000000
T3      : 0x4081f05c  T4      : 0x00000005  T5      : 0x001b0006  T6      : 0x01040001
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000001  MTVAL   : 0x65656266
MHARTID : 0x00000000

Stack memory:
40826a30: 0x4081f054 0x40880000 0x40818c70 0x40022574 0xffffffe4 0x00000000 0x00000000 0xfffffffe
40826a50: 0x000186a0 0x00000000 0x001bc354 0x7fffffff 0x4081f790 0x00000008 0x00000000 0x42013962
40826a70: 0x000186a0 0x00000000 0x40819000 0x40022498 0x40880000 0x4081f054 0x00000000 0x00002500

Other Steps to Reproduce

No response

I have checked existing issues, online documentation and the Troubleshooting Guide

  • I confirm I have checked existing issues, online documentation and Troubleshooting guide.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions