In the field of video object segmentation (VOS), low-light conditions pose a significant challenge, often leading to extremely poor image quality and inaccurate matching by computing similarity between query and memory frames. Event cameras have high dynamic range and motion information of objects. These characteristics provide visibility of objects to assist VOS methods under low-light conditions. In this paper, we introduce a novel framework for low-light VOS, incorporating event camera data to improve segmentation accuracy. Our approach consists of two key components: Event-Guided Memory Matching (EGMM) and Adaptive Cross-Modal Fusion (ACMF). The EGMM module is designed to solve the inaccurate matching under low-light conditions. On the other hand, the ACMF module aims to extract valuable features from noise features by adaptive fusing image and event modalities. Besides, we construct a simulated Low-Light Event DAVIS (LLE-DAVIS) dataset and collect a real-world Low-Light Event Object Segmentation (LL-EOS) dataset, including frames and events. Experiments validate the effectiveness of our method on both datasets.