The challenge posed by large motion plays a crucial role in the task of Video Frame Interpolation (VFI) for handling the potentially significant temporal gap between input inference frames. Existing algorithms are often constrained by limited receptive fields or rely on local refinement, resulting in suboptimal performance when dealing with scenarios with large motion. In this paper, we introduce a sparse global matching algorithm to specifically address Video Frame Interpolation (VFI) challenges associated with large motion. Our two-step modeling approach efficiently captures both local details and global motion correlations to overcome the limitations of existing methods in handling large motion. First, we estimate a pair of initial intermediate flows using a hybrid structure of CNN and Transformer for local details. Then, we incorporate a sparse global matching block to identify unmatched flaws in flow estimation and generate sparse flow fixes within a global receptive field. Our method not only achieves the state-of-the-art performance in the most challenging subset of the commonly used large motion benchmarks, X-Test, Xiph and SNU-FILM hard and extreme, but also maintains very competitive performance on SNU-FILM easy and medium.