Empirical statistical downscaling (ESD) methods seek to refine global climate model (GCM) outputs via processes that glean information from a combination of observations and GCM simulations. They aim to create value-added climate projections by reducing biases and adding finer spatial detail. Analysis techniques, such as cross-validation, allow assessments of how well ESD methods meet these goals during observational periods. However, the extent to which an ESD method’s skill might differ when applied to future climate projections cannot be assessed readily in the same manner. Here we present a “perfect model” experimental design that quantifies aspects of ESD method performance for both historical and late 21st century time periods. The experimental design tests a key stationarity assumption inherent to ESD methods – namely, that ESD performance when applied to future projections is similar to that during the observational training period. Case study results employing a single ESD method (an Asynchronous Regional Regression Model variant) and climate variable (daily maximum temperature) demonstrate that violations of the stationarity assumption can vary geographically, seasonally, and with the amount of projected climate change. For the ESD method tested, the greatest challenges in downscaling daily maximum temperature projections are revealed to occur along coasts, in summer, and under conditions of greater projected warming. We conclude with a discussion of the potential use and expansion of the perfect model experimental design, both to inform the development of improved ESD methods and to provide guidance on the use of ESD products in climate impacts analyses and decision-support applications.