User behaviour on purchasing is always driven by complex latent factors, which are highly disentangled in the real world. Learning latent factorized representation of users can uncover user intentions behind the observed data (i.e. user-item interaction) and improve the robustness and interpretability of the recommender system. However, existing collaborative filtering methods learning disentangled representation face problems of balancing the trade-off between reconstruction quality and disentanglement. In this paper, we propose a controllable variational autoencoder framework for collaborative filtering. Specifically, we adopt a modified Proportional-Integral-Derivative (PID) control to the β -VAE objective to automatically tune the hyperparameter β using the output of Kullback-Leibler divergence as feedback. We further introduce item embeddings to guide the system to learn representation related to the real-world concepts using a factorized Gaussian distribution. Experimental results show that our model can get a crucial improvement over state-of-the-art baselines. We further evaluate our model’s effectiveness to control the trade-off between reconstruction error and disentanglement quality in the recommendation.