how large this implementation could handle? how about 200k * 500k user-item matrix? and 500k * 500k item similarity matrix output