On top of my head,要实现这个功能并没有现成的package或者function可以调用。 如果要自己写代码来实现的话感觉还颇有难度。如果采用euclidean distance to measure similarity, the distance would be dominated by the distance of numerical covariates. 换句话说categorical covariates is somewhat ignored in the similarity metrics. 谷歌了下,关于similarity metrics of categorical variables没有简单现成的答案 ,基本都是paper。不知版上诸多大牛是否有好的解决方法?