m*f
2 楼
一个简单的作业。。。
s*h
3 楼
先谢谢了
刚用R没多久,现在要把R的数据按格式输出,有什么好的tutorial多谢推荐
数据是一个array和两个list, 长度相同,都是N,大约400k左右
R里的数据如下:
df$itemsetID 是个array[n]
head(df$itemsetID,3 )
[1] "2399" "2439" "2546"
两个list, 都是N X 10
applist[1]
[[1]]
[1] "2021" "2042" "2067" "2099" "2112" "2147" "2186" "2204" "2043" "2053"
scorelist[1]
[[1]]
[1] 0.28386375 0.12140423 0.08740495 0.07996213 0.04783727 0.04678283 0.
04569770 0.03860657 0.03166964 0.02983698
输出格式,比如说,第i行是
df$itemsetID[i], applist[[i]][1], scorelist[[i]][1], applist[[i]][2],
scorelist[[i]][2], ...applist[[i]][10], scorelist[[i]][10]
怎么弄,多谢指点。
告诉tutorial也行。
刚用R没多久,现在要把R的数据按格式输出,有什么好的tutorial多谢推荐
数据是一个array和两个list, 长度相同,都是N,大约400k左右
R里的数据如下:
df$itemsetID 是个array[n]
head(df$itemsetID,3 )
[1] "2399" "2439" "2546"
两个list, 都是N X 10
applist[1]
[[1]]
[1] "2021" "2042" "2067" "2099" "2112" "2147" "2186" "2204" "2043" "2053"
scorelist[1]
[[1]]
[1] 0.28386375 0.12140423 0.08740495 0.07996213 0.04783727 0.04678283 0.
04569770 0.03860657 0.03166964 0.02983698
输出格式,比如说,第i行是
df$itemsetID[i], applist[[i]][1], scorelist[[i]][1], applist[[i]][2],
scorelist[[i]][2], ...applist[[i]][10], scorelist[[i]][10]
怎么弄,多谢指点。
告诉tutorial也行。
l*s
4 楼
看不到图啊
k*a
5 楼
list里面vector如果都一样长,可以转换为data frame
比如
cbind(df$itemsetID, as.data.frame(applist), as.data.frame(scorelist))
再调调列顺序就可以了
然后合并就可以了
比如
cbind(df$itemsetID, as.data.frame(applist), as.data.frame(scorelist))
再调调列顺序就可以了
然后合并就可以了
n*3
6 楼
s*h
8 楼
另外,as.data.frame 有error.
我用的data.frame, 虽然有warning,但是结果是对的
error 和 warning是啥原因我都不清楚。
1.
tmp Error in data.frame(c("2038", "2087", "2059", "2169", "2099", "2186", :
arguments imply differing number of rows: 10, 9, 8
2.
tmp Warning messages:
1: In f(init, x[[i]]) :
number of columns of result is not a multiple of vector length (arg 2)
2: In f(init, x[[i]]) :
number of columns of result is not a multiple of vector length (arg 2)
3: In data.row.names(row.names, rowsi, i) :
some row.names duplicated: 3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,
21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,
46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,
71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,
96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,
116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,
135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,
154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,
173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,
211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,
230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,
249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,
268,269,270,271,2 [... truncated]
我用的data.frame, 虽然有warning,但是结果是对的
error 和 warning是啥原因我都不清楚。
1.
tmp Error in data.frame(c("2038", "2087", "2059", "2169", "2099", "2186", :
arguments imply differing number of rows: 10, 9, 8
2.
tmp Warning messages:
1: In f(init, x[[i]]) :
number of columns of result is not a multiple of vector length (arg 2)
2: In f(init, x[[i]]) :
number of columns of result is not a multiple of vector length (arg 2)
3: In data.row.names(row.names, rowsi, i) :
some row.names duplicated: 3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,
21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,
46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,
71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,
96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,
116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,
135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,
154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,
173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,
211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,
230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,
249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,
268,269,270,271,2 [... truncated]
o*n
9 楼
python有个json library,可以把json object load进去读成dictionary,你再把dict
处理成csv就行。R里也有类似的处理json的包,我记得。
处理成csv就行。R里也有类似的处理json的包,我记得。
v*e
11 楼
多年经验,总结出来就一句话:搞数据格式,能不用R处理就不用R处理。
理由:1. 慢 2. 陷阱多容易出错 3. 繁复
理由:1. 慢 2. 陷阱多容易出错 3. 繁复
Y*a
12 楼
看楼主代码像是python的风格
你的applist和scorelist到底是什么格式?能不能给出
length(applist)
class(applist)
class(applist[1])
if 'matrix', get dim(applist[1])
if 'list', get length(applist[1])
给出这几个信息,才能写code给你
你的applist和scorelist到底是什么格式?能不能给出
length(applist)
class(applist)
class(applist[1])
if 'matrix', get dim(applist[1])
if 'list', get length(applist[1])
给出这几个信息,才能写code给你
t*g
14 楼
http://www.statmethods.net/input/exportingdata.html
楼主试试,write.table with sep=","
我没试过output csv, 不过text 和 excel 都用过。
楼主试试,write.table with sep=","
我没试过output csv, 不过text 和 excel 都用过。
d*c
15 楼
现成包处理csv的方法是很多的,不一定能满足你的格式要求。你这样交错着输出,不
一定有现成的方法。
既然你的数据量不大,格式要求又比较特别,就写个for loop,一个个用字符串拼起来
也不会慢吧。
一定有现成的方法。
既然你的数据量不大,格式要求又比较特别,就写个for loop,一个个用字符串拼起来
也不会慢吧。
p*r
16 楼
# create matrix from applist, then transpose it
# so the matrix is N rows * 10 columns
app # Same for scorelist
score# generate column sequence (1,11,2,12...10,20) so as to reorder them after
cbind
cols # or you can do cols data # reorder columns
data # generate col_names: "applist1", "scorelist1", "applist2","scorelist2"...
colnames(data) rownames(data) # add itemID column to the front
data write.csv(data,"data.csv")
# so the matrix is N rows * 10 columns
app # Same for scorelist
score# generate column sequence (1,11,2,12...10,20) so as to reorder them after
cbind
cols # or you can do cols data # reorder columns
data # generate col_names: "applist1", "scorelist1", "applist2","scorelist2"...
colnames(data) rownames(data) # add itemID column to the front
data write.csv(data,"data.csv")
相关阅读
问个R的问题: List of list里面有一个有数值,怎么找到他的最大值的index请问:有人了解MITx MicroMasters Program in Statistics and Data Science?Data Scientist vs Software Engineer (转载)Uber数据科学组招几个DA和DS (转载)GT OMSCS 2019 fall 求加入相关群Looking for a programmer NYCTF-IDF能检查源程序抄袭吗? (转载)晴天霹雳!汉奸组织咔咔CACAA阴谋断绝华人绿卡之路 (转载)关于课题选择的一些问题版上有做predictive modeling 的吗?概率和统计的区别【工作内推】Principal Science & Analytics初级菜鸟的纠结-是否应该转行data scientist著名人類學家胡家奇擔憂人工智慧發展AI方向会议的审稿人可以发邮件找GC/PC/AC要吗?胡家奇著名人類學家論人工智慧著名人类学家胡家奇《拯救人类》一书在新加坡出版发行可以推荐Insight Data Science的面试机会 (已结束)Desktop for Machine learning (XGboost etc.)【SparkData】大数据分析之三大利器 ~~~“ The more I use P (转载)