g*g
2 楼
Do something in between, let's say you keep a "file pool",
you can open a maximum of 5000, and you keep the most recent 5000
open. Put it in a queue, pop the head out and append the new one
at the tail when it's over 5000. When you write a file and the file
is already in the queue, remove it and append it to the tail.
To speed up search, you can use a hashmap to track if the files are
open.
【在 j*******s 的大作中提到】
: 请教一个问题,有一个大文件,是个txt表格,按照第一列的关键字分割成若干文件。
: 比如
you can open a maximum of 5000, and you keep the most recent 5000
open. Put it in a queue, pop the head out and append the new one
at the tail when it's over 5000. When you write a file and the file
is already in the queue, remove it and append it to the tail.
To speed up search, you can use a hashmap to track if the files are
open.
【在 j*******s 的大作中提到】
: 请教一个问题,有一个大文件,是个txt表格,按照第一列的关键字分割成若干文件。
: 比如
j*s
3 楼
好方法,多谢多谢,堆栈这个方法好极了。
【在 g*****g 的大作中提到】
: Do something in between, let's say you keep a "file pool",
: you can open a maximum of 5000, and you keep the most recent 5000
: open. Put it in a queue, pop the head out and append the new one
: at the tail when it's over 5000. When you write a file and the file
: is already in the queue, remove it and append it to the tail.
: To speed up search, you can use a hashmap to track if the files are
: open.
【在 g*****g 的大作中提到】
: Do something in between, let's say you keep a "file pool",
: you can open a maximum of 5000, and you keep the most recent 5000
: open. Put it in a queue, pop the head out and append the new one
: at the tail when it's over 5000. When you write a file and the file
: is already in the queue, remove it and append it to the tail.
: To speed up search, you can use a hashmap to track if the files are
: open.
j*s
4 楼
用队列还是堆栈好?第一列的关键字是随机的,FIFO还是LIFO没区别吧?
【在 g*****g 的大作中提到】
: Do something in between, let's say you keep a "file pool",
: you can open a maximum of 5000, and you keep the most recent 5000
: open. Put it in a queue, pop the head out and append the new one
: at the tail when it's over 5000. When you write a file and the file
: is already in the queue, remove it and append it to the tail.
: To speed up search, you can use a hashmap to track if the files are
: open.
【在 g*****g 的大作中提到】
: Do something in between, let's say you keep a "file pool",
: you can open a maximum of 5000, and you keep the most recent 5000
: open. Put it in a queue, pop the head out and append the new one
: at the tail when it's over 5000. When you write a file and the file
: is already in the queue, remove it and append it to the tail.
: To speed up search, you can use a hashmap to track if the files are
: open.
A*o
9 楼
or keep all file names in memory,
and only write to 10k files each iteration reading through the raw file.
【在 g*****g 的大作中提到】
: Do something in between, let's say you keep a "file pool",
: you can open a maximum of 5000, and you keep the most recent 5000
: open. Put it in a queue, pop the head out and append the new one
: at the tail when it's over 5000. When you write a file and the file
: is already in the queue, remove it and append it to the tail.
: To speed up search, you can use a hashmap to track if the files are
: open.
and only write to 10k files each iteration reading through the raw file.
【在 g*****g 的大作中提到】
: Do something in between, let's say you keep a "file pool",
: you can open a maximum of 5000, and you keep the most recent 5000
: open. Put it in a queue, pop the head out and append the new one
: at the tail when it's over 5000. When you write a file and the file
: is already in the queue, remove it and append it to the tail.
: To speed up search, you can use a hashmap to track if the files are
: open.
相关阅读
请教:用java做图形化显示结果最近我在看不少数学大牛的网页,发现Amex SPG卡免费送25K points 可换500刀现金jetty疑问第一份工作,C#还是Java区别大吗?how do u store secrets?我觉得新手应该系统学学基本知识Job Opening: ERP engineer in Washington D.C.请问JDBC连SQL server的connect reset问题如何解决? (转载)java struts奇怪问题求助发现firefox还是英文版合适啊Position: JAVA Developer国内JAVA培训元老张孝祥老师38岁就累死了!!!想用javascript开发应用如何通过HttpURLConnection实现Http1.1的重用?来来来贴个励志篇:一问三不知也能拿offer (转载)Is it possible to get Class object for T from a generic class? (下列空档,是否可填)[新版申请] 移动设备开发(MobileDev)Re: 说道假期郁闷啊 (转载)求助: ldap的ssl connection