Creating smaller chunks from large file and sort the chunks
I am implementing external sort in python, and currently stuck with this
problem. I have divided a large text file containing integer numbers into
small chunks and I am trying to sort these chunks. So far I am able to
write this much.
with open(fpath,'rb') as fin:
input_iter = iter(lambda: fin.read(40 * 1024),'')
for item in input_iter:
print item
current_chunk = list(item)
# sort the buffers
current_chunk.sort(key = lambda x : int(x))
When I execute this code, I got an error
File "problem3.py", line 68, in <lambda>
current_chunk.sort(key = lambda x : int(x))
ValueError: invalid literal for int() with base 10: ''
which I guess is coming due to this line input_iter = iter(lambda:
fin.read(40 * 1024),'') Is their an alternate way to over come this
problem. Thank you
No comments:
Post a Comment