Right on. It is too vague. What is contained in the log? Strings? Objects? If strings, use tries/hash. If objects, convert to binary, then use hash. Either way, split and store chunks of temp data on disk. cache them if needed.
Assume n is too big, and existed items are sparse compared to n, you may want to look at the multiple-dimension aggregation algorithm in data mining . Although it is multi-dimension, the algorithm have good performance on how to iterate through the whole dimension and acount for each item in cell.