How Hadoop stores emails data -


i'm naive big data field. started exploring tools hadoop , got clarity framework , map/reduce framework still have lot of questions: want analyse emails , email categoriation can organise emails different categories wondering how should store emails hdfs. should first of convert emails text file (composed of spaced-separated columns: date, author, subject, content..) or sequence file composed of binary key-value pairs , store file hdfs ?

i'm not used work sequence files read many articles how hdfs store unstructured data type of file. can please enlighten me?

thanks in advance.


Popular posts from this blog