In an ad hoc work, I need to read in files in multiple HDFS directories based on a date range.
The HDFS data structure is like the following
Provided a date e.g.20170801, I need to read in the files from folder
/data/20170830, but not others.
So to achieve this inside my python script, I searched online and finally arrived at the following solution.
Then, so fit my specific needs, I just need to do a simple filtering for the list.