# Get a List of HDFS Files in Python

Catalogue

In an ad hoc work, I need to read in files in multiple HDFS directories based on a date range.

The HDFS data structure is like the following

Provided a date e.g.20170801, I need to read in the files from folder /data/20170801, /data/20170802, …, /data/20170830, but not others.

So to achieve this inside my python script, I searched online and finally arrived at the following solution.

Then, so fit my specific needs, I just need to do a simple filtering for the list.