Python for Machine learning related

I have a doubt that recently I came across. I am having about 30 million files of unstructured data. Will I be able to pull it using Python for convenient reading making it structured one?

Instread of pulling those many files and working with all at once, why don’t you use a small subset of files and try working on that subset?