We collect data from clients via a number of methods including FTP. We’d been collecting, but not processing, a particular client’s files for a while now accumulating to over 200,000 files in a single folder.
That’s a fair few files and we needed to process them and then remove them, being the go-to-Linux-guy I was tasked with sorting through the files.
Sadly the version of find we have on the server doesn’t have the parameter allowing me to set a date to find files before/after but it does have the ability to pick a reference file:
find -not -newer ./mb-001.*****.log.csv -delete
This command finds every file not newer than the reference file, and deletes them, rather quickly too.
As a side note, even rm had difficulty deleting all the files in another folder (which had 400,000+ files in it) thankfully using find and xargs allowed me to break it up a little:
find . | xargs -0 rm -f
Before deleting the files I’d compressed them all down to back them up, went down from 1.6GB to 26MB, even the ls -l file listing was larger than that.