I’m attempting to find a way to sanitize/filter file names in a Bash script the exact same way as the sanitize_file_name function from WordPress works. It has to take a filename string and spit out a clean version that is identical to that function.

You can see the function here: https://github.com/WordPress/WordPress/blob/master/wp-includes/formatting.php#L1762

It’s too complicated for my limited Bash skills so I’m wondering if anyone can help or knows where a script is that does this already. I couldn’t find one. All I could find are scripts that change spaces or special characters into underscores or dashes. But this isn’t all the sanitize_file_name function does.

In case you are curious, the filenames have to be compatible with this WordPress function because of the way this website is setup to handle videos. It allows people to upload videos through WordPress that are then sent to a separate video server for encoding and then sent to Amazon S3 and CloudFront for serving on the site. However it also allows adding videos through Dropbox using the External Media plugin (which actually is duplicating the video upload with the Dropbox sync now but that’s another minor issue). This video server is also syncing to a Dropbox account and whitelisting the folders in it and has this Bash script watching a VideoServer Dropbox folder using inotifywait which copies videos from it to another folder temporarily where the video encoder encodes them. This way when they update the videos in their Dropbox it will automatically re-encode and update the video shown on the site. They could just upload the files through WordPress but they don’t seem to want to or don’t know how to do that for some reason.

Read more here: Bash sanitize_file_name function


Solution:

If you know the solution of this issue, please leave us a reply in Comment section, to update the question.


Wordpress related questions and answers: