PowerShell: Get-FileEncoding

VertigoRay/ February 4, 2015/ Uncategorized/ 0 comments

I often work in PowerShell, and one day I needed to create a script that would pull the file encoding out a file.

Encodings

However, this proved to be difficult since most encodings don’t require a BOM (Byte Order Mark). Here’s some good information that I found on the subject:

Automatically determining the correct encoding for a given byte array is notoriously difficult. Sometimes, to be helpful, the author of the data will insert something called a BOM (Byte Order Mark) at the beginning of the data. If a BOM is present, that makes detecting the encoding painless, since each encoding uses a different BOM.

However, the problem remains, how do you automatically detect the correct encoding when there is no BOM? Technically it’s recommended that you don’t place a BOM at the beginning of your data when using UTF-8, and there is no BOM defined for any of the ANSI code pages. So it’s certainly not out of the realm of possibility that a text file may not have a BOM. If all the files that you deal with are in English, it’s probably safe to assume that if no BOM is present, then UTF-8 will suffice. However, if any of the files happen to use something else, without a BOM, then that won’t work.

The Code

Credits

I came across some code on a PowerShell sharing site, POSHCode.org,  that inspired me to do things a different way. So, I made the ammendments there as well. Unfortunately, since I’ve written this blog, it appears that POSHCode has gone down for the count:

poshcode.org is almost here! Upload your website to get started.

Screenshot taken: June 21, 2017.

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>
*
*