<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=1703665079923990&amp;ev=PageView&amp;noscript=1">

Validating File Integrity with Get-File Hash

Validating File Integrity with Get-File Hash

File Integrity
Posted by ASCEND TECHNICAL TEAM on 1/12/23 1:50 PM

<< Back to Blog

 

Downloading and verifying that a file hasn’t been inadvertently or maliciously changed has been the job of admins for a while. Ideally, to make sure the file you have downloaded is exactly the same as the source, you would make a byte-to-byte comparison. But that’s not often practical or possible for files you’ve downloaded.

Additionally, that just proves the file you downloaded was the same one that was published. It doesn’t verify the integrity of the file in any way. To truly accomplish this comparison and validation, we can use something known as a hash.

 

What is a hash?

A hash is a string of characters that is generated by analyzing the bytes of the file using a specific algorithm. This hash value is much smaller than the actual size of the file and is published alongside the file you’re downloading, which allows you to run that same hash algorithm against your downloaded file and verify the hashes match.

There are different algorithms and utilities to generate these hash values. Every algorithm will generate a different hash, but the utility used to generate the hash will always generate the same hash value when you choose the same algorithm.

 

 

Powershell cmdlet

Previously you needed a 3rd party tool to do this, but PowerShell provides a handy cmdlet to perform the computations for you. Get-FileHash is the built-in PowerShell cmdlet that can be used to generate a hash value, allowing you to verify against the reference hash. Find more details on the cmdlet and options here.

Some vendors publish the information pretty consistently. HPE, for example, tends to include the hash values in the notes and download files. Take a look at this file for the ILO 5 firmware update.

You’ll see on the tab “Installation Instruction” that they have the hash/checksum values listed there.

If you download the files, what you’d do from PowerShell is run “Get-FileHash” and specify the path to the file you want to be checked.

 

PS C:\Down\Blog> Get-FileHash -Path .\cp045967.exe 
 Algorithm       Hash                                                                   Path 
---------       ----                                                                   ---- 

SHA256          82F776A89483EB2192FC41637708A820938B9091D5964530A089092CB64AEBFB       C:\Down\Blog\cp045967.exe 

 

You’ll see it generated a hash value of 82F776A89483EB2192FC41637708A820938B9091D5964530A089092CB64AEBFB, and you can compare that to the value on the web page and verify it matches. 

You can check multiple files, too. If you have downloaded all three files on that page, you can use a wildcard in the path and get them all. 

 

PS C:\Down\Blog> Get-FileHash -Path .\*.* 

Algorithm       Hash                                                                   Path 

---------       ----                                                                   ---- 

SHA256          82F776A89483EB2192FC41637708A820938B9091D5964530A089092CB64AEBFB       C:\Down\Blog\cp045967.exe 

SHA256          71EF16D38226ED06E72B1A87493AA90521D62D18DCF68BB15014C3C1028FBF4C       C:\Down\Blog\cp045967_part1.compsig 

SHA256          8B6A297F69E570D72111C076505BFC074AB84B618B9142129CC8746525DE49F6       C:\Down\Blog\cp045967_part2.compsig 

 

Then you can do a comparison of each of those files.

 

Validating Across Sites

Not all sites will use the default algorithm of SHA256 for computing the hash. Some may have SHA1 or MD5 (shorter keys and faster), and some may have SHA384 or SHA512 (longer keys mean longer compute time, but less likely to get a chance match when it shouldn’t match, not that this very likely with the short keys). 

In those cases, when you need to use a non-default algorithm, you run the cmdlet and provide the algorithm as a parameter, as shown below.

 

PS C:\Down\Blog> Get-FileHash -Path .\*.* -Algorithm SHA1  

Algorithm       Hash                                                                   Path 

---------       ----                                                                   ---- 

SHA1            589038C7ED6F0271F16CDCE148534AAAE387BA0B                               C:\Down\Blog\cp045967.exe 

SHA1            BD54FC0333A123F4558ACC5BBF7B6825DC3A45A6                               C:\Down\Blog\cp045967_part1.compsig 

SHA1            B92AD34D54FFD5C0BE0289EFAAEB0B77F30AB7F1                               C:\Down\Blog\cp045967_part2.compsig 

 

 

This allows you to be flexible and match based on the site’s information. 

Here are a few items to be aware of:

  • The larger the file, the longer it takes to compute the hash. For example, 500GB images can take a couple of hours.
  • The hash is based on the contents, not the date/time stamps of the file, etc.
  • For example, if you create a file called “HelloWorld.txt” and put “Hello World” in it, the hash is:
    A591A6D40BF420404A011733CFB7B190D62C65BF0BCDA32B57B277D9AD9F146E
  • If you rename the file to “GoodbyeWorld.txt”, the hash remains the same. 
  • If you change the text inside the file to “Goodbye World” and save it, the hash is now:
  • C96724127AF2D6F56BBC3898632B101167242F02519A99E5AB3F1CAB9FF995E7 
  • And if I change the text inside the file back to “Hello World”, the hash is back to the original value.

You can generate a file and put the same text in your file, and get the same hash value, as noted above. Hashes are computed based on the contents to ensure their integrity. If you change the “Hello World” to “Hello world” (lowercase w on “world), you’ll see that you get a completely different hash.

 

Large Files

Aside from basic sanity checks to make sure your downloads from sites are matching what they should be, using a hash is handy for validating large copies. 

For example, I had a fairly large file transfer occur where there was a network hiccup during the transfer. The transfer auto-resumed, but I wasn’t confident that there wasn’t any corruption due to the network glitch. I computed the hash on both the source and destination to validate no issues with the file and thus was able to save an hour of time I would have spent re-downloading a file when it wasn’t necessary. 

 

Validate Against Tampering

Another purpose of this might be to help ensure files weren’t tampered with. While you may have good security in place, there’s always some way that someone can get to files they shouldn’t.

If you want to ensure files weren’t modified at all, you can document the results of computation against all the files to a text file and put that in another alternate location or on immutable storage. Then whenever you need to do validation, you can run another computation against those files and verify nothing has changed. This is perfect for the person that is extremely paranoid within your organization. 

Leveraging the PowerShell cmdlet Get-FileHash can bring reassurance that your files were transferred properly and match the published source. 

Do you still have questions? Check out more of our IT Tips, or let us know by reaching out to talk to an expert. We are here to help!

 

 

 

<< Back to Blog

Posted in Data Management, IT Tips