How can I check if two files are identical?
Last modified: September 23, 2025

Sometimes, you need to ensure that two files are exactly the same, meaning their content contains exactly the same data. These are called duplicates.
We will look at several techniques in Windows to perform this check, some simpler, others more reliable.
1) Visual Comparison
Visual comparison is mentioned for completeness but it is not a reliable method. It consists of opening the two files with the appropriate software and comparing what you see on screen, for example two images in Paint or two text documents.
Be careful: two files that look identical can have very different encodings. For example, a BMP image and a PNG image may look identical to the naked eye, but their binary representation can be completely different.
2) File Size Comparison
Comparing the size in bytes of files does not verify that two files are identical, but it can quickly show that they are different.
- If the size is different, the files are not identical.
- If the size is identical, no conclusion can be drawn: you will need to use one of the methods below to confirm identity.
Note: Make sure to compare the size in bytes, not in other units like KB or MB.
3) Hash Function Comparison
A hash function provides a digital fingerprint of a file, condensing its content into a short string of characters; it’s similar to a real fingerprint. Several hashing algorithms exist, with varying reliability (MD5, SHA-1...).
To illustrate the principle, let's take the example of a checksum (CS). This hash function simply adds up all the bytes of the file. It gives very poor results but has the advantage of being easy to understand.
Let’s consider four text files (example that will be used later) each composed of 8 characters:
File Name | File Content (8 characters) |
Checksum (CS) |
---|---|---|
File1.dat | 00000123 | CS = 0+0+0+0+0+1+2+3 = 6 |
File2.dat | 00000123 | CS = 0+0+0+0+0+1+2+3 = 6 |
File3.dat | 00000124 | CS = 0+0+0+0+0+1+2+4 = 7 |
File4.dat | 00000142 | CS = 0+0+0+0+0+1+4+2 = 7 |
- Observation 1: Files File1.dat and File2.dat have the same checksum (CS = 6), as expected since they are identical.
- Observation 2: File File3.dat is clearly different from the previous two files and has a different checksum (SC = 7).
- Observation 3: However, there is an issue with files File3.dat and File4.dat; their content is different but they have the same checksum (CS = 7)! This can happen especially with weak hash functions like simple checksums. This is called a collision.
Note: This method does not compare content directly but only their digital fingerprint. For thousands of large files, comparing fingerprints is very fast and efficient.
Example of fingerprint calculation in Windows:
Here is a concrete example using the certutil command (built into Windows) with the MD5 hash function. Among the same four files above, only the first two files have the same MD5 fingerprint 38d822d2c9b62e572837d1b98447e52f, meaning they are indeed identical.
Syntax: C:\> certutil -hashfile filename MD5

“Although extremely reliable, an identical fingerprint does not guarantee 100% that the files are identical.”
4) Bit-by-Bit Comparison
Bit-by-bit comparison checks every single bit of both files. Unlike the previous method, this one gives a 100% reliable result but can take significantly more time.
Example of bit-by-bit comparison in Windows:
The example below shows two uses of the fc (file compare) command, also built into Windows.
Syntax: C:\> fc /b filename1 filename2

Results of pairwise comparison:
- Files File1.dat and File2.dat are identical.
- Files File3.dat and File4.dat are different.
The results match the previous method.
5) Using Graphic Software
Although functional, the previous two solutions require command-line usage, which becomes inconvenient when handling large volumes of files.
Since Windows does not include a graphical tool for this type of comparison, you need third-party software, which requires installation. Among the many available solutions, I recommend Duplicate Media Finder, whose free version is sufficient for performing a binary comparison. Its approach to the problem (see this very complete tutorial) is particularly well thought out and original. Best used when you have many files to manage.