So, for those that did not know. The hash() function in ColdFusion can do a cool trick. It can generate a md5 hash of a file. Why would you want to do this? Well, if you want to compare 2 files at the byte level a hash is the best way. IF the files are identical byte for byte the hash will be the same. if one byte in the file changed the hash changes. The change could be something as simple as changing a file attribute or an update to its last modified date.

Here is the code to do it. It is very simple.

view plain print about
1<cfset fileHash = hash(FileRead(myfullFilePath), 'MD5')>
This code would result in something that looks like this: 6168f305888c3d795e67c6de17bf8a21

This works perfectly in about 90+% of all cases. However, in what I was doing it worked in about 5% of all cases. The simple fact is that I was trying to hash files that were in the 250mb+ range. When doing this my server would start generating this lovely error:

view plain print about
1"Error","Thread-26","12/29/09","15:27:48",,"Error invoking CFC for gateway ProcessFile: Java heap space."
2java.lang.OutOfMemoryError: Java heap space
3    at java.util.Arrays.copyOfRange(Arrays.java:3209)
4    at java.lang.String.<init>(String.java:216)
5    at java.lang.StringBuffer.toString(StringBuffer.java:585)
6    at coldfusion.tagext.io.FileUtils.readFile(FileUtils.java:174)

I have not been able to determine the exact file size that causes this. I have seen it with files as small at 50mb. But, I do see it very regularly with files over 100mb. I have also not found a way around it yet. I currently just wrap the code inside a cftry to suppress the error.

Till next time...

--Dave