How to obfuscate and protect BIM IFC files

Dion Moult

2018-01-28

Within the architecture, engineering, and construction industry, we often share IFC files to transfer building information to others. IFC (Industry Foundation Classes) is an open standard for storing BIM information. This IFC file can be produced by software such as Revit, ArchiCAD, or SketchUp.

On occasion, however, we would like to strip out information from the IFC. Perhaps there is some classified information that is stored in one of the IFC properties, or perhaps you just need a clean set of a geometry in the respective classes to work with. In this article, we will assume that you want to obfuscate the IFC such that all geometric information is retained (specifically, IfcBuildingElement subtypes and IfcShapeRepresentation), but other BIM information is mangled.

This is a relatively easy task, and can be done with a standard text editor. IFC files are plain text files, which although generally do not lend themselves well to hand-editing, are quite easy to modify with regex. For this exercise, you will need Vim, which is the world's most advanced text editor which also happens to support regex commands, and not choke upon files which are many gigabytes in size. If you are not using Vim, you will need to convert the regex below to your own probably PCRE compliant flavour.

We'll take a look at stripping three types of data in the IFC: strings, non-strings, and the element classes themselves.

In the IFC format, apart from the file header, all strings represent user data with the exception of those hardcoded in the IfcShapeRepresentation (which really should be constants, ideally, but hey). The one other slight exception to this are IfcGloballyUniqueId strings which is recommended to be a unique 128-bit number, but since all parsers generally ignore this it is safe to throw this baby out with the bathwater. All string information are enclosed in single quotes which makes stripping with regex trivial.

Non string information is a bit trickier, but generally we need to only care about RGB codes and transparency codes.

Subtypes which share the same express specification can be swapped out interchangeably with one another, such as columns and beams. Others may exist but I haven’t looked too deeply.

As such, the file can be stripped by:

:g!/SHAPEREP/s/'.{-}'/'0'/g
:%s/RGB((.{-}),.*)/RGB(\1,0.5,0.5,0.5)/g
:%s/IFCSURFACESTYLERENDERING((.{-}),.{-},/IFCSURFACESTYLERENDERING(\1,1,/g
:%s/IFCCOLUMN/IFCBEAM/g

(note: maintain file headers)

As a bonus, we should also strip material information. Material metadata is already stripped by virtue of stripping all strings, but even though every material has no data, the material assignment information is still there. Unfortunately, some IFC outputs split the IfcRelAssociatesMaterial into multiple lines. Multi-line regexes should be treated with caution as it is easy to cause lexical errors. Here’s my attempt:

%s/RELASSOCIATESMATERIAL((.*_.{-},#).{-});/RELASSOCIATESMATERIAL(\100000);/g

Where 00000 is the ID of the IfcMaterial you'd like to reset it to.

The result is shown below, where all layer information, all property set information is trashed, all names are trashed, all materials reset, and it thinks a column is a beam, etc. I am viewing the results with Solibri IFC Model Checker.

Obfuscated IFC in Solibri

Keep in mind that with all the best intentions of IFC and the Building Smart folks behind the standard, the implementation is rather spotty in various software, especially Revit. So, your mileage may vary.

As a closing note, IFC is meant for interoperability. Stripping information in this manner is not particularly condoned by myself. You may want to consider alternatives such as... Well, just giving them the data :)

Comments

If you have any comments, please send them to dion@thinkmoult.com.