$INPTBffr =~ s/([^\x0A\x0D\x20-\x7E]+)//g;
The single line of Perl code shown above will strip all non-text
data (other than line endings) from any file leaving only text behind. The
entire contents of the file are loaded into $INPTBffr
by a simple READ() statement, and then binary
information is stripped from it in place by the substitution regular
expression construct on the right. This tiny fragment of code is not by any
means a whole program, but for a single statement, it does do a surprising
amount of the work necessary to transform files into a suitable form for
input to a semantic analysis engine.