How to Determine the File Format of Your Print Job

Wed, 10/07/2015 - 13:52 By Dave Brooks

Introduction

It’s out of the scope of any single web page of a manageable length to describe how to determine the format of any file. Nonetheless, we have noted here some basic tools and rules we use when helping customers or for our research and experimentation.

The Cygwin file utility

Cygwin is a distribution of utilities compatible with Windows which you might find on a typical Linux system. Some of us at the company find them invaluable.

You can find Cygwin by doing a web search; the steps to install are beyond the scope of this document, but it’s not difficult to find instructions if you need them.

The file utility is quite useful; it almost always recognizes a file type, including PCL, PostScript, PDF, and text in various formats. It also excels at image formats, so you’ll know in a moment if your print job is TIFF for instance.

The Cygwin utilities run on in a DOS command window. We are not aware of a graphic interface.

PCL jobs

PCL is a widely used format produced by software from many vendors, and the variety you can find in PCL is vast. Staggering. However, we can offer these considerations.

  • You can always do a web search for a PCL viewer and see if the file loads into the viewer; if you think you will need to do this often, it’s good to be familiar with a PCL viewer
  • Using the cygwin file utility mentioned above:
C:\Output\PCL>file XML__LibXML__Boolean.3pm.PCL
XML__LibXML__Boolean.3pm.PCL: HP Printer Job Language data JOB NAME = "XML__LibXML__Boolean.3pm"\015\011\011\011 \011\011\011 \011\011 \011\011

 

  • You can load the file into notepad or another text editor. Here is a sample of what a simply annotated PCL file looks like in notepad:

simple PCL in a text editor

  • many PCL jobs are binary files; you will need to use a viewer program or the file utility to recognize them

Text jobs

RPM can read and process several text formats. For instance, the file utility mentioned above reports a text file as:

XML__LibXML__Boolean.3pm: ASCII text, with overstriking

However, the file utility also reports ASA format files as any of the following:

  • ASCII text, with CRLF, CR line terminators
  • ASCII text
  • data

It’s always easiest to recognize something if you know what you are looking for, so please feel free to enlist our help in this.

PostScript

The good news about PostScript files is that they are generally text, and they start with a line that is going to resemble the following:

%!PS-Adobe-3.0

You can look at a PostScript file in an editor although you’re going to see a lot of text you may not understand, for instance:

%!PS-Adobe-3.0
%%Title: Xerox DocuTech 6135 Internet, Job 92
%%Creator: Pscript.dll Version 5.0
%%CreationDate: 6/2/2000 13:35:47
%%For: Administrator
%%BoundingBox: (atend)
%%Pages: (atend)
%%Orientation: Portrait
%%PageOrder: Special
%%DocumentNeededResources: (atend)

 

The first line and the prolific use of %% are good indicators you’re looking at PostScript.

PDF

The Cygwin file utility does an excellent job at identifying PDF files, in case there is any doubt; or you could try any PDF viewer program.

If you were to open the file in notepad the first line would be something like:

%PDF-1.4

The numbers may be different, but this is probably a good indicator that you’re looking at PDF.