LPR/LPD Advanced Topics

Thu, 08/22/2024 - 12:14 By Dave Brooks

Introduction

We have a separate page that introduces LPR and LPD. Our print server product, RPM Remote Print Manager(R) ("RPM") is an LPD server.

The essence of sending jobs via LPD is a level of cooperation between the person sending the print job, and the person setting up the queue on the print server to process this print data correctly.

LPD includes very little instruction on how to process the data exactly. For instance, there is no directive from the print client to print the job on a specific printer, or archive it in a specific folder, etc.

RPM therefore has an extensive configuration setup to process your print jobs exactly like you intended.

With that said, there are several nuances in the LPD protocol that RPM handles without needing to be configured.

I'll describe them here. Most of this functionality draws on the content of a print request, so I'll describe that first.

The content of an LPD print request

An LPD print request includes 5 parts:

  1. the name of a print queue to submit this print request to
  2. the name and size of a "control file" which contains metadata about the job
  3. the contents of the control file
  4. the name and size of the data file
  5. the actual data file

The control file and data file "names" do not resemble a regular file name on any computer system. They are more similar to:

cFA<several digits><text similar to the host name>

The data file name is similar but starts with "d" instead of "c".

There are usually 3 digits in the name but we've seen longer digit strings. It doesn't matter but within a given print request, they should be unique for reasons which will make sense shortly.

Copies

Control files are plain text files. Each line contains a single character code that describes what that line means, followed by a text string.

Concerning the data file name, the codes will either be a format directive or the letter "U" which means "remove this data file when the job is complete". RPM handles the latter meaning differently, with queue settings.

So, the total number of format directives for the data file, aside from "U" tells us how many copies to generate.

Also, the IBM AS/400 control file has a unique non-standard code to indicate copies. RPM recognizes this as well.

Multiple print jobs in one request

Most of the time, we see a print request formatted as I describe above, with the 5 parts. The control file can come ahead of the data file, that is parts 2 and 3 are usually ahead of 4 and 5. However, it doesn't matter which comes first so long as they are not mixed together. These commands have a binary command code embedded so it is not ambiguous.

However, you can send multiple print jobs in one request. All the print client needs to do is add more control and data files.

This is how it would work, assuming we see the control file followed by the data file:

  1. print on this named queue
  2. control file name and size
  3. control file data
  4. data file name and size
  5. data file name
  6. control file name and size
  7. control file data
  8. data file name and size
  9. data file name
  10. <steps 2 through 5 repeat indefinitelyi>

And so on. At the end of the transmission, the client closes the connection. That's what happens anyway when one print job is sent.

The advantages of sending multiple jobs at once include:

  • fewer computing and networking resources, as opening a new connection is fast but not free
  • If you are using a Windows client and you send thousands of jobs, you may encounter running out of resources (I do, occasionally). I have not seen this happen when stacking multiple jobs into a single connection between the print client and RPM.

Capture

RPM can optionally generate "capture" files for print job requests. This records the exact content of the request, minus the actual data file content. All we do there is record how many bytes the data file was supposed to be, and the total size in bytes of each part message that originally contained data bytes.

We have identified a lot of very peculiar behavior from print clients this way. And we've also learned where we made faulty assumptions about what the LPD specification meant, or ways we can broaden our perspective and adapt without generating more bugs in the world.

We have a replay script in our system that recreates the events depicted in the capture file. We make up the content of the data file because it's not about troubleshooting how the job is processed, but rather about how we acquire the job.

No secrets are revealed in the capture file unless you are concerned about file names or user names.

Truncated data files

We have occasional reports of data files partially transmitting, then the job apparently "times out" based on observation.

In each instance, there was a VPN involved. We urged the customer to involve their network support staff as troubleshooting connections is beyond the scope of what we can accomplish remotely. Not to complain but this is because:

  • we don't know your network architecture
  • we should not know too much about your network architecture as that could be a security breach for you
  • we don't know the particulars of your VPN management software

And, we have not found other software vendors to be quite willing to discuss their potential bugs with us.

However, in each instance, the customer was eventually able to resolve this issue with their VPN and eliminate this behavior.

Zero-byte data files

This is a feature of the LPD protocol, not a bug. The protocol allows the print client to define the size of the data file as zero bytes. This means that the data file doesn't have a specific boundary when this print request was generated on the client computer.

We have run into this. RPM handles it by writing the bytes to the incoming data file as long as they continue to arrive. Then, once the client closes the connection, we record the total bytes received as the data file size.

This is called out in the LPD specification and was included in RPM early on.

Enormous data file sizes

The opposite extreme of zero-byte data files could be the enormous sizes the Microsoft LPR client sometimes specifies. These values are typically over 4 GB.

This may not be accurate but it seems that the Microsoft LPR port monitor does this when you print to a port monitor. The code could have simply specified a zero-byte size as discussed above. Instead, it shoots for a large number.

When RPM sees a data size of over 4 billion, we "play it by ear" so to speak, and do not assume it will be that large. We keep that large size on hand but overwrite it when the client closes the connection, updating it with the true data file size.