Why do we need yet another PostScript Interpreter?

(github repository)

Print to PDF print drivers are ubiquitous in the Windows world. They are everywhere, and part of the reason for that may be because Microsoft ships the pscript5.dll on windows with which you can create a PostScript/PDF print driver in an afternoon.

However, all of those print drivers simply create PDF files, well, duh !? I guess that's obvious :)

But what's NOT obvious is that these drivers actually create a PostScript language file ('.ps), and they turn that into a PDF file and throw away the '.ps file.

SO... what if the driver didn't throw away the '.ps file ?

One format for all

Consider for a moment that you could have one and only one file format available to you, from every single file on your system or network converted TO that format automatically ?? No "File - Save As", no special software, of a different flavor, you would have to launch against each different file type, just a simple print, and voila, you've got a representation of anything whatsoever, into a file format that is consistent across every type of file.

Maybe that format (*.ps) isn't important to you yet, more on that later, but for now, think about what you could do if you had some need to analyze any type of file, and you could front-end that analysis by simply printing the file, i.e., not having to manually convert to some common format.

A configurable PostScript print driver free for the asking.

My CursiVision system, built for electronic signature capture and document management, contains a PostScript print driver that I would gladly give you the installer for, free of charge. Simply email me at This email address is being protected from spambots. You need JavaScript enabled to view it. and I'll get you that installer right away (I'll put a link to it here at a later date).

This driver does not throw away the intermediate PostScript file. In addition, the driver has a registry setting that you can use to specify what to "launch" against this retained PostScript file.

Couple these two together, and you can build a pretty significant software or document process with them.

The EnVisioNateSW PostScript interpreter

This PostScript Interpreter software fits perfectly into this scenario.

Over the years I have seen countless requirements that specify documents should be examined for content, and perhaps for the data generated within them. Those who may require such a thing may also believe they would have to develop custom software for each "type" of file to be examined. Per the above discussion, that is no longer required.

However, unlike a tool destined solely for the creation of PDF files, this PostScript interpreter is entirely different and is, from it's conception, intended to be useful for ANY purpose it's clientele may dream up as necessary, such as inspecting the content of documents for particular text or data.

As a robust COM object, this software is highly extensible in it's configurability for unique purposes. Further, it's implementation and architecture are so clean and obvious, you could see where and how such configuration(s) could be made. Isn't that one of the ideas of Open Source, to be able to bend it to your needs ?

A term I use often in discussing software characteristics, is "Extensibility".

For me, the true meaning of this is that I should be able to take any particular software system that relates, in some way, to the particular problem domain I'm trying to provide a solution for, and to "Extend" that system such that it performs perfectly to my needs in that regard.

Is that somehow too much to ask ?

I would think that the concept of Open Source, and "Extensibility" point to each other and utter "<- This is what I mean by that".

However, that is SO FAR from the truth it makes Open Source a travesty of lies and broken promises.

Why is Open source like this?

Here are a couple of reasons:

  • It is sloppy, sloppy, sloppy. It is disorganized, disheveled, and difficult to read
  • It is not architected. It does not have any sort of formal integration technology
    • No formal adherance to any sort of interface definitions, no idl/odl files
    • Plugins are bullshit, capabilities are offered because a file ('.dll) exists somewhere ? Not good
    • Reliance on header files to dictate interoperability ? Maintenance and versioning headache ! No Way
  • It is poorly structured, too hard to build, and frankly, to distracted by the need for cross platforms
  • ..... many more

Thus, you might begin to understand why I am in this space. I am here to do what I can to improve the state of software development, using it's most glaring example of what demands to be fixed: Open Source.

My work as an example

The fact that it is so hard get Open Source to bend to your needs is part of the reason I became focused on writing examples of how it should be created to actually embody "Extensibility" as it's primary nature.

I'm a COM fanatic

Don't confuse that to mean I'm a diehard Windows Fanatic, i.e., it's not correct to assume COM is only Windows. If you think I'm saying it is NOT on Linux, you'd be wrong, I'm not saying that at all, the fact is, I don't even know.

And also, I'm not necessarily a fanatic of any OS or platform, I've been there and done that. I was a diehard OS/2 devotee in the early 90's, I even had the "OS2 MAN" customized license plate (still do on the wall). So I've seen far, far, far better platforms take a dive to Windows, but what was I to do ? Write something on something that nobody would use ? Remember the OS/2 software for managing your CompuServe account ? Me either. Remember CompuServe ? Me either.

But what COM is is that formal set of concepts that provide the pathway to the Extensibility and configurability that I so strongly require in literally every software project that I do, and there have been a lot of them.

What COM is not is inextricably tied to Windows or any OS for that matter, and this is, again, something that I feel is not appreciated by the masses.

On the surface, developers probably think that to use COM, you need CoInitialize, CoCreate, and other Windows API calls, as well as the "registration" facilities in Windows to manage, versioning, install locations, platforms, bitness, etc., etc. All of those things are great, but absolutely not required.

COM Artifacts are just '.dlls in the end - you absolutely don't need that Windows architecture to load '.dlls, right? The ability to load dynamic libraries is native to every platform/OS/Development environment. At least any that's worth bothering with !

Throwing out the Windows API calls around COM, and especially the Common Language Runtime (aach !!) where you don't need it, COM is a fantastic, elegant, and ultimately simple set of techniques to make software interoperability, maintainability, and extensibility very easy to implement. The documentation is not great, of course, being a product of inane documentation automation like Doxygen - don't get me started on that. But with a modern development environment and especially a debugger, it's easy enough to figure out.

What does this have to do with the PostScript Interpreter?

This project, the interpreter, is used to parse PostScript language files. As above, those files are the product of a PostScript print driver, a version of such I offer for free, and are the "one and only one" file format that you can automate the conversion to from any file on your system.

The interpreter has a simple events interface, the IPostScriptEvents interface, with which it can call into your software when it finds something interesting in those PostScript source files.

There is a very simple example application in the repository that shows how to use this interface. At present, the interface will tell your software, the client (sink) of the events interface, where and what text exists in that document.

But what about the future uses of the Interpreter? Can you find uses for it in your software processes?

If I have done my job correctly, you should be able to extend this software to do anything regarding data, text, graphics or whatever that you would typically find in documents, or any file you can print.

Look at the sources in the repository, see if you can't quickly locate those places within it that would be dealing with what you are interested in. Then, note the pattern of events interfaces and how they are defined and used within the PostScript Interpreter.

Note that most of the code in the system regarding events is in the folder "COM Events", I point that out as an example of what I mean about the failings of Open Source - I wish everybody would think about organization as an important concept in software maintainability.

From all of that, you should be able to see where within the interpreter you would make outgoing calls on an events interface, and also to define that interface TO the interpreter, and finally, how to hook up to that interface in your client software.

Should you need any assistance at all with a more complex or different requirement, don't hesitate to email me, again at This email address is being protected from spambots. You need JavaScript enabled to view it..