For other information, see the Ghostscript overview.
There are many projects that would improve Ghostscript and that we would like to do, but for which we don't have enough resources. If you would like to take responsibility for any of these projects, please contact us. Additional comments on implementation approaches or project goals are in italic type like this.
MS Windows has a "language monitor" capability which would allow Ghostscript to be invoked seamlessly to process input files in any language Ghostscript handles and for any printer for which Ghostscript has a driver. Doing this properly would require integrating Ghostscript with Windows' "Add Printer" dialog, using an appropriate PPD.
Russell Lang's RedMon program provides
some, but not all, of this capability.
See also lib/ghostpdf.ppd
.
Currently, Ghostscript can work as a "helper application" for the Netscape browser, but not as a plug-in; the latter would integrate it more closely with the browser. We aren't sure what doing this would involve; we've also heard by rumor that it's already been done.
In order to integrate Ghostscript into XMetaL and other applications it would be convenient for Ghostscript to be distributed as a COM object along with the current gswin32.exe, gswin32c.exe and gsdll32.dll files.
Currently Ghostscript implements Visual Trace window for Windows only (see wdtrace.c). An implementation for X would be useful.
Currently, drivers can be written so that converting PostScript to a list of graphical objects can run in one thread, and rasterizing the objects can run in another thread. However, drivers must be written specially if they are going to do this. We would like to change the architecture so that any driver can work this way. We would also like to support dual-threaded operation for drivers that produce high-level output, such as the PDF writer. Doing this would require separating banding from the multithreaded logic. Also, currently each thread has its own allocation pool: this is unnecessary in the normal case, since Ghostscript now supports properly locked access to the C heap, but embedded systems still need to use a fixed-size area for the rasterizing thread. With a locked, shared allocator, the rasterizing thread could use the full set of band list functions; with a fixed-size area and a separate allocator, only a subset is available, as is the case now for dual-threaded drivers.
Currently, drivers must be linked into the executable. We would like to be able to load drivers dynamically. Doing this requires defining a platform-independent API (presumably extending the current gp_* APIs) that would work at least on Linux, vendor Unix, MS Windows, and Macintosh. Unix systems should include Sun, HP, AIX, IRIX, DEC; Linux ELF and a.out formats should both be supported. Consider the Netscape plug-in architecture.
The PostScript 'setpagedevice' function implements matching of media and page size requests to available media, page orientation, and paper handling (duplex, etc.) Currently it is implemented in PostScript code, which means it is not available for use with other input languages. (It is available for PDF, which Ghostscript implements on top of PostScript, but not for the not-yet-freely-available PCL interpreters that use the Ghostscript library, or for possible future SVG or similar interpreters). We would like to move this function into C. The device driver will be required to send page parameters up to PostScript to be stored in a resource. To be included in this project are handling policy implementations in the device drivers. DeferredMediaSelection should also be implemented.
In a few cases, it would be desirable to provide a 'tee' capability for drivers: specifically, for generating small, low-resolution 'thumbnail' images concurrently with other output. Probably the simplest way to do this is to generate a band list and then process it twice. This is not completely trivial, since the band list does include device resolution information and scaling would be required for some constructs.
OutputDevice
resource category
Each available output device should provide an instance of the
OutputDevice
resource category, which gives the available
page sizes, resolutions, media classes, process color models, and other
information about the device. This would replace the current
non-standard use of a 4-element PageSize
in the
InputAttributes
entry of the page device dictionary.
Currently, the maximum length of the OutputFile
parameter is
a compile-time constant, gp_file_name_sizeof
. This is
appropriate for ordinary file names, since this constant is the platform's
limit on the length of a file name. However, if OutputFile
is a pipe, the length should not be limited in this way. This is
probably a small project: it requires allocating the file name dynamically,
and freeing it in the finalization routine that gets called when a driver
instance is freed..
We would like to provide (Adobe) PrintGear and (H-P) PPA output drivers for Ghostscript, but the specifications for these protocols are not published. If you can provide them to us without violating any agreements, please let us know. (Some work has already been done on reverse-engineering these protocols, but we don't have references to it.)
We would like to improve the high-level PostScript-writing
pswrite
driver to bring it up to parity with the PDF-writing
driver (including the many improvements in the latter being implemented in
Ghostscript 7.xx). Specifically, we want it to write text as text rather
than bitmaps, and to consistently write images in their original high-level
form. We have already started to factor out code that
should be common to these two drivers, specifically for writing embedded
fonts and compressed data streams.
There is one small part of this project that would be especially valuable
and could be done independently (although it might have to be partly or
entirely redone later): compressing images. Currently the driver only
compresses character bitmaps, and doesn't compress other images at all.
It should use the CCITTFaxEncode
filter for 1-bit-deep
images, and plane-separated LZWEncode
compression for color
images. When generating LL3 PS, the
Flate
compression will work better than miGIF. It may be
worth trying several methods on each image and use the one that works best.
Currently, the PCL 5 drivers produce only bitmaps; the PCL XL driver produces high-level graphics and sometimes high-level images, but low-level text. We would like to improve these drivers to produce higher-level, smaller output. This was a very low-priority project; it has become more important now that H-P's laser printers are shipping with less memory.
We would like a "GDI driver" for MS Windows that would implement more
higher-level constructs (specifically for text). The
mswin
and mswinprn drivers both do some of
this. Some of the the 'xfont' support code for MS Windows should be useful.
We were frustrated in the past because the GDI calls for getting font sizes
and metrics consistently returned incorrect information and provided no way
to get the correct information; perhaps this has been fixed in 32-bit
Windows. We believe that H-P, Russell Lang, and perhaps others are working
in this area, but we can always use more help.
The PDF writer needs to be able to generate thumbnails (small previews). We
might do this through the 'tee' capability mentioned above. However, we
currently prefer the idea of implementing a completely separate program to
add thumbnails to an arbitrary, existing PDF file: this would allow
Ghostscript to add thumbnails to PDF files generated by other programs.
Much of the code needed to do this has already been written
for Ghostscript's PDF linearizer: see
lib/pdfwrite.ps
. A user has implemented this as well,
using a separate program that calls Ghostscript: see
http://www.uni-giessen.de/~g029/eurotex99/oberdiek/.
In addition to factoring out the error diffusion code as described below, we would like to see another attempt at reducing the enormous volume of code for color inkjet drivers. There are three sets of drivers (gdevcdj.c, gdevstc.c, gdevupd.c) with much overlapping functionality. The latter two driver families make good attempts at factoring out things like head geometry and canned control strings, but we think this problem deserves another pass, especially in the hope of consolidating these drivers into a single family.
See below under "Notification for glyph decaching."
Currently, all images are decompressed by the interpreter before being passed to the graphics library; the PDF writer may then compress them again. Ordinarily, this only slows things down a little, but in the case of DCT-encoded images that are being DCT-encoded in the output, image degradation may occur. Ideally, the implementation should be smart enough to not decode and re-encode the image. However, making this work properly is difficult. This would probably involve extending the library APIs for images so that they could pass a stream, possibly including filters, instead of the (fully decoded) data rows.
Currently, the PDF writer has no way to emit warnings. Users would like to see warnings when fonts cannot be embedded (this is actually required when the value of CannotEmbedFontPolicy is set to /Warning), and for some other questionable situations like non-existent Dests (Feature request #480853). Probably the right way to handle this is with a pseudo device parameter called "Warnings" that is a list of strings: the pdfwrite driver would add strings to this list, and the ps2pdf script (lib/gs_pdfwr.ps) would read them out, print them, and reset them at the end of each page.
Currently, the library supports a maximum of 32 bits of data per pixel; we
would like to raise this limit to 64 bits on systems where the 'long' data
type is 64 bits wide. The gx_color_index
type is already defined as 'long', but there are many places where the type
bits32
is used for pixel values; there is a 32-bit
stored-image "device", but there is no 64-bit device; a few algorithms and
tables have knowledge of the 32-bit width built into them, only because the
C preprocessor doesn't have any kind of loop or repetition
capability.
The PostScript specification includes an option for the interpreter to implement trapping (adjustments of object boundaries to prevent visual anomalies caused by slight misregistration of different ink layers): we would like to implement this. This is a complex and difficult area; even many Adobe RIPs don't do it.
Ghostscript includes a reduced True Type bytecode interpreter branched from FreeType 1. It performs a grid fitting for True Type glyphs except ones involving instructions patented by Apple. A wanted improvement is to implement a stem recognition algorithm similar to Free Type autohinting. It also would help to poorly designed Type 1 fonts, which have misplaced or missed hints.
Another useful improvement is to implement a font antialiasing with
TextAlphaBits
other than 1,2,4.
Ghostscript 7.00 and later supports ICCBased color spaces of PDF using the icclib package from http://web.access.net.au/argyll/color.html but there is no facility to use ICC output (printer) profiles that may be embedded in the PDF. Also it would be useful for PostScript to be able to directly use a specific Intent from ICC profile to convert output colors (as CRD's are now used). The primary difficulty is that the graphics library and PostScript always use CIE XYZ as the connection space, but ICC profiles may use CIELAB as the connection space, requiring conversion for use with the graphics library.
Currently, knowledge of the specific data formats and algorithms for halftoning permeates too many places in the library. We would like halftoning to be more "object oriented" (using virtual procedures) so that we could support other halftoning methods such as direct use of threshold arrays, or the double-rectangle approach added in newer PostScript versions. Threshold arrays take much less space than the current representation, generally at the expense of longer rendering time for black-and-white images; double-rectangle representation would give us a better implementation of AccurateScreens. We might want store both threshold arrays and the current representation.
Currently, several different inkjet drivers implement their own, very similar but slightly differing error diffusion methods. This has caused severe code bloat as well as tempting future driver writers to contribute to it further. We want to factor out error diffusion into a common set of facilities that drivers can use. We would like to design these facilities so that they can easily interface to the Even-Toned Screening algorithms, to the extent that these will be Open Source.
The Ghostscript distribution includes a stochastic threshold array. This array has some gamma correction built into it, which works well for some output devices and not for others. We would like to provide a version of this array without (or with less) gamma correction. We have original data available from which this could be done fairly easily.
The PostScript language defines many functions relevant to graphics rendering as being implemented by arbitrary PostScript procedures: transfer (gamma correction), black generation, undercolor removal, several stages of CIE color space and rendering, and color mapping for Separation and DeviceN spaces. Since the graphics library can't call PostScript procedures, Ghostscript currently samples these procedures at a fixed number of points and interpolates linearly between the samples. As of Ghostscript 6.20, the library can interpret a restricted subset of PostScript procedures directly (basically those that only use arithmetic and comparisons: no loops, sub-procedures, or data structures). Changing the rendering functions to use this approach when possible would greatly improve output quality when the functions are very non-linear (which we have actually seen in practice). This should only be done if the function is, in fact, severely non-linear, since interpreting the function definition will almost always be much slower than interpolating in the table.
Currently, there is a lot of tiresome code for doing callbacks with continuations for loading the caches that hold sampled values for the procedures listed under "Change sampled functions ..." above. For the Separation and DeviceN tint transform functions, and only for these, PostScript code associated with the setcolorspace operator actually converts the PostScript procedure to a Function object -- to a FunctionType 4 (PostScript subset) Function if possible, or to a FunctionType 0 (sampled) Function if not. This approach should be used for all the other sampled functions. Doing this would reduce the amount of C code significantly, while only increasing PostScript code slightly.
This change would require touching (and slightly changing) all PostScript operators that currently do such callbacks: for example, rather than a setblackgeneration operator that takes a PostScript procedure as its operand, we would have a .setblackgeneration operator that takes as operands both the PostScript procedure (so that currentblackgeneration can return it) *and* a Function derived from it (which will actually be used when loading the cache, or for sampling directly if desired).
In some cases, this approach has a non-negligible space cost. If the PostScript procedure cannot be represented as a FunctionType 4 Function, it must be sampled and represented as a FunctionType 0 Function. Then the BG / UCR / transfer / ... cache will essentially just hold a copy of the Function data. While it is likely that this situation will be rare in practice, it might be worth looking into changing the internal representation of these caches so that they were the same as the representation of a FunctionType 0 Function with a particular choice of parameters. Then the PostScript code that called .buildsampledfunction when necessary could arrange the parameters to have the same values as the internal representation of the cache, and the cache could use the Function data directly. This is probably not worth the trouble.
Currently, if a CIE rendering dictionary uses a lookup table for the final step, Ghostscript always interpolates linearly between the entries. Cubic interpolation should be supported as an option. A cubic interpolation option is also needed for general table-lookup Functions.
Ghostscript has partial support for alpha channel and for alpha and RasterOp compositing. There is some architectural support for general compositing, but it postdates the RasterOp implementation, and most of the RasterOp code doesn't use it. We expect that the more extensive compositing and alpha capabilities of SVG will find their way into PDF (and probably PostScript as well) in the course of 2000 and 2001, and we will need to implement them.
Currently, when Ghostscript uses a band list, it does halftoning before banding. It should do halftoning after banding: this produces smaller band lists and shifts more work to the rasterizer (which is good because the rasterizer can be multi-threaded internally for higher performance on multiprocessors: see the next topic.)
When smoothed ("interpolated") images are written in the band list, extra rows must be written above and below each band in order to provide the data for interpolation. Currently, the number of such rows is computed very conservatively; instead, the final interpolation algorithm should be consulted to provide the correct value. This is a small task.
For high-resolution devices, rasterization dominates execution time. On multiprocessor systems, Ghostscript can do tasks in parallel:
We would want these facilities implemented so that no conditional compilation was involved: on uniprocessor systems, the locking API would simply have a vacuous implementation.
Currently, drivers can't do a very good job of downloading rendered character bitmaps to the device they manage, because they can't find out when a bitmap is being deleted from Ghostscript's cache and therefore will never be referenced again. Here is a sketch of how we would add this capability to the graphics library:
text_begin
call, simply
to get access to a gs_imager_state
that references the
rendered character cache. (The driver could always simply call the default
implementation of text_begin
.)
text_begin
procedure, the driver would call
gs_glyph_decache_register(imager_state, notify_proc, proc_data)
where proc_data
was, or pointed to a structure that
included, a pointer to the driver.
gs_glyph_decache_register
would use the general
notification mechanism defined in gsnotify.h
to call
notify_proc(proc_data, pchar_data)
whenever a bitmap was removed from the character cache.
pchar_data
would point to some identification of the
character; perhaps just the bitmap ID, but possibly a
gx_cached_bits_common
or even a cached_char.
char_cache
structure would be need an additional
member, a gs_notify_list_t
. It would also need to add
finalization so that when it was freed, it would notify and unregister all
clients, using gs_notify_all(list, NULL)
and then
gs_notify_release
.
This facility was requested by the Display Ghostscript project, but it could also be used to improve the output of the PCL XL driver and possibly the X and PCL5 drivers.
There is a project to create a GNU implementation of the OPENStep API, which involves extending Ghostscript to provide the full functionality of Adobe's Display PostScript system with some of the NeXT extensions. For more information, please contact Net-Community <scottc@net-community.com>.
For full Adobe PostScript compatibility, Ghostscript needs a real "job server" to encapsulate the execution of PostScript files. See the section on "Job Execution Environment" in the PostScript Language Reference Manual for details.
Ghostscript could be adapted with some work to read SVG. This would be an interesting and challenging project because SVG's graphics model would require extending the library (see above). If SVG turns out to be an important standard, it is important that there be a good free implementation of it.
%font%
and other IODevices.
Currently, the %font%
IODevice is not implemented. We would
like to see this implemented using a general framework for implementing
IODevices (%xxxx%) entirely in PostScript, in an "object oriented" manner
very similiar to the way Resource categories are implemented. An IODevice
would be implemented as a dictionary with the following keys, whose values
would be procedures that implemented the corresponding operation:
/File /DeleteFile /RenameFile /Status /FileNameForAll /GetDevParams /PutDevParams
There would only be global IODevices, no local ones; the dictionary keeping track of them would be stored in global VM.
This is an obscure feature that matters only because some PostScript code
uses filenameforall
with this IODevice, rather than
filenameforall
with the /Font Resource
category, to enumerate available fonts.
Adobe Acrobat Reader can scan a PDF file that has had its end-of-lines converted by careless users transferring the file across operating systems as text rather than binary across, and reconstruct the cross-reference table which the PDF interpreter requires. This only works if the file has no binary data in it, which with PDF 1.3 is rarely the case. However, users occasionally receive PDF files that have been damaged in this way, and it might be useful to have a program that can repair them. We think this should probably be done as a separate program, possibly in PostScript, similar to Ghostscript's PDF linearizer.
Currently, neither the PostScript interpreter nor the graphics library is fully re-entrant (no writable globals). Making them fully re-entrant would make Ghostscript usable in multi-threaded environments, and more easily usable in embedded environments. Note that this is necessary, but far from sufficient, for Ghostscript to allow simultaneous execution of a single Ghostscript interpreter instance by multiple threads: that is probably permanently out of the question. Almost all drivers, including all of the drivers in devs.mak which are maintained as part of the main Ghostscript code, are already fully re-entrant; making the remaining ones re-entrant should really be up to the driver author.
The %ram% device is documented in PS Supplement 3010 and 3011 dated August 30, 1999. This is probably not a major impediment to portability, but it would be handy.
On Unix, the suggested implementation would be to create a subdirectory of the temporary directory (usually /tmp), with the name chosen and the directory created in such a way as to avoid /tmp races and similar problems. Ghostscript should delete the subdirectory when it exits.
Copyright © 2000-2006 Artifex Software, Inc. All rights reserved.
This software is provided AS-IS with no warranty, either express or implied. This software is distributed under license and may not be copied, modified or distributed except as expressly authorized under the terms of that license. Refer to licensing information at http://www.artifex.com/ or contact Artifex Software, Inc., 7 Mt. Lassen Drive - Suite A-134, San Rafael, CA 94903, U.S.A., +1(415)492-9861, for further information.
Ghostscript version 8.64, 27 January 2009