Discussion:
startup refactoring
(too old to reply)
Victor Mote
2003-06-16 23:38:14 UTC
Permalink
FOP Developers:

I did a dry run of the startup refactoring (Session, Document, etc.) work
yesterday & this morning, and am satisfied that the concepts work. Here are
some comments:

1. I am going to start committing changes, hopefully this evening. Much of
the work is refactoring, but there are some substance changes as well,
specifically to allow multiple output options, multiple documents, etc. I
have therefore decided to implement it as a series of self-contained
changes, rather than dropping the entire change in at once. This will make a
better doc trail. Let me know if you object.

2. Logging. During the dry run, I realized that I don't know whether we want
to have logging for a Session, or for each Document, or even at some finer
level of granularity. Session will be implemented as a singleton, so if we
only need logging at that level, then a static construct can be used to get
logging from /anywhere/ without adding anything. Otherwise, I'll add logic
that allows the Document object to either be directly accessed or computed
(by going up the object hierarchy) to get the logger. That seems more
desirable to me than implementing logging in all of the classes, but perhaps
I am missing something there.

Victor Mote (mailto:***@outfitr.com)
2025 Eddington Way
Colorado Springs, Colorado 80916
Voice +1 (719) 622-0650
Fax +1 (720) 293-0044
Glen Mazza
2003-06-17 02:22:08 UTC
Permalink
Post by Victor Mote
1. I am going to start committing changes, hopefully
this evening. Much of
the work is refactoring, but there are some
substance changes as well,
specifically to allow multiple output options,
multiple documents, etc. I
have therefore decided to implement it as a series
of self-contained
changes, rather than dropping the entire change in
at once. This will make a
better doc trail. Let me know if you object.
Shouldn't be a problem (we can always go back to the
old code if it's a disaster!) When you're done, I
still would like to get the LayoutHandler and
StructureHandler classes out of the apps package
though (Bug 20397, although on these new classes)--any
objections from anyone on this part?

Glen


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Jeremias Maerki
2003-06-17 07:04:56 UTC
Permalink
Post by Victor Mote
I did a dry run of the startup refactoring (Session, Document, etc.) work
yesterday & this morning, and am satisfied that the concepts work.
1. I am going to start committing changes, hopefully this evening. Much of
the work is refactoring, but there are some substance changes as well,
specifically to allow multiple output options, multiple documents, etc. I
have therefore decided to implement it as a series of self-contained
changes, rather than dropping the entire change in at once. This will make a
better doc trail. Let me know if you object.
I'm a bit concerned that you call this "startup refactoring" where it's
really API redesign in the end, I think. And that's an important topic
where I would have expected you to ask for a "go for it" before starting
on this. Generally, JustDoIt (tm) is a good thing but this is about the
API and I'd like use to reach consensus on the direction. From the Wiki
page I read that there are substantial differences in our (you, Jörg, me)
ideas.

I'm a bit unhappy that you placed/left Session in the apps package. I
would like to see the apps package deprecated as a whole over time. I
would like a cli package that only contains the stuff needed for the
command line and I'd like to have (wish, not a demand) the main FOP api
in the org.apache.fop package.

And then I'm a bit unhappy about your terminology: Session, Document,
RenderContext... Session and Document seem like "rubber terms" to me.
Your Session looks very much like a FOProcessorFactory (analog to
TransformerFactory). A session to me is a short-lived object used during
one transaction. I think we should get consensus on the terminology
first.

Sorry if I'm a bit negative here. I certainly don't want to kill any
enthusiasm. FOP needs this more than ever.
Post by Victor Mote
2. Logging. During the dry run, I realized that I don't know whether we want
to have logging for a Session, or for each Document, or even at some finer
level of granularity. Session will be implemented as a singleton, so if we
only need logging at that level, then a static construct can be used to get
logging from /anywhere/ without adding anything. Otherwise, I'll add logic
that allows the Document object to either be directly accessed or computed
(by going up the object hierarchy) to get the logger. That seems more
desirable to me than implementing logging in all of the classes, but perhaps
I am missing something there.
You're talking about a singleton. I hope you don't mean a GoF Singleton.
A GoF Singleton in Java usually involves a static variable which I would
like to avoid in our new API.

Static logging à la Commons-Logging is the easiest thing, sometimes even
necessary if you ask people like Nicola Ken Barozzi. Even in
Avalon-enabled software. The normal Avalon style of logging means a
container is giving a logger instance to a service (see LogEnabled).

As long as we're talking about services (Renderer, FOTreeBuilder,
org.apache.excalibur.source.SourceResolver,
org.apache.excalibur.xml.sax.SAXParser etc.), Avalon-style logging is
superior (but more difficult to do right), because you can have multiple
instances of the same service in a VM but logging to separate log files,
for example, if you run two different applications in the same VM (in
Avalon Phoenix, for example). You could have an invoicing system and a
CRM system both running FOP for document production but logging to
different log files. Ok, they could be separated by different
classloaders but I still think not having statics in a server
environment is better.


Jeremias Maerki
Victor Mote
2003-06-17 15:10:19 UTC
Permalink
Post by Jeremias Maerki
I'm a bit concerned that you call this "startup refactoring" where it's
really API redesign in the end, I think. And that's an important topic
where I would have expected you to ask for a "go for it" before starting
on this. Generally, JustDoIt (tm) is a good thing but this is about the
API and I'd like use to reach consensus on the direction. From the Wiki
page I read that there are substantial differences in our (you, Jörg, me)
ideas.
I certainly thought I had tacit consent.
Post by Jeremias Maerki
I'm a bit unhappy that you placed/left Session in the apps package. I
would like to see the apps package deprecated as a whole over time. I
would like a cli package that only contains the stuff needed for the
command line and I'd like to have (wish, not a demand) the main FOP api
in the org.apache.fop package.
My humble apologies. What is the easiest way to roll it back? Is there an
automated way?
Post by Jeremias Maerki
And then I'm a bit unhappy about your terminology: Session, Document,
RenderContext... Session and Document seem like "rubber terms" to me.
Your Session looks very much like a FOProcessorFactory (analog to
TransformerFactory). A session to me is a short-lived object used during
one transaction. I think we should get consensus on the terminology
first.
OK. In my terminology, Session is a long-lived object that lives, well, for
the Session. Document are children of Session. The only part of Session that
you have seen is a renaming of Driver to Session. That is all that has been
committed so far. But eventually nearly all of existing Driver gets pushed
down to Document, RenderContext, and RenderTask. The only thing that really
still lives in Session is the logging stuff, and it might belong with
Document (under my scheme).
Post by Jeremias Maerki
Sorry if I'm a bit negative here. I certainly don't want to kill any
enthusiasm. FOP needs this more than ever.
Post by Victor Mote
2. Logging. During the dry run, I realized that I don't know
whether we want
Post by Victor Mote
to have logging for a Session, or for each Document, or even at
some finer
Post by Victor Mote
level of granularity. Session will be implemented as a
singleton, so if we
Post by Victor Mote
only need logging at that level, then a static construct can be
used to get
Post by Victor Mote
logging from /anywhere/ without adding anything. Otherwise,
I'll add logic
Post by Victor Mote
that allows the Document object to either be directly accessed
or computed
Post by Victor Mote
(by going up the object hierarchy) to get the logger. That seems more
desirable to me than implementing logging in all of the
classes, but perhaps
Post by Victor Mote
I am missing something there.
You're talking about a singleton. I hope you don't mean a GoF Singleton.
A GoF Singleton in Java usually involves a static variable which I would
like to avoid in our new API.
Yes, I am talking about GoF Singleton. I understand the reluctance to use
static variables, but I haven't adopted the no-tolerance approach that you
have. The alternative is to either carry the static information around with
you in every method that might need it, or to actually store the information
in every object that might need it. I already see that with the logging. If
it is needed, then OK, but I don't see it yet.
Post by Jeremias Maerki
Static logging à la Commons-Logging is the easiest thing, sometimes even
necessary if you ask people like Nicola Ken Barozzi. Even in
Avalon-enabled software. The normal Avalon style of logging means a
container is giving a logger instance to a service (see LogEnabled).
As long as we're talking about services (Renderer, FOTreeBuilder,
org.apache.excalibur.source.SourceResolver,
org.apache.excalibur.xml.sax.SAXParser etc.), Avalon-style logging is
superior (but more difficult to do right), because you can have multiple
instances of the same service in a VM but logging to separate log files,
for example, if you run two different applications in the same VM (in
Avalon Phoenix, for example). You could have an invoicing system and a
CRM system both running FOP for document production but logging to
different log files. Ok, they could be separated by different
classloaders but I still think not having statics in a server
environment is better.
That sounds possibly useful for what we are doing.

Victor Mote
Victor Mote
2003-06-17 16:46:39 UTC
Permalink
Post by Victor Mote
My humble apologies. What is the easiest way to roll it back? Is there an
automated way?
I rolled them back manually, and have committed the change. All should be
back as it was before, except the affected files are two revisions higher.

Victor Mote
Jeremias Maerki
2003-06-17 17:01:38 UTC
Permalink
Post by Victor Mote
Post by Jeremias Maerki
I'm a bit concerned that you call this "startup refactoring" where it's
really API redesign in the end, I think. And that's an important topic
where I would have expected you to ask for a "go for it" before starting
on this. Generally, JustDoIt (tm) is a good thing but this is about the
API and I'd like use to reach consensus on the direction. From the Wiki
page I read that there are substantial differences in our (you, Jörg, me)
ideas.
I certainly thought I had tacit consent.
Well, I'm certainly guilty of letting trail off that discussion. I'm
sorry if I missed anything.
Post by Victor Mote
Post by Jeremias Maerki
I'm a bit unhappy that you placed/left Session in the apps package. I
would like to see the apps package deprecated as a whole over time. I
would like a cli package that only contains the stuff needed for the
command line and I'd like to have (wish, not a demand) the main FOP api
in the org.apache.fop package.
My humble apologies. What is the easiest way to roll it back? Is there an
automated way?
I wasn't vetoing your changes, merely pointing out possible differences.
You didn't need to revert. But do you agree with what I wrote?

<snip/>
Post by Victor Mote
OK. In my terminology, Session is a long-lived object that lives, well, for
the Session. Document are children of Session. The only part of Session that
you have seen is a renaming of Driver to Session. That is all that has been
committed so far. But eventually nearly all of existing Driver gets pushed
down to Document, RenderContext, and RenderTask. The only thing that really
still lives in Session is the logging stuff, and it might belong with
Document (under my scheme).
That's not what I'm talking about. Do we all agree on this terminology? Can
we find better classnames for what they are? These are key elements in
FOP's architecture. I'd like to have good names for them so they are
easily understandable even by outsiders. If you use Session you'll get
all sorts of strange questions. I'm pretty sure about that.

<snip/>
Post by Victor Mote
Yes, I am talking about GoF Singleton. I understand the reluctance to use
static variables, but I haven't adopted the no-tolerance approach that you
have. The alternative is to either carry the static information around with
you in every method that might need it, or to actually store the information
in every object that might need it. I already see that with the logging. If
it is needed, then OK, but I don't see it yet.
So I guess it's a matter of community decision what is preferred. Going
for statics and singleton is certainly easier but it'll carry you away
from the Avalon style of life. I have no problem if the majority of
committers think that Avalon should be thrown out. I didn't manage to
introduce the Avalon-patterns, yet, and it looks like I'm the only one
really pushing it so far but without enough power.


Jeremias Maerki
Victor Mote
2003-06-17 19:48:03 UTC
Permalink
Post by Jeremias Maerki
Post by Victor Mote
Post by Jeremias Maerki
I'm a bit unhappy that you placed/left Session in the apps package. I
would like to see the apps package deprecated as a whole over time. I
would like a cli package that only contains the stuff needed for the
command line and I'd like to have (wish, not a demand) the
main FOP api
Post by Victor Mote
Post by Jeremias Maerki
in the org.apache.fop package.
My humble apologies. What is the easiest way to roll it back?
Is there an
Post by Victor Mote
automated way?
I wasn't vetoing your changes, merely pointing out possible differences.
You didn't need to revert. But do you agree with what I wrote?
Reversion is already completed -- if other changes start getting checked in,
then it becomes a much more difficult task. WRT to the location of the
"Session" concept, I don't have strong feelings about it the way some others
do, and I viewed that as an issue that could be resolved later by simply
moving it wherever you guys decided it should go.
Post by Jeremias Maerki
Post by Victor Mote
OK. In my terminology, Session is a long-lived object that
lives, well, for
Post by Victor Mote
the Session. Document are children of Session. The only part of
Session that
Post by Victor Mote
you have seen is a renaming of Driver to Session. That is all
that has been
Post by Victor Mote
committed so far. But eventually nearly all of existing Driver
gets pushed
Post by Victor Mote
down to Document, RenderContext, and RenderTask. The only thing
that really
Post by Victor Mote
still lives in Session is the logging stuff, and it might belong with
Document (under my scheme).
That's not what I'm talking about. Do we all agree on this
terminology? Can
we find better classnames for what they are? These are key elements in
FOP's architecture. I'd like to have good names for them so they are
easily understandable even by outsiders. If you use Session you'll get
all sorts of strange questions. I'm pretty sure about that.
I am not committed to the name "Session", but can't think of a better name,
even with the help of my trusty thesaurus. I think FrameMaker uses that name
in its FDK API. Microsoft uses "Application" for the equivalent concept in
its Office 2000 object model. I didn't like "Driver" much or I would have
left it alone (that implies a different concept to me). I suppose
FOPInstance, or RuntimeInstance might work. I didn't realize that there
would be misunderstandings of "Session".
Post by Jeremias Maerki
Post by Victor Mote
Yes, I am talking about GoF Singleton. I understand the
reluctance to use
Post by Victor Mote
static variables, but I haven't adopted the no-tolerance
approach that you
Post by Victor Mote
have. The alternative is to either carry the static information
around with
Post by Victor Mote
you in every method that might need it, or to actually store
the information
Post by Victor Mote
in every object that might need it. I already see that with the
logging. If
Post by Victor Mote
it is needed, then OK, but I don't see it yet.
So I guess it's a matter of community decision what is preferred. Going
for statics and singleton is certainly easier but it'll carry you away
from the Avalon style of life. I have no problem if the majority of
committers think that Avalon should be thrown out. I didn't manage to
introduce the Avalon-patterns, yet, and it looks like I'm the only one
really pushing it so far but without enough power.
I'm not opposed to Avalon, but I might be if I understood it. It probably
seems lazy that I am not up to speed on it yet -- I have tried a couple of
times to get the big picture from the doc, but concluded that I won't get my
arms around it until I use it (IOW, the doc wasn't very helpful).

If the static construct and Avalan are mutually exclusive, the former seems
far less invasive, and easier both to implement and un-implement. However, I
realize that there are more things that you want to do with Avalon that
might require that invasiveness anyway. Sorry I am not more help here.

Victor Mote
J.Pietschmann
2003-06-17 21:02:01 UTC
Permalink
Post by Victor Mote
I'm not opposed to Avalon, but I might be if I understood it.
Well, the good message about Avalon is that it can make
development for much easier.
The bad news is that if too much of Avalon is exposed in the
APIs intended for common usage, it will drive many potential
embedders away.

I think if we want to use Avalon internally (and there are
interesting use cases, for example the Hyphenator), it would
make most sense to provide an Avalon component shell which can
be also used with a simple factory pattern similat to TrAX.
Have a look at the Avalonization Wiki.
Post by Victor Mote
It probably
seems lazy that I am not up to speed on it yet -- I have tried a couple of
times to get the big picture from the doc, but concluded that I won't get my
arms around it until I use it (IOW, the doc wasn't very helpful).
Amen, brother.
Post by Victor Mote
If the static construct and Avalan are mutually exclusive, the former seems
far less invasive, and easier both to implement and un-implement.
Well, static data often interferes badly with multithreading.
We can use it in the CLI, but in the core which is intended to be
possibly embedded in multithreaded long running server environments
should it is best to avoid them. Even if you can arrange to
synchronize properly (not as easy as many people think), blocked
threads on shared ressources will almost certainly generate angry
comments.

J.Pietschmann
Victor Mote
2003-06-17 22:36:10 UTC
Permalink
Post by J.Pietschmann
Well, static data often interferes badly with multithreading.
We can use it in the CLI, but in the core which is intended to be
possibly embedded in multithreaded long running server environments
should it is best to avoid them. Even if you can arrange to
synchronize properly (not as easy as many people think), blocked
threads on shared ressources will almost certainly generate angry
comments.
The only static data would be a pointer to the singleton object, which
contains the real data. However, it seems that your point would still be
valid with regard to that data.

The choices for data to be shared amongst threads are:
1. Allow concurrent access.
2. Allow synchronized access.
3. Allow no access (can't share data between threads).

We all agree that #1 is bad. Implementing #3 because #2 might be slow seems
counterproductive. If a person in line at the store complains that the line
is moving slowly, I don't see how it helps them to say that they can't shop
here any more (but it will shut them up). Or am I missing some fourth option
here?

Victor Mote
Glen Mazza
2003-06-17 19:28:29 UTC
Permalink
Post by Jeremias Maerki
I'm a bit unhappy that you placed/left Session in
the apps package. I
would like to see the apps package deprecated as a
whole over time. I
would like a cli package that only contains the
stuff needed for the
command line and I'd like to have (wish, not a
demand) the main FOP api
in the org.apache.fop package.
Instinctively, I wouldn't trust any code in the
package root of org.apache.fop -- we wouldn't have a
very modularized design that way (knowing FOP's
current coding style, the main FOP API would then be
accessing objects all through the packages,
octopus-like, splitting of the business logic with the
actual objects doing the work, inextricable from the
XSL FO process.)

FOP is more a pipeline to me:

APPs package (CLI, TRAX/XSLTInputHandler, Avalonized
API, Victor's ideas) --> FOTreeBuilder/Layout/Area
Tree creation --> Rendering.

IMO FOTreeBuilder should just expose three functions
(perhaps another one for logging):

1.) SetXSLFOStream() (file or stream)
2.) SetRenderType() (those constants currently in
Driver or CommandLineStarter)
3.) Run(). (returns a stream or file)

You can have 500 methods of calling these functions
within the apps package--it's all fine/OK, because
they will only be able to work with FOTreeBuilder
(apps will have no access to Renderers or Layout,
etc.) and its three functions.

Such an FOTreeBuilder may be getting to the heart of
the XSLFO Spec:

Input: XSLFO Stream, RenderType
Output: Document

and may be close to Xalan's approach, a team which has
similar input/output requirements.

Also, embedded Java code should be able to run without
accessing the Apps class at all by directly calling
those three functions in FOTreeBuilder. (Although we
can certainly provide additional options in apps--but
FOTreeBuilder should be all that is strictly needed.)
Said another way, the processing paths of
FOTreeBuilder, once instantiated in embedded code,
should not access *any* functionality in the apps
package, because apps is to the left of the pipeline.
Post by Jeremias Maerki
Session, Document,
RenderContext... Session and Document seem like
"rubber terms" to me.
Your Session looks very much like a
FOProcessorFactory (analog to
TransformerFactory). A session to me is a
short-lived object used during
one transaction. I think we should get consensus on
the terminology
first.
Again, I don't care much about what's in the app
package, but I think the idea of a one-way
FOTreeBuilder with just those three/four functions
exposed should be explored.

In general, too much business logic is in the apps
package, it evens knows about ElementMappings and
Renderers -- it shouldn't have to work with anything
that implementation-specific. Also, AWTStarter and
PrintStarter appear to be in the wrong places, those
should somehow be moved out of apps and be *past*
FOTreeBuilder. Let FOTreeBuilder decide to activate
those objects should it get an appropriate
SetRenderType().

But my "mental model" in incomplete here...comments
most welcome!

Glen


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Jeremias Maerki
2003-06-18 06:44:52 UTC
Permalink
Post by Glen Mazza
Instinctively, I wouldn't trust any code in the
package root of org.apache.fop -- we wouldn't have a
very modularized design that way (knowing FOP's
current coding style, the main FOP API would then be
accessing objects all through the packages,
octopus-like, splitting of the business logic with the
actual objects doing the work, inextricable from the
XSL FO process.)
Not necessarily. Well, the public API has to have some way to control
the whole show. This will automatically lead to a little octopus if the
code is in ...fop or ...fop.api. But no problem with another package.
Post by Glen Mazza
APPs package (CLI, TRAX/XSLTInputHandler, Avalonized
API, Victor's ideas) --> FOTreeBuilder/Layout/Area
Tree creation --> Rendering.
Uh, thanks for that one. It's a very nice show why the current apps
package is a mess. SoC (Separation of Concerns) would suggest not to mix
thing like CLI, public API and inner communication classes.
Post by Glen Mazza
IMO FOTreeBuilder should just expose three functions
Logging, at least in Avalon-land, is a lifecycle aspect (through
the LogEnabled interface). The methods below are lifestyle methods.
Lifecycle != lifestyle. It's best to talk about lifestyle primarily and
leave the lifecycle (instantiation, logging, configuration,
initialization) to a different discussion. Helps keeping the focus, I
believe. The lifecycle can eventually be handled automatically by a
container (such as Fortress or Merlin). Leaves you to care about the
lifestyle (=functionality).
Post by Glen Mazza
1.) SetXSLFOStream() (file or stream)
2.) SetRenderType() (those constants currently in
Driver or CommandLineStarter)
3.) Run(). (returns a stream or file)
This mixes concerns. A render type does not belong in a class called
FOTreeBuilder. The name already implies that the class is responsible
for building the FO tree. It should have nothing to do with the
rendering aspect, especially since FO tree building is independant of
the output format (wrt Renderers). The render type is better placed in a
class such as a RenderingRun/Document/<whatever-we-call-it>. The FO tree
builder is (to me) a service that simply accepts a SAX stream and builds
the FO tree. The layout engine, another (coarse grained) service, will
then access the FO tree to do the layout. This is all kept together by a
"supervising" class. At least, that's my personal high-level view of it.
Post by Glen Mazza
You can have 500 methods of calling these functions
within the apps package--it's all fine/OK, because
they will only be able to work with FOTreeBuilder
(apps will have no access to Renderers or Layout,
etc.) and its three functions.
Such an FOTreeBuilder may be getting to the heart of
Input: XSLFO Stream, RenderType
Output: Document
and may be close to Xalan's approach, a team which has
similar input/output requirements.
Also, embedded Java code should be able to run without
accessing the Apps class at all by directly calling
those three functions in FOTreeBuilder.
The FOTreeBuilder should remain an inner service to FOP, not exposed in
the public API, if you ask me. The public API, IMHO, should be an easily
understandable construct that masks the internal complexity from the
user. Driver already does that today, it's just a bit too tightly packed
and not focused on as little as possible. We need to disentangle that
class to make it more like the JAXP API.
Post by Glen Mazza
(Although we
can certainly provide additional options in apps--but
FOTreeBuilder should be all that is strictly needed.)
Said another way, the processing paths of
FOTreeBuilder, once instantiated in embedded code,
should not access *any* functionality in the apps
package, because apps is to the left of the pipeline.
Jeremias Maerki
Glen Mazza
2003-06-18 20:36:12 UTC
Permalink
Post by Jeremias Maerki
Post by Glen Mazza
Instinctively, I wouldn't trust any code in the
package root of org.apache.fop -- we wouldn't have
a
Post by Glen Mazza
very modularized design that way (knowing FOP's
current coding style, the main FOP API would then
be
Post by Glen Mazza
accessing objects all through the packages,
Well, the public API has to have
some way to control
the whole show.
You don't mean that literally--e.g.., a servlet
programmer does not need useThisRenderer() and
attachAreaTree() functions in a public API. (i.e.,
the public (embedded) API would be a strict subset of
the functions available to the supervising octopus
running the show)
Post by Jeremias Maerki
This will automatically lead to a
little octopus if the
code is in ...fop or ...fop.api. But no problem with
another package.
For my discussion, apps=api (they're both supervising
octopi). (Although I'm sure your package would be
orders-of-magnitude cleaner and simpler!)
Post by Jeremias Maerki
Post by Glen Mazza
APPs package (CLI, TRAX/XSLTInputHandler,
Avalonized
Post by Glen Mazza
API, Victor's ideas) -->
FOTreeBuilder/Layout/Area
Post by Glen Mazza
Tree creation --> Rendering.
Uh, thanks for that one. It's a very nice show why
the current apps
package is a mess.
I agree with you that the apps package is hideous--but
a pipeline may cleanup nearly all of it, providing it
enforces the rules I mentioned earlier:

a.) The only access between apps/api and the rest of
the packages is via FOTreeBuilder and its "three"
functions.
b.) FOP is designed such that FOTreeBuilder *can* be
directly instantiated via a servlet (even if we don't
allow it).
c.) Once so instantiated, no code within apps/api
packages can be referenced.

Via these rules, look at what gets removed from apps:
(a) structure and layout handler are gone (those are
past the FOTreeBuilder)
(b) AWTStarter and PrintStarter are gone (those
processing decisions are either handled by
FOTreeBuilder or something that it delegates to.)
(c) App class has *no* knowledge of renderers, element
mappings, area trees, structure handlers.
(d) Business logic is kept with each object, and not
shared in multiple places.
Post by Jeremias Maerki
Post by Glen Mazza
IMO FOTreeBuilder should just expose three
functions
It's best to talk about
lifestyle primarily and
leave the lifecycle (instantiation, logging,
configuration,
initialization) to a different discussion. Helps
keeping the focus, I
believe.
Excellent! We can leave that distraction out of the
discussion. (Although they, in addition to threading
issues, *do* appear to strengthen your argument on the
need for a supervising class.)
Post by Jeremias Maerki
Post by Glen Mazza
1.) SetXSLFOStream() (file or stream)
2.) SetRenderType() (those constants currently in
Driver or CommandLineStarter)
3.) Run(). (returns a stream or file)
This mixes concerns. A render type does not belong
in a class called
FOTreeBuilder.
I think it does, because it needs to know whether or
not to generate an Area Tree, *which type* of
structure handler, etc., also, since the Area Tree
needs to know the render type, FOTreeBuilder will need
to pass that information on to it via the pipeline.
(Peter had said and you confirmed that the Area Tree
still needs to know the rendering type for font sizes,
etc.)

Furthermore a different implementation of
FOTreeBuilder may make different Area Tree decisions
based on its render_type.
Post by Jeremias Maerki
It should have nothing to
do with the
rendering aspect, especially since FO tree building
is independant of
the output format (wrt Renderers).
Indeed, nothing to do with *rendering* but it does
need to know the render_type as mentioned above.
APPS/API knows nothing but FOTreeBuilder, which knows
nothing but AreaTree, which knows nothing but the
Renderers.

So I also agree with you that FOTreeBuilder doesn't
need to know about the Renderers. (but it could be so
redesigned in the future, depending on the pluggable
implementation of FOTreeBuilder, and this design more
easily allows for that type of change.)
Post by Jeremias Maerki
The render type
is better placed in a
class such as a
RenderingRun/Document/<whatever-we-call-it>.
Sure, providing they pass that information on to
FOTreeBuilder. ;)
Post by Jeremias Maerki
The FO
tree
builder is (to me) a service that simply accepts a
SAX stream and builds
the FO tree.
IMHO FOTreeBuilder is an object (C++/Java), not a
service (C).
Post by Jeremias Maerki
The layout engine, another (coarse
grained) service, will
then access the FO tree to do the layout. This is
all kept together by a
"supervising" class.
If we were doing C programming--my fear is that the
supervising class is going to end up eating FOP's
object-oriented design and splitting the business
logic too much in multiple places (just like apps
currently does). (I guess I'll just have to trust the
team to be disciplined in this regard! ;)
Post by Jeremias Maerki
The FOTreeBuilder should remain an inner service to
FOP, not exposed in
the public API, if you ask me.
OK, a *very* thin wrapper (for those not needing any
of the threading/logging goodies in apps/api):

public class Fop extends FOTreeBuilder { }

user's embedded code:
Fop.setXSLFOStream(blah);
Fop.setRenderType(RENDER_PS);
myDoc = Fop.Run();

Glen


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Victor Mote
2003-06-19 18:03:11 UTC
Permalink
Post by Glen Mazza
Post by Jeremias Maerki
Well, the public API has to have
some way to control
the whole show.
You don't mean that literally--e.g.., a servlet
programmer does not need useThisRenderer() and
attachAreaTree() functions in a public API. (i.e.,
the public (embedded) API would be a strict subset of
the functions available to the supervising octopus
running the show)
I am not sure what Jeremias meant, but yes, the API needs to at least
indirectly control all of the major decisions. In my vision of the API, the
servlet programmer would not need useThisRenderer(), but would need
something like setOutputType(OUTPUT_PDF), which would ultimately cause the
correct renderer to be selected. More likely, it would be set in a
constructor. Here is some simplified sample startup code:

session = new Session;
document = session.addDocument(inputFile);
document.addRenderType(RENDER_PDF, LAYOUT_SIMPLE, outputFile1);
document.addRenderType(RENDER_POSTSCRIPT, LAYOUT_SIMPLE, outputFile2);
document.addRenderType(RENDER_PDF, LAYOUT_CLASSIC, outputFile3);
document.addRenderType(RENDER_PRINT);
document.process()
-OR-
session.process() // if you want session to manage a queue of documents
Document = FOTree
RenderContext = AreaTree
RenderType = Renderer

The document.addRenderType() also builds any RenderContext objects that are
needed, three in this example (the first two can share the same AreaTree,
each of the others requires a different one). When document.process() is
run, it looks at the RenderContext objects to determine whether an FO Tree
needs to be built In this case it does (and yes, it should be in a different
package). It can then loop through the RenderContext objects to see what if
any layout work needs to be done, and build an Area Tree based on the output
type and the selected LayoutStrategy. Each RenderContext object will then
loop through the RenderTypes which are attached to it to fire up Renderers.
So in the example above, the same AreaTree will be used to spit out the
first two RenderTypes before trying to build the AreaTree needed for the
"CLASSIC" layout.

Of course, there are a number of configuration options available as well,
all of which can be attached to the appopriate object by a servlet
programmer. The actual using of those options is in other objects (and
indeed, should be in other packages), but the /control/ mechanism can live
in these four classes.

(I have left eager/patient processing out of this example, but control
should be returned to the Document object at the end of each PageSequence to
see if eager processing needs to fire off some layout and rendering work
before continuing with parsing. I am not entirely sure whether eager/patient
processing is totally a function of the LayoutStrategy, or whether some
LayoutStrategies can tolerate either, which would make it need to be
configurable).
Post by Glen Mazza
Post by Jeremias Maerki
This will automatically lead to a
little octopus if the
code is in ...fop or ...fop.api. But no problem with
another package.
For my discussion, apps=api (they're both supervising
octopi). (Although I'm sure your package would be
orders-of-magnitude cleaner and simpler!)
Post by Jeremias Maerki
Post by Glen Mazza
APPs package (CLI, TRAX/XSLTInputHandler,
Avalonized
Post by Glen Mazza
API, Victor's ideas) -->
FOTreeBuilder/Layout/Area
Post by Glen Mazza
Tree creation --> Rendering.
Uh, thanks for that one. It's a very nice show why
the current apps
package is a mess.
I agree with you that the apps package is hideous--but
a pipeline may cleanup nearly all of it, providing it
a.) The only access between apps/api and the rest of
the packages is via FOTreeBuilder and its "three"
functions.
b.) FOP is designed such that FOTreeBuilder *can* be
directly instantiated via a servlet (even if we don't
allow it).
c.) Once so instantiated, no code within apps/api
packages can be referenced.
(a) structure and layout handler are gone (those are
past the FOTreeBuilder)
(b) AWTStarter and PrintStarter are gone (those
processing decisions are either handled by
FOTreeBuilder or something that it delegates to.)
(c) App class has *no* knowledge of renderers, element
mappings, area trees, structure handlers.
(d) Business logic is kept with each object, and not
shared in multiple places.
Post by Jeremias Maerki
Post by Glen Mazza
IMO FOTreeBuilder should just expose three
functions
It's best to talk about
lifestyle primarily and
leave the lifecycle (instantiation, logging,
configuration,
initialization) to a different discussion. Helps
keeping the focus, I
believe.
Excellent! We can leave that distraction out of the
discussion. (Although they, in addition to threading
issues, *do* appear to strengthen your argument on the
need for a supervising class.)
Post by Jeremias Maerki
Post by Glen Mazza
1.) SetXSLFOStream() (file or stream)
2.) SetRenderType() (those constants currently in
Driver or CommandLineStarter)
3.) Run(). (returns a stream or file)
This mixes concerns. A render type does not belong
in a class called
FOTreeBuilder.
I think it does, because it needs to know whether or
not to generate an Area Tree, *which type* of
structure handler, etc., also, since the Area Tree
needs to know the render type, FOTreeBuilder will need
to pass that information on to it via the pipeline.
(Peter had said and you confirmed that the Area Tree
still needs to know the rendering type for font sizes,
etc.)
There is a semantics problem here. I agree with Jeremias' raw point about
the naming of the class. FOTreeBuilder should only build an FOTree under the
direction of some higher-level class (the one I call Document above). And it
wouldn't know about how the FOTree would be used.
Post by Glen Mazza
Furthermore a different implementation of
FOTreeBuilder may make different Area Tree decisions
based on its render_type.
Post by Jeremias Maerki
It should have nothing to
do with the
rendering aspect, especially since FO tree building
is independant of
the output format (wrt Renderers).
Indeed, nothing to do with *rendering* but it does
need to know the render_type as mentioned above.
APPS/API knows nothing but FOTreeBuilder, which knows
nothing but AreaTree, which knows nothing but the
Renderers.
So I also agree with you that FOTreeBuilder doesn't
need to know about the Renderers. (but it could be so
redesigned in the future, depending on the pluggable
implementation of FOTreeBuilder, and this design more
easily allows for that type of change.)
Post by Jeremias Maerki
The render type
is better placed in a
class such as a
RenderingRun/Document/<whatever-we-call-it>.
[Responding to Jeremias here] Or, better yet IMO, into a RenderType object
that is a child or grandchild of the Document, so that there can be multiple
RenderTypes for the same Document.
Post by Glen Mazza
Sure, providing they pass that information on to
FOTreeBuilder. ;)
Post by Jeremias Maerki
The FO
tree
builder is (to me) a service that simply accepts a
SAX stream and builds
the FO tree.
IMHO FOTreeBuilder is an object (C++/Java), not a
service (C).
Post by Jeremias Maerki
The layout engine, another (coarse
grained) service, will
then access the FO tree to do the layout. This is
all kept together by a
"supervising" class.
If we were doing C programming--my fear is that the
supervising class is going to end up eating FOP's
object-oriented design and splitting the business
logic too much in multiple places (just like apps
currently does). (I guess I'll just have to trust the
team to be disciplined in this regard! ;)
Post by Jeremias Maerki
The FOTreeBuilder should remain an inner service to
FOP, not exposed in
the public API, if you ask me.
OK, a *very* thin wrapper (for those not needing any
public class Fop extends FOTreeBuilder { }
Fop.setXSLFOStream(blah);
Fop.setRenderType(RENDER_PS);
myDoc = Fop.Run();
In the model I am proposing, threading can be used or not, at whatever
granularity is desired. I think it would even be possible (I haven't
thought all the way through this), for Session to be passed a maximum number
of threads to control. If that number is two, and there is only one Document
in the queue, it might allow document to run two RendererTypes at a time, or
run two RenderContexts at a time, or whatever. (This idea is not
half-baked -- in truth, the batter is not even stirred, nor the ingredients
purchased. It just seems possible). For those who need simple, you just
create a Session, a Document, a RenderType, and process the Document.

Victor Mote
J.Pietschmann
2003-06-19 20:28:53 UTC
Permalink
Post by Victor Mote
In my vision of the API, the
servlet programmer would not need useThisRenderer(), but would need
something like setOutputType(OUTPUT_PDF), which would ultimately cause the
correct renderer to be selected.
This has already been discussed up and down. There is still the
problem that renderers may need a lot of renderer specific
confiuguration data. Take for example PDF encryption. I don't
think it is a good idea to always pass this in some generic
way through the Driver (or whatever the replacement) to the
renderer.
The other problem is the output. There are renderers writing to
a byte stream, there is the AWT and the print renderer which don't
write to a stream like object at all, there may be renderers like
an SVG renderer which write a SAX event stream. Again, I don't
think the output should always be passed through the driver.
It seems quite natural to solve these problems by using separate
renderer objects
processor = new FOProcessor()
// configure processor
renderer = new PDFRenderer(output);
// configure renderer
processor.setRenderer(renderer);
If there is no complicated configuration (use all defaults)
you can still write with another constructor
processor = new FOProcessor();
processor.format(input,new PDFRenderer(output));
which should be easy enough to servlet programmers.

Note that TrAX uses a similar pattern, with Source and Result
abstracting a variety of input and output mechanisms to the
XSLT processor.

[big snip]
Discussing the API may be fun, but IMO fixing bugs like the
broken text justification should get some attention too. The
best API is useless if the code inside doesn't work.

J.Pietschmann
Victor Mote
2003-06-20 16:32:41 UTC
Permalink
Post by Victor Mote
Post by Victor Mote
In my vision of the API, the
servlet programmer would not need useThisRenderer(), but would need
something like setOutputType(OUTPUT_PDF), which would
ultimately cause the
Post by Victor Mote
correct renderer to be selected.
This has already been discussed up and down. There is still the
problem that renderers may need a lot of renderer specific
confiuguration data. Take for example PDF encryption. I don't
think it is a good idea to always pass this in some generic
way through the Driver (or whatever the replacement) to the
renderer.
I don't understand. The setOutputType() would be in the RenderType class,
which is one of the four classes that Driver would be split into. So there
might also be, for example, a setEncryptionKey() method to handle this. The
whole point of creating the three extra classes is to make the granularity
fine enough to properly control this kind of stuff. I'm sorry if I have
missed your point.
Post by Victor Mote
The other problem is the output. There are renderers writing to
a byte stream, there is the AWT and the print renderer which don't
write to a stream like object at all, there may be renderers like
an SVG renderer which write a SAX event stream. Again, I don't
think the output should always be passed through the driver.
It seems quite natural to solve these problems by using separate
renderer objects
processor = new FOProcessor()
// configure processor
renderer = new PDFRenderer(output);
// configure renderer
processor.setRenderer(renderer);
If there is no complicated configuration (use all defaults)
you can still write with another constructor
processor = new FOProcessor();
processor.format(input,new PDFRenderer(output));
which should be easy enough to servlet programmers.
I think we are in agreement here, except that it looks like you are thinking
single Document, and I have added a mechanism (the Document class) that will
allow multiple Documents to be queued up and/or multithreaded.
Post by Victor Mote
Note that TrAX uses a similar pattern, with Source and Result
abstracting a variety of input and output mechanisms to the
XSLT processor.
[big snip]
Discussing the API may be fun, but IMO fixing bugs like the
broken text justification should get some attention too. The
best API is useless if the code inside doesn't work.
I assume you are talking about trunk here. If so, my reasoning for trying to
get API resolved (or really Control, from my perspective) is so that I can
continue with getting the layout cleanly separated and pluggable, which I
view as being the critical path toward project sanity. I have been trying
for a year now to work on fonts, and I can't get there from here.

Victor Mote
J.Pietschmann
2003-06-20 19:34:28 UTC
Permalink
Post by Victor Mote
I don't understand. The setOutputType() would be in the RenderType class,
which is one of the four classes that Driver would be split into. So there
might also be, for example, a setEncryptionKey() method to handle this. The
whole point of creating the three extra classes is to make the granularity
fine enough to properly control this kind of stuff. I'm sorry if I have
missed your point.
Could you outline your API ideas on the Wiki?
Post by Victor Mote
I think we are in agreement here, except that it looks like you are thinking
single Document, and I have added a mechanism (the Document class) that will
allow multiple Documents to be queued up and/or multithreaded.
I think processing multiple documents is a layer above processing
a single source document.
Post by Victor Mote
I assume you are talking about trunk here. If so, my reasoning for trying to
get API resolved (or really Control, from my perspective) is so that I can
continue with getting the layout cleanly separated and pluggable,
Not a chance, pal. The interface between layout and rendering is
much too complex, for example it includes the whole area tree mess.
If it would be possible to plug the old layout into it, I would have
done this a long time ago (rather: backporting the renderers to the
maintenance branch). Don't forget that SVG plays a role too, and you
can't have this feature broken because it's almost the only advantage
we have over other FO processors now.

J.Pietschmann
Jeremias Maerki
2003-06-20 19:40:09 UTC
Permalink
Post by J.Pietschmann
Could you outline your API ideas on the Wiki?
Yes, please. I'm also in the process of writing down my ideas. That way
it will be much easier to relate everyone's ideas to each other. At the
moment I wish we could all sit together and figure it out by talking
together and painting on a whiteboard. The Wiki will have to suffice I
guess.


Jeremias Maerki
Jeremias Maerki
2003-06-23 18:47:30 UTC
Permalink
I have done so now. I've added a new (sub)page to the Wiki to avoid
making the FOPAvalonization page even longer.

http://nagoya.apache.org/wiki/apachewiki.cgi?FOPAvalonization/AltAPIProposalJM

While writing down my thought about the API I have come to the
realization that I cannot make up my mind about the inner context
classes Victor has come up with. But I think he's quite close:

- Session: Looks like my and Jörg's FOProcessor to me. The user
interacts with this class to configure FOP and do processing runs.
- Document and RenderContext: I'm not sure but I think these two should
be merged. I've called it ProcessorContext in my proposal at first,
but then chose not to include them in the proposal right now because
I didn't quite know what to put in there. The thing I know is that we
need a central data object that keeps references while the FOProcessor
implementation coordinates the processing (data separated from logic).
- Renderer: You guys hate me for that, I know, but I still refuse to
give it so much visibility in these discussions. In my proposal I've
separated the logic from the data again (with JAXP as role-model) and
made the Renderer a totally normal Avalon service that is being looked
up by MIME type in the background. The FOPResult classes account for
the differences of output types. The FOProcessor impl is responsible
to establish the link between FOPResults and Renderer. For AWT (I'd
like to call it Java2D from now on if you guys agree. That's the
official name of the API after all.) I've tried to introduce an
interface that clients can use to interact with the Java2D renderer.

So, I don't have anything more concrete on the inner "glue" that keeps
the whole process together but maybe my proposal brings some new ideas.
I'd like to hear your thoughts about my proposal (flaws, nodding,
missing things etc.).

I hope that we can find a common path for the whole thing. Important to
me is to have a good terminology and an API that conforms to the
requirements I've written down on the FOPAvalonization page.

Side note WRT resolvers: I've only placed the SourceResolver in the API
because I think that the other resolvers are not necessary. I'm not
certain about that but this can be easily added later.
Post by Jeremias Maerki
Post by J.Pietschmann
Could you outline your API ideas on the Wiki?
Yes, please. I'm also in the process of writing down my ideas. That way
it will be much easier to relate everyone's ideas to each other. At the
moment I wish we could all sit together and figure it out by talking
together and painting on a whiteboard. The Wiki will have to suffice I
guess.
Jeremias Maerki
J.Pietschmann
2003-06-23 19:28:56 UTC
Permalink
Post by Jeremias Maerki
I have done so now. I've added a new (sub)page to the Wiki to avoid
making the FOPAvalonization page even longer.
- Renderer: You guys hate me for that, I know, but I still refuse to
give it so much visibility in these discussions.
Suppose the PDFRenderer a set of encryption options, as obtained from
the request in a web service environment (or the text renderer is
supposed to use the output encoding specified by the request). This
is parametrization, not configuration, and in any case not as easy
to press through a configure(File) or configure(InputStream).

Could you add an example how this would be handled in your proposal?
Post by Jeremias Maerki
Side note WRT resolvers: I've only placed the SourceResolver in the API
because I think that the other resolvers are not necessary. I'm not
certain about that but this can be easily added later.
Image and font resolvers are needed as hooks for users who want to
have their own caching implementation or want to implement metrics
access for fonts mapping to several odd files in all kind of even more
odd places. However, they can be implemented as Avalon services too.

And, well, the hyphenator is also a good Avalon service...

J.Pietschmann
Jeremias Maerki
2003-06-23 20:24:06 UTC
Permalink
Post by J.Pietschmann
Post by Jeremias Maerki
I have done so now. I've added a new (sub)page to the Wiki to avoid
making the FOPAvalonization page even longer.
- Renderer: You guys hate me for that, I know, but I still refuse to
give it so much visibility in these discussions.
Suppose the PDFRenderer a set of encryption options, as obtained from
the request in a web service environment (or the text renderer is
supposed to use the output encoding specified by the request). This
is parametrization, not configuration, and in any case not as easy
to press through a configure(File) or configure(InputStream).
Could you add an example how this would be handled in your proposal?
I've updated the Wiki page. The parametrization (as opposed to
configuration) is done through set/getOutputProperty (also see my
comment on the Wiki page). TraX has setOutputProperty on the Transformer.
I've moved it to the Result class because that's the more intuitive
place for me.
Post by J.Pietschmann
Post by Jeremias Maerki
Side note WRT resolvers: I've only placed the SourceResolver in the API
because I think that the other resolvers are not necessary. I'm not
certain about that but this can be easily added later.
Image and font resolvers are needed as hooks for users who want to
have their own caching implementation or want to implement metrics
access for fonts mapping to several odd files in all kind of even more
odd places. However, they can be implemented as Avalon services too.
Right. I think I'm not wanting to much if I let someone, who wants
advanced functionality, implement an Avalon service and register it in
the main configuration file. Day-to-day usage is covered by the API,
special behaviour through the Avalon container.
Post by J.Pietschmann
And, well, the hyphenator is also a good Avalon service...
I guess so.


Jeremias Maerki
J.Pietschmann
2003-06-24 19:40:54 UTC
Permalink
Post by Jeremias Maerki
I've updated the Wiki page. The parametrization (as opposed to
configuration) is done through set/getOutputProperty (also see my
comment on the Wiki page).
Hmhm. The class you vasll FOPResult and subclasses has the same
role what I called Renderer and subclasses. Your choice may be
more intuitive for many people.
Post by Jeremias Maerki
TraX has setOutputProperty on the Transformer.
Well, TrAX' output properties is a set of keyed properties each
with a relatively fixed set of cvalues...

J.Pietschman
Jeremias Maerki
2003-06-24 20:53:01 UTC
Permalink
It's not only meant to be intuitive (read: JAXP-like) but also to get
that Renderer thing into the background where it belongs IMHO. You also
don't get direct access to the serializer in JAXP, I think. This helps
decoupling the API from the inner workings which is one factor towards a
stable API.
Post by J.Pietschmann
Hmhm. The class you vasll FOPResult and subclasses has the same
role what I called Renderer and subclasses. Your choice may be
more intuitive for many people.
Jeremias Maerki
Victor Mote
2003-06-26 16:40:58 UTC
Permalink
Post by Jeremias Maerki
I have done so now. I've added a new (sub)page to the Wiki to avoid
making the FOPAvalonization page even longer.
http://nagoya.apache.org/wiki/apachewiki.cgi?FOPAvalonization/AltA
PIProposalJM
Post by Jeremias Maerki
While writing down my thought about the API I have come to the
realization that I cannot make up my mind about the inner context
- Session: Looks like my and Jörg's FOProcessor to me. The user
interacts with this class to configure FOP and do processing runs.
I think your name is fine. I am confused about whether it is an interface
(as written) or a class (I don't see any implementations). I guess I don't
understand the need for FOProcessorFactory, which seems to be an unnecessary
complication for the user. Since I don't understand Avalon, I am not sure
how factoring it into/out of this affects the API, but if that is needed, it
seems like it should be done in an implementation of FOProcessor instead of
creating separate Factories.
Post by Jeremias Maerki
- Document and RenderContext: I'm not sure but I think these two should
be merged. I've called it ProcessorContext in my proposal at first,
but then chose not to include them in the proposal right now because
I didn't quite know what to put in there. The thing I know is that we
need a central data object that keeps references while the FOProcessor
implementation coordinates the processing (data separated from logic).
RenderContext is only useful if you are trying to reuse an AreaTree for
multiple output options. I am frankly confused right now about whether the
dev team even wants to try to do that. I think it is a good idea, and
suspect that whatever difficulties have existed in implementing this in the
past are probably a result of our current tight coupling between FOTree,
layout, AreaTree, and Rendering, which is of course what I am trying to
unravel.

Your DocumentMetadata class holds information in it that would live in my
Document concept. That is information that is common to (but not necessarily
used by) by all output media, and does not need to be set for each output
media. It looks like you are pushing the data that I envisioned in Document
and RenderContext down to RenderType/FOPResult. The net effect is that it
can't be reused.
Post by Jeremias Maerki
- Renderer: You guys hate me for that, I know, but I still refuse to
give it so much visibility in these discussions. In my proposal I've
separated the logic from the data again (with JAXP as role-model) and
made the Renderer a totally normal Avalon service that is being looked
up by MIME type in the background. The FOPResult classes account for
the differences of output types. The FOProcessor impl is responsible
to establish the link between FOPResults and Renderer. For AWT (I'd
like to call it Java2D from now on if you guys agree. That's the
official name of the API after all.) I've tried to introduce an
interface that clients can use to interact with the Java2D renderer.
Your FOPResult is I think roughly equivalent to what I am calling RenderType
or Renderer (somewhere along the way, I started calling mine RenderType to
make a more clear distinction between it and the workhorse Renderers), and
if I understand what you are doing, I think our goals are the same.
FOPResult is more of a control class, and the Renderer is a workhorse class.
Post by Jeremias Maerki
So, I don't have anything more concrete on the inner "glue" that keeps
the whole process together but maybe my proposal brings some new ideas.
I'd like to hear your thoughts about my proposal (flaws, nodding,
missing things etc.).
I hope that we can find a common path for the whole thing. Important to
me is to have a good terminology and an API that conforms to the
requirements I've written down on the FOPAvalonization page.
Here are the issues that I am still uncomfortable with:
1. complexity. In addition to the Factory issue already mentioned, I think
there is some benefit to arranging the data more intuitively for the user.
From the user's standpoint, there is input and output (ie. 2 classes). It
makes sense to have a third (Session/FOProcessor) primarily to facilitate
reuse. IMO, if the user has to interact with more than 3 classes to get the
work done, we are unnecessarily complex. (My 4th class, RenderContext, is
not exposed to the user).
2. reuse. I see no downside to arranging our objects to allow reuse, even if
we decide we don't want to facilitate it. This doesn't directly affect the
API (we can have 15 or 50 classes supporting those exposed to the user).

Sorry to be so slow responding.

Victor Mote
Jeremias Maerki
2003-06-26 17:19:37 UTC
Permalink
Post by Jeremias Maerki
Post by Jeremias Maerki
I have done so now. I've added a new (sub)page to the Wiki to avoid
making the FOPAvalonization page even longer.
http://nagoya.apache.org/wiki/apachewiki.cgi?FOPAvalonization/AltA
PIProposalJM
Post by Jeremias Maerki
While writing down my thought about the API I have come to the
realization that I cannot make up my mind about the inner context
- Session: Looks like my and Jörg's FOProcessor to me. The user
interacts with this class to configure FOP and do processing runs.
I think your name is fine. I am confused about whether it is an interface
(as written) or a class (I don't see any implementations).
It's an API. You don't necessarily see implementing classes in the
specification. Compare to JAXP. Theoretically, RenderX could implement
the same API and we would have a common API for XSL-FO processing. It
would even be interesting to do an implementation of that API based on
the maintenance branch. It would probably not support all the features
but JAXP implementations also don't do that always.
Post by Jeremias Maerki
I guess I don't
understand the need for FOProcessorFactory, which seems to be an unnecessary
complication for the user. Since I don't understand Avalon, I am not sure
how factoring it into/out of this affects the API, but if that is needed, it
seems like it should be done in an implementation of FOProcessor instead of
creating separate Factories.
It's not an Avalon-specific reason why I specified the factory. For one,
it's almost the same as in JAXP (The whole API is, BTW). The distinction
between FOProcessorFactory and FOProcessor is the following:

FOProcessorFactory has a configuration (where do the fonts come from,
what's the caching strategy etc.etc.). You will want to have one such
object per environment where FOP is used.

FOProcessor could be argued to be superfluous if it weren't for the
add/removeEventListener methods which allow you to register a listener
to get events on a particular processing run (start page-sequence, new
page, content clipped, end page-sequence, this kind of things). I
haven't come up with more, yet, but I can imagine that this class could
get additional methods later.

Basically you distiguish the basic environment from the processor run.
You wouldn't want to reload all the configuration stuff each time you
want to run FOP.

A nice side-effect is the AvalonFOProcessorFactory that should fit nicely
as an component for use in Avalon-enabled system (mainly Cocoon).
Post by Jeremias Maerki
Post by Jeremias Maerki
- Document and RenderContext: I'm not sure but I think these two should
be merged. I've called it ProcessorContext in my proposal at first,
but then chose not to include them in the proposal right now because
I didn't quite know what to put in there. The thing I know is that we
need a central data object that keeps references while the FOProcessor
implementation coordinates the processing (data separated from logic).
RenderContext is only useful if you are trying to reuse an AreaTree for
multiple output options. I am frankly confused right now about whether the
dev team even wants to try to do that.
From the first discussion we had I got the impression that it's not
worth the pains right now. In the worst case you simply hold the stuff
from RenderContext in the ProcessorContext.
Post by Jeremias Maerki
I think it is a good idea, and
suspect that whatever difficulties have existed in implementing this in the
past are probably a result of our current tight coupling between FOTree,
layout, AreaTree, and Rendering, which is of course what I am trying to
unravel.
Your DocumentMetadata class holds information in it that would live in my
Document concept. That is information that is common to (but not necessarily
used by) by all output media, and does not need to be set for each output
media. It looks like you are pushing the data that I envisioned in Document
and RenderContext down to RenderType/FOPResult. The net effect is that it
can't be reused.
Of course, it can be reused. Nothing prevents you from passing in the
same DocumentMetadata into the FOPResult. But nobody will want to do
that anyway because metadata like "title" will probably change with each
document being processed. I think we don't talk about the same thing.

As I said before, what I'm trying to do is separate the inner workings
from the API as well as possible. Things from FOPResult (including the
metadata) will probably be placed in the RenderContext so you'll have
access to it if necessary. I see DocumentMetadata only as a pass-through
thing to the Renderer.
Post by Jeremias Maerki
Post by Jeremias Maerki
- Renderer: You guys hate me for that, I know, but I still refuse to
give it so much visibility in these discussions. In my proposal I've
separated the logic from the data again (with JAXP as role-model) and
made the Renderer a totally normal Avalon service that is being looked
up by MIME type in the background. The FOPResult classes account for
the differences of output types. The FOProcessor impl is responsible
to establish the link between FOPResults and Renderer. For AWT (I'd
like to call it Java2D from now on if you guys agree. That's the
official name of the API after all.) I've tried to introduce an
interface that clients can use to interact with the Java2D renderer.
Your FOPResult is I think roughly equivalent to what I am calling RenderType
or Renderer (somewhere along the way, I started calling mine RenderType to
make a more clear distinction between it and the workhorse Renderers), and
if I understand what you are doing, I think our goals are the same.
FOPResult is more of a control class, and the Renderer is a workhorse class.
Right. but I think RenderType is not the right name for this because it
only suggests to be information about which renderer is to be used. It
omits all the nice parameters you want to pass over.
Post by Jeremias Maerki
Post by Jeremias Maerki
So, I don't have anything more concrete on the inner "glue" that keeps
the whole process together but maybe my proposal brings some new ideas.
I'd like to hear your thoughts about my proposal (flaws, nodding,
missing things etc.).
I hope that we can find a common path for the whole thing. Important to
me is to have a good terminology and an API that conforms to the
requirements I've written down on the FOPAvalonization page.
1. complexity. In addition to the Factory issue already mentioned, I think
there is some benefit to arranging the data more intuitively for the user.
From the user's standpoint, there is input and output (ie. 2 classes). It
makes sense to have a third (Session/FOProcessor) primarily to facilitate
reuse. IMO, if the user has to interact with more than 3 classes to get the
work done, we are unnecessarily complex. (My 4th class, RenderContext, is
not exposed to the user).
As I said, it's the same as JAXP. I don't see why people will have
problem with that. It's a clean separation into input, output, setup and
hook into the processing run.
Post by Jeremias Maerki
2. reuse. I see no downside to arranging our objects to allow reuse, even if
we decide we don't want to facilitate it. This doesn't directly affect the
API (we can have 15 or 50 classes supporting those exposed to the user).
The only thing I would reuse in my API is the factory instance because
this is the place where the Avalon Container would sit and which would
manage all the inner services. The image cache service for example.
Per-processing-run caching would be hooked into the ProcessorContext.
It's the FOProcessorFactory implementation that will be heavy-weight,
the rest is all light-weight.
Post by Jeremias Maerki
Sorry to be so slow responding.
No need, happens to me, too.


Jeremias Maerki
Victor Mote
2003-06-26 18:13:06 UTC
Permalink
Post by Jeremias Maerki
Post by Victor Mote
I think your name is fine. I am confused about whether it is an
interface
Post by Victor Mote
(as written) or a class (I don't see any implementations).
It's an API. You don't necessarily see implementing classes in the
specification. Compare to JAXP. Theoretically, RenderX could implement
the same API and we would have a common API for XSL-FO processing. It
would even be interesting to do an implementation of that API based on
the maintenance branch. It would probably not support all the features
but JAXP implementations also don't do that always.
Well, its a whole lot more than an API and there are some implementing
classes in your spec. However, thanks for clarifying.
Post by Jeremias Maerki
Post by Victor Mote
I guess I don't
understand the need for FOProcessorFactory, which seems to be
an unnecessary
Post by Victor Mote
complication for the user. Since I don't understand Avalon, I
am not sure
Post by Victor Mote
how factoring it into/out of this affects the API, but if that
is needed, it
Post by Victor Mote
seems like it should be done in an implementation of
FOProcessor instead of
Post by Victor Mote
creating separate Factories.
It's not an Avalon-specific reason why I specified the factory. For one,
it's almost the same as in JAXP (The whole API is, BTW). The distinction
FOProcessorFactory has a configuration (where do the fonts come from,
what's the caching strategy etc.etc.). You will want to have one such
object per environment where FOP is used.
FOProcessor could be argued to be superfluous if it weren't for the
add/removeEventListener methods which allow you to register a listener
to get events on a particular processing run (start page-sequence, new
page, content clipped, end page-sequence, this kind of things). I
haven't come up with more, yet, but I can imagine that this class could
get additional methods later.
Basically you distiguish the basic environment from the processor run.
You wouldn't want to reload all the configuration stuff each time you
want to run FOP.
OK, so FOProcessorFactory is roughly equivalent to Session and FOProcessor
is roughly equivalent to Document (I don't care about the terminology, I'm
just trying to find where we thinking the same and where we are not).
Post by Jeremias Maerki
Post by Victor Mote
RenderContext is only useful if you are trying to reuse an AreaTree for
multiple output options. I am frankly confused right now about
whether the
Post by Victor Mote
dev team even wants to try to do that.
From the first discussion we had I got the impression that it's not
worth the pains right now. In the worst case you simply hold the stuff
from RenderContext in the ProcessorContext.
OK. Since it doesn't affect the API, we can always add it back later if
needed.
Post by Jeremias Maerki
Post by Victor Mote
I think it is a good idea, and
suspect that whatever difficulties have existed in implementing
this in the
Post by Victor Mote
past are probably a result of our current tight coupling between FOTree,
layout, AreaTree, and Rendering, which is of course what I am trying to
unravel.
Your DocumentMetadata class holds information in it that would
live in my
Post by Victor Mote
Document concept. That is information that is common to (but
not necessarily
Post by Victor Mote
used by) by all output media, and does not need to be set for
each output
Post by Victor Mote
media. It looks like you are pushing the data that I envisioned
in Document
Post by Victor Mote
and RenderContext down to RenderType/FOPResult. The net effect
is that it
Post by Victor Mote
can't be reused.
Of course, it can be reused. Nothing prevents you from passing in the
same DocumentMetadata into the FOPResult. But nobody will want to do
that anyway because metadata like "title" will probably change with each
document being processed. I think we don't talk about the same thing.
That is not reuse. The metadata example is a trivial one. A Collection of
Fonts used and Fonts to be embedded would be a more important one. However,
I don't care. You are correct that we aren't talking about the same thing.
Post by Jeremias Maerki
As I said before, what I'm trying to do is separate the inner workings
from the API as well as possible. Things from FOPResult (including the
metadata) will probably be placed in the RenderContext so you'll have
access to it if necessary. I see DocumentMetadata only as a pass-through
thing to the Renderer.
Post by Victor Mote
Post by Jeremias Maerki
- Renderer: You guys hate me for that, I know, but I still refuse to
give it so much visibility in these discussions. In my proposal I've
separated the logic from the data again (with JAXP as
role-model) and
Post by Victor Mote
Post by Jeremias Maerki
made the Renderer a totally normal Avalon service that is
being looked
Post by Victor Mote
Post by Jeremias Maerki
up by MIME type in the background. The FOPResult classes account for
the differences of output types. The FOProcessor impl is responsible
to establish the link between FOPResults and Renderer. For AWT (I'd
like to call it Java2D from now on if you guys agree. That's the
official name of the API after all.) I've tried to introduce an
interface that clients can use to interact with the Java2D renderer.
Your FOPResult is I think roughly equivalent to what I am
calling RenderType
Post by Victor Mote
or Renderer (somewhere along the way, I started calling mine
RenderType to
Post by Victor Mote
make a more clear distinction between it and the workhorse
Renderers), and
Post by Victor Mote
if I understand what you are doing, I think our goals are the same.
FOPResult is more of a control class, and the Renderer is a
workhorse class.
Right. but I think RenderType is not the right name for this because it
only suggests to be information about which renderer is to be used. It
omits all the nice parameters you want to pass over.
I don't understand your last statement, but I agree that FOPResult is a
better name than RenderType.
Post by Jeremias Maerki
Post by Victor Mote
Post by Jeremias Maerki
So, I don't have anything more concrete on the inner "glue" that keeps
the whole process together but maybe my proposal brings some
new ideas.
Post by Victor Mote
Post by Jeremias Maerki
I'd like to hear your thoughts about my proposal (flaws, nodding,
missing things etc.).
I hope that we can find a common path for the whole thing.
Important to
Post by Victor Mote
Post by Jeremias Maerki
me is to have a good terminology and an API that conforms to the
requirements I've written down on the FOPAvalonization page.
1. complexity. In addition to the Factory issue already
mentioned, I think
Post by Victor Mote
there is some benefit to arranging the data more intuitively
for the user.
Post by Victor Mote
From the user's standpoint, there is input and output (ie. 2
classes). It
Post by Victor Mote
makes sense to have a third (Session/FOProcessor) primarily to
facilitate
Post by Victor Mote
reuse. IMO, if the user has to interact with more than 3
classes to get the
Post by Victor Mote
work done, we are unnecessarily complex. (My 4th class,
RenderContext, is
Post by Victor Mote
not exposed to the user).
As I said, it's the same as JAXP. I don't see why people will have
problem with that. It's a clean separation into input, output, setup and
hook into the processing run.
OK.
Post by Jeremias Maerki
Post by Victor Mote
2. reuse. I see no downside to arranging our objects to allow
reuse, even if
Post by Victor Mote
we decide we don't want to facilitate it. This doesn't directly
affect the
Post by Victor Mote
API (we can have 15 or 50 classes supporting those exposed to the user).
The only thing I would reuse in my API is the factory instance because
this is the place where the Avalon Container would sit and which would
manage all the inner services. The image cache service for example.
Per-processing-run caching would be hooked into the ProcessorContext.
It's the FOProcessorFactory implementation that will be heavy-weight,
the rest is all light-weight.
Hmmm. The whole thing seems pretty heavyweight to me (at least compared to
my proposal). Since my real interest is Control and not API, I think the
most productive thing for me to do is to simply defer on the API issues to
those of you who care more about them. As long as we can build whatever
Control infrastructure under that API that we want to (and it looks like we
can), and if you guys are happy with the API, we can defer that part of the
discussion until another day. It may very well be that when I see how your
vision is implemented, I will find the Control structures that (I think) I
need and that we are done anyway.

Victor Mote
Jeremias Maerki
2003-06-26 18:42:53 UTC
Permalink
Post by Victor Mote
Well, its a whole lot more than an API and there are some implementing
classes in your spec. However, thanks for clarifying.
Only two classes (DefaultFOProcessorFactory and AvalonFOProcessorFactory)
to show how the two world can be made more or less compatible.
Post by Victor Mote
OK, so FOProcessorFactory is roughly equivalent to Session and FOProcessor
is roughly equivalent to Document (I don't care about the terminology, I'm
just trying to find where we thinking the same and where we are not).
I guess so.
Post by Victor Mote
Post by Jeremias Maerki
Of course, it can be reused. Nothing prevents you from passing in the
same DocumentMetadata into the FOPResult. But nobody will want to do
that anyway because metadata like "title" will probably change with each
document being processed. I think we don't talk about the same thing.
That is not reuse. The metadata example is a trivial one. A Collection of
Fonts used and Fonts to be embedded would be a more important one. However,
I don't care. You are correct that we aren't talking about the same thing.
Then you mean reuse in the context of producing multiple output formats
in one renderung run?
Post by Victor Mote
Post by Jeremias Maerki
Right. but I think RenderType is not the right name for this because it
only suggests to be information about which renderer is to be used. It
omits all the nice parameters you want to pass over.
I don't understand your last statement, but I agree that FOPResult is a
better name than RenderType.
Let's try different wording: The name "RenderType" suggests that it is a
enumeration or a parameter, but it's more than that.
Post by Victor Mote
Hmmm. The whole thing seems pretty heavyweight to me (at least compared to
my proposal). Since my real interest is Control and not API, I think the
most productive thing for me to do is to simply defer on the API issues to
those of you who care more about them. As long as we can build whatever
Control infrastructure under that API that we want to (and it looks like we
can), and if you guys are happy with the API, we can defer that part of the
discussion until another day. It may very well be that when I see how your
vision is implemented, I will find the Control structures that (I think) I
need and that we are done anyway.
I think we're getting in the right direction. I think, my API proposal
would enable a stable API and you can concentrate on what's necessary to
do inside FOP to handle the whole processing. You can even experiment
over time, changing the inner glue without changing the API. And a
stable and flexible API is one of the most important things to me. We
changed the Driver class too many times. Hopefully, my proposal could
improve this. Just trying to decouple...

But I disagree with the heavyweight thing. It's only the most necessary
things well separated by topic. Expand on your proposal to enable all
the functionality my proposal covers. I wonder how lightweight yours
stays.


Jeremias Maerki
Glen Mazza
2003-06-26 19:06:04 UTC
Permalink
Post by Victor Mote
Post by Victor Mote
I don't understand your last statement, but I
agree that FOPResult is a
Post by Victor Mote
better name than RenderType.
Let's try different wording: The name "RenderType"
suggests that it is a
enumeration or a parameter, but it's more than that.
Errr...I may not understand everything here, but
aren't our inputs *just*:

1) xsl-fo stream (file, DOM Document, or inputStream)
2) render type (*is* either an enumeration directly,
or could be represented as such--PDF, PS, etc.)

Shouldn't we be leery of "render options" where one
specifies properties of how the output should look
outside of what is specified by the XSL-FO file? (If
there are output properties that cannot be specified
sufficiently by XSL-FO 1.0, well, that's what FOP
extensions or XSL-FO 2.0 would be for, correct?)

Just curious what others are thinking here.

Glen


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
J.Pietschmann
2003-06-26 19:47:47 UTC
Permalink
Post by Glen Mazza
Shouldn't we be leery of "render options" where one
specifies properties of how the output should look
outside of what is specified by the XSL-FO file?
PDF encryption? Printer options? Text encoding? MIF version?
Post by Glen Mazza
(If
there are output properties that cannot be specified
sufficiently by XSL-FO 1.0, well, that's what FOP
extensions or XSL-FO 2.0 would be for, correct?)
I don't think the WG will accept output format specific stuff in
the FO source. I personally don't think this is a good idea either.

J.Pietschmann
Glen Mazza
2003-06-26 20:09:12 UTC
Permalink
OK, thanks for the enlightenment.
Post by J.Pietschmann
Post by Glen Mazza
Shouldn't we be leery of "render options" where
one
Post by Glen Mazza
specifies properties of how the output should look
outside of what is specified by the XSL-FO file?
PDF encryption? Printer options? Text encoding? MIF
version?
Post by Glen Mazza
(If
there are output properties that cannot be
specified
Post by Glen Mazza
sufficiently by XSL-FO 1.0, well, that's what FOP
extensions or XSL-FO 2.0 would be for, correct?)
I don't think the WG will accept output format
specific stuff in
the FO source. I personally don't think this is a
good idea either.
J.Pietschmann
__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Victor Mote
2003-06-26 19:48:10 UTC
Permalink
Post by Victor Mote
Post by Victor Mote
That is not reuse. The metadata example is a trivial one. A
Collection of
Post by Victor Mote
Fonts used and Fonts to be embedded would be a more important
one. However,
Post by Victor Mote
I don't care. You are correct that we aren't talking about the
same thing.
Then you mean reuse in the context of producing multiple output formats
in one renderung run?
Yes. Even if we think we don't want that now, we should have the flexibility
to add it later.
Post by Victor Mote
Post by Victor Mote
Hmmm. The whole thing seems pretty heavyweight to me (at least
compared to
Post by Victor Mote
my proposal). Since my real interest is Control and not API, I think the
most productive thing for me to do is to simply defer on the
API issues to
Post by Victor Mote
those of you who care more about them. As long as we can build whatever
Control infrastructure under that API that we want to (and it
looks like we
Post by Victor Mote
can), and if you guys are happy with the API, we can defer that
part of the
Post by Victor Mote
discussion until another day. It may very well be that when I
see how your
Post by Victor Mote
vision is implemented, I will find the Control structures that
(I think) I
Post by Victor Mote
need and that we are done anyway.
I think we're getting in the right direction. I think, my API proposal
would enable a stable API and you can concentrate on what's necessary to
do inside FOP to handle the whole processing. You can even experiment
over time, changing the inner glue without changing the API. And a
stable and flexible API is one of the most important things to me. We
changed the Driver class too many times. Hopefully, my proposal could
improve this. Just trying to decouple...
I suspect that the reason the Driver class had to change so often is that it
was doing duty for three different hierarchical concepts by my count. As
long as we have all three of them accounted for with your proposal, I'm OK
with it (although see one afterthought see below).
Post by Victor Mote
But I disagree with the heavyweight thing. It's only the most necessary
things well separated by topic. Expand on your proposal to enable all
the functionality my proposal covers. I wonder how lightweight yours
stays.
Except for Avalonization (which I admit I don't understand, but which I
understood not to be a factor here), I don't see what functionality your
model provides over mine. On the Avalonization issue, I guess the easy way
to get to the heart of the matter is to simply ask whether Avalonization can
be implemented with the model I have proposed or not.

I did have one afterthought that I think I should mention. The JAXP thing
caught me a bit off-guard, and I only just now sorted out in my mind why. I
don't use these tools often enough or in enough detail to keep this fresh in
my mind (but I think I have the concepts straight), so please correct me if
I am wrong. If I want to use Xalan, for example, I could use a Xalan
interface, or I could use a Xalan implementation of the JAXP interface. You
mentioned in an earlier posting that the functionality between the two
didn't necessarily need to match up. This implies (and makes sense when you
think about interfaces) that it is a lowest-common-denominator. There may be
times when, to get the full richness of an implementation, you have to use
things that are specific to it because the higher-level interface couldn't
and shouldn't force all implementations to implement that feature. FOP and
RenderX might (and probably already have) evolve(d) to meet different
use-cases. So it makes sense to me to use a simple lightweight API for FOP
(if possible), and if we want to build an industry-standard API like that
which you have proposed, that it be done in a separate project or at least
package, and probably be done in concurrence with the other implementors.
That would be my preference, but I do not feel strongly enough about it to
do much more than mention it. I think this distinction probably also
explains the different wavelengths that we seem to be on.

Victor Mote
Jeremias Maerki
2003-06-28 10:45:30 UTC
Permalink
Post by Victor Mote
Post by Jeremias Maerki
Then you mean reuse in the context of producing multiple output formats
in one renderung run?
Yes. Even if we think we don't want that now, we should have the flexibility
to add it later.
Granted. I thought about that myself. I don't see why my API proposal
couldn't be enhanced (later), for example, with a MultipleFOPResult that
contains/holds a list of FOPResults. That would be a
backwards-compatible and even quite intuitive (See GoF Composite) way to
handle the case.
Post by Victor Mote
Except for Avalonization (which I admit I don't understand, but which I
understood not to be a factor here), I don't see what functionality your
model provides over mine. On the Avalonization issue, I guess the easy way
to get to the heart of the matter is to simply ask whether Avalonization can
be implemented with the model I have proposed or not.
I can't answer that because I have the impression that your proposal is
incomplete. What I'm missing:
- How to specify output properties like encryption parameters
- How to handle Java2D?
- How to handle SAX input?
- How to handle FOP configuration?

On the other side, it doesn't matter so much. I think it is important to
separate the inner glue from the client interface, so to isolate FOP's
Avalon container (or whatever we will use/do) from its environment so
FOP provides a blackbox interface to the outside. Only in the
Cocoon-context should there be a possibility to pass in a parent
ServiceManager so FOP can use Cocoon's SourceResolver, parser factories
etc. Your proposal can probably be made to handle that.
Post by Victor Mote
I did have one afterthought that I think I should mention. The JAXP thing
caught me a bit off-guard, and I only just now sorted out in my mind why. I
don't use these tools often enough or in enough detail to keep this fresh in
my mind (but I think I have the concepts straight), so please correct me if
I am wrong. If I want to use Xalan, for example, I could use a Xalan
interface, or I could use a Xalan implementation of the JAXP interface.
The problem with this example is that I can't think of any reason why I
would need a Xalan-specific API (if there is still one). Xalan-1 had a
proprietary one. That got deprecated with the arrival of TraX/JAXP. But
my statement is not really based on extensive research. Maybe there ARE
use cases.
Post by Victor Mote
You
mentioned in an earlier posting that the functionality between the two
didn't necessarily need to match up.
Right, but it should be avoided to have two different client APIs
because you will have to maintain and more importantly support them.
Post by Victor Mote
This implies (and makes sense when you
think about interfaces) that it is a lowest-common-denominator.
Yes, but holding that open for extensions:
- Ability to add new subclasses of FOPResult which only certain
implementations support.
- Ability to pass in special output property objects to control special
features.
Post by Victor Mote
There may be
times when, to get the full richness of an implementation, you have to use
things that are specific to it because the higher-level interface couldn't
and shouldn't force all implementations to implement that feature. FOP and
RenderX might (and probably already have) evolve(d) to meet different
use-cases. So it makes sense to me to use a simple lightweight API for FOP
(if possible), and if we want to build an industry-standard API like that
which you have proposed, that it be done in a separate project or at least
package, and probably be done in concurrence with the other implementors.
You're right again. But then it happens that a good API gets adopted by
other parties. At any rate I'll gladly participate in formulating a
standard API as a separate project as my time allows it.

RenderX guys: If you're listening in, I'd like to hear your thoughts
about having/creating a common client API for XSL-FO processing in Java.
Post by Victor Mote
That would be my preference, but I do not feel strongly enough about it to
do much more than mention it. I think this distinction probably also
explains the different wavelengths that we seem to be on.
The biggest difference probably arises from your focus on the inner glue
and my focus on the client API. Your thoughts are very valuable but they
should evolve in the context of the inner glue because that's where you
invested most of your brain time, I guess.

Jeremias Maerki
J.Pietschmann
2003-06-28 11:23:21 UTC
Permalink
Post by Jeremias Maerki
The problem with this example is that I can't think of any reason why I
would need a Xalan-specific API (if there is still one).
Everytime you need something to do which is not in JAXP, like
using an XPath to select nodes from a DOM.

J.Pietschmann
Jeremias Maerki
2003-06-28 11:35:34 UTC
Permalink
Ok, but that's a special purpose API that has (almost) nothing to do
with XSL transformation. And it's only a proprietary API because there
is no standard API for this.
Post by J.Pietschmann
Post by Jeremias Maerki
The problem with this example is that I can't think of any reason why I
would need a Xalan-specific API (if there is still one).
Everytime you need something to do which is not in JAXP, like
using an XPath to select nodes from a DOM.
Jeremias Maerki
Victor Mote
2003-06-28 16:57:00 UTC
Permalink
Post by Victor Mote
Post by Victor Mote
Except for Avalonization (which I admit I don't understand, but which I
understood not to be a factor here), I don't see what functionality your
model provides over mine. On the Avalonization issue, I guess
the easy way
Post by Victor Mote
to get to the heart of the matter is to simply ask whether
Avalonization can
Post by Victor Mote
be implemented with the model I have proposed or not.
I can't answer that because I have the impression that your proposal is
- How to specify output properties like encryption parameters
- How to handle Java2D?
- How to handle SAX input?
- How to handle FOP configuration?
My proposal was not contemplating any significant changes to any of these.
Encryption parameters would be stored/set in the FOPResult object (I've
adopted your terms here even though the concepts might be a bit off -- I
don't know whether that is more or less confusing). Java2D would be a
FOPResult option, which would drive the creation of a RenderContext object
to handle the AreaTree creation for that scheme. SAX input would be handled
as it is now. FOP configuration would be handled as it is now, except for
the changes to the object model. Perhaps I am not catching the intent of
your questions?
Post by Victor Mote
Post by Victor Mote
This implies (and makes sense when you
think about interfaces) that it is a lowest-common-denominator.
- Ability to add new subclasses of FOPResult which only certain
implementations support.
- Ability to pass in special output property objects to control special
features.
OK. One that is on the top of my mind is pluggable layout aka
LayoutStrategy. Please walk me briefly through how that might work at the
API level, considering that other implementations might not want/need such a
thing, and how to handle defaults. What I am really grappling with here is
whether you build such a concept into a shared, industry-wide API, or treat
it as an extension, and if it is an extension, what the ramifications are,
especially to those using the API. Specifically, if the outer shell of the
API doesn't handle the extension, don't you end up kind of building (via
subclassing or whatever) a separate API underneath the outer shell that is
FOP-specific?

(BTW, the above paragraph is not asking anyone to endorse the idea of
pluggable layout, but simply using the concept as a specific example of the
type of thing that we might want to have the flexibility to handle).
Post by Victor Mote
Post by Victor Mote
There may be
times when, to get the full richness of an implementation, you
have to use
Post by Victor Mote
things that are specific to it because the higher-level
interface couldn't
Post by Victor Mote
and shouldn't force all implementations to implement that
feature. FOP and
Post by Victor Mote
RenderX might (and probably already have) evolve(d) to meet different
use-cases. So it makes sense to me to use a simple lightweight
API for FOP
Post by Victor Mote
(if possible), and if we want to build an industry-standard API
like that
Post by Victor Mote
which you have proposed, that it be done in a separate project
or at least
Post by Victor Mote
package, and probably be done in concurrence with the other
implementors.
You're right again. But then it happens that a good API gets adopted by
other parties. At any rate I'll gladly participate in formulating a
standard API as a separate project as my time allows it.
If there are any changes to the one that we propose, then we *must* change
our API in order to meet it (one of the things we have specifically said we
don't want to do), or not be compliant. And I don't think we want to wait
for such a process anyway.
Post by Victor Mote
RenderX guys: If you're listening in, I'd like to hear your thoughts
about having/creating a common client API for XSL-FO processing in Java.
BTW, we'd better change the name of FOPResult if we're going to try to sell
this to a wider audience :-)
Post by Victor Mote
The biggest difference probably arises from your focus on the inner glue
and my focus on the client API. Your thoughts are very valuable but they
should evolve in the context of the inner glue because that's where you
invested most of your brain time, I guess.
That's fair. I see the API as a function of the object model, and think the
best way to assure a stable API is to get the object model to better match
our processing needs. To the extent that the layer of API objects that you
have designed meets that end, I think we'll be OK.

If you and Joerg are satisfied that there are no major gotchas lurking, I am
quite happy to follow along.

Victor Mote
J.Pietschmann
2003-06-28 17:23:36 UTC
Permalink
Post by Victor Mote
BTW, we'd better change the name of FOPResult if we're going to try to sell
this to a wider audience :-)
abbr for FormattingObjectsProcessingResult. No need to change :-)

J.Pietschmann
Victor Mote
2003-06-28 17:40:57 UTC
Permalink
Post by Victor Mote
Post by Victor Mote
BTW, we'd better change the name of FOPResult if we're going to
try to sell
Post by Victor Mote
this to a wider audience :-)
abbr for FormattingObjectsProcessingResult. No need to change :-)
Just don't be surprised to see a counter-proposal RenderXResult, an abbr
for:
Really Exciting Newly Designed Eclectic Renderer for XML Result :-)

Victor Mote
Glen Mazza
2003-06-29 20:10:54 UTC
Permalink
I would judge the notion that FOP and RenderX share
common goals to be somewhat sentimental; perhaps we
should wait until Coke and Pepsi form such warm
collaborative bonds first!

At any rate, instead of "FOPResult", since we already
have an InputHandler class (and subclasses where
needed), perhaps we should call it OutputHandler?

Glen
Post by Victor Mote
Post by Victor Mote
BTW, we'd better change the name of FOPResult if
we're going to try to sell
Post by Victor Mote
this to a wider audience :-)
abbr for FormattingObjectsProcessingResult. No
need to change :-)
J.Pietschmann
---------------------------------------------------------------------
__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

J.Pietschmann
2003-06-26 19:40:04 UTC
Permalink
Post by Victor Mote
I guess I don't
understand the need for FOProcessorFactory, which seems to be an unnecessary
complication for the user.
It has something to do with the GoF Factory pattern. This means you can
choose the implementation of the FOProcessorAPI by setting a Java property,
or some similar mechanism at run time instead of choosing it at compile
time. Granted, this seems of not much use given that FOP will be the only
implementation at least for some time. There is another aspect: the
FOProcessorFactory holds a default configuration and is a mechanism to
quickly create preconfigured FOProcessor objects. You just can't use
a single FOProcessor, because it holds a state while rendering and is
therefore not MT safe in itself. Unless you can live with blocked threads
this means you have to create a FOProcessor for every thread. Furthermore
you might want to keep FOProcessor objects some time around after rendering
has finished in order to inquire the number of rendered pages or whatever
useful data the processor can supply then.
Post by Victor Mote
RenderContext is only useful if you are trying to reuse an AreaTree for
multiple output options. I am frankly confused right now about whether the
dev team even wants to try to do that.
I think we have a few slightly more pressing problems:
- getting layout up to conformance
- make FOP MT safe
- improving the API to ease embedding (including Java2D embedding)
- improve performance and reduce memory consumption

Adding multiple output streams is certainly fun but I suspect the
bulk of the current users would be more interested in one of the
points above. And somehow I have the feeling that your approach could
easily get in the way, in particular I wouldn't like if it would
*increase* memory usage in general.
Post by Victor Mote
It looks like you are pushing the data that I envisioned in Document
and RenderContext down to RenderType/FOPResult. The net effect is that it
can't be reused.
Why can't it be reused?
Post by Victor Mote
1. complexity. In addition to the Factory issue already mentioned, I think
there is some benefit to arranging the data more intuitively for the user.
Users should be able to recognize certain patterns, like the ones
used in JAXP.

J.Pietschmann
Victor Mote
2003-06-26 20:10:27 UTC
Permalink
Post by Victor Mote
Post by Victor Mote
I guess I don't
understand the need for FOProcessorFactory, which seems to be
an unnecessary
Post by Victor Mote
complication for the user.
It has something to do with the GoF Factory pattern. This means you can
choose the implementation of the FOProcessorAPI by setting a Java property,
or some similar mechanism at run time instead of choosing it at compile
time. Granted, this seems of not much use given that FOP will be the only
Hmmm. I missed that in the "goals" section.
Post by Victor Mote
implementation at least for some time. There is another aspect: the
FOProcessorFactory holds a default configuration and is a mechanism to
quickly create preconfigured FOProcessor objects. You just can't use
a single FOProcessor, because it holds a state while rendering and is
therefore not MT safe in itself. Unless you can live with blocked threads
this means you have to create a FOProcessor for every thread. Furthermore
This aspect of it could/should be entirely hidden from the user. IOW, I
wouldn't make the API more complex just to achieve this. Also, the
alternative to sharing shareable data and blocking thread access it to
duplicate the data and the processing of it. I'll certainly leave that issue
to you performance gurus.
Post by Victor Mote
you might want to keep FOProcessor objects some time around after rendering
has finished in order to inquire the number of rendered pages or whatever
useful data the processor can supply then.
Post by Victor Mote
RenderContext is only useful if you are trying to reuse an AreaTree for
multiple output options. I am frankly confused right now about
whether the
Post by Victor Mote
dev team even wants to try to do that.
- getting layout up to conformance
- make FOP MT safe
- improving the API to ease embedding (including Java2D embedding)
- improve performance and reduce memory consumption
I am not trying to implement multiple output options, but merely to make
sure we have the flexibility to do so. And frankly, I don't care if it ever
gets implemented. What *is* important, but on which I seem not to be making
any progress in explaining, is that each of the major processing tasks in
FOP should have a high-level controlling class that is responsible for it. I
see all four of the items you mention as being dependent on that.
Post by Victor Mote
Adding multiple output streams is certainly fun but I suspect the
bulk of the current users would be more interested in one of the
points above. And somehow I have the feeling that your approach could
easily get in the way, in particular I wouldn't like if it would
*increase* memory usage in general.
I agree with your first statement. On the second, I wouldn't like Jeremias's
proposal either if it caused a nuclear holocaust :-). I don't see why you
would suggest that my proposal would use more memory. I am quite sure it
would use less, but not enough to even mention.
Post by Victor Mote
Post by Victor Mote
It looks like you are pushing the data that I envisioned in Document
and RenderContext down to RenderType/FOPResult. The net effect
is that it
Post by Victor Mote
can't be reused.
Why can't it be reused?
See answer in my response to Jeremias (which crossed your inquiry in the
mail).
Post by Victor Mote
Post by Victor Mote
1. complexity. In addition to the Factory issue already
mentioned, I think
Post by Victor Mote
there is some benefit to arranging the data more intuitively
for the user.
Users should be able to recognize certain patterns, like the ones
used in JAXP.
Well, I don't think anyone is ever going to want to put a GUI on top of
Xalan, but I'll bet it will happen with FOP. The use case for FOP as it is
right now is fairly analagous to JAXP, i.e. a non-interactive process, but I
don't expect that to stay the same. Also, see my response to Jeremias about
the JAXP issue (also crossed in the mail). I'm not opposed to that pattern,
I'm just not sure it makes sense as our API.

Victor Mote
Clay Leeds
2003-06-26 20:24:42 UTC
Permalink
Forgive my newbie-ness here, but I'm having problems figuring out how
to send this [patch] in. I'm sending it here... Please enlighten me
(gently ;-p) so that next time I'm doing it as efficiently as possible.
Thanks!

PATCH INFO:
The info for XPath is missing the fact that it's pretty much Windows IE
5+ only (requires MSXML 3.0). However, it's still a really useful tool!
It'd be nice if it could be re-engineered to be cross-platform!

Here's the good part:
Victor Mote
2003-06-27 15:17:43 UTC
Permalink
Post by Clay Leeds
Forgive my newbie-ness here, but I'm having problems figuring out how
to send this [patch] in. I'm sending it here... Please enlighten me
(gently ;-p) so that next time I'm doing it as efficiently as possible.
Thanks!
Thanks for the patch. The instructions are here:
http://xml.apache.org/fop/dev/index.html#patches
Post by Clay Leeds
The info for XPath is missing the fact that it's pretty much Windows IE
5+ only (requires MSXML 3.0). However, it's still a really useful tool!
It'd be nice if it could be re-engineered to be cross-platform!
I modified (and applied) the patch to say something like "Requires Internet
Explorer 5+". IMO, comments like this are in a gray area -- if they do
change their product to be cross-platform, we're out-of-date. In spite of
the relative inconvenience to the user, it is almost better for them to find
this out by following the link.

Thanks for your continuing efforts to improve the doc.

Victor Mote
Clay Leeds
2003-06-27 15:26:32 UTC
Permalink
Post by Victor Mote
http://xml.apache.org/fop/dev/index.html#patches
Thanks. So essentially, I did everything correctly except I didn't open
a new bugzilla item, and put '[PATCH]' at the beginning of the SUMMARY.
I'll do that next time. I'll then attach the DIFF file (were there any
problems with that DIFF file?). Since much of what I'll be doing will be
(hopefully) improving the web site, Platform will be 'All'. As for
Severity, would you prefer I indicate "Enhancement", or stick with my
own assessment to severity (most would end up being 'minor' as
'nit-picking' is not an option ;-p).
--
Clay Leeds - ***@medata.com
Web Developer - Medata, Inc. - http://www.medata.com
PGP Public Key: https://mail.medata.com/pgp/cleeds.asc
Victor Mote
2003-06-27 17:33:50 UTC
Permalink
Post by Clay Leeds
Thanks. So essentially, I did everything correctly except I didn't open
a new bugzilla item, and put '[PATCH]' at the beginning of the SUMMARY.
I'll do that next time. I'll then attach the DIFF file (were there any
problems with that DIFF file?). Since much of what I'll be doing will be
The only problem I saw with the DIFF file was that it was not in "unified"
format, which helps the patch programs find the context more easily. On the
one that you submitted, it didn't matter because it was easier to
cut-and-paste than to run patch anyway. If you are using WinCVS, I know that
the newer versions have a checkbox on the diff page for "Unified diff",
which you will want to select.
Post by Clay Leeds
(hopefully) improving the web site, Platform will be 'All'. As for
Severity, would you prefer I indicate "Enhancement", or stick with my
own assessment to severity (most would end up being 'minor' as
'nit-picking' is not an option ;-p).
Enhancement is fine, unless there is a specific category for Doc.

Victor Mote
Clay Leeds
2003-06-27 17:54:48 UTC
Permalink
Post by Victor Mote
The only problem I saw with the DIFF file was that it was not in "unified"
format, which helps the patch programs find the context more easily. On the
one that you submitted, it didn't matter because it was easier to
cut-and-paste than to run patch anyway. If you are using WinCVS, I know that
the newer versions have a checkbox on the diff page for "Unified diff",
which you will want to select.
I'm launching MacCvsX as we speak (I don't run WinCVS ;-p). I just
checked and unfortunately, "Unified diff" is not an option. I thought I
could run it from the command line if you give me the command to run,
but nope, it won't work... I did a 'man cvs' and /Unified showed
"Pattern not found". Any ideas? I think I'll just do my edits onesy,
twosy and send you each edit separately. I doubt there'll be many. Any
that happen to be "large" we'll probably have a discussion about on
fop-dev in advance anyway, so you'll know it's coming.
Post by Victor Mote
Enhancement is fine, unless there is a specific category for Doc.
"Doc" is not an option in the "Severity" menu, but it is under the
"Component" menu. I'll choose "Enhancement" instead of "minor" for
Severity if that's what you prefer...

;-p
--
Clay Leeds - ***@medata.com
Web Developer - Medata, Inc. - http://www.medata.com
PGP Public Key: https://mail.medata.com/pgp/cleeds.asc
Victor Mote
2003-06-27 19:35:42 UTC
Permalink
Post by Clay Leeds
I'm launching MacCvsX as we speak (I don't run WinCVS ;-p). I just
checked and unfortunately, "Unified diff" is not an option. I thought I
Double-check the WinCVS MacCVS page to see if they have a later version. I
am actually running a beta version of WinCVS, and seeing no known problems.
Post by Clay Leeds
could run it from the command line if you give me the command to run,
but nope, it won't work... I did a 'man cvs' and /Unified showed
"Pattern not found". Any ideas? I think I'll just do my edits onesy,
Try 'man diff'. CVS is largely "just" a wrapper around RCS and diff, so you
should *generally* be able to use the diff doc.

If MacCVS works like WinCVS, there is a command-line console available that
has CVS built into it, even if the OS doesn't have natively.

Victor Mote
Clay Leeds
2003-06-27 19:56:58 UTC
Permalink
Victor,
Post by Victor Mote
Double-check the WinCVS MacCVS page to see if they have a later version. I
am actually running a beta version of WinCVS, and seeing no known problems.
I'm using MacCvsX (which appears to be different from MacCVS ;( ). It
took a bit to get it running (mainly permissions issues), but it's
working. If it's all the same to you, I'll keep using MacCvsX... (That's
not to say I won't try to get MacCVS working, though).
Post by Victor Mote
Try 'man diff'. CVS is largely "just" a wrapper around RCS and diff, so you
should *generally* be able to use the diff doc.
Ahhh... --unified It looks like I can do it from the command line, now
that I know a bit more how it works...
Post by Victor Mote
If MacCVS works like WinCVS, there is a command-line console available that
has CVS built into it, even if the OS doesn't have natively.
I saw a command-line utility when I was first playing around with it.
I'm having trouble finding it again... I'll keep looking.

;-0
--
Clay Leeds - ***@medata.com
Web Developer - Medata, Inc. - http://www.medata.com
PGP Public Key: https://mail.medata.com/pgp/cleeds.asc
J.Pietschmann
2003-06-26 20:30:40 UTC
Permalink
Post by Victor Mote
I don't see why you
would suggest that my proposal would use more memory. I am quite sure it
would use less, but not enough to even mention.
Well, your approach to "decoupling layout and rendering" seems
to include building a full area tree, or something equivalent.
FOP was implemented this way before 0.20.1. What you might see
as "tight coupling" is the result of improving the situation.

J.Pietschmann
Victor Mote
2003-06-27 05:06:03 UTC
Permalink
Post by J.Pietschmann
Post by Victor Mote
I don't see why you
would suggest that my proposal would use more memory. I am quite sure it
would use less, but not enough to even mention.
Well, your approach to "decoupling layout and rendering" seems
to include building a full area tree, or something equivalent.
FOP was implemented this way before 0.20.1. What you might see
as "tight coupling" is the result of improving the situation.
I have been misunderstood somewhere. There may be times when building a full
area tree (and writing to disk if necessary) is appropriate, esp. if it were
going to be reused. That is a dead issue. If the area tree is only used
once, then PageSequence is the largest chunk that might need to be built,
and then only if patient processing is used. I do not intend to force anyone
to do anything different than is now done in terms of how memory is managed,
or even the general flow of the processing. What I am asking for is that
control of those decisions be made at a higher level, which keeps the code
modular and gives us more options. I don't want layout starting the process
of rendering a page, but rather to notify the control mechanism that a page
is ready, and letting the control mechanism decide whether to cache it,
render it, whether that should be done in a separate thread, etc. IOW the
performance "smarts" are in the high-level control objects.

I'm sorry if that was not clear before.

Victor Mote
J.U. Anderegg
2003-06-26 21:08:36 UTC
Permalink
Post by J.Pietschmann
- improving the API to ease embedding (including Java2D embedding)
What has to be embedded?
Post by J.Pietschmann
Adding multiple output streams is certainly fun but I suspect the
bulk of the current users would be more interested in one of the
points above. And somehow I have the feeling that your approach could
easily get in the way, in particular I wouldn't like if it would
*increase* memory usage in general.
Multiple output streams are here with FOP 0.x.x
The AWT viewer uses the area tree as page cache and it's print function
again.

I'm experimenting with a new Graphics2D renderer. The attached description
gives some hints to configuration/parametrization requirements. Directions,
orientations, i18n will be the next topics. All I know at the time is that
present renderers need heavy upgrades.

Hansuli Anderegg
Victor Mote
2003-06-21 17:07:56 UTC
Permalink
Post by J.Pietschmann
Could you outline your API ideas on the Wiki?
Sure (give me a couple of days). Again, my point is not to push for a
particular API or framework, but only to show that the control concepts (PDF
encryption was the example given IIRC) can be added to the appropriate
classes as needed.
Post by J.Pietschmann
Post by Victor Mote
I think we are in agreement here, except that it looks like you
are thinking
Post by Victor Mote
single Document, and I have added a mechanism (the Document
class) that will
Post by Victor Mote
allow multiple Documents to be queued up and/or multithreaded.
I think processing multiple documents is a layer above processing
a single source document.
Right. So, if you are only processing one document in a Session, you don't
need a Document class to keep track of and control the differences between
documents. The information can be static-ish or singleton-ish.
Post by J.Pietschmann
Post by Victor Mote
I assume you are talking about trunk here. If so, my reasoning
for trying to
Post by Victor Mote
get API resolved (or really Control, from my perspective) is so
that I can
Post by Victor Mote
continue with getting the layout cleanly separated and pluggable,
Not a chance, pal. The interface between layout and rendering is
much too complex, for example it includes the whole area tree mess.
If we are talking about redesign/trunk here, that is precisely what we are
trying to fix. Layout shouldn't need to know anything about rendering
*directly*. The concept of RenderContext is built around knowing what the
RenderType *capabilities* are, and managing layout accordingly, so there is
indirect (abstract is probably the better term) knowledge.

If you can convince me that it is not possible for layout and rendering to
be refactored into distinct tasks or "services", then my interest in FOP
diminishes to zero, and I'll stop making so much noise. I don't see why that
should be true.
Post by J.Pietschmann
If it would be possible to plug the old layout into it, I would have
done this a long time ago (rather: backporting the renderers to the
maintenance branch). Don't forget that SVG plays a role too, and you
can't have this feature broken because it's almost the only advantage
we have over other FO processors now.
I agree that "would be possible" isn't there in the current design, but
again, I see no inherent reason why it can't be added. I would agree with
you if layout could not be cleanly separated from both FOTree building and
Rendering. I don't follow your point about SVG. From a big-picture, abstract
standpoint, why do we need to think of SVG any differently than any other
graphic type? We need to lay it out with the proper dimensions, and we need
to render it properly. Can't those be distinct?

Victor Mote
Peter B. West
2003-06-22 01:08:35 UTC
Permalink
...
Post by Victor Mote
If you can convince me that it is not possible for layout and rendering to
be refactored into distinct tasks or "services", then my interest in FOP
diminishes to zero, and I'll stop making so much noise. I don't see why that
should be true.
Victor,

Without presuming the result of this discussion, it wouldn't be the end
of the world if it were true. It just means that certain views of the
process are precluded.

Peter
--
Peter B. West http://www.powerup.com.au/~pbwest/resume.html
Victor Mote
2003-06-22 15:52:21 UTC
Permalink
Post by Victor Mote
...
Post by Victor Mote
If you can convince me that it is not possible for layout and
rendering to
Post by Victor Mote
be refactored into distinct tasks or "services", then my interest in FOP
diminishes to zero, and I'll stop making so much noise. I don't
see why that
Post by Victor Mote
should be true.
Victor,
Without presuming the result of this discussion, it wouldn't be the end
of the world if it were true. It just means that certain views of the
process are precluded.
You are right of course in a general way. However, for my purposes, I don't
think FOP will ever be a suitable solution without pluggable layout. The
existing user base is servlet- and performance-oriented, and my application
is much more beautiful-output-oriented, with almost no need for speed. I
think FOP can be made to accommodate both, and variety on some other axes as
well, through the pluggable layout or LayoutStrategy scheme. I don't have a
clear idea whether everyone thinks this is the right way to go, but I
*think* I am hearing that it is (at least in theory) feasible to do so.

Victor Mote
J.Pietschmann
2003-06-22 11:26:02 UTC
Permalink
Post by Victor Mote
If you can convince me that it is not possible for layout and rendering to
be refactored into distinct tasks or "services", then my interest in FOP
diminishes to zero, and I'll stop making so much noise. I don't see why that
should be true.
Layout and rendering can be factored into different packages
of services, actually this is already done to a large extent.
However, the interface is quite complex, and I don't think you
can plug in the old layout engine without rewriting significant
parts of it into the framework provided by HEAD.
Post by Victor Mote
I don't follow your point about SVG. From a big-picture, abstract
standpoint, why do we need to think of SVG any differently than any other
graphic type?
Inlined SVG code is parsed by the FOTreeBuilder. In any case,
the FOTreeBuilder must have at least some knowledge about anything
wich can appear in instream-foreign-object.

J.Pietschmann
Victor Mote
2003-06-22 15:41:16 UTC
Permalink
Post by Victor Mote
Post by Victor Mote
If you can convince me that it is not possible for layout and
rendering to
Post by Victor Mote
be refactored into distinct tasks or "services", then my interest in FOP
diminishes to zero, and I'll stop making so much noise. I don't
see why that
Post by Victor Mote
should be true.
Layout and rendering can be factored into different packages
of services, actually this is already done to a large extent.
However, the interface is quite complex, and I don't think you
can plug in the old layout engine without rewriting significant
parts of it into the framework provided by HEAD.
Agreed.
Post by Victor Mote
Post by Victor Mote
I don't follow your point about SVG. From a big-picture, abstract
standpoint, why do we need to think of SVG any differently than
any other
Post by Victor Mote
graphic type?
Inlined SVG code is parsed by the FOTreeBuilder. In any case,
the FOTreeBuilder must have at least some knowledge about anything
wich can appear in instream-foreign-object.
OK, now I follow. Yes, the difference between SVG & other graphics packages
would be the namespace issue. The FOTree builder needs to know what to do
with the namespace. So, for a renderer-agnostic FOTree builder to work
properly, it should parse the SVG & create the necessary objects in the
FOTree. The Renderer (this would be the workhorse object that actually
renders, not the somewhat-related RenderType control object that controls)
then should be responsible to either ignore it or place it into the output,
depending on its capabilities. But the FOTree builder doesn't need to know
or care. (At the risk of getting esoteric, if the Document object knows that
none of the output options requested support SVG, it could tell the FOTree
builder to ignore that namespace and save the memory. FOTree builder still
doesn't need to know or care about why, it simply "serves" Document).

Victor Mote
Glen Mazza
2003-06-22 17:01:31 UTC
Permalink
Post by Victor Mote
(At the risk of getting esoteric, if the
Document object knows that none of the output
options requested support SVG, it
could tell the FOTree builder to ignore that
namespace and save the memory.
That's one helluva smart document you're planning
there... ;)

Glen


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Victor Mote
2003-06-23 00:55:36 UTC
Permalink
Post by Glen Mazza
Post by Victor Mote
(At the risk of getting esoteric, if the
Document object knows that none of the output
options requested support SVG, it
could tell the FOTree builder to ignore that
namespace and save the memory.
That's one helluva smart document you're planning
there... ;)
Just to clarify -- I don't intend to write such logic, or even suggest that
it would be a good thing. The only point that I am trying to make is that if
you get the control objects lined up right you can make the whole thing
pretty sophisticated & flexible, and still maintain Separation of Concerns.

Victor Mote
Victor Mote
2003-06-26 15:21:21 UTC
Permalink
Post by Victor Mote
Post by J.Pietschmann
Could you outline your API ideas on the Wiki?
Sure (give me a couple of days). Again, my point is not to push for a
particular API or framework, but only to show that the control
concepts (PDF
encryption was the example given IIRC) can be added to the appropriate
classes as needed.
OK, I have clarified some items and added some (informal) API information to
the main API wiki, toward the bottom of the "Startup Concepts Proposal"
section:
http://nagoya.apache.org/wiki/apachewiki.cgi?FOPAvalonization

Victor Mote
Glen Mazza
2003-06-19 21:15:52 UTC
Permalink
<victor quote>

[Responding to Jeremias here] Or, better yet IMO, into
a RenderType
object
that is a child or grandchild of the Document, so that
there can be
multiple
RenderTypes for the same Document.

</victor quote>

I can understand enhancements for logging and
threading, but multiple RenderTypes for the same
Document appears to be Cocoon's department, not ours.


We're at a level below that, very similar to Xalan:

Xalan input: xml document, xslt stylesheet
Xalan output: document

FOP input: xml (xsl-fo namespace) document, render
type
FOP output: document

Given that our Area tree is renderer-specific, and
since the area tree creation is tightly bound to fo
tree creation--I think any internal implementation we
would have of multiple rendering types would just
involve running FOP once for each rendering type. If
so, perhaps this is best left with Cocoon.

Glen
Glen Mazza
2003-06-19 21:29:48 UTC
Permalink
Errr...this came across as harsher-sounding than I
would have liked.

If the API has some convenient ways for the user to
specify multiple output types for a single xsl-fo
stream, that should be fine.

Glen
Post by Glen Mazza
<victor quote>
[Responding to Jeremias here] Or, better yet IMO,
into
a RenderType
object
that is a child or grandchild of the Document, so
that
there can be
multiple
RenderTypes for the same Document.
</victor quote>
I can understand enhancements for logging and
threading, but multiple RenderTypes for the same
Document appears to be Cocoon's department, not
ours.
Xalan input: xml document, xslt stylesheet
Xalan output: document
FOP input: xml (xsl-fo namespace) document, render
type
FOP output: document
Given that our Area tree is renderer-specific, and
since the area tree creation is tightly bound to fo
tree creation--I think any internal implementation
we
would have of multiple rendering types would just
involve running FOP once for each rendering type.
If
so, perhaps this is best left with Cocoon.
Glen
---------------------------------------------------------------------
__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Victor Mote
2003-06-20 16:42:21 UTC
Permalink
Post by Glen Mazza
Given that our Area tree is renderer-specific, and
since the area tree creation is tightly bound to fo
tree creation--I think any internal implementation we
would have of multiple rendering types would just
involve running FOP once for each rendering type. If
so, perhaps this is best left with Cocoon.
I saw your later posting amending the conclusion, but I want to make sure
that the underlying logic is clear as well. AreaTree is not
renderer-specific, but RenderContext specific. So, for example, the same
AreaTree can be used to generate PostScript and PDF output. PostScript and
PDF are part of the same RenderContext, whereas AWT would use a different
one. (The user doesn't really have to know anything about RenderContexts
because the RenderTypes can be hard-wired to them).

Also, while currently it is true that the AreaTree creation is somewhat
bound to the FOTree creation, that is one of the things that I think we are
trying to change (I am anyway). The only reason that I know of for the
processing to be the way it is now is to facilitate eager processing. If we
return control of that up to these Control/API classes, we get a clean
separation of these tasks, add the option for patient processing as well,
and add the possibility of pluggable layout.

Victor Mote
Glen Mazza
2003-06-20 18:58:58 UTC
Permalink
Post by Glen Mazza
AreaTree
is not
renderer-specific, but RenderContext specific. So,
for example, the same
AreaTree can be used to generate PostScript and PDF
output.
Are you sure? Please read
http://marc.theaimsgroup.com/?l=fop-dev&m=105455951226310&w=2
(Peter, Jeremias' writing)--they appear to indicate
that the area tree is dependent upon the specific
renderer being used, because of the fonts.
Post by Glen Mazza
Also, while currently it is true that the AreaTree
creation is somewhat
bound to the FOTree creation, that is one of the
things that I think we are
trying to change (I am anyway).
currently it's:

FOP -----> Driver -------> FOTreeBuilder
| |
| |
| |
\/ |
Structure Handler <---

As in
(1) FOP class determines if the render_type requires
an area tree.
(2) Driver activates the FOTreeBuilder.
(3) Based on the render_type, Driver determines the
type of StructureHandler (AreaTree) that FOTreeBuilder
should send SAXEvents to, and gives it to
FOTreeBuilder.
(4) FOTreeBuilder sends SAX events to the
StructureHandler that was given to it by Driver.

I wanted to move to:

(1)
Doc/Dri
API/Apps --------> FOTreeBuilder ------> its Area
Tree
Post by Glen Mazza
The only reason that
I know of for the
processing to be the way it is now is to facilitate
eager processing.
If we
return control of that up to these Control/API
classes, we get a clean
separation of these tasks, add the option for
patient processing as well,
and add the possibility of pluggable layout.
Victor Mote
---------------------------------------------------------------------
__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Jeremias Maerki
2003-06-20 19:11:52 UTC
Permalink
Post by Glen Mazza
Post by Glen Mazza
AreaTree
is not
renderer-specific, but RenderContext specific. So,
for example, the same
AreaTree can be used to generate PostScript and PDF
output.
Are you sure? Please read
http://marc.theaimsgroup.com/?l=fop-dev&m=105455951226310&w=2
(Peter, Jeremias' writing)--they appear to indicate
that the area tree is dependent upon the specific
renderer being used, because of the fonts.
Today, that is so. My idea is to move the font registry out of the
renderers in to a font subsystem with multiple font sources. The
involved renderers can be asked what font sources they support and so
the available set of fonts will be restricted. We've talked about the
difficulties of supporting multiple renderers in one processor run
before (see archives). Making the font subsystem standalone would make
it possible to use the same area tree for the PDF and PS renderers and
also make the AT independant of the renderer (I think).
Post by Glen Mazza
Post by Glen Mazza
Also, while currently it is true that the AreaTree
creation is somewhat
bound to the FOTree creation, that is one of the
things that I think we are
trying to change (I am anyway).
FOP -----> Driver -------> FOTreeBuilder
| |
| |
| |
\/ |
Structure Handler <---
I think now it's:
Driver-->FOTreeBuilder-->FOTree-->StructureHandler
---->LayoutEngine-->AreaTree-->Renderer
Post by Glen Mazza
As in
(1) FOP class determines if the render_type requires
an area tree.
(2) Driver activates the FOTreeBuilder.
(3) Based on the render_type, Driver determines the
type of StructureHandler (AreaTree) that FOTreeBuilder
should send SAXEvents to, and gives it to
FOTreeBuilder.
(4) FOTreeBuilder sends SAX events to the
StructureHandler that was given to it by Driver.
It's not directly SAX events but something similar.
Post by Glen Mazza
(1)
Doc/Dri
API/Apps --------> FOTreeBuilder ------> its Area
Tree
But then, where's the StructureHandler (output to MIF and RTF)?



Jeremias Maerki
Glen Mazza
2003-06-20 19:38:02 UTC
Permalink
--- Jeremias Maerki <***@greenmail.ch> wrote:
http://marc.theaimsgroup.com/?l=fop-dev&m=105455951226310&w=2
Post by Glen Mazza
Post by Glen Mazza
(Peter, Jeremias' writing)--they appear to
indicate
Post by Glen Mazza
that the area tree is dependent upon the specific
renderer being used, because of the fonts.
Today, that is so. My idea is to move the font
registry out of the
renderers in to a font subsystem with multiple font
sources. The
involved renderers can be asked what font sources
they support and so
the available set of fonts will be restricted.
This seems to be another layer of indirection--quite
useful still--but might the area tree *still* be
renderer-dependent? ("The involved renderers can be
asked"--It still has to know which renderers to ask,
right? It may very well get different answers and
work differently per renderer then.) If the Area Tree
wants to know if Tribune New Roman is supported and
PDF renderer says yes while PS says no, the area tree
generated would be different for both, I think.

Glen


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Jeremias Maerki
2003-06-20 19:47:46 UTC
Permalink
Post by Glen Mazza
This seems to be another layer of indirection--quite
useful still--but might the area tree *still* be
renderer-dependent? ("The involved renderers can be
asked"--It still has to know which renderers to ask,
right? It may very well get different answers and
work differently per renderer then.) If the Area Tree
wants to know if Tribune New Roman is supported and
PDF renderer says yes while PS says no, the area tree
generated would be different for both, I think.
Right in general. I think this cannot be as detailed as individual fonts.
It should be like that: Renderer, do you support font sources X and Y?
Font Source X could be a service providing PostScript Type1 fonts, Font
Source Y could be a service providing fonts available to AWT. The whole
thing will have to happen before processing starts because different
Area Trees have to be generated if the requested set of renderers don't
support the same font sources (AWT against PDF, for example). The
problem is, as we already gathered in the earlier discussion, that I
guess you wouldn't want to simply let FOP generate two different
(!!!pagination!!!) ATs. You will get two different documents. But the
main use case for multiple renderers is to have one doc for printing and
one for the archive. And in this aspect, different output is not
acceptable. Therefore and IMO, multiple output formats in the same
processor run is not worth the pain. But the above concept is still
worthwhile.


Jeremias Maerki
Glen Mazza
2003-06-20 19:23:50 UTC
Permalink
Previous email was sent before I was done <red face/>,
Post by Glen Mazza
AreaTree
is not
renderer-specific, but RenderContext specific. So,
for example, the same
AreaTree can be used to generate PostScript and PDF
output.
Are you sure? Please read
http://marc.theaimsgroup.com/?l=fop-dev&m=105455951226310&w=2
(Peter, Jeremias' writing)--they appear to indicate
that the area tree is dependent upon the specific
renderer being used, because of the fonts.
Post by Glen Mazza
Also, while currently it is true that the AreaTree
creation is somewhat
bound to the FOTree creation, that is one of the
things that I think we are
trying to change (I am anyway).
currently it's:

FOP -----> Driver -------> FOTreeBuilder
| | |
| | |
| | |
\/ | |
AWT/Print \/ |
Starter Structure Handler <----

As in
(1) FOP class determines if the render_type requires
an area tree, if so goes to App.Driver, else
App.AWT/Print Starter.
(2) Driver activates the FOTreeBuilder.
(3) Based on the render_type, Driver determines the
type of StructureHandler (AreaTree) that FOTreeBuilder
should send SAXEvents to, and gives it to
FOTreeBuilder.
(4) FOTreeBuilder sends SAX events to the
StructureHandler that was given to it by Driver.

I wanted to move to:

(1)
Doc/Dri
API/Apps --------> FOTreeBuilder ------> its Area
Tree

(1) Doc/Dri/whatever feed the FOTreeBuilder the xslfo
and the render_type, calls Run()

(2) Based on the render_type FOTreeBuilder either
creates an area tree of its choosing, or doesn't
(AWT/Print). If it does, it determines *which* type
of area tree to create (Structure or MIF or the other
one)--based again on the render_type. The business
logic for this would be in FOTreeBuilder.

Regardless of area tree decision, it is responsible
for generating the report via its Run() function.
I.e., the area tree may just be a local variable of
its run() rather than its current status as a member
variable.

Your idea is (non-graphical, I'm getting tired ;):

<document class>
foTree = foTreeBuilder.createFOTree();
areaTree = areaTree.createAreaTree(foTree);

You are correct--this is *very* elegant--problem is,
from Keiron's writing, we just can't separate the
FOTree from the area tree because of an unacceptable
performance hit.

See:
http://marc.theaimsgroup.com/?l=fop-dev&m=105270887501490&w=2

*If*---?#@%#$---the FOTree needs to encapsulate the
AreaTree, I thought the next best solution would be to
divorce your document from knowledge of the AreaTree,
rather than have both classes point to it.

Here's food for thought--if we rename FOTreeBuilder to
XSLFOProcessor, perhaps you would have less concerns
about feeding it the render type. We can make this
centrally-located class the "octopus" of the
application, rather than the left-side
Document/API/APPs class.
Post by Glen Mazza
If we
return control of that up to these Control/API
classes, we get a clean
separation of these tasks, add the option for
patient processing as well,
and add the possibility of pluggable layout.
I agree--splitting the business logic between multiple
classes tends to create spaghetti code.

Glen

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Jeremias Maerki
2003-06-20 19:33:16 UTC
Permalink
Post by Glen Mazza
(1) Doc/Dri/whatever feed the FOTreeBuilder the xslfo
and the render_type, calls Run()
(2) Based on the render_type FOTreeBuilder either
creates an area tree of its choosing, or doesn't
(AWT/Print).
Even with AWT/Print an Area Tree is generated.
Post by Glen Mazza
If it does, it determines *which* type
of area tree to create (Structure or MIF or the other
one)--based again on the render_type. The business
logic for this would be in FOTreeBuilder.
But no area tree is generated when a StructureHandler (RTF/MIF) is
active.
Post by Glen Mazza
Regardless of area tree decision, it is responsible
for generating the report via its Run() function.
I.e., the area tree may just be a local variable of
its run() rather than its current status as a member
variable.
<document class>
foTree = foTreeBuilder.createFOTree();
areaTree = areaTree.createAreaTree(foTree);
You are correct--this is *very* elegant--problem is,
from Keiron's writing, we just can't separate the
FOTree from the area tree because of an unacceptable
performance hit.
It doesn't need to be separated. But the building (!) of the FO tree is
separated from other code like layout or RTF generation.
Post by Glen Mazza
http://marc.theaimsgroup.com/?l=fop-dev&m=105270887501490&w=2
AreaTree, I thought the next best solution would be to
divorce your document from knowledge of the AreaTree,
rather than have both classes point to it.
Here's food for thought--if we rename FOTreeBuilder to
XSLFOProcessor, perhaps you would have less concerns
about feeding it the render type. We can make this
centrally-located class the "octopus" of the
application, rather than the left-side
Document/API/APPs class.
YOU GET IT!



Jeremias Maerki
Glen Mazza
2003-06-20 19:48:11 UTC
Permalink
Post by Jeremias Maerki
Even with AWT/Print an Area Tree is generated.
But no area tree is generated when a
StructureHandler (RTF/MIF) is
active.
Yes, I got confused between the two sets of "weird"
render types: those that handle their processing with
their own starters (which I would like to push past
FOTreeBuilder/XSLFOProcessor, and have the latter
class handle as well) and those that don't need an
area tree because of no page numbering, etc.

Glen

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
J.U. Anderegg
2003-06-21 11:39:25 UTC
Permalink
A FOP renderer cannot support fonts. Physical devices, output systems do
this. But a FOP renderer may allow to map fonts and translate Unicode
characters to bytes using codepages. The user has to take care that font
metrics and it's character sets are accurate enough for all target devices
and better does not try to print Helvetica on a matrix printer or paint
color on a black/white laser printer.

Current FOP uses a mix of Adobe and Java font systems: less would be more.

Hansuli Anderegg
Peter B. West
2003-06-22 01:18:47 UTC
Permalink
Post by Jeremias Maerki
Post by Glen Mazza
If it does, it determines *which* type
of area tree to create (Structure or MIF or the other
one)--based again on the render_type. The business
logic for this would be in FOTreeBuilder.
But no area tree is generated when a StructureHandler (RTF/MIF) is
active.
To take this off on a bit of a tangent, I have always been a little
sceptical about the possiblility of realising the "structure renderers"
without the area tree. It seems to me to depend on whether the renderer
in question follows a page definitions/flows model. I don't know
anything about the structure of either RTF or MIF, but, for example, if
one of these formats defined page formats (size, orientation, margins)
at the beginning of each section, but then had structures corresponding
to the data on individual pages, then pagination, i.e. layout, would
have to be performed before the structure could be rendered.

Can percentages be resolved without at least some degree of area tree
construction?

Bertrand is probably in the best position to comment wrt RTF. Is anyone
familiar with MIF. Does it simply define page structures and flows?

Peter
--
Peter B. West http://www.powerup.com.au/~pbwest/resume.html
J.Pietschmann
2003-06-20 19:38:57 UTC
Permalink
Post by Glen Mazza
foTree = foTreeBuilder.createFOTree();
areaTree = areaTree.createAreaTree(foTree);
Check old FOP versions, like 0.19 or so...
Post by Glen Mazza
You are correct--this is *very* elegant--problem is,
from Keiron's writing, we just can't separate the
FOTree from the area tree because of an unacceptable
performance hit.
Memory problems, not performance in general. The area
tree is usually *huge*.

J.Pietschmann
Victor Mote
2003-06-21 16:16:30 UTC
Permalink
Post by Glen Mazza
Are you sure? Please read
http://marc.theaimsgroup.com/?l=fop-dev&m=105455951226310&w=2
(Peter, Jeremias' writing)--they appear to indicate
that the area tree is dependent upon the specific
renderer being used, because of the fonts.
Answer 1: I interpreted these comments to relate to the differences between
PostScript/PDF and AWT/Print layout.

Answer 2: If we do have layout/AreaTree/rendering differences between
PostScript and PDF, then it seems (??!!) to me to be an anomaly that should
be fixed. Off the top of my head, I can't think of a good reason for it.
Same with differences between AWT and Print.

Answer 3: If the differences do exist & need to exist, then the fix to the
Section/Document, etc. approach is to remove the concept of RenderContext,
and place all of that data/logic into RenderType instead.
Post by Glen Mazza
one)--based again on the render_type. The business
logic for this would be in FOTreeBuilder.
Sorry, man, but I cringe every time you say this.
Post by Glen Mazza
<document class>
foTree = foTreeBuilder.createFOTree();
areaTree = areaTree.createAreaTree(foTree);
Close. The AreaTree creation would be done from the RenderContext. Document
controls FOTree creation (if any), RenderContext controls AreaTree creation
(if any), and RenderType controls rendering.
Post by Glen Mazza
You are correct--this is *very* elegant--problem is,
from Keiron's writing, we just can't separate the
FOTree from the area tree because of an unacceptable
performance hit.
http://marc.theaimsgroup.com/?l=fop-dev&m=105270887501490&w=2
I don't think that is the link you wanted -- it doesn't seem to be relevant
here. Nevertheless, Keiron and others are concerned about performance. That
is why I factored out the idea of eager vs. patient processing. One of the
benefits to getting all of the control logic up to these higher-level
classes is that we can achieve Separation of Concerns (at least my
conception of that term) without sacrificing the performance benefits of
eager processing when it is appropriate. The lower-level workhorse classes
don't have to know anything about what is going on "over the wall" or
whether eager or patient processing is occurring. All they know is that when
they are done with their task, they return control back to the higher-level
objects. Right now, for example, when a page is completely laid out, the
layout tells rendering to render it. What I am trying to achieve is that
control would come back to RenderContext instead, which decides whether to
fire up rendering logic to render (perhaps in a separate thread), or perhaps
to fire up two rendering tasks (because there are two output media sharing
the same AreaTree), or perhaps (down the road), it doesn't render yet at all
because it knows it eventually wants to create an optimized PDF instead of
the non-sequential page building we use now.
Post by Glen Mazza
From that standpoint, I like the term "service" that Jeremias uses. FOTree
building is a service that is consumed by Document. AreaTree building is a
service that is consumed by RenderContext. Rendering is a service that is
consumed by RenderType.

I would love to have Keiron involved with this conversation. The last time
we discussed this, I asked him whether he liked the idea of moving these
control mechanisms into these higher-level objects, and he has so far not
responded.
Post by Glen Mazza
Here's food for thought--if we rename FOTreeBuilder to
XSLFOProcessor, perhaps you would have less concerns
about feeding it the render type. We can make this
No. The only place where it even remotely makes sense (IMO) for FOTree
building to know anything about the RenderType is in the StructureRenderers,
and the correct way to factor the functionality there is that the parsing is
tied to the rendering process (no FOTree built, no layout, go straight to
rendering). So, to the extent that the pure SAX parsing is tied to FOTree
building, it should be decoupled and made available to the
StructureRenderers (I think this may already be in place). So, with
StructureRenderers handled correctly, I don't see why FOTree building knows
anything at all except to build an FOTree.
Post by Glen Mazza
centrally-located class the "octopus" of the
application, rather than the left-side
Document/API/APPs class.
I'm not terribly hung up on what the API looks like. However, once the
control is properly factored out, it seems like the API naturally follows.
After all, what we are really trying to accomplish here is to give the user
/ servlet programmer the maximum amount of flexibility (control!) that we
can.

Victor Mote
Glen Mazza
2003-06-21 20:28:05 UTC
Permalink
--- Victor Mote <***@outfitr.com> wrote:
http://marc.theaimsgroup.com/?l=fop-dev&m=105270887501490&w=2
Post by Victor Mote
I don't think that is the link you wanted -- it
doesn't seem to be relevant
here.
That was the link--Keiron's writing appeared to
indicate that the area tree needs to be
created/processed *while* the FO Tree is being built,
and not *after* it is done.

Glen

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
J.U. Anderegg
2003-06-20 08:58:44 UTC
Permalink
The current FOP is not fit for i18n, directions, area orientations and does
not even support a dotted line. Renderes too. Design correct data objects in
the first place instead of fancy control mechanisms.
- what do pipelines look like?
- are there really pipelines or are partial document fragments passed on?
What's in memory at any time?

main
o initialize
o parse XSL
o layout pages
o render x times

initialize
o process config
output: global config objects
o get platform info
output: global platform objects

parse XSL
o process layout-master-sets, page-sequences
output: internal page description objects

o process flows
- standardize properties (measurements, property synonyms)
- resolve XSL inheritance to avoid tricky states, switches
- calculate FO dimensions as far as possible and transform
output content: to a suitable representation (my favorite: Graphics2D)
output document stucture, areas: into FOTree with pointers to content
output unique property/attribute maps: memory savers

layout pages
o apply page descriptions to flows
o paginate: fill pages sequentially, split content objects
o resolve references
output: content objects supplemented with page coordinates on a
device-independent way
o baptize the result "Area Tree"

render
o process renderer configuration
o output content objects in a device-dependent format

Hansuli Anderegg
Jeremias Maerki
2003-06-20 19:01:31 UTC
Permalink
Post by Glen Mazza
Post by Jeremias Maerki
The FO
tree
builder is (to me) a service that simply accepts a
SAX stream and builds
the FO tree.
IMHO FOTreeBuilder is an object (C++/Java), not a
service (C).
I'm talking about Avalon-style services. These are components that
implement a certain work interface. In the background there's a main
implementation class (either alone or representing a whole subsystem).
The interface helps to decouple the whole system (FOP). Loose coupling
is the keyword here. The interface also makes it easy to switch
implementations (for example: font sources, layout engines, whatever).

FOTreeBuilder, in my view of things, will change to be an interface, not
a Java object. You'll then have an implementation of the FOTreeBuilder
(ex. DefaultFOTreeBuilder) which provides the service of building an FO
tree.
Post by Glen Mazza
Post by Jeremias Maerki
The layout engine, another (coarse
grained) service, will
then access the FO tree to do the layout. This is
all kept together by a
"supervising" class.
If we were doing C programming
We don't.
Post by Glen Mazza
--my fear is that the
supervising class is going to end up eating FOP's
object-oriented design and splitting the business
logic too much in multiple places (just like apps
currently does). (I guess I'll just have to trust the
team to be disciplined in this regard! ;)
Post by Jeremias Maerki
The FOTreeBuilder should remain an inner service to
FOP, not exposed in
the public API, if you ask me.
OK, a *very* thin wrapper (for those not needing any
public class Fop extends FOTreeBuilder { }
Fop.setXSLFOStream(blah);
Fop.setRenderType(RENDER_PS);
myDoc = Fop.Run();
Please not. This encourages tight coupling between the main classes of
FOP IMHO. Avalon encourages the assembly of a system (several
services/subsystems bound together to build a bigger system, association
instead of subclassing). The subsystems get separated by well-designed
interfaces that encourage SoC (separation of concerns). That's why I'd
like to see FOTreeBuilder as a service:

FOTreeBuilder could look like this:
public interface FOTreeBuilder {
void setStructureHandler(StructureHandler handler);
ContentHandler getContentHandler();
FONode getRootNode();
boolean isEmpty();
void reset();
}
or
public interface FOTreeBuilder {
/** Build an FO tree and store it in the process controller class.*/
ContentHandler buildFOTree(ProcessorRun/Document/<whatever>, StructureHandler handler);
}

That's all it does. Already today, it's quite a light-weight thing. It's
Victor's new classes (Document, for example) that will keep the whole
thing together, in as little number of places as possible. This improves
decoupling and centralizes control which in the end should make the
whole thing easier to understand.

I really don't know if this concrete example will work out exactly like this.
I didn't even investigate how this will fit with Peter's work. I'm just
getting the impression that you're going into a problematic direction by
off-loading too much on something like the TreeBuilder.


Jeremias Maerki
J.Pietschmann
2003-06-17 21:03:22 UTC
Permalink
main FOP api in the org.apache.fop package.
org.apache.fop.api package?

J.Pietschmann
Jeremias Maerki
2003-06-18 05:44:19 UTC
Permalink
No problem with that.
Post by J.Pietschmann
main FOP api in the org.apache.fop package.
org.apache.fop.api package?
Jeremias Maerki
Arnd Beißner
2003-06-22 10:52:07 UTC
Permalink
Post by Peter B. West
Bertrand is probably in the best position to comment wrt RTF. Is anyone
familiar with MIF. Does it simply define page structures and flows?
I'm roughly familiar with MIF - did some rough HTML to MIF conversion
years ago.

Basically MIF is structured text that is annotated with stylenames, which
needn't even by defined in the MIF (but can, if I remember correctly).

As for the general discussion on renderer types: IMO, it's a mistake to
mangle "renderers" that produce formatted page-level output like PDF or
PostScript with "renderers" that produce flow output in other formatting
languages, like HTML, RTF or MIF. The latter is rather a conversion step,
and you would need not the area tree but rather the FO element tree to do
a good conversion.

Fundamentally, I think these two different kinds of "renderer tasks" just
have two things in common: a parser and an FO element tree, and that's it.

So my suggestion would be to implement only formatted output in FOP and
refactor the other outputs into a separate tool. If you need a clear
differentiation between the renderer types, you might take this one: do I
need to know the size of a glyph in a certain font/size to produce the
output? If yes, the appropriate renderer goes into FOP, if not, it goes
into a separate tool.

Just my 2 cents, of course.
--
Cappelino Informationstechnologie GmbH
Arnd Beißner
Glen Mazza
2003-06-22 16:26:13 UTC
Permalink
Arnd Beißner
2003-06-22 17:03:23 UTC
Permalink
So the XSL-FO spec--which FOP is trying to implement
for as many output types as possible--is not relevant
for those output types which don't need to know glyph
size? By putting it into a separate tool, that is
what you may be implying.
In a way, that's true. You need control over glyph placement (and, of
course, lines, area, etc.) if you want to do a conforming implementation.
I don't think you will be able to reach even basic conformance with output
types HTML, RTF and MIF - expect perhaps in a very degenerate way (like
creating an image for each page). With RTF (but only with a very recent
version of it), you might stand a chance to reach at least roughly basic
conformance.

Conceptually, XSL:FO, RTF, MIF and HTML are the same thing (no nitpicking
please 8-)). A renderer implementation of each of these standards produces
glyphs, lines, and other low-level graphical objects that have definitive
place on a medium. This is structurally some very different from
converting between these language conversion.

To summarize my view on this: FO->RTF, FO->MIF, FO->HTML is a conversion
between similar high-level-languages where each language element has a
similar corresponding language element in the destination language.
FO->PDF, FO->PostScript, FO->PCL, FO->AWT are formatting processes. The
formatting part is what makes up the primary goal and especially most of
the complexity of FOP.

How about seeing things from this perspective: transformation of XSL:FO to
RTF, MIF and HTML could be done using an XSLT stylesheet (provided that
you create a trivial XML representation of RTF and MIF first, of course).
That stylesheet would be complex, of course, but on the 'possible side',
because these transformations fundamentally are tree transformations of
the same kinds of things.

What can be shared between conversion and formatting beside the things I
already mentioned? I think you can forget almost everything about fonts,
the layouters, the area tree.

How does this sound to you?
--
Cappelino Informationstechnologie GmbH
Arnd Beißner
Glen Mazza
2003-06-22 17:38:50 UTC
Permalink
Arnd Beißner
2003-06-22 19:15:39 UTC
Permalink
We don't do HTML at all (Xalan's department)-- HTML is
browser dependent. HTML is not an output format.
I know, the purpose was to illustrate my point, since
everyone is familiar with HTML, but perhaps not with
MIF or RTF. If you think it makes sense for FOP to
output RTF, then it also makes sense for FOP to support
HTML (at least in combination with CSS2).
w/RTF, "as for reaching even basic conformance"--my
impression on XSL-FO was that we must render up to the
maximum capabilities of the output format.
So where there is an FO object or property that RTF
simply can't support, this would *not* mean (1) we're
incompliant with the XSL-FO spec w.r.t. the RTF
rendering, or (2) that RTF cannot done via an XSL-FO
input.
Well, formatting objects are mandatory for a specific
conformance level regardless of the output medium
(sometimes different for visual and aural output media
of course). As for properties, the specification already
defines behaviour if certain rendering properties can
not be represented. This is true for color and fonts,
for example. The specification, however, seems to always
assume that you (the implementation) have precise
control over the formatting and pagination process,
which you don't have with RTF.

Example: expressions including functions seem to be basic
conformance level to me. RTF doesn't support expressions.
How will you resolve an expression that can only be
determined a formatting-time if you can't hand over the
expression to RTF? You cannot anticipate the actual
formatting an RTF renderer (like Word) will do, if only
because a different hyphenation library is used.
I think the "binary" of PDF is also human-readable,
just like RTF is, albeit it's more complex and has
much more functionality. It appears that the main
difference between the two groups is back to some
needing page numbering and others not.
With this I don't agree at all. First, PDF is only
human-readable in a way that PostScript is, too - and
only if you totally leave out encoded objects. You
could even say that PDF is something like annotated
and restricted PostScript. Nitpicking aside - this is
basically true.

Second, it's not really about page numbering. The
difference is formatting. By formatting I mean deciding
were lines and pages are broken and where glyphs and
other graphical elements are put exactly. If you don't
know where exactly a glyph will end up, it's not
formatting. 8-) Extreme example: XSL:FO to formatted
ASCII. There are lots of constraints that you now
have, but the formatter will still know where each
glyph will display.

Before we're getting too philosophical, let me say
that we're now talking two different issues:

1. Is it possible to develop a conforming XSL:FO
implementation that produces RTF or MIF or similar
ouput?

2. Are there really two different groups of output
formats and does it really make sense to support
both in one tool (FOP).

My answers:

1. Probably no. Only if all XSL:FO specification
constraints are either directly supported by the
destination language (RTF, MIF, whatever) or can
be transformed into supported destination language
elements without anticipating concrete formatting
steps (like line-breaking). Anyway, this probably
cannot be definitely resolved without contacting
the W3C. Also, it's a rather academical (or
worse, legal) issue unless/until you want to
claim conformance for these output types for FOP. 8-)

2. Yes, there are, and no, it doesn't really.
That's a bit like taking a C compiler that
produces machine code, and then add the capability
to produce Java source code output to that C compiler.
Yes, you can do it, but no, it doesn't make too
much sense, since the only code you share is the
parsing stuff and abstract syntax tree building.

Of course, the few parts that *can* be shared
*should* be shared. Maybe it's even worth to
bundle both things into one "tool" or library
- if only for the user's convenience. I just think
it's absolutely not worth the bother to clutter
major parts of FOP (old or new) with stuff needed
by converter-type output renderers, when these
don't even profit from these same major parts of FOP.

All of the stuff needed for FO->RTF or FO->MIF
conversion is pretty straightforward stuff.
IMO not at all comparable with the complexities
lurking in actually *formatting* according to
the XSL:FO specification.

Sermon ends. 8-)
--
Cappelino Informationstechnologie GmbH
Arnd Beißner
Bertrand Delacretaz
2003-06-23 05:52:38 UTC
Permalink
...Before we're getting too philosophical, let me say
1. Is it possible to develop a conforming XSL:FO
implementation that produces RTF or MIF or similar
ouput?
Probably not, XSL-FO is clearly meant for page output and RTF and MIF
(unless abused in strange ways) are document formats, not page formats.
2. Are there really two different groups of output
formats and does it really make sense to support
both in one tool (FOP)....
There are definitely two groups, and I agree with you that
....
All of the stuff needed for FO->RTF or FO->MIF
conversion is pretty straightforward stuff....
The whole point of the StructureHandler interface is to be able to
reuse FOP's "frontend" for structure-based renderers.
The impact of StructureHandler on the "standard" FOP output formats
(PDF mostly) is minor, but it allows the FOP "pipeline" to branch
cleanly, after the parsing and attributes resolution, to generate
either structure-based or page-based formats.

Besides sharing code, the advantage in RTF and similar output formats
being developed as part of FOP is the community: one project with more
audience instead of several, each with a smaller audience.

-Bertrand
J.U. Anderegg
2003-06-23 08:35:38 UTC
Permalink
Post by Bertrand Delacretaz
The whole point of the StructureHandler interface is to be able to
reuse FOP's "frontend" for structure-based renderers.
The impact of StructureHandler on the "standard" FOP output formats
(PDF mostly) is minor, but it allows the FOP "pipeline" to branch
cleanly, after the parsing and attributes resolution, to generate
either structure-based or page-based formats.
How do you plan to handle RTF styles?

Is it difficult to transform tables and lists?

Hansuli Anderegg
Bertrand Delacretaz
2003-06-23 08:43:51 UTC
Permalink
Post by J.U. Anderegg
...
How do you plan to handle RTF styles?
In jfor we defined an extension to XSL-FO (the "jfor-style" attribute)
to control RTF styles.

Another way would be to recognize sets of attribute values in the input
XSL-FO and map them to RTF styles.

I think some form of extension is needed as (AFAIK) the concept of
styles does not exist in XSL-FO, as it is meant for printed output.

If you import XML in the newest versions of FrameMaker for example, you
have to define a kind of "style map" which recognizes specific
constructions in the XML and assigns styles to them. This might also be
an option, use a second input file that tells FOP which (MIF or RTF)
styles to assign when certain patterns are recognized in the input.
Post by J.U. Anderegg
...Is it difficult to transform tables and lists?
Not too much, but when we developed jfor we targeted it at RTF 1.5
which as no support for nested tables, so we had to fake them using
joined cells (as older versions of Word did).

Other than that, the tables and lists structures of XSL-FO map nicely
to RTF. Possible problems are relative dimensions, for example it might
be hard in RTF to say that a table must take 60% of the page height.

-Bertrand
J.U. Anderegg
2003-06-23 10:08:14 UTC
Permalink
Post by Bertrand Delacretaz
...
Post by J.U. Anderegg
How do you plan to handle RTF styles?
In jfor we defined an extension to XSL-FO (the "jfor-style" attribute)
to control RTF styles.
I think some form of extension is needed as (AFAIK) the concept of
styles does not exist in XSL-FO, as it is meant for printed output.
(1) This is not a FOP extension, but rather a fundamental change of the
XSL-FO language, which does not know stlye sheets.
Post by Bertrand Delacretaz
Another way would be to recognize sets of attribute values in the input
XSL-FO and map them to RTF styles.
(2) I wrote a few weeks ago this and it is still my idea, how FOP should
store properties:

"Apply the principles of relational databases to eliminate redundancies: set
up tables of unique/used fonts, strokes, colors, ... and have the objects
reference table entries. This will cost table lookups, CPU. However, it will
also ease state processing. The program does not have to keep track of
inherited properties set 300 FO elements earlier. The nuisance is that style
sheets have acceptable redundancy (see DocBook), XSLT replicates properties
innumerable times and FOP has to recollect and normalize all this stuff."

My opinion:

(1) is off FOP territory
(2) to be considered, if FOP is implemented on this way

Hansuli Anderegg
Bertrand Delacretaz
2003-06-23 11:50:34 UTC
Permalink
Post by J.U. Anderegg
...In jfor we defined an extension to XSL-FO (the "jfor-style"
attribute)
to control RTF styles....
(1) This is not a FOP extension, but rather a fundamental change of the
XSL-FO language, which does not know stlye sheets.
What I called "extension" applies to XSL-FO, or rather to the input
documents handled by jfor, which use the "jfor-style" attribute in
addition to standard XSL-FO. You're right that this is not a FOP
extension.

This can be implemented cleanly with namespaces, so as not to pollute
the XSL-FO input, something like

<fo:block font-size="12pt" style:name="someRtfStyle">

Meaning that the style names are "decorations" to the input document.
Post by J.U. Anderegg
...(2) I wrote a few weeks ago this and it is still my idea, how FOP
should
"Apply the principles of relational databases to eliminate
redundancies: set
up tables of unique/used fonts, strokes, colors, ... and have the objects
reference table entries....
Sounds reasonable, but I don't know enough about the current properties
code to comment on this.

-Bertrand
Clay Leeds
2003-06-23 16:51:45 UTC
Permalink
Post by J.U. Anderegg
Post by Bertrand Delacretaz
Post by J.U. Anderegg
How do you plan to handle RTF styles?
In jfor we defined an extension to XSL-FO (the "jfor-style" attribute)
to control RTF styles.
I think some form of extension is needed as (AFAIK) the concept of
styles does not exist in XSL-FO, as it is meant for printed output.
(1) This is not a FOP extension, but rather a fundamental change of the
XSL-FO language, which does not know stlye sheets.
Forgive my intrusion, and perhaps this is not related, or just using a
different namespace, but why not define a xsl:attribute-set for each
"style" and then use the xsl:use-attributes to help with the definition
of the RTF/MIF style? Consider this this example (the comments are meant
for people who want to modify 'my' template to change their output):

<xsl:attribute-set name="attNormal">
<!-- attNormal | attribute set: font-family,font-size,color &
background-color) of entire EOR -->
<xsl:attribute name="background-color">#ffffff</xsl:attribute>
<!-- background-color for overall EOR page - default:#ffffff (white) -->
<xsl:attribute name="color">#000000</xsl:attribute>
<!-- color for text - default:#000000 (black) -->
<xsl:attribute name="font-family">courier new, courier,
monospace</xsl:attribute>
<!-- font-family for text - default:"courier new,courier,monospace" -->
<!-- NOTE: Changes to font-family affects FOP rendering of EOR and
may require overhaul of template! -->
<xsl:attribute name="font-size">8pt</xsl:attribute>
<!-- font-size for text - default:8pt - -->
<!-- NOTE: Changes to font-size DRASTICALLY affects FOP rendering of
EOR and may require COMPLETE overhaul of template! -->
</xsl:attribute-set>

Then couple this with these fo:page-sequence calls:

<!-- DEFINE PAGE SEQUENCE - repeating -->
<fo:page-sequence master-reference="repeating">
<fo:static-content flow-name="xsl-region-before">
<fo:block padding="0pt" xsl:use-attribute-sets="attNormal">
<xsl:call-template name="tmpHeader"/>
</fo:block>
</fo:static-content>
<fo:static-content flow-name="xsl-region-after">
<fo:block xsl:use-attribute-sets="attNormal">
<xsl:call-template name="tmpFooter"/>
</fo:block>
</fo:static-content>
<fo:flow flow-name="xsl-region-body">
<fo:block xsl:use-attribute-sets="attNormal">
<xsl:call-template name="tmpBody"/>
<fo:block padding="0pt" font-size="1pt">
<fo:marker marker-class-name="table-continued"/>
</fo:block>
</fo:block>
</fo:flow>
</fo:page-sequence>

Is this related to the discussion? I see that I'm using xsl: namespaces,
but I'm curious to learn how this relates...
--
Clay Leeds - ***@medata.com
Web Developer - Medata, Inc. - http://www.medata.com
PGP Public Key: https://mail.medata.com/pgp/cleeds.asc
Jeremias Maerki
2003-06-23 17:19:21 UTC
Permalink
Nice idea, but there's a problem. The xsl namespace gets filtered out by
the XSLT engine, or IOW expanded to the FO attributes before they reach
FOP. FOP never sees anything with the xsl: prefix.
Post by Clay Leeds
Forgive my intrusion, and perhaps this is not related, or just using a
different namespace, but why not define a xsl:attribute-set for each
"style" and then use the xsl:use-attributes to help with the definition
of the RTF/MIF style? Consider this this example (the comments are meant
<snip/>
Post by Clay Leeds
Is this related to the discussion? I see that I'm using xsl: namespaces,
but I'm curious to learn how this relates...
Jeremias Maerki
Clay Leeds
2003-06-23 17:33:23 UTC
Permalink
Post by Jeremias Maerki
Nice idea, but there's a problem. The xsl namespace gets filtered out by
the XSLT engine, or IOW expanded to the FO attributes before they reach
FOP. FOP never sees anything with the xsl: prefix.
Does this mean that just about every fo:block in the intermediary FO
file (the one you can only see if you run xalan.bat instead) has a
cagillion font-weight="bold" and font-face="courier new, courier,
monospace"? Wow! Sounds inefficient, and tedious but I guess that's how
it goes...

Speaking of which, how does the file created by '-at' differ from the
file generated by running xalan.bat?
--
Clay Leeds - ***@medata.com
Web Developer - Medata, Inc. - http://www.medata.com
PGP Public Key: https://mail.medata.com/pgp/cleeds.asc
Jeremias Maerki
2003-06-23 18:02:57 UTC
Permalink
Post by Clay Leeds
Post by Jeremias Maerki
Nice idea, but there's a problem. The xsl namespace gets filtered out by
the XSLT engine, or IOW expanded to the FO attributes before they reach
FOP. FOP never sees anything with the xsl: prefix.
Does this mean that just about every fo:block in the intermediary FO
file (the one you can only see if you run xalan.bat instead) has a
cagillion font-weight="bold" and font-face="courier new, courier,
monospace"? Wow! Sounds inefficient, and tedious but I guess that's how
it goes...
Yep.
Post by Clay Leeds
Speaking of which, how does the file created by '-at' differ from the
file generated by running xalan.bat?
That's the Area Tree XML: The layouted pages serialized to a proprietary
XML format. It's only interesting for debugging purposes (in layout
engine development).

Jeremias Maerki
J.Pietschmann
2003-06-23 19:35:23 UTC
Permalink
Post by Jeremias Maerki
Post by Clay Leeds
Speaking of which, how does the file created by '-at' differ from the
file generated by running xalan.bat?
That's the Area Tree XML: The layouted pages serialized to a proprietary
XML format. It's only interesting for debugging purposes (in layout
engine development).
And people who want to feed back results from the rendering process
into a second XSLT pass...

J.Pietschmann
Arnd Beißner
2003-06-22 19:18:07 UTC
Permalink
Post by Arnd Beißner
determined a formatting-time if you can't hand over the
determined at formatting-time if you can't hand over the
--
Cappelino Informationstechnologie GmbH
Arnd Beißner
Bahnhofstr. 3, 71063 Sindelfingen, Germany
Email: ***@cappelino.de
Phone: +49-7031-463458
Fax: +49-7031-463460
Mobile: +49-173-3016917
Continue reading on narkive:
Loading...