Introduction
I have a Object-Orientation question, related to the Model-View-Controller (MVC) pattern. I suspect it will have a simple “here is the idiom everyone uses for this situation” type answer, but I haven’t found it. Here’s a fictional piece of software that demonstrates my problem.
Background
Suppose I am writing a program to solve, once and for all, all those school-boy questions about “Who would win a fight between a giraffe and a kangaroo?”
I am writing it in Python, which gives a fair bit (too much?) of flexibility in its Object Model, but I am happy to hear how the problem is solved in other (more strict?) languages.
I am using a nice Model-View-Controller (MVC) pattern.
My model includes a database of animals. It has a nice class hierarchy – everything inherits from the Animal class.
In version 1, my view includes a visual display of photographs of each of the animals. In version 2, I hope to add a separate view which contains cartoon representations of the losing animals’ carcass (with markings highlighting where the animal was injured) and another view which plays appropriate animal noises.
So that means (a) each animal class may have a corresponding view class that knows how to display it, and (b) each animal class may have more than one corresponding view class, depending on the purpose of the view.
So far, so good. I think I am describing an ordinary use of the MVC.
The Problem
Now, I have decided to display the first 20 animals on the screen, so I pass a collection of animals to the code responsible for creating the view instances.
Here’s the dilemma. That code knows only that each instance it is some sort of animal. How can it create the correct corresponding view?
Solutions?
I have a number of potential solutions.
Model knows about the Views
Each animal class could contain a factory for the view class. For example, I could ask the Zebra instance to create a ZebraPhotoView for me.
This is an ugly reversal of the dependencies. I don’t mind the model being aware that there exist view classes (e.g. that might subscribe to changes) but I don’t like the idea that every time a new view is added, all of the model objects need to change.
Switch Statement
Each animal class could contain a “id” method, that simply returns a static value. Some separate factory code could ask the object for its id, and then contain a giant switch statement that creates a corresponding view depending on the value of that id.
Every time I see a switch statement, I think “Why didn’t the object model successfully hide this in some good clean polymorphism?”
Note also that the switch statement, in this case, is very fragile to changes in the model class hierarchy.
Reflection
Python (and several other modern languages) offer access to meta-data about the class. In particular, Python offers the ability to check if a variable is an instance of a specified class.
I could hard-code a static list of animal class, view class pairs – e.g. [(Aardvark, AardvarkView), (Aardwolf, AardwolfView),
…]
Then for each object, iterate down the list, checking if it is an instance of that class, and, if so, create an instance of the corresponding view.
Iterate through a list looking for a match? Sounds inefficient. Yes, a dictionary would be faster if there were many different animal types, but I would lose one of the benefits: you can still take advantage of the inheritance hierarchy this way. For example, one of the entries could be (Bird, BirdView)
, which could display a generic picture for all the birds that I don’t have photographs for. I could make the list a little less fragile to additional animals being added.
So, this has a very slight improvement in “Object-Orientedness” over the switch statement, but it still requires careful maintenance to make sure the hard-coded list matches the rest of the code.
Also, while I appreciate the utility of features such as isinstance
for debuggers, instrumentors, object browsers and other “meta-programs”, using reflection (e.g. isinstance()
) in my regular code adds a rather pungent code smell to me.
Registration
Registration has the right feel to me, but I can’t quite get the mechanics.
The idea is that the ZebraView class, when it is first “elaborated” (to use an Ada term – this is after start up, but prior to the first instance being created), could call a class method on the Zebra class and say “Hey, if anyone comes around here asking for a “photograph” of you, send them to me.”
The Zebra class would have to know that view classes exist, to be able to store references to them in a dictionary, and to be able to return a reference to one when asked, but there would be no hard-coded knowledge of what they are, or even what “photograph” means, beyond being a cookie.
The trouble here is I am looking for inheritance behaviour on class (static) methods, and it doesn’t work that way. I can’t write a registerView
static method on the Animal class, and call it on the Zebra class, let alone have it access the Zebra class’s static data member representing only the views registered on the Zebra class.
Actually, with Python’s ability to define lambda function, decorators and adding functions to classes at runtime (Hell, it even supports dynamically specified base classes!) it might be possible to tack such functionality on, but before I massively over-engineer what should be such a common problem, I thought I would ask the lazyweb first.
So, can anyone with more MVC experience, please explain the idiom I am missing?
Thanks; one of the downsides of no longer working with dozens of expert developers is that I haven’t anywhere else to bounce the “Surely someone must have been here before” technical questions.
Comment by David on September 12, 2008
I’ve been working in Sports Admin for three months, so it’s been a while since I’ve written any code. Indeed this might be as close as I get for a while 🙂 – so apologies if my rustiness leads to a poor answer.
Couldn’t things like “normal photo”, “carcass photo”, “sound” be attributes on the animal class. Then if you pass a collection of animals to the “carcass view”, it could interrogate each animal to see if it has a “carcass photo” and if so, display it.
This seems too obvious to me, so perhaps I’ve misunderstood the question.
Comment by Tom on September 12, 2008
Switch statement in ZebraView decides what it should do when asked to render/play its associated ZebraModel. You shouldn’t interrogate the models about anything related to the view at all, nor patch the models to have this information.
Comment by Julian on September 12, 2008
David,
Perhaps my examples of views: “photo”, “carcass cartoon” and “sound” weren’t the best – or weren’t enough – to illustrate my point.
I was trying to get across three concepts:
In many situations, these constraints would not apply. A real life example that occurs to me right now was an error-logging function. We made a method on each object that would return its name in an abbreviated form, suitable for the logging engine. Then the logger could simply interrogate each object for the suitable text to include in the log. In that project, there was only one such “view” required, and the code to display the name wasn’t complex or intrusive, so I didn’t mind going with such a solution.
Comment by Julian on September 12, 2008
Tom,
Once the ZebraView is associated with the ZebraModel, we are home and hosed. (We shouldn’t even need a switch statement at that point.)
The difficulty is creating a ZebraView in the first place, given that all we have is a reference to an AnimalModel.
Comment by Aristotle Pagaltzis on September 12, 2008
You are close, but the placement of your registry is all upside down.
class View:
_registry = {}
@classmethod
def register(cls, renderables):
for rcls in renderables:
_registry[rcls] = cls
@staticmethod
def class_for(renderable):
return _registry[renderable.__class__]
Comment by Julian on September 12, 2008
Aristotle,
Thanks. That’s a definite improvement. The model objects don’t even need to know the view objects exist.*
It does use reflection/meta-data (e.g.
renderable.__class__
), which causes me to pause.It also requires that every class be expressly mentioned as renderable. If I add a Vulture animal, I must either add a VultureView or update the BirdView to register itself as able to render vultures. I could work around that by walking up the __bases__ tree, until I find an entry in the registry but that seems over the top.
Despite these minor drawbacks, this is the way I will go in the future. Thanks again!
* This isn’t entirely true. View objects often influence the design of model objects – in particular to share far more of their private data than seems appropriate, and occasionally to store extra attributes that aren’t necessary for computation (including, for example, the animal’s name.)
Comment by Aristotle Pagaltzis on September 12, 2008
The only reasonable way to avoid reflection would be to use multi-method dispatch, which is not available in Python. The above is a half-arsed single-purpose implementation of a fraction of MMD.
Comment by Sunny Kalsi on September 14, 2008
If a zebra has a method “zebraSkill”, you’d need the Zebra View to know it’s dealing with the Zebra object. I don’t believe the model should know anything about the views. This stuff is also not bound at runtime, so having a registration framework sounds wrong. I’d have a factory method which returns a view, given a model. the individual views should take their appropriate model in the constructor (ZebraView would take a Zebra). The factory (with a switch statement) would make the view itself. If you don’t like factories, use inversion of control. In java, Guice uses annotations to bind types of views to types of models.
Comment by Aristotle Pagaltzis on September 14, 2008
Sunny:
Yes, that is why there is ZebraView and not only View.
That’s why I proposed putting the registry in the abstract View class rather than the abstract Model class.
It is wrong, but the registration happens at global initialisation time, which is as close to compile time as you can get. If this were done with multiple dispatch, which is what the registry approach is emulating, it would trivially happen at compile time.
That misses the entire point of the exercise. Switch statements are the null answer in OO design. “You’ll do it like we used to do it in C and you’ll like it. Polymorphwhat?” The entire point of OO is to be able to introduce a bundle of behaviour (like add new kind of widget in a GUI toolkit) without having to touch all 314 switch statements in the program – instead, you subclass the right thing and then everywhere there would be a switch statement in a C program, you invoke a method, and dispatching routes the call to the right place. Polymorphism.
That is just a masqueraded initialisation-phase registration approach. If Julian cared, he could easily rewrite the approach I proposed as a decorator that could be used in View subclasses. In fact, Guido van Rossum implemented a very basic form of multi-method dispatch for Python in just that way. (That piece of code omits a whole range of subtleties of a good MMD implementation, though.)
Comment by Tom on September 15, 2008
The hash(-based switch) registry is an extremely slight improvement over an explicit switch written in class_for(). Especially when it’s not even dynamic (“as close to compile time as you can get”).
Comment by Julian on September 15, 2008
[Thanks, Aristotle, I was halfway through writing essentially the same comment, which I include below despite being now redundant. When I got to the bit about Guice, I had to go and read about it. Then life intervened, and I never finished writing…]
Sunny,
Absolutely true. Every proposed solution takes care of that.
I agree. I do have a caution though, that sometimes, even though the model doesn’t know about the view type, its design is affected by it. For example, each animal type may have a way of returning its English (or even Latin) name so the view can display it, even though there is no need for the object to know its name in order to work out a fight winner. The object doesn’t have a code dependency on its view, but its design was influenced by the views need for methods.
In any case, the level of coupling varies. Aristotle’s solution has the lower level.
That’s slightly too strong. I would agree that this stuff could (in theory at least) be bound at compile-time. I agree that run-time binding does seem a bit late. Scripting languages, like Python, blur the compile-time/run-time distinction.
All of the proposed solutions do this; the question is how do they do it?
So, you are proposing that I go with the Switch Statement solution discussed above. Its drawbacks are also discussed.
I don’t object to factories, I am asking about the best way to implement them.
[That’s as far as I got…]
Comment by Julian on September 15, 2008
Suppose, the Animal Fight software was being written by a huge team; it has been split up so that each class is owned by an one developer. (I am not saying this is realistic, but I am trying to illustrate the coupling issues with an extreme example.)
The Animal team have written hundreds of animal classes, and are constantly adding new ones.
The chief architect of the View team owns the top-level View class, and the View factory.
Another developer has written the BirdView class, which displays a traditional m-shaped children’s drawing of a flying bird. It is good enough for most birds, except the flightless birds.
The FBV (Flightless Bird View) team are diligently working away at adding PenguinView, OstrichView, KiwiView, etc.
In an ideal world, the architect could finish her work and then move onto another project; she never needs to care about the progress or design documents of the other teams. The developer of the BirdView class could then finish his work, and move on. Finally, the Flightless Bird View team could finish their work; they would have to have read the View Class and Bird Class design documentation (or, at least, the interfaces section).
None of the proposals offer an ideal world. (Aristotle claims it requires Multi-method Dispatch. Twelve years ago, I could have responded intelligently to this claim, but I have forgotten far more than I remember about MMD, and I can no longer see how it applies here.)
In the switch-statement world, the architect has to hang around and add extra lines to the switch statement, for every new animal class and every new view class.
In the registration world, the architect can leave. The BirdView developer has to hang around to add registrations for any new Bird objects, and remove registrations for any new views from the FBV team.
Not ideal, but a significant step forward.
Comment by Aristotle Pagaltzis on September 15, 2008
You can add new cases to that quasi-switch statement without touching the switch statement. So it achieves exactly the same as language-level polymorphism. How is that improvement an extremely slight one?
Can you name any drawbacks with this approach (other than ones which are merely due to its half-arsedness, such as the ones Julian listed)?
Comment by Aristotle Pagaltzis on September 15, 2008
The latter is what the former would fix. With proper MMD, BirdView would define a
View.for
variant that takes a BirdModel and returns a BirdView, and PenguinView would define aView.for
that takes a PenguinModel and retuns a FlightlessBirdView, and when you wroteView.for(some_kind_of_model)
then dispatch would just figure out which one of the is the best match and call that.Of course you can easily fix that particular issue in the manual implementation as well – you just have to walk the inheritance hierarchy. But there are more subtleties that real MMD accounts for. (I had thought of some I wrote my first comment, but unfortunately I can’t recall now. They are along the same lines though: selecting a useful match in the face of various edge cases.)
Comment by Tom on September 15, 2008
And I can ride my bicycle without touching the handlebars! (This is where I’m laughing as I type. 🙂
I’m not trying to argue with you — sorry if it came off that way. I even agree that the registry is an improvement, especially for the reasons listed above. I just wouldn’t underestimate a simple, readable switch statement which performs the same function.
The main difference between any of the proposed solutions has been how distributed the hash table is, and the associated ugliness of the code to fetch it.
Comment by Mr Rohan on September 15, 2008
The main advantage of the registry approach is that the UI designer can decide to associate “N” different models with one particular view without the model designer caring too much.
Although in reality models that are close to the UI are always reasonably closely tied to the UI needs and so shouldn’t been designed separately.
Comment by Richard Atkins on September 15, 2008
Very briefly, multi-methods are methods where the call is dispatched based on the runtime type of not just the method target, but also the n parameters to the method. Dispatching on the target and one parameter is thus called double-dispatch. Various implementations have different schemes for deciding which method overload matches the parameters best (where CLOS has a reputation for one of the hardest to guess defaults on corner cases), or simply throwing an ambiguous call error at compile/runtime. This is more powerful than regular single-dispatch methods, even with overloads, as single-dispatch only use the runtime type of the target and the static type of the parameters.
One other kludge for implementing double-dispatch multi-methods in a statically-typed OO language is the Visitor pattern [GoF] – create an ModelVisitor interface that declares methods visitFoo(Foo) for each subtype of the model (or any visitable object hierarchy), and add a method accept(ModelVisitor) to each Model type that dispatches back to the correct ModelVisitor.visitFoo() method. If there were any runtime object graph, you’d also need to decide what object controls how to traverse the graph, but in your example everything would be a simple Visitor calls FooModel, and FooModel calls Visitor back, but now with a type it knows, and without any messy casting going on. You would then write a ViewBuilder (this Visitor comes with a dash of Builder pattern [GoF] too) that implements the ModelVisitor interface, and in each visitFoo() method sets a view member to the new view that you want to create. A top level method would then call accept() on a Model object passing in the ViewBuilder, and then retrieve the new view from the builder.
In Python and most other dynamically typed OO languages, there’s no interface type, so it’s only by convention that there’s a ModelVisitor to implement. Otherwise, you could implement this without much trouble as well. Note that this still only helps for the one view to one model case, and it’s also not recommended for use when the graph of visitable objects is continuing to grow. It’s much better to use when the graph is steady, and you’re only adding more Visitors instead, as this means that the Visitor interface itself is stable, and you don’t have to regularly update the Visitor implementations.
Comment by Alastair on September 15, 2008
Julian,
My first reaction on reading your problem statement was to respond with just two words, “Visitor” and “Factory”, and rely on the pattern definition of both of these to fill in the details. Fortunately Richard seems to have provided some of the details himself, which is nice.
Frankly though I don’t see the need for double-dispatch. Your controller (or whatever) will know what sort of view you want (photo, cartoon carcass, or sound) and hence which factory class to use. You just need a factory class for each (eg PhotoFactory, etc). Just instantiate the factory object as needed and pass it in as a visitor to the animal.
Comment by Sunny Kalsi on September 21, 2008
OK I figured out what’s been bugging me about this: It’s that what Julian is talking about isn’t MVC. Put yourself in the position of the view. It gets given a bunch of objects and gets told “hey can you render these?” The answer should be “WTF?!? NO!”
The view (as a whole, i.e. there’s only 1 in the “application”) needs to ask the model (as a whole) questions. It can’t get given pieces of the model and be expected to know how to render it. The app is supposed to tell the view “hey view, go do your thing. Here’s the model and controller”. The view can then ask the model for it’s zebras specifically, and render them using the zebra renderer. If the model wants to do some reflection, it’s upto the model, but the view needs access to the model as a whole so it can adequately render it.
You might still be stuck in the same position, where your model just exposes a list of animals, and the view has to figure out how to render them, but ideally there’d be a variety of methods which give the view access to the objects in a way the view likes.
Comment by Julian on October 1, 2008
Sunny,
The word View is being used with several different meanings here. I think if we clarify our definitions, we’ll find we are agreeing wholeheartedly.
There is the concept that the View is all of the code in the application related to the presentation of the model. This is the one you describe. As you say, there is only one such view per application. Also, as you say, it makes little sense to present a bunch of objects to this amorphous concept, and demand that they be rendered.
Each item in the model may be presented in several ways, for example HTML versus RTF, Audio versus Text, Tables versus Charts. If you take a cross-section of all of the code related to one of these ways, you might call that a view (or perhaps a sub-view, if you want to make it distinct.)
In this way, one model might have several sub-views.
It’s clear that all this code doesn’t fit into one class.
One way to divide that code is to put all of the sub-view code into one place (e.g. all of the HTML-generating code into one class.)
Another way to divide the code is to put all of the code related to one item of the model in one place (e.g. all of the view code specific to Kangaroos into one class.) Assuming that is the case, it is natural to call this the KangarooView class, because it deals with the presentation of the Kangaroo element of the model.
It may well be that the KangarooView needs access to the whole model, but chances are, it really only needs access to the Kangaroo object within that model.
Indeed, that is a natural dilemma to be in. Consider a Cage class, which aggregates a number of Animals. It is natural to have a collection of animals, and an iterator over all of them (as opposed to one iterator over all the Aardvarks, another over all the Aardwolves, etc.). When it comes time to draw pictures of those animals, or generate an audio file of what such a cage would sound like, it is the view’s job to work out what the animals are.
This entire post is about how to best do that task.