Let's see the problem in an simple enough example, which works with very few components to make it concise but enough to demonstrate the point. So we have the following classes
interface Compressor { CompressedContent compress(UncompressedContent uncompressedContent); } interface CompressedContent { ... } class UncompressedContent { ... } class ContentDownloader { void downloadCompressAndSave(Compressor compressor) { UncompressedContent uncompressedContent = downloadContent(); CompressedContent compressedContent = compressor.compress(uncompressedContent); save(compressedContent); } } class ZipCompressor implements Compressor { ... } class RARCompressor implements Compressor { ... }
Our mini application downloads contents from God knows where, compresses and saves them. Different compressing algorithms can be plugged-in easily (Strategy pattern). The ContentDownloader uses other components too, but they are not relevant for our example, so let's just say ContentDownloader from now represents a bunch of classes. Similarly there can be several other implementations of Compressor. The question is, how would you package your components?
In the following, lacking both a UML visualisation plugin and the commitment to get one I will use a simple notation. Dependencies between packages will be represented by '-->' and packages by ( <class1>, <class2>, ...). So the structure where the package containing classA and classB depending on package containing classC and classD, and both are in the same superpackage will be represented by
( (classA, classb) --> (classC, classD) )
Solution 0
(ContentDownloader, Compressor, ZipCompressor, RarCompressor)
Everything in one bag. Obviously it's not a good solution. The packaging should one way or another represent the structure of the application and tell the developer something about it.
Solution 1
(ContentDownloader) -> (Compressor, ZipCompressor, RarCompressor)
Seems better. We've confined all the compressor code (interfaces and implementations) into one package. However I like having classes belonging to the same abstraction level in a package. Having interfaces and their implementations in the same place, I feel, violates this.
Solution 2
( ContentDownloader -> (Compressor <- (ZipCompressor, RarCompressor) ) )
This is something I see quite frequently. For example put the façade interface under the package org.something.app and the implementation under org.something.app.impl.
Solution 3
(ContentDownloader) -> ( (Compressor) <- (ZipCompressor, RarCompressor) )
A variation when the interface is under org.something.app.client and the implementation under org.something.app.impl.
The common problem with both last approaches is that (ContentDownloader) depends on a package both containing the interfaces it uses, and the implementations it has no knowledge of. Why is it a problem? Let's stop here for a short intermezzo and ponder what we want from packages actually. What I want, for one, is that if I need to do some change in functionality, I have to touch as few packages as possible and in the packages I do have to, I want as few classes as possible not related to the change (This is called the Common Closure Principle).Why? Because unrelated components localised closely to the place of change are noise.
Or to view it from another perspective, let's a run a small hypothetical experiment. Let's assume the packages are units of release (even if not, the general idea is worth considering). Adding another implementation will require recompiling the full package ( (Compressor) <- (ZipCompressor, RarCompressor) ) and (ContentDownloader) too. This is bad, since no code change has happened in (ContentDownloader). And it wouldn't be enough to release ( (Compressor) <- (ZipCompressor, RarCompressor) ), we would have to release the bigger package containing everything. Experiment ends.
Again it's only hypothetical, because in Java the packages are not units of release. But I like to find general principles behind things on different scale, and I think structuring your deployable components (projects) or libraries has lot in common with package structuring. After all you can always decide that a submodel in your component has grown big enough to earn his place as a separate library (or embedded component). It's always a possibility, so keeping it mind (until it goes against other considerations deemed more important) while designing the package structure could make further refactorings and maintenance easier.
I hope these unorganized ramblings have made some sense and we can try another approach.
Solution 4
(ContentDownloader) -> (Compressor) <- (ZipCompressor, RarCompressor)
Seems much better. Adding another implementation only requires recompiling (and releasing) (ZipCompressor, RarCompressor) and nothing else. We reached the point I've been heading to, but actually there is another interesting variation.
Solution 5
(ContentDownloader,Compressor) <- (ZipCompressor, RarCompressor)
This structure can be justified by pointing out that the ContentDownloader directly depends on the Compressor, so for the sake of package cohesion it might be a good idea to put them the together. This is what Martin Fowler calls Separated Interface. After all for the developer of the ContentDownloader the implementation of the Compressor is an irrelevant detail he doesn't even want to know about. But code change in the ContentDownloader would lead to recompiling (ZipCompressor, RarCompressor), so for now I drop this solution.
Drawing conclusions
What happened here is we started with an initial state of (if we don't count solution 0)
(ContentDownloader) -> (Compressor, ZipCompressor, RarCompressor)
and transformed it to
(ContentDownloader) -> (Compressor) <- (ZipCompressor, RarCompressor).
We had a depender and a dependee package. We extracted the visible part of the dependee package into a new package and both the depender package and the remainder of the original dependee package depends on that now.
Doesn't it resemble something? If we replace the word "package" to "class" and "visible part" to "interface" what we get is the appliance of the DIP (Dependency Inversion Principle). An example with classes and interfaces:
UserService ------uses------> UserRepository <----implements----- MongoUserRepository
And (not so) surprisingly there is an equivalent existing principle for packages, called SDP (Stable Dependencies Principle), coined by the same Robert C. Martin, who has come up with DIP (or SOLID). If you haven't heard of them yet, I strongly advise to follow the links above.
Back to the game, after reinventing the wheel, I'm quite content with the result. We have found a general principle not only to justify the "rightness" (from at least one point of view) of our chosen package structure at the end, but it might even give us some practical advices how to achieve it. To put it in one sentence, if we have the package structure
(A) -> (B)
then move all classes in B that are not referenced directly from A to a new package B2, rename B to B1 and the dependencies will look like
(A) -> (B1) <- (B2)
Of course this is a simplistic example with only 2 packages involved, but I think the general idea can shine through. And of course there is a lot of other important factors, package cohesion for example, we haven't really taken into consideration. In the following posts I want to explore those and investigate the "usefulness" of some package coupling metrics.