Thursday, 31 October 2013

Searching for packaging guidelines - 1

Defining a satisfying package structure always gives me a headache. Sometimes I feel it "right" but until recently I've had no formal principle to rely on to confirm my hunch. Couple of weeks ago I started developing a plugin that can calculate design quality metrics and so I've dug into topic, and I think I found some useful stuff.

Let's see the problem in an simple enough example, which works with very few components to make it concise but enough to demonstrate the point. So we have the following classes
interface Compressor {
  CompressedContent compress(UncompressedContent uncompressedContent);
}
interface CompressedContent { ... } 
class UncompressedContent { ... } 
class ContentDownloader {
  void downloadCompressAndSave(Compressor compressor) {
       UncompressedContent uncompressedContent = downloadContent();
       CompressedContent compressedContent = compressor.compress(uncompressedContent);
       save(compressedContent);
  }
}
class ZipCompressor implements Compressor { ... }
class RARCompressor implements Compressor { ... }

Our mini application downloads contents from God knows where, compresses and saves them. Different compressing algorithms can be plugged-in easily (Strategy pattern). The ContentDownloader uses other components too, but they are not relevant for our example, so let's just say ContentDownloader from now represents a bunch of classes. Similarly there can be several other implementations of Compressor. The question is, how would you package your components?
In the following, lacking both a UML visualisation plugin and the commitment to get one I will use a simple notation. Dependencies between packages will be represented by '-->' and packages by ( <class1>, <class2>, ...). So the structure where the package containing classA and classB depending on package containing classC and classD, and both are in the same superpackage will be represented by

( (classA, classb) --> (classC, classD) )

Solution 0

(ContentDownloaderCompressor, ZipCompressor, RarCompressor)

Everything in one bag. Obviously it's not a good solution. The packaging should one way or another represent the structure of the application and tell the developer something about it.

Solution 1

 (ContentDownloader) -> (CompressorZipCompressor, RarCompressor)

Seems better. We've confined all the compressor code (interfaces and implementations) into one package. However I like having classes belonging to the same abstraction level in a package. Having interfaces and their implementations in the same place, I feel, violates this.

Solution 2

( ContentDownloader -> (Compressor <- (ZipCompressor, RarCompressor) ) )

This is something I see quite frequently. For example put the façade interface under the package org.something.app and the implementation under org.something.app.impl.

Solution 3 

(ContentDownloader) -> ( (Compressor) <- (ZipCompressor, RarCompressor) )

A variation when the interface is under org.something.app.client and the implementation under org.something.app.impl.

The common problem with both last approaches is that (ContentDownloader) depends on a package both containing the interfaces it uses, and the implementations it has no knowledge of. Why is it a problem? Let's stop here for a short intermezzo and ponder what we want from packages actually. What I want, for one, is that if I need to do some change in functionality, I have to touch as few packages as possible and in the packages I do have to, I want as few classes as possible not related to the change (This is called the Common Closure Principle).Why? Because unrelated components localised closely to the place of change are noise.
Or to view it from another perspective, let's a run a small hypothetical experiment. Let's assume the packages are units of release (even if not, the general idea is worth considering). Adding another implementation will require recompiling the full package ( (Compressor) <- (ZipCompressor, RarCompressor) ) and (ContentDownloader)  too. This is bad, since no code change has happened in (ContentDownloader). And it wouldn't be enough to release ( (Compressor) <- (ZipCompressor, RarCompressor) ), we would have to release the bigger package containing everything. Experiment ends.
Again it's only hypothetical, because in Java the packages are not units of release. But I like to find general principles behind things on different scale, and I think structuring your deployable components (projects) or libraries has lot in common with package structuring. After all you can always decide that a submodel in your component has grown big enough to earn his place as a separate library (or embedded component). It's always a possibility, so keeping it mind (until it goes against other considerations deemed more important) while designing the package structure could make further refactorings and maintenance easier.

I hope these unorganized ramblings have made some sense and we can try another approach.

Solution 4

 (ContentDownloader) -> (Compressor) <- (ZipCompressor, RarCompressor)

Seems much better. Adding another implementation only requires recompiling (and releasing) (ZipCompressor, RarCompressor) and nothing else. We reached the point I've been heading to, but actually there is another interesting variation.

Solution 5

 (ContentDownloader,Compressor) <- (ZipCompressor, RarCompressor)

This structure can be justified by pointing out that the ContentDownloader directly depends on the Compressor, so for the sake of package cohesion it might be a good idea to put them the together. This is what Martin Fowler calls Separated Interface. After all for the developer of the ContentDownloader the implementation of the Compressor is an irrelevant detail he doesn't even want to know about. But code change in the ContentDownloader would lead to recompiling  (ZipCompressor, RarCompressor), so for now I drop this solution.

Drawing conclusions

What happened here is we started with an initial state of (if we don't count solution 0)

(ContentDownloader) -> (CompressorZipCompressor, RarCompressor)

and transformed it to

(ContentDownloader) -> (Compressor) <- (ZipCompressor, RarCompressor).

We had a depender and a dependee package. We extracted the visible part of the dependee package into a new package and both the depender package and the remainder of the original dependee package depends on that now.
Doesn't it resemble something? If we replace the word "package" to "class" and "visible part" to "interface" what we get is the appliance of the DIP (Dependency Inversion Principle). An example with classes and interfaces:

UserService ------uses------> UserRepository <----implements----- MongoUserRepository

And (not so) surprisingly there is an equivalent existing principle for packages, called SDP (Stable Dependencies Principle), coined by the same Robert C. Martin, who has come up with DIP (or SOLID). If you haven't heard of them yet, I strongly advise to follow the links above.

Back to the game, after reinventing the wheel, I'm quite content with the result. We have found a general principle not only to justify the "rightness" (from at least one point of view) of our chosen package structure at the end, but it might even give us some practical advices how to achieve it. To put it in one sentence, if we have the package structure

(A) -> (B)

then move all classes in B that are not referenced directly from A to a new package B2, rename B to B1 and the dependencies will look like

(A) -> (B1) <- (B2)

Of course this is a simplistic example with only 2 packages involved, but I think the general idea can shine through. And of course there is a lot of other important factors, package cohesion for example, we haven't really taken into consideration. In the following posts I want to explore those and investigate the "usefulness" of some package coupling metrics.



Friday, 25 October 2013

Represent your IDs as Value Objects

I'm a big fan of creating Value Object classes for ids. It might seem over-engineering at first, so if you think so, please suspend your disbelief for a short time and let me try to show you some (3) benefits.

Let's have the following example. In your RPG game a Player can choose an Avatar for herself from a bunch. Let's assume first that you haven't followed my advice and used simply primitives (or their autoboxed versions) for ids. It's easy to imagine that the code for this use case includes an application service (or facade if you like) which does everything application services are supposed to do, and delegates to a domain service.

In the Domain all kind of checks and fiddling could ensue with passing on and on the playerId and the avatarId to various objects and methods.

//application service - orchestrates beetween domain services and objects,
//authenticates, handles transactions, ....
class PlayerAccessoriesService {
 
        public void assignAvatarToPlayer(Integer playerId, Integer avatarId) {
                authenticationCheck();
                logging();
                transactionStart();
                playerService.assignAvatarToPlayer(playerId, avatarId);
                transactionEnd();
        }      
}
//domain service
class PlayerService {
        
        public void assignAvatarToPlayer(Integer playerId, Integer avatarId) {
                avatarRepository.checkAvatarIdIsValid(avatarId);
                Player player = playerRepository.find(playerId);
                player.setAvatar(avatarId);
                playerRepository.update(player);
        }             
}
 
interface PlayerRpository {
      Player find(Integer playerId);  
}
interface AvatarRepository {
      void checkAvatarIdIsValid(Integer avatarId);  
}
So let's see the various problems with this approach!

Problem 1

Now, since both ids are Integers imagine how easy to make a small mistake and pass playerId instead of avatarId somewhere. I assure you, in a fairly complex code with many moving parts, hours of debugging can follow before you detect this mistake. Introducing id classes like


class PlayerId {
        private final Integer value;
        public PlayerId(Integer value) { this.value = value; }
        public Integer getValue() { return this.value; }
}
eliminates this problem. The new method signatures would look like

class PlayerService {
        public void assignAvatarToPlayer(PlayerId playerId, AvatarId avatarId) { ... }             
}
The difference is somewhat similar to the one between static and dynamic typing. The transformation from primitives to Value Objects happens in the Anti-corruption layer, wherever you may draw it (between application and domain layer, or outside of the application layer).

Problem 2

At some point it turns out that Integer is too limited to hold the playerIds and the decision to convert to Long is made. Without a wrapper class every single method which has it as an Integer argument must change (see the code above). It can mean a pretty large number of changes. If it's wrapped around with a Value Object, the only places we need to change are the ones when it's actually evaluated (at the boundaries of the anti-corruption layers, where it is transformed to/from primitive).

Problem 3

Let's assume the client was sloppy and sent us a Null as playerId. The code passes it around for a while before something tries to actually use it (like the database), then most probably blows up with some strange exception. If it happened deep enough (long way from the entry point, the app service), it might prove tricky to locate the source of the problem. If in our Value Object class constructor we add some validation and wrap the primitive value in it at once when it steps inside our domain, like this
class PlayerId {
        private final Integer value;
        public PlayerId(Integer value) { 
                Validate.notNull("Player Id must be non-null", value)
                this.value = value; 
        }
        public Integer getValue() { return this.value; }
}
it will blow up immediately, making the problem very easy to locate. What we actually did here is applying the Fail-fast principle.

Naming of interfaces and abstract classes

This gonna be a short one. What naming conventions (if any) do you follow for interfaces, abstract classes and implementations? Sometimes I come across the C# way of using the prefix 'I' in interfaces and feel an instant dislike. I think by referring to the type of the reference we are breaching encapsulation and going against polymorphism as well. Let's see the following example

class UserLifecyleService {
        private IUserRepository userRepository;
        public void createUser(User user) {
                //do some stuff, logging, authentication,...
                userRepository.create(user);
                //do some stuff, send out an event or whatever
        }
}
So the repository it's an interface. What if later the need arise to make it an abstract class instead? Or a concrete one? Or the other way around, it was concrete then we realised that we are supposed to depend on abstractions? What's happening here is that we've exposed some implementation detail to the client of the repository, creating an unwanted coupling. If we change the type of our component, then we need to do code change in the client (here the UserLifecycleService), too.
But since it seems a default convention in C# circles (and I've come across some cases in Java projects), I'd be interested in any counter-arguments.

Thursday, 24 October 2013

Testing different implementations of the Domain Repository interface

The post is about a simple but useful way of writing tests against different implementations of a Repository interface. Assuming you have a PlayerRepository in your Domain (this is just an interface) and two implementations in the Infrastructure layer, InMemoryPlayerRepository and MongoPlayerRepository. The latter is used in production, the former is mainly for tests.

interface PlayerRepository {
        boolean exists(PlayerId playerId); 
}
class InMemoryPlayerRepository implements PlayerRepository {
        @Override public boolean exists(PlayerId playerId) {  ...  }
}
class MongoPlayerRepository implements PlayerRepository {
        @Override public boolean exists(PlayerId playerId) {  ...  }
}
I used to maintain different test suites for them, since MongoPlayerRepository required for example a running Mongo process and a running Spring application context, while the in-memory version did not. But most tests were identical so it was simply duplicated code. The solution I'm using nowadays is extracting the tests in an abstract test template class and do implementation specific test configuration in the subclasses.

abstract class AbstractPlayerRepositoryTestTemplate {
        protected PlayerRepository testObj;
        @Test
        public void exists() { 
                // setup
                ...
                //assert
                assertTrue(testObj.exists(playerId));
        }
}
 
class InMemoryPlayerRepositoryTest extends AbstractPlayerRepositoryTestTemplate {
        @Before
        public void before() {
                testObj = new InMemoryPurchaseStore();
        }
}
 
@ContextConfiguration("classpath:applicationContext-test.xml")
@RunWith(SpringJUnit4ClassRunner.class)
class MongoPlayerRepositoryITest extends AbstractPlayerRepositoryTestTemplate {
        @Resource
        private MongoPlayerRepository mongoRepository;
        @Before
        public void before() {
                testObj = mongoRepository;
                testObj.clear();
        }
}
Of course you can use the approach for anything where there are multiple implementations of an interface, I just come across the need most frequently with Repositories.

Share your open source project with Sonatype

Two weeks ago the idea struck me, that having a Maven plugin to check DDD layer violations and optionally break the build would prove very useful in our projects. Then thanks to Kaloz the idea has improved to be more ambitious, so I've decided to

1. put a bit more design checks in the plugin
2. go to open source.

After some googling I ended up configuring a Sonatype Open Source Project Repository Hosting. This guide is a very good starting point, I'd only like to fill up some holes I felt missing in the guidance.

1. Before creating a JIRA ticket, you have to own a domain, like tindalos.org, and your groupId will reflect it. Simple way to create a domain is through Go Daddy.  

2. As the Guide points out, you should install GPG on your computer. I went very smooth on Mac (as opposed to Windows 8)

3. Put the following snippets into your settings.xml

   

4. The pom should look like


The groupId should match with the one you have specified when creating the JIRA ticket and somehow resemble to the domain you own.

5. run the following commands in command line

mvn release:clean
mvn release:prepare -Pgpg
mvn release:perform -Pgpg

6. Once you released your artifact with mvn release:perform, log in to Sonatype OSS. Find your staging repository, close it, then release (the first time I think you have to Promote, too and comment on your JIRA ticket). 

That's it. In a short while you should see your artifacts in the Central Repo.