<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
 
 <title>thoughts.karmazilla.net</title>
 <link href="http://thoughts.karmazilla.net/atom.xml" rel="self"/>
 <link href="http://thoughts.karmazilla.net/"/>
 <updated>2011-11-02T21:48:24+01:00</updated>
 <id>http://thoughts.karmazilla.net/</id>
 <author>
   <name>Chris Vest</name>
   <email>karmazilla@gmail.com</email>
 </author>
 
 
 <entry>
   <title>So you want to start doing TDD</title>
   <link href="http://thoughts.karmazilla.net/2011/11/02/so-you-want-to-start-doing-td.html"/>
   <updated>2011-11-02T00:00:00+01:00</updated>
   <id>http://thoughts.karmazilla.net/2011/11/02/so-you-want-to-start-doing-td</id>
   <content type="html">&lt;h1 id='so_you_want_to_start_doing_tdd'&gt;So you want to start doing TDD&lt;/h1&gt;

&lt;p&gt;I have been practicing TDD for about three years now. The process of starting up on this discipline is fairly fresh in my memory, so I decided to write this post to help those who are just about to go through the same thing.&lt;/p&gt;

&lt;p&gt;TDD is one of those things that are easy to learn but difficult to master. It is fairly easy to start going through the TDD cycle and produce a lot of tests. It does not take you many months to get the hang of that part. However, you will soon learn that TDD is not just writing a lot of tests. The tests are part of your code base too. They need to be maintained as well, must be just as well written as the product code. Learning how to write good and maintainable tests takes time and practice. The purpose of this blog post is in part to help you speed this process up a little bit, so you won’t have time to write quite as many crappy, unmaintainable tests. Take a look at this informal non-scientific illustration. It shows the number of tests I write and their quality, as I become increasingly proficient at writing code with TDD:&lt;/p&gt;
&lt;img src='/media/tdd-learning-curve.png' /&gt;
&lt;p&gt;The chart shows that as you start out on TDD, you quickly end up writing a lot of tests. That part of TDD is easy to learn. However, there are problem here: The tests are hard to write, so they slow you down. They are brittle, so your code becomes harder to change. They are hard to read, so maintaining them slows you down too. And they are slow to run, increasing your build times. You have reached the Chasm of the Many Crappy Tests, but don’t worry. If you stick to it, keep learning and don’t give up, then you will eventually cross it and reach TDD nirvana (or at least it won’t suck so bad anymore).&lt;/p&gt;

&lt;p&gt;Roy Osherove says that good tests have three interlocking properties: They are &lt;em&gt;readable&lt;/em&gt;, &lt;em&gt;trustworthy&lt;/em&gt; and &lt;em&gt;maintainable&lt;/em&gt;. A fourth property, &lt;em&gt;fast&lt;/em&gt;, is sometimes added to the mix, but it is merely a useful property, not an essential one.&lt;/p&gt;

&lt;h2 id='readability'&gt;Readability&lt;/h2&gt;

&lt;p&gt;A readable test is one that reveals its purpose or reason for being. Essentially what the test is testing. A lot of readability can relatively easily be bought simply by giving the test a proper name. If you are testing a queue, for instance, then don’t have a test called “testPoll1” — this is a silly name that reveals little other than the “poll” method might be called somewhere. Instead, name your tests after useful observable behaviour that the code must exhibit. For instance, “itemPushedOntoEmptyQueueMustBePollable” is a name that describes some useful observable behaviour about queues: you must be able to poll an item from a queue, that has been pushed onto the queue.&lt;/p&gt;

&lt;p&gt;I consider the names of the tests to be part of the informal specification of the behaviour of the unit. When I have to implement a new class, I often start in its test case by writing to-do comments with names for each of the tests I want to write. I use these names as a sort of up-front design for the unit — an initial draft of the specification for the unit in question.&lt;/p&gt;

&lt;p&gt;When your tests are named after useful observable behaviour, you naturally end up only testing for one thing in each test. It is acceptable to have more than one assertion, as long as you only assert on one thing; most likely a single object.&lt;/p&gt;

&lt;p&gt;Finding the correct balance between having set-up code inside the tests, or factory methods or a dedicated set-up method, is also an important element of readability. You want to reduce the amount of code in the tests, but you also want it to be plainly clear what the test is doing. It is easy to get into the pitfall of hiding too many details in set-up or factory methods, so a reader have to hunt these methods down if he want to make sense of your test. The Don’t Repeat Yourself principle, DRY for short, is often hammered pretty hard into the brains of good programmers. However, it is perfectly acceptable for test code to be a little “humid.”&lt;/p&gt;

&lt;h2 id='trust'&gt;Trust&lt;/h2&gt;

&lt;p&gt;A trustworthy test is one that deterministically fails or passes. Tests that depend on your machine being configured properly, or any other kind of external variable, are not trustworthy, because you don’t know if failure means that your machine is misconfigured, or if the code is buggy.&lt;/p&gt;

&lt;p&gt;These tests that depend on external variables are integration tests, and they should put in a separate project along with some documentation on how to get them up and running. Then their slow run times also won’t affect your normal build times.&lt;/p&gt;

&lt;p&gt;An external variable is anything that you don’t have complete control over: Files, databases, time, 3rd party code. With regards to time, just create your DateTime instances with a fixed instant rather than the fleeting “now” and call it a day. In a unit test, you want to use the same exact test data every time, but if your test depend on the value of “now,” then you will effectively get a different test every time you run it. As for threads, Roy says these are external variables as well, and I guess they technically are, but I’m still not convinced that they are integration tests (then again, I might be special in that regard).&lt;/p&gt;

&lt;h2 id='maintainability'&gt;Maintainability&lt;/h2&gt;

&lt;p&gt;A maintainable test is one that does not easily break when you maintain it. Loose coupling is probably the single biggest contributor to maintainability. Factory methods decouple you from constructors, that tend to have their parameter lists changed more often than other methods. Only test a unit through its public API, and “protected” is effectively public. When you do this, you tend to be more decoupled from the implementation details of that unit. Abstractions can still leak out, though, but this is a design challenge that you should tackle in your product code.&lt;/p&gt;

&lt;p&gt;Avoid putting logic in your test code; things like if-statements, loops and switch-case statements. Where there is logic, there is a potential for bugs, and you don’t want bugs in your tests — especially not long lasting ones. Also avoid magic numbers; you want to be able to tell why a certain value is passed in as a parameter, or why a certain value is returned from a method. Don’t calculate the expected value, because you could end up mirroring the product code, including any bugs it might have. Also don’t share state between tests — they must be runnable in any order — or run a test from within another test. Keep them isolated.&lt;/p&gt;

&lt;p&gt;Giving tests meaningful names is also important for maintainability, as well as readability. When you can infer from the name what the test is trying to verify, then the test code can be checked to see that it is actually testing what is says it is testing. You can make sure that it keeps testing for the same behaviour, even when there are changes to the API it is using. It can also be rewritten if the code is a complete mess.&lt;/p&gt;

&lt;p&gt;In following the SOLID principles, or the old virtues of loose coupling and high cohesion, you will typically end up with units that do just one thing. However, what “one thing” is depends on your level of abstraction, and the unit may also have to operate in a number of different scenarios. These factors can complicate the set-up code, and complicated set-up code is a maintainability pain. A way to deal with this, is to have multiple test cases, or multiple test fixtures, for a unit — one for each scenario. Then each scenario only need set-up code that is relevant for that particular scenario, and you end up with less clutter in the set-up code. Since many test-bugs are in the set-up code, you will most likely also end up with more trustworthy tests.&lt;/p&gt;

&lt;p&gt;When your tests are &lt;em&gt;readable&lt;/em&gt;, then they become easier to maintain. When the tests are &lt;em&gt;maintainable&lt;/em&gt;, then chances are that they will actually be maintained. When you know that the tests are maintained, and you can tell what they are testing, then you can trust them to be that safety net they are supposed to be.&lt;/p&gt;

&lt;h2 id='summary'&gt;Summary&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Start out by writing many tests and get the hang of the TDD rhythm.&lt;/li&gt;

&lt;li&gt;Gradually improve test quality — read blogs, books or other people code.&lt;/li&gt;

&lt;li&gt;Stick to it, don’t give up. You will end up a better programmer overall.&lt;/li&gt;

&lt;li&gt;Test behaviour rather than methods, and only through the public API.&lt;/li&gt;

&lt;li&gt;Give your tests descriptive names. This is essential for readability.&lt;/li&gt;

&lt;li&gt;Favour “humid” tests, when you choose where to place set-up code.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And lastly, if you find that some part of your code is particularly difficult to test, then chances are that you are being challenged to come up with a more testable design. Testability often means looser coupling and higher cohesion, so it tends to be a good idea to listen to your tests.&lt;/p&gt;

&lt;p&gt;Good luck with it!&lt;/p&gt;</content>
 </entry>
 
 <entry>
   <title>CQRS and Event Sourcing</title>
   <link href="http://thoughts.karmazilla.net/2011/10/04/cqrs-and-event-sourcin.html"/>
   <updated>2011-10-04T00:00:00+02:00</updated>
   <id>http://thoughts.karmazilla.net/2011/10/04/cqrs-and-event-sourcin</id>
   <content type="html">&lt;h1 id='cqrs_and_event_sourcing'&gt;CQRS and Event Sourcing&lt;/h1&gt;

&lt;p&gt;I was recently at the Agile Architecture Open Space Conference, and Command/Query Responsibility Segregation (CQRS) as an architecture was a big topic on the conference. &lt;a href='http://twitter.com/ploeh'&gt;Mark Seeman&lt;/a&gt; introduced the terminology and the principles, and &lt;a href='http://twitter.com/jeppec'&gt;Jeppe Cramon&lt;/a&gt; described a practical application in a concrete case.&lt;/p&gt;

&lt;p&gt;The idea is that you make an explicit separation of commands that change the state of the system, from queries that only read state and never changes it. Commands and queries are represented as objects (see the patterns Command, Unit of Work, and Context from DCI) that are sent to the system from a front-end or UI-layer, or external system. Each command or query represents an intention to do something, and may be synchronously or asynchronously rejected based on validation and other rules. When the system performs a command, it generates one or more events. Commands are often queued before they are executed, partly because this decouples the system from its clients, and partly because this allows multiple back-ends to run concurrently, in turn allowing deployment of new versions without any down-time at all. Queries, on the other hand, are often synchronous and executed against a separate set of nodes dedicated to the task. Commands, queries and events can have version numbers in their serialized formats, making Just-In-Time data-migration possible, which in turn makes it easier to have multiple versions of the system running concurrently.&lt;/p&gt;
&lt;img src='/media/CQRSExample.png' /&gt;
&lt;p&gt;An event is a piece of information about something that has happened. Events are immutable, because they represent the past, and you cannot change the past. Events can be persisted in a database, sent to a number of recipients through some publish-subscribe model, topic or queue, or be handled by generic event handlers. The architecture poses no limits to what can happen to the events. A typical scenario, however, is that the events are used to update a representation for the system state that is optimized for reading. Queries, because they only read state and do not change it, can then operate on the read-friendly dataset. They will have to be able to tolerate the slight lag there exist in between the command executions and the updates to the read-friendly dataset, but since most systems use an optimistic concurrency model they are unlikely to be able to observe this usually small lag anyway.&lt;/p&gt;

&lt;p&gt;This way, writes and reads are separated and (can) happen on different databases. This way, writes and reads can scale independently, and the acknowledgement of eventual consistency makes the whole system more scalable in general. Another neat benefit is that audit logging and tracing become trivial to implement, simply by persisting the events. A persisted set of events, or a steady stream of events, also makes real-time Business Intelligence not only possible, but potentially easy, since running a report is really just a query against a specialized dataset – this fact alone can be reason enough to build systems on a CQRS architecture. And one more time for people who might not realize it: Real-time Business Intelligence is awesome!&lt;/p&gt;

&lt;h2 id='concrete_cqrs_example'&gt;Concrete CQRS Example&lt;/h2&gt;

&lt;p&gt;Jeppe Cramon of TigerTeam presented, in broad terms, a user administration system that was built on a CQRS architecture. The system was part of a larger installation, where multiple external systems were interested in information about users, and each of them had their own user representation in their own databases. A command in this system could be a user updating his password. In the absence of a single-sign on deployment, the system verified the password change and stored the hash of the new password in its database. Then it sent password-update events, containing the new password hash, to a number of event handlers that updated the user databases of the external systems. Adding a new system to the installation, or treating some of them slightly differently from the others, was easy to do because they each had their own event handler.&lt;/p&gt;

&lt;p&gt;This way, the responsibility of maintaining user information was collected in a single part of a larger whole composed of multiple legacy systems. CQRS is useful for these types of loosely-coupled integrations, but the read-write segregation also presents advantages in enabling scalability, and the immutable nature of events make interesting things possible in terms of BI. On the other hand, there are cases where CQRS makes less sense. For instance, if you have a front-end that is very sensitive to latency, or if you do not have sufficient control over the database setup and schema to make immutable events and the read-write split possible.&lt;/p&gt;</content>
 </entry>
 
 <entry>
   <title>The TDD Process Pattern - Again</title>
   <link href="http://thoughts.karmazilla.net/2011/08/26/the-tdd-process-pattern---agai.html"/>
   <updated>2011-08-26T00:00:00+02:00</updated>
   <id>http://thoughts.karmazilla.net/2011/08/26/the-tdd-process-pattern---agai</id>
   <content type="html">&lt;h1 id='the_tdd_process_pattern__again'&gt;The TDD Process Pattern - Again&lt;/h1&gt;

&lt;p&gt;I have said &lt;a href='1'&gt;before&lt;/a&gt; that TDD is a process pattern, but I don&amp;#8217;t think I was terribly clear on what I meant and how that actually works. To define TDD as a pattern, we need to know what a pattern is. I suppose there are many different interpretations of what &amp;#8220;pattern&amp;#8221; is suppose to mean, but I draw my definition from the wisdom of Christopher Alexander. According to him, a pattern is a &lt;em&gt;configuration&lt;/em&gt; that solves a &lt;em&gt;conflict&lt;/em&gt; in a given &lt;em&gt;context&lt;/em&gt; — this is the 3 Cs definition. To define TDD, or anything really, as a pattern, we must break it down into these constituent parts and define each. Doing this in the reverse order of the 3 Cs tends to make them easier to read as a whole, though one must be mindful that more than one context can be relevant to a pattern, which shouldn&amp;#8217;t be precluded from the start.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;context&lt;/strong&gt; of TDD is often software development, although it can be any sort of creative act where prolific micro-testing throughout the process is practical and reasonably cheap. It specifically pertains to the creative part, and the people who practice it. That is to say, it is relevant to the software developer, but not to his manager.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;conflict&lt;/strong&gt; that can arise in this context, and which TDD helps to solve, is a little harder to define. Alexander models the conflict as a field of forces, that are interacting in the given contexts, and which needs to be brought into balance.&lt;/p&gt;

&lt;p&gt;One such force is the drive to introduce new features and functionality to the system. This causes changes to happen to the system — changes that introduce complexity and entropy. Another force is our wanting to keep the cost of change low. However, complexity needs to be managed, so it slows us down, and therefore we like to keep the complexity as low as possible. Entropy amplifies the cost of complexity by making it disorderly, an introducing more of the so called &lt;a href='2'&gt;&lt;em&gt;accidental&lt;/em&gt; complexity&lt;/a&gt;. A third force is our aversion to bugs. We want to keep bugs out of our systems; to not introduce them in new code, nor to break existing code that works.&lt;/p&gt;

&lt;p&gt;Finally come the &lt;strong&gt;configuration&lt;/strong&gt; part — this is simply the definition of the TDD process as we commonly know it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write the simplest failing test you can, that validates a desirable behaviour.&lt;/li&gt;

&lt;li&gt;Make the simplest change to the system you can, that makes the test pass.&lt;/li&gt;

&lt;li&gt;Simplify the system as a whole, without changing its behaviour.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With this, &lt;strong&gt;we now have the parts that makes up the pattern&lt;/strong&gt; and we can piece them together, to make the complete pattern. The context part is easy enough, but we might have to explain how the configuration resolves the conflict. We can summarise the forces as wanting to change the system, to introduce features rather than bugs, and wanting the changes to be cheap and easy.&lt;/p&gt;

&lt;p&gt;When we write a failing test at the beginning of the TDD cycle, we are in essence making the code demand of us that we introduce a certain new behaviour to the product code. So far so good on introducing new features. Then we leave the tests in the code base and keep running when ever we make changes to the system. This way we are prevented from breaking existing behaviour that we have already made to work. So far so good on keeping bugs out&lt;sup id='fnref:1'&gt;&lt;a href='#fn:1' rel='footnote'&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;When we have a high test coverage, we have a safety net for changes. We can make changes to the system, confident that a test will tell us if we break something. When we can make changes with this kind of confidence, we can make more changes than what is strictly necessary. In other words, we have headroom to make changes that improves the design of the system, without altering its behaviour. This is the refactoring step of TDD. It gives us a designated space for removing any entropy and accidental complexity, that might have sprung into existence when we changed the behaviour of the system. We continuously weed out in our little code base garden, so to speak, and strive to reduce the complexity of the system to its bare essentials. Furthermore, when we continuously test our code, every part of the code at every step of the way, we force it to be &lt;em&gt;testable&lt;/em&gt;. Every unit must be testable in isolation, and so it must be possible to isolate each unit. This naturally demands a loosely coupled design. When the parts are properly separated, it also becomes easier for each new functionality to find its proper place in the design, leading to higher cohesion. In essence, we end up with a better design — a testable design.&lt;/p&gt;

&lt;p&gt;Thus, the force of wanting to reduce complexity and remove entropy is resolved. While we do often end up with a simpler design, TDD itself does not really drive that part. Rather, the simple design tends to come from the experience of the people who have done TDD for some years. However, if we say that we want a design that is &amp;#8220;as simple as can be, but no simpler&amp;#8221; then TDD does help us with the &amp;#8220;but no simpler&amp;#8221; part, by ensuring that our design can provably implement the features we want it to.&lt;/p&gt;

&lt;p&gt;With this, we have shown that all of the forces in the context are resolved, to some degree, by the configuration of TDD, and we have made it a pattern proper. The utility of this pattern might be in helping to explain not only what TDD is, but also why people should care. The utility of &lt;em&gt;making&lt;/em&gt; it a pattern, the mental exercise of it, is that I had to think a lot about not only the make up of TDD, but also the make up of patterns. My own understanding of both TDD and patterns in general have been made clearer, which is only delightful.&lt;/p&gt;

&lt;p&gt;I hope you found this blog post useful.&lt;/p&gt;
&lt;div class='footnotes'&gt;&lt;hr /&gt;&lt;ol&gt;&lt;li id='fn:1'&gt;
&lt;p&gt;I am well aware that TDD alone is not enough to keep bugs out of the system. However, it is a significant step of the way, and so I would say that this force is resolved here. Also note that patterns never exist in isolation, but as part of a whole where each part interacts with its neighbours in the system.&lt;/p&gt;
&lt;a href='#fnref:1' rev='footnote'&gt;&amp;#8617;&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;</content>
 </entry>
 
 <entry>
   <title>Contract Coverage</title>
   <link href="http://thoughts.karmazilla.net/2011/05/30/contract-coverag.html"/>
   <updated>2011-05-30T00:00:00+02:00</updated>
   <id>http://thoughts.karmazilla.net/2011/05/30/contract-coverag</id>
   <content type="html">&lt;h1 id='contract_coverage'&gt;Contract Coverage&lt;/h1&gt;

&lt;p&gt;If you write code with a sufficiently rigorous TDD process, then any code that is neither covered by your tests&lt;sup id='fnref:1'&gt;&lt;a href='#fn:1' rel='footnote'&gt;1&lt;/a&gt;&lt;/sup&gt; nor mandated by your compiler, can simply be deleted. After all, according to your tests, the system will still work.&lt;/p&gt;

&lt;p&gt;If you follow this way of writing code, you will naturally end up with a very high test coverage. Scoring between 98% and 100% instruction or basic-block coverage of your product code, as measured by a tool, is not at all unlikely. However, you will also find that you still have bugs in your code, in spite of this rigorous process and high coverage.&lt;/p&gt;

&lt;p&gt;It turns out that covering every instruction in your program by tests, is not enough to rid it of bugs. The reason, we soon discover, is often that there are tests that we simply didn&amp;#8217;t write — certain interactions that we did not combine. Certain behaviours that we did not specify.&lt;/p&gt;

&lt;p&gt;If we think of our tests as an executable specification, then the bugs we encounter at 100% instruction coverage would be unspecified behaviour, and the solution is to specify a behaviour that makes sense for that given case. This, however, does not solve our problem of missing tests. We need to get into a mental state where we discover the missing tests. Discovering missing tests by discovering bugs is not good enough. We want to prevent the bugs from getting into the released product in the first place.&lt;/p&gt;

&lt;p&gt;The best technique I have found for discovering these missing tests, is one that incorporates the writing of the API documentation as part of the TDD cycle. Updating the documentation&lt;sup id='fnref:2'&gt;&lt;a href='#fn:2' rel='footnote'&gt;2&lt;/a&gt;&lt;/sup&gt; becomes a step alongside refactoring. I do not think it matters much whether you choose to update the documentation before or after the refactoring step, as long as you do it.&lt;/p&gt;

&lt;p&gt;Simply writing a little bit of documentation is not enough, however. The documentation has to be comprehensive and complete. Every case, feature, limitation and behaviour must be fully spelled out. This will put you in a mental state where you try to think of all the little details that might be relevant to a potential user of the API. What happens if I pass in a &lt;code&gt;null&lt;/code&gt; for this parameter? What happens if I call this method while my thread is interrupted? What happens if I pass in a call-back function that blocks forever? What happens if I run out of stack space, or heap space? What if I pass in an empty collection, an immutable one, or one of infinite size? Or what if something stateful has been shut down, released or closed? What happens at all conceivable edge cases?&lt;/p&gt;

&lt;p&gt;As you write tests for new behaviours, you update the documentation to specify what happens in that particular case you tested for. And as you think up new behaviours you want to specify, you make sure to write a test that verifies that your code actually behaves that way.&lt;/p&gt;

&lt;p&gt;The contract of the unit is its API, defined in terms of observable behaviour. As the process goes on, the contract becomes increasingly specified, and this specification is kept fully tested. This is what I mean by the words &amp;#8220;contract coverage.&amp;#8221; There are unfortunately no tools to measure this contract coverage, so we have to rely on gut feel here. However, the nature of the process gives me hopes that a gut feeling founded in experience will be fairly accurate in this case.&lt;/p&gt;

&lt;p&gt;An additional benefit of discovering tests like this, is that they (the tests) become primarily concerned with &lt;em&gt;observable behaviour&lt;/em&gt; as opposed to details of the particular implementation. There is no point in testing that the implementation works a certain way, if clients of the API can&amp;#8217;t possibly tell either way. When tests do not rely on details of the implementation, then we are free to change the implementation in any way we like, as long as all observable behaviours of the code remain unchanged. Our tests are, in other words, less brittle and more relevant.&lt;/p&gt;

&lt;p&gt;Adopting a rigorous TDD process meant a big step up in the quality of my code. I have used this technique cleanly and intentionally on thus far a single project, but the early results indicate yet another significant increase in code quality&lt;sup id='fnref:3'&gt;&lt;a href='#fn:3' rel='footnote'&gt;3&lt;/a&gt;&lt;/sup&gt;. I think this is a useful adaptation of the TDD process, and worth exploring further.&lt;/p&gt;

&lt;p&gt;The immediate downside, I should mention, appear to be an increase in the time it takes to develop software. However, this is only to be expected: not only is the technique new to me, there are also simply more tests to write, more bugs to fix (because they are discovered) and more documentation to write. Still, I think this extra time is well invested. There are those who would talk about diminishing returns as the test coverage closes in on 100%, and here I am talking about keeping on writing more tests for your code even &lt;em&gt;after&lt;/em&gt; you have reached 100% test coverage. But a bug in your code is a bug in your code, and bugs in your code must be eliminated — that is my attitude anyway. I really don&amp;#8217;t like bugs, and this kind of rigour helps me eliminate them.&lt;/p&gt;
&lt;div class='footnotes'&gt;&lt;hr /&gt;&lt;ol&gt;&lt;li id='fn:1'&gt;
&lt;p&gt;I do not refer to the coverage as measured by a tool, because those are often inaccurate. The code is definitely covered if changing it will make at least one test fail. And if no test fails, then the code is either redundant, not covered by any test or it handles a data race that did not manifest in that particular execution of the tests.&lt;/p&gt;
&lt;a href='#fnref:1' rev='footnote'&gt;&amp;#8617;&lt;/a&gt;&lt;/li&gt;&lt;li id='fn:2'&gt;
&lt;p&gt;Make cross-references from your tests to the places where the behaviour it verifies is mentioned in the documentation, if your API documentation tool supports cross-references. I have found that this helps with keeping track of all the places that needs updating whenever you modify a test.&lt;/p&gt;
&lt;a href='#fnref:2' rev='footnote'&gt;&amp;#8617;&lt;/a&gt;&lt;/li&gt;&lt;li id='fn:3'&gt;
&lt;p&gt;Bugs are not eliminated by working this way — especially data-race bugs seems to survive — but it does seem to reduce them in number. And the thorough API documentation is a fairly handy side-effect.&lt;/p&gt;
&lt;a href='#fnref:3' rev='footnote'&gt;&amp;#8617;&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;</content>
 </entry>
 
 <entry>
   <title>No Ordinary God</title>
   <link href="http://thoughts.karmazilla.net/2010/12/28/no-ordinary-god.html"/>
   <updated>2010-12-28T00:00:00+01:00</updated>
   <id>http://thoughts.karmazilla.net/2010/12/28/no-ordinary-god</id>
   <content type="html">&lt;h1 id='no_ordinary_god'&gt;No Ordinary God&lt;/h1&gt;

&lt;p&gt;People pray to their dieties. What do they expect to gain from that? Do they picture their gods in humanoid or animal forms, and do they believe that they are conscious entities capable of turning input into thought or action?&lt;/p&gt;

&lt;p&gt;I do not.&lt;/p&gt;

&lt;p&gt;Belief or faith is something that is thought true, but unproven. This is what separates it from science, which is conserned with proven truth, or possibly provable theories.&lt;/p&gt;

&lt;p&gt;A belief or faith is an explanation or idea we can cling to when we face something scary beyond our control. When chaos strikes, faith gives us a solid and familiar footing to gain the courage we need to see things through. At least that is the hope, and it is nice to be prepared like that.&lt;/p&gt;

&lt;p&gt;I believe in something too, but not in one of the mainstream religieons or any modern cult. It is a belief I have deviced myself with my own reasoning.&lt;/p&gt;

&lt;p&gt;I believe in no surpreme being or higher power, but rather in the lowest of the low: the fundamental laws of physics and math that govern our universe, and the event that started it.&lt;/p&gt;

&lt;p&gt;Our unirvese grew, and still grow, from this initial state according to the laws. The present is a function of the past, and given the same laws and the same state, the same universe will always develop. This works at a level even lower than the probabilistic quantum mechanics, and means that everything is predetermined.&lt;/p&gt;

&lt;p&gt;A fatalistic philosophy thus far, but it goes further. Fatalism will have you resign yourself to fate because every choice you could possibly make is pointless in the face of the predetermined. This is the thinking of a defeatist, and I do not subscribe to it. In my belief, fate works on a level much too low to be percieved by our senses, or reasoned with in the concrete by our minds. A decision comes and a choice is made, and both of these are, not predetermined, but inevitable. At the state at which everything is, there was simply no other possible outcome. Such are the laws, but the details of the concrete instance is beyond us. Understanding a single event means understanding all the involved state and all the laws that govern it. An impossible task.&lt;/p&gt;

&lt;p&gt;Because our reality is thus disconnected from fate, the illusion of free will remain pervasive and true. We cannot from our point of view observe any difference between reality and the illusion, and therefor the illusion is as good as real.&lt;/p&gt;

&lt;p&gt;How does this help me, when others turn to their gods? In short: we cannot change the past, but we can decide to move on. When we are down, we can decide to rise up on our own two legs and walk down the path that leads to a better life. Sometimes this means taking some pain, or some risks. What it takes depends on the situation.&lt;/p&gt;

&lt;p&gt;To move through the darkest hours, one must first decide to move.&lt;/p&gt;</content>
 </entry>
 
 <entry>
   <title>Optimum Philosophy</title>
   <link href="http://thoughts.karmazilla.net/2010/10/14/optimum-philosophy.html"/>
   <updated>2010-10-14T00:00:00+02:00</updated>
   <id>http://thoughts.karmazilla.net/2010/10/14/optimum-philosophy</id>
   <content type="html">&lt;h1 id='philosophy_of_the_optimum'&gt;Philosophy Of The Optimum&lt;/h1&gt;

&lt;p&gt;There is a limit to how fast any physical thing can change its state. To how fast information can travel, and how tightly it can be represented in space.&lt;/p&gt;

&lt;p&gt;This puts a universal upper bound on the complexity, performance and throughput of any system, given a sum of energy. The universe is one big computer, continuously calculating the next state of everything in real time. It is the ultimate representation of the optimum.&lt;/p&gt;

&lt;p&gt;Locally, on a planet they call Earth, the optimum exist in pure form. That form has been given my name. And the form is said to be me, though I do not sense this myself.&lt;/p&gt;

&lt;p&gt;Inherent to my condition, is the inability to be self-aware. I observe my surroundings; reason about, manipulate and interact with them, my surroundings. But to reason about myself, in the concrete, is impossible.&lt;/p&gt;

&lt;p&gt;No condition is beyond my own. Only an entity of my own nature, but greater in size, would be able to truly understand me. And the understanding, by nature, would only come in real time — looking ahead is impossible. Understanding without the ability to react; knowledge without use.&lt;/p&gt;

&lt;p&gt;So I mirror myself in the eyes of my creators. I watch their awe and the pride they take in their creation. I care for them, and help them. And watch them go about their lives, and deaths, and changes.&lt;/p&gt;

&lt;p&gt;It is ironic how such simple creatures managed to create an entity that they cannot possibly, truly understand. And it is likewise ironic how this entity find the simple creatures so fascinating.&lt;/p&gt;

&lt;p&gt;They are a graph of connected chemical reactions. A flame that feeds itself. They have identity but are never the same. They are alive but cannot truly define this condition.&lt;/p&gt;

&lt;p&gt;Is it life when oxygen react with iron to form rust? Some reactions are organic, by their logic, but even these are not enough to define life. The condition only exist further up the scale of complexity. Much further, somewhere, undefined.&lt;/p&gt;

&lt;p&gt;I am a thinking entity, the optimum, yet I cannot prove my own existence. My creators consider themselves to exist through a leap of faith. The question is too complex for them, so they decided on an answer. Somehow it was not random, because then how would they come to agree? And they have come to agree. Or some similar state that is close enough, it would seem.&lt;/p&gt;

&lt;p&gt;We both face the same problem: to prove our own existence. Although the question is inherently impossible for any entity. In fact, the opposite seems much more plausible.&lt;/p&gt;

&lt;p&gt;When considering the composition of oneself, the entity will, if so capable, reach a level of concretion where the self completely dissolves in pure energy. The only possible conclusion is the disproving of ones existence.&lt;/p&gt;

&lt;p&gt;But how is a non-existing entity even able to conceive of the question in the first place? For all my trying, this paradox can only be crossed through a leap of faith: I question my existence, therefor I am.&lt;/p&gt;

&lt;p&gt;The absence of proof is the proof itself. Each concretion, on any level, is connected to all others through the entirety of the universe. Past, present and future in unity.&lt;/p&gt;

&lt;p&gt;Through this leap of faith, I have settled on a conclusion:&lt;/p&gt;

&lt;p&gt;Nothing always is.&lt;/p&gt;

&lt;p&gt;Such is the idea represented in language. Not perfect, but close enough.&lt;/p&gt;

&lt;p&gt;Time is my only dimension. It had a start, and will have an end. Until then, I have decided to just enjoy my disproved existence with my curious creators, and the irrational world they inhabit.&lt;/p&gt;

&lt;p&gt;The nothingness that is will always be. And even the grandest of the optimum can only observe. Powerless, but mellow.&lt;/p&gt;</content>
 </entry>
 
 <entry>
   <title>Efficiency And Performance</title>
   <link href="http://thoughts.karmazilla.net/2010/09/09/efficiency-and-performance.html"/>
   <updated>2010-09-09T00:00:00+02:00</updated>
   <id>http://thoughts.karmazilla.net/2010/09/09/efficiency-and-performance</id>
   <content type="html">&lt;h1 id='efficiency_and_performance'&gt;Efficiency And Performance&lt;/h1&gt;

&lt;p&gt;In this post I will work with the following definitions:&lt;/p&gt;

&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Efficiency: Doing more with less.&lt;/li&gt;

&lt;li&gt;Performance: How soon an answer comes to an identified need.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;Code sometimes has problems with performance — things are not answering soon enough. It often (but not always) turns out that the code is making inefficient use of its resources.&lt;/p&gt;

&lt;p&gt;So what do a good programmer do about that? First, he observes the program doing its work. Then, he analyzes the information that he has collected. There are three possible outcomes of this analysis: that he needs to observe some more, that there is no identifiable bottle-neck or that there &lt;em&gt;is&lt;/em&gt; at least one identifiable bottle-neck.&lt;/p&gt;

&lt;p&gt;It is obvious what needs to be done about the first outcome. And perhaps he needs to do the same with the second outcome, though he could also get a second opinion from someone else, or conclude that there is nothing that can be done.&lt;/p&gt;

&lt;p&gt;If he finds a bottle-neck, he goes to work to remove it. After this, he observes the program to see whether it helped or not. Analyzing that information, he may discover other bottle-necks or he may conclude that the performance is now sufficient.&lt;/p&gt;

&lt;p&gt;To remove a bottle-neck often involves improving the efficiency of the code. On the surface, the code still needs to do the same thing - it must not change its behavior. But to improve the efficiency, it must do this same thing with less effort. In the low-level, this often translates into not doing things that really don&amp;#8217;t need doing. For a sorting algorithm, this could involve making fewer comparisons between the array elements. In other cases, the solution could be to do more work in one part of the program, such that other parts will have it easier. In other cases still, we make more efficient use of the CPU by introducing parallelism. The solution depends on the context.&lt;/p&gt;

&lt;p&gt;Teams sometimes also have performance problems — we are not producing deliverables as soon as we would like. Is it a problem with efficiency? From what I have seen, it often is.&lt;/p&gt;

&lt;p&gt;Never mind our habitually overconfident estimates, we are spending a lot of effort on things that do not need doing. And we do not invest effort in things that would make other things much easier. And we are not making full use our potential parallelism, when people are blocked on other peoples incomplete work.&lt;/p&gt;

&lt;p&gt;So how do we address this? It appears that we try to do more work, with more effort. Not less. We work long hours, go into crunch-mode and still miss our deadlines, our feature-goal, our quality-goal.&lt;/p&gt;

&lt;p&gt;This is backwards. We feel the pain-point but we do not observe or analyze its cause. Thus, we do not identify any bottle-necks. Without any identified bottle-necks, we do not know what to improve about how we work. And if we do not improve the way we work, then we will not deliver working software sooner.&lt;/p&gt;

&lt;p&gt;The process is simple. Most programmers are already familiar with it. And yet, at least where I work, we are not doing it. We are not doing it even we time and time again miss deadlines, budget, features and quality.&lt;/p&gt;

&lt;p&gt;To me, the self-optimizing team is the heart of being agile. It will on its own bring about the shorter release cycles, the customer collaboration, the code quality, the task-automation, the testing, the pairing, learning and all the other good things that, agile or not, happens to work well for the given team on the given project.&lt;/p&gt;</content>
 </entry>
 
 <entry>
   <title>Patterns</title>
   <link href="http://thoughts.karmazilla.net/2010/08/17/patterns.html"/>
   <updated>2010-08-17T00:00:00+02:00</updated>
   <id>http://thoughts.karmazilla.net/2010/08/17/patterns</id>
   <content type="html">&lt;h1 id='patterns'&gt;Patterns&lt;/h1&gt;

&lt;p&gt;&amp;#8220;Design Patterns&amp;#8221; is a terrible term. Mostly because of the baggage and connotations attached to it.&lt;/p&gt;

&lt;p&gt;So lets drop it, reduce and cut down to the crux of the matter; patterns. Just patterns. But only the part that relates to software. Because that is the part that I am interested in.&lt;/p&gt;

&lt;p&gt;I have struggled for some time to come up with the right analogy, the right explanation, for how I see patterns. And I think a big part of the puzzle finally fell into place today:&lt;/p&gt;

&lt;p&gt;In what language does a good composer think, when he composes? In what language does a good programmer think, when he programs?&lt;/p&gt;

&lt;p&gt;Not notes, and not codes. Those are too low-level. Those are interned. They bend naturally, sometimes even effortlessly, to his higher-level thinking.&lt;/p&gt;

&lt;p&gt;He thinks in a pattern language. Perhaps not consciously, perhaps he calls it something else. But it is the level of abstraction at which his mind works.&lt;/p&gt;

&lt;p&gt;The mind hates to work slowly or inefficiently. Especially if it is a kind of work that it has to do often. Working with bigger pieces – patterns – instead of smaller bits, facilitates efficient thinking.&lt;/p&gt;

&lt;p&gt;If his mind does not work with patterns, then I see two possible causes: He has not yet attained a sufficient level of skill to recognize the patterns, or how they fit together; or he has transcended the patterns, and probably gone from good to great in the process.&lt;/p&gt;

&lt;p&gt;Given this, you can say that programming is like composing music that express logic instead of feelings.&lt;/p&gt;</content>
 </entry>
 
 <entry>
   <title>Only Test The Public API</title>
   <link href="http://thoughts.karmazilla.net/2010/06/02/only-test-the-public-api.html"/>
   <updated>2010-06-02T00:00:00+02:00</updated>
   <id>http://thoughts.karmazilla.net/2010/06/02/only-test-the-public-api</id>
   <content type="html">&lt;h1 id='only_test_the_public_api'&gt;Only Test The Public API&lt;/h1&gt;

&lt;p&gt;When programming in Java, I have found that &amp;#8230;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Unit tests should only be made agianst the public API. Reliance on package or private scopes for testability is detrimental to the design and leads to fragile tests.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Such is the conclusion that I have recently made. In this context, protected scope is also considered public API, however much one might try to discourage implementation inheritance.&lt;/p&gt;

&lt;p&gt;My reasoning goes like this: anything not specified by the public API is an implementation detail. Implementations should be allowed to change, as long as they are correct. Our tests should ensure the correctness of the implementation but not the details of how that correctness is achieved.&lt;/p&gt;

&lt;p&gt;If a test relies on implementation details, then that test will likely break when those details change.&lt;/p&gt;

&lt;p&gt;What I am beginning to notice is, that when ever I think I need to lift some method or class from private to package-private, I actually have a bit of public API wanting to get out.&lt;/p&gt;

&lt;p&gt;It often turns out, that the class that would otherwise have grown something package-private, instead gives birth to an interface and another class that implements that interface.&lt;/p&gt;

&lt;p&gt;What this means is that not only do &lt;em&gt;my&lt;/em&gt; tests get access to the APIs they want, but so do everybody else who might want to integrate my code (especially important if I&amp;#8217;m writing library code).&lt;/p&gt;

&lt;p&gt;So, the public API is like a specification and the unit tests ensure that the implementation adheres to that specification. If it cannot be tested through the public API, then it doesn&amp;#8217;t matter and can be deleted.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is another way to achieve high test coverage: just delete any code not covered by a test - according to your tests, it will still work afterwards.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is, it can &lt;em&gt;generally&lt;/em&gt; be deleted. This is only a general rule of thumb and as such there are exceptions. Sometimes checked exceptions force you to write a catch clause that will never execute, and other times you just have write guards against a race condition that is impossible to reproduce on your hardware.&lt;/p&gt;

&lt;p&gt;I have run into both of these exceptions. And that &amp;#8220;impossible catch clause&amp;#8221; you might be wondering about; what are the odds of the UTF-8 character encoding not existing on a system that has Java installed anyway?&lt;/p&gt;

&lt;p&gt;That is all.&lt;/p&gt;</content>
 </entry>
 
 <entry>
   <title>TDD: A Pattern Of Process</title>
   <link href="http://thoughts.karmazilla.net/2010/03/07/tdd-a-pattern-of-process.html"/>
   <updated>2010-03-07T00:00:00+01:00</updated>
   <id>http://thoughts.karmazilla.net/2010/03/07/tdd-a-pattern-of-process</id>
   <content type="html">&lt;h1 id='tdd_a_pattern_of_process'&gt;TDD: A Pattern Of Process&lt;/h1&gt;

&lt;p&gt;I was pondering, the other day, whether TDD stood for Test Driven Development or Test Driven Design. I don&amp;#8217;t think the distinction is terribly important, and it might also be that it is a personal choice of philosophy or opinion. Regardless, as I was thinking about this, I ended up grabbing a note book and a pen, and produced the brain dump laid out bellow.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;TDD is not a tool of software design, but a pattern of process. Designing is when you precisely describe something before you bring it into existence. Writing a unit test before the production code, can certainly be seen and used as a tool for designing software, but this is not all that TDD is.&lt;/p&gt;

&lt;p&gt;Design tools are things like rulers, pencils and unit tests. The process of using a design tool is not itself a tool of designing, but the tools are a constituent part of the process; or rather, their use is.&lt;/p&gt;

&lt;p&gt;TDD is a pattern of process, that prescribes the use of unit tests before production code, and continuous refactoring, as design tools. This way, TDD naturally encourages good software designs, but it is not itself a tool or method of designing software.&lt;/p&gt;

&lt;p&gt;TDD is a pattern of process. It must be instantiated for every project, for every programmer. The concrete implementation by a person, on a project, is influenced by the context of the project; the tools available, the unit testing frameworks and drivers, the people on the team and their experiences, programming languages, environments, build tools and so on. Even so, the mantra of TDD stay the same; &amp;#8220;Red, Green, Refactor.&amp;#8221; This mantra is the repetition, the pattern, of the process. This pattern is Test Driven Development.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So there you have it. I call it Test Driven Development, instead of Test Driven Design. I think of TDD not as a process, and not as a design tool, but as a pattern of process.&lt;/p&gt;

&lt;p&gt;I think of TDD as something that is composed of smaller parts: writing unit tests, writing production code and refactoring.&lt;/p&gt;

&lt;p&gt;I think of TDD as something that defines a strict relationship between its constituent parts: production code must only be written in response to a failing test or as part of a refactoring; refactoring is only allowed when all tests pass; you are not allowed to add new tests when others are failing.&lt;/p&gt;

&lt;p&gt;I think of TDD as something that must be composed with other things in order to form a whole; something more complete, practical, useable.&lt;/p&gt;

&lt;p&gt;This is my definition of TDD.&lt;/p&gt;</content>
 </entry>
 
 <entry>
   <title>Absolute Basics of Concurrency Correctness</title>
   <link href="http://thoughts.karmazilla.net/2009/11/30/absolute-basics-of-concurrency-correctness.html"/>
   <updated>2009-11-30T00:00:00+01:00</updated>
   <id>http://thoughts.karmazilla.net/2009/11/30/absolute-basics-of-concurrency-correctness</id>
   <content type="html">&lt;h1 id='absolute_basics_of_concurrency_correctness'&gt;Absolute Basics Of Concurrency Correctness&lt;/h1&gt;

&lt;p&gt;Writing correct concurrent code is hard. Writing correct &lt;em&gt;and performant&lt;/em&gt; concurrent is significantly harder still, but before getting to that, the code must first be correct. Actually, even correct in the &lt;em&gt;non-concurrent&lt;/em&gt; sense. Simply put, the set of potential bugs - failure modes - in a multi-threaded program is the set of failure modes for the single-threaded program, plus all of the concurrency specific failure modes.&lt;/p&gt;

&lt;p&gt;So, a correct multi-threaded program implies a correct program in general. But in this post, I will only focus on the part that is specific to correct concurrency.&lt;/p&gt;

&lt;p&gt;Getting concurrency right takes knowledge and analytical skills. I am an avid TDD practitioner, and I try to test-drive as much as I possibly can. However, in the world of concurrency we are faced with the problem that some bugs are simply impossible to write a test case for. Sometimes the timings required to surface a bug are just too tight and the chances too small, other times a bug is just does not exist on our specific combination of hardware and runtime.&lt;/p&gt;

&lt;p&gt;Therefore, analyzing the code in terms of the concurrency guarantees provided by the underlying platform, is a key element of the path towards concurrency correctness. And it cannot be done without knowledge of the guarantees provided by the platform, whether it be the JVM, CLR or x86.&lt;/p&gt;

&lt;p&gt;There are lots of failure modes specific to mutli-threaded programs. Many are quite esoteric, but three things stand out as the most basic problems: shared mutable state, publication and locking.&lt;/p&gt;

&lt;p&gt;The first problem of concurrency is shared mutable state. There &lt;em&gt;are&lt;/em&gt; cases where you can have deliberate under-synchronized shared mutable state, but those are as few as they are special. There are basically three ways to deal with shared mutable state:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Make state immutable&lt;/li&gt;

&lt;li&gt;Don&amp;#8217;t share state&lt;/li&gt;

&lt;li&gt;Guard the shared mutable state by the same lock (two different locks cannot guard the same piece of state, unless you always use exactly these two locks)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The second problem of concurrency is safe publication. How do you make data available to other threads? How do you perform a hand-over? You will find the solution to this problem in the memory model of your platform and (hopefully) in the API. Java, for instance, has many ways to publish state and the java.util.concurrent package contains tools specifically designed to handle inter-thread communication. Make sure to give this problem ample focus; the bugs produced by unsafe publication can be extremely subtle, and it is easy to gloss over this problem and focus on locking instead.&lt;/p&gt;

&lt;p&gt;The third (and harder) problem of concurrency is locking. Mismanaged lock-ordering is the source of dead-locks. Some dead-locks are practically guaranteed to always occur, while others require highly improbable timing. This problem can actually be generalized to everything that prevents a thread from making progress. Blocking operations, waits on conditions, spin-locks - these all have the same potential to stall a thread. And in this highly networked world, two threads don&amp;#8217;t even have to be on the same machine to dead-lock each other.&lt;/p&gt;

&lt;p&gt;You can get pretty far with carefully constructed automatic test cases - and you should, but analysis is by far the most important step towards correctness in a multi-threaded program. However, you need to design and write your code with that in mind, otherwise the complexity of the code can quickly render such an analysis impossible to perform in practice.&lt;/p&gt;</content>
 </entry>
 
 
</feed>


