Usability testing of application programming interfaces

Application programming interfaces, APIs, provide building blocks for programmers to compose programs such as websites or games. Often an API is being read and used by people other than the ones who design the API. To make the programmers feel powerful and be efficient, APIs should be readable and enjoyable. Like any other tool, APIs create an experience for their users. How could we support the API design so that a good user experience (UX), or in this case a good programmer experience (PX), is ensured?

I came up with an idea of a simple method to test usability of your code and the reading quality of the APIs you create:
– copy a portion of your code to an empty text file
– strip all comments
– preserve empty lines
– ask your friend, the tester, to guess vaguely what the code does
– listen the guesses and learn

I conducted one test with this method on a software library for I am designing an API. I prepared a code snippet of about 40 lines that used a hypothetical API. Then I showed it to my friend and listened how he understood the code.

The test took about 20 minutes. The benefits were immediate. For example, my code snippet had a line const m = lib.move(tile1.atMid(), tile2.atMid()). My friend guessed that it moves a tile object somehow but was unsure why it assigned something to m. In reality, the code was intended to estimate a translation between the two tile objects so that the translation can be applied later to the tile1 and possibly to other tiles too. Alone, it does not move anything, only prepares a move. Regardless I conducted the test for only a single person, the mismatch between the intended code behaviour and assumed behaviour was clear enough for me to think of a better method name.

Research exists on code readability testing. For example, see Code Readability Testing, an Empirical Study by T. Sedano 2016. To cite the abstract “The programmers report that readability testing is worth their time. They observe increases in their ability to write readable code.” My own experience with the test I conducted was in line with the notion. It is hard to see how the API looks to others when you know it too well yourself.

My method was focused on readability. How to test for the experience? How users feel? How to measure enjoyability of a lines of code? Could it be related to the question: how to assess enjoyability of a good book or a music album?

I do not know how test feelings towards code. However a few ideas came to my mind. The rest of this articles is mostly about those possible ideas. Are they good or feasible, I cannot say, so enjoy with a grain of salt.

Books and albums receive ratings and reviews. In my experience, written reviews often carry emotions felt by the reviewer before, during, and after. However, proper reviews are tedious to write and thus not feasible to measure API quality. Maybe there could be a lighter way?

There is a programming joke circling in internet that claims that “the only valid measurement of code quality is WTFs per minute.” Could it be a good approach to use only short words? It seems simple and measurable.

In addition to words, emojis are simple way to represent feelings. Could they expose the feelings in code and do it better than long reviews?

How could you guess what the following code does, expressed as a few emojis? What comes to your mind when you try to understand this code? Maybe ↔️↔️◼️? Or 💤💢💬?

const tile = t(tileEl)
const m = t.geom.move.translate(view.atMid(), tile.atMid())
view.move(m)

const s = t.geom.move.scale(view.atMid(), 0.9)
view.move(s)

const drag = tile.draggable()
drag.enable.translation()
drag.limitBy(view)

const viewdrag = view.draggable()
viewdrag.enable.scaleRotate(tile.atMid())
viewdrag.disable()

If emojis feel too minimal, what about an alternative measurement method that is also avoid of written language: Ask a friend to highlight the parts of the code that she likes. Maybe with multiple colors even. What do you think of the method?

As a wild thought, if such survey method became powered by Amazon’s Mechanical Turkey or similar crowd sourcing service, a nice set of features would become possible. For example of a feature, a WTF heat map laid over the code so that the map becomes more accurate over time. Also, it could be possible to A/B test pseudocode variants and lines until the winner is found with statistical strength. Could you even gather training data to measure code quality automatically via AI?

These methods might have a shortcoming. As it is different to play music than listen music, it is a different experience to write code than read it. The former requires much deeper understanding than the latter. Therefore methods that measure code readability might fall short. Maybe it would be better to observe people who use the API. The method of observation is being widely used in user experience studies. However, it might be difficult to observe people building real world applications with API that does not exist yet. Therefore observing programmers is probably more feasible in later stages of API development.

Nevertheless, these were some of my thoughts on ensuring code quality and especially the readability of the building blocks programmers write for other programmers, by which I mean APIs. Let me know if you know some methods to measure code readability or enjoyability.

Leave a Comment Cancel reply