Correct Erlang usage mandates you do not write any kind of defensive code. This is called intentional programming. You write code for the intentional control flow path which you expect the code to take. And you don’t write any code for the paths which you think are not possible. Furthermore, you don’t write code for data flow which was not the intention of the program.

# 结果证明防御性编程很蠢14

If an Erlang program goes wrong, it crashes. Say we are opening a file. We can guard the file open call like so:

 1 {ok, Fd} = file:open(Filename, [raw, binary, read, read_ahead]), 

What happens if the file doesn’t exist? Well the process crashes. But note we did not have to write any code for that path. The default in Erlang is to crash when a match isn’t valid. We get a badmatch error with a reason as to why we could not open the file.

A process crashing is not a problem. The program is still operating and supervision--An important fault-tolerance concept in Erlang--will make sure that we try again in a little while. Say we have introduced a race condition on the file open, by accident. If it happens rarely, the program would still run, even if the file open fails from time to time.

You will often see code that looks like:

 1 2 3 ok = foo(...), ok = bar(...), ok = ... 

which then asserts that each of these calls went well, making sure code crashes if the control and data flow is not what is expected.

Notice the complete lack of error handling. We don’t write

 1 2 3 4 case foo(...) of ok -> case bar(...) of ... end; {error, Reason} -> throw({error, Reason}) end, 

Nor do we fall into the trap of the Go programming language and write:

 1 2 3 4 5 6 7 8 res, err := foo(...) if err != nil { panic(...) } res2, err := bar(...) if err != nil { panic(...) } 

because this is also plain silly, tedious and cumbersome to write.

The key is that we have a crash-effect in the Erlang interpreter which we can invoke where the default is to crash the process if something goes wrong. And have another process clean up. Good Erlang code abuses this fact as much as possible.

# 意图？15

Note the word intentional. In some cases, we do expect calls to fail. So we just handle it like everyone else would, but since we can emulate sum-types in Erlang, we can do better than languages with no concept of a sum-type:

 1 2 3 4 case file:open(Filename, [raw, read, binary]) of {ok, Fd} -> ...; {error, enoent} -> ... end, 

Here we have written down the intention that the file might not exist. However:

• We only worry about non existence.
• We crash on eaccess which means an access error due to permissions.
• Likewise for eisdir, enotdir, enospc.

• 我们仅仅关心文件不存在的情况。
• 我们希望在遇到 eaccess 时崩溃，表示因为权限的关系，发生了访问错误。
• 我们希望在遇到 eisdir, enotdir, enospc 时的行为也和 eaccess 一样。

# 为什么？

Leaner code, that’s why.

We can skip lots of defensive code which often more than halves the code size of projects. There are much less code to maintain so when we refactor, we need to manipulate less code as well.

Our code is not littered with things having nothing to do with the “normal” code flow. This makes it far easier to read code and determine what is going on.

Erlang process crashes gives lots of information when something dies. For a proper OTP process, we get the State of the process before it died and what message was sent to it that triggered the crash. A dump of this is enough in about 50% of all cases and you can reproduce the error just by looking at the crash dump. In effect, this eliminates a lot of silly logging code.

# 数据流防御性编程16

Another common way of messing up Erlang programs is to mangle incoming data through pattern matching. Stuff like the following:

 1 2 3 convert(I) when is_integer(I) -> I; convert(F) when is_float(F) -> round(F); convert(L) when is_list(L) -> list_to_integer(L). 

The function will convert “anything” to an integer. Then you proceed to use it:

 1 process(Anything) -> I = convert(Anything), ...I... 

The problem here is not with the process function, but with the call-sites of the process function. Each call-site has a different opinion on what data is being passed in this code. This leads to a situation where every subsystem handles conversions like these.

There are several disguises of this anti-pattern. Here is another smell:

 1 2 3 4 convert({X, Y}) -> {X, Y}; convert(B) when is_binary(B) -> [X, Y] = binary:split(B, <<"-">>), {X, Y}. 

This is stringified programming where all data are pushed into a string and then manually deconstructed at each caller. It leads to a lot of ugly code with little provision for extension later.

Rather than trying to handle different types, enforce the invariant early on the api:

 1 process(I) when is_integer(I) -> ... 

And then never test for correctness inside your subsystem. The dialyzer is good at inferring the use of I as an integer. Littering your code with is_integer tests is not going to buy you anything. If something is wrong in your subsystem, the code will crash, and you can go handle the error.

There is something to be said about static typing here, which will force you out of this unityped world very easily. In a statically typed language, I could still obtain the same thing, but then I would have to define something along the lines of (* Standard ML code follows *)

 1 2 3 datatype anything = INT of int | STRING of string | REAL of real 

and so on. This quickly becomes hard to write pattern matches for, so hence people only defines the anything type if they really need it. (Gilad Bracha was partially right when he identified this as a run-time check on the value, but what he omitted was the fact that the programmer has the decision to avoid a costly runtime check all the time—come again, Gilad ☺).

# undefined 的祸害

Another important smell is that of the undefined value. The story here is that undefined is often used to program a Option/Maybe monad. That is, we have the type

 1 -type option(A) :: undefined | {value, A}. 

[For the static typists out there: Erlang does have a type system based on success types for figuring out errors, and the above is one such type definition]

[对于待在那儿的静态类型们：Erlang确实有基于成功类型4的类型系统来找出错误，上面的代码就是这样一种类型定义]

It is straightforward to define reflection/reification into an exception-effect for these. Jakob Sievers stdlib2 library already does this, as well as define the monadic helper called do (Though the monad is of the Error-type rather than Option).

But I’ve seen:

 1 2 3 4 -spec do_x(X) -> ty() | undefined when X :: undefined | integer(). do_x(undefined) -> undefined; do_x(I) -> ...I.... 

Which leads to complicated code. You need to be 100% in control of what values can fail and what values can not. Constructions like the above silently passes undefined on. This has its uses--but be wary when you see code like this. The undefined value is essentially a NULL. And those were C.A.R Hoare’s billion dollar mistake.

The problem is that the above code is nullable. The default in Erlang is that you never have NULL-like values. Introducing them again should be used sparingly. You will have to think long and hard because once a value is nullable, it is up to you to check this all the time. This tend to make code convoluted and complicated. It is better to test such things up front and then leave it out of the main parts of the code base as much as possible.

# 「开放」数据表示法

Whenever you have a data structure, there is a set of modules which knows about and operates on that data structure. If there is only a single module, you can emulate a common pattern from Standard ML or OCaml where the concrete data structure representation is abstract for most of the program and only a single module can operate on the abstract type.

This is not entirely true in Erlang, where anyone can introspect any data. But keeping the illusion is handy for maintainability.

Erlang 不完全是这样，任何人都能内省9到任何数据，但是保持这一点让程序更容易维护。

The more modules that can manipulate a data structure, the harder it is to alter that data structure. Consider this when putting a record in a header file. There are two levels of possible creeping insanity:

• You put the record definition in a header file in src. In this case only the application itself can see the records, so they don’t leak out.
• You put the record definition in a header file in include. In this case the record can leak out of the application and often will.

• 你把记录定义在 src 文件夹下的一个头文件中。在这种情况下，只有该应用能看到这些记录，所以它们不会泄露。
• 你把记录定义在 include 文件夹下的一个头文件中。在这种情况下，记录可能从该应用中泄露出去，并且这种情况经常发生。

A good example is the HTTP server cowboy where its request object is manipulated through the cowboy_req module. This means the internal representation can change while keeping the rest of the world stable on the module API.

HTTP 服务器 cowboy 是个很好的例子，它通过 cowboy_req 模块操作 request 对象。这意味着即使内部数据结构的表示发生改变，基于 cowboy_req 模块的 API 的其它代码不受影响。

There are cases where it makes sense to export records. But think before doing so. If a record is manipulated by several modules, chances are that you can win a lot by re-thinking the structure of the program.

# “true” 和 “false” 是 atom() 类型

As a final little nod, I see too much code looking like

 1 f(X, Y, true, false, true, true), 

Which is hard to read. Since this is Erlang, you can just use a better name for the true and false values. Just pick an atom which makes sense and then produce that atom. It also has the advantage to catch more bugs early on if arguments get swapped by accident. Also note you can bind information to the result, by passing tuples. There is much to be said about the concept of boolean blindness which in typical programs means to rely too much on boolean() values. The problem is that if you get a true say, you don’t know why it was true. You want evidence as to its truth. And this can be had by passing this evidence in a tuple. As an example, we can have a function like this:

 1 2 3 4 case api:resource_exists(ID) of true -> Resource = api:fetch_resource(ID), ...; false -> ... end. 

But we could also write it in a more direct style:

 1 2 3 4 case api:fetch_resource(ID) of {ok, Resource} -> ...; not_found -> ... end. 

(Edit: I originally used the function name resource_exists above but Richard Carlsson correctly points out this is a misleading name. So I changed it to something with a better name)

which in the long run is less error prone. We can’t by accident call the fetch_resource call and if we look up the resource, we also get hold of the evidence of what the resource is. If we don’t really want to use the resource, we can just throw it away.

# 结束语

Rules of thumb exists to be broken. So once in a while they must be broken. However, I hope you learnt something or had to stop and reflect on something if you happened to get here (unless you scrolled past all the interesting stuff).

I am also interested in Pet-peeves of yours, if I am missing some. The way to become a better programmer is to study the style of others.

1. 防御性编程强调对错误防范于未然，未雨绸缪，对程序中任何可能出现错误的地方编写处理逻辑。参见防御性编程防御性编程与疯狂偏执性编程

2. 意图编程强调把注意力放在能有的，能实现意图的东西上。参见 Intentional_programming意图编程

3. 一种用来表示多种不同类型的数据结构。详见：Tagged_union

4. success types 应该理解为「函数成功执行得到的返回值所具有的类型」，即与返回错误类型相对的类型

5. 这里的 Option 不是「可选」的意思，而是指 option type

6. 反射是指程序在运行时可以访问、检测和修改它本身状态或行为的一种能力。详见 reflection

7. 两种函数式语言

8. 内省就是运行时类型检查。详见 内省type introspectionprogram introspection

9. 作者的意思是 true 和 false 是对称的，用 true 的地方也能用 false 替代，那么为什么要用 true 而不是 false 呢？所以需要上下文提供额外的信息来说明为什么用 true。

10. scrolled past 意为「在浏览器中往下滚动，从而跳过了某些内容」

11. Pet-peeves 是指「厌恶的东西」。作者可能是指某些编程风格的派系之争，类似与「PHP 是世界上最好的语言 :)」

12. 这里的 style 可能是指「优秀程序员的编码风格」，也可能是指「优秀程序员的编码风格和做事方式」

13. 原文：It is an effect, silly

14. 原文：Intentional?

15. 原文：Data flow defensive programming

16. 原文：The scourge of undefined

17. 原文：“Open” data representations

18. 原文：The values ‘true’ and ‘false’ are of type atom()

19. 原文：Closing remarks