Реклама
Practical Go: Real World Advice For Writing Maintainable Go Programs
17-11-2022, 01:50 | Автор: ThereseDesrocher | Категория: Смайлики
Practical Go: Real world advice for writing maintainable Go programs.
Thank you for joining me today. The story of how this workshop came about needs some exposition.
A few years ago I made a tweet called "three rules for dating my code base", it was intended to be a short list of the things you should do for a healthy Go project. Tweet sizes being what they were at the time, this tweet was followed a day later with another three things, I suspect I could have continued for the rest of the year.
Related to this, in 2015 I had dabbled with an agreement to write a book for O’Reilly. The project fizzled out after a few months for two reasons:
O’Reilly’s document preparation system was good, but not as flexible as I wanted. There were too many undefined steps between my words and the final book.
The unforgiving calculus of the number of hours a project like this would take meant if I failed to find the hundreds of hours necessary there would be no partial credit.
Both of these struck me as rather old world approaches to developing a book and the distinctly non-agile way that a complete manuscript was copy edited was not lost on me.
After the three rules tweets I began to think again about how I would like to see a book about Go written in a way that delivered continual benefit even if I were to drop out at an arbitrary point.
This workshop, the document you are reading, the articles on my blog, and the presentations I give are my answer to this problem. This document represents the best version of the material that exists today. As I present more, give more workshops, the material will grow, however, Go programmers continually reap the benefits of this material without waiting for someone to declare arbitrarily that Practical Go is done .
Today this is a workshop style presentation, I’m going to dispense with the usual slide deck and we’ll work directly from this document which you can take away with you today.
You can find the latest version of this presentation at.
License.
Introduction.
There are (probably) no correct answers.
100 years ago David Hilbert was deeply unhappy with the state of mathematics. Mathematics was beset with inconsistencies and paradoxes. Seeking closure the German mathematican embarked on what became know as Hilbert’s Program , an attempt to reduce mathematics to a set of axiomatic proofs. From those foundations, mathematics could be reconstructed. Incontravertbily, provably, correct.
Sadly for Hilbert and his compatriots, their work was undone in 1931 when Kurt G_del published his incompleteness theorum. G_del demonstrated that, parodoxically, any system powerful enough to formalise mathematics would be unable to resolve its own inconsistences. Yet as fundamentally flawed as mathematics is, it has permitted us to understand the universe and the space between atoms.
A few years ago I met a man who told me that the Wright Brothers built the first successful aeroplane years before the laws of hydrodynamics were fully understood. They weren’t the first to fly, people had parachuted from balloons or glided from cliff tops, but they were the first to transition from ground to airborne.
This was 1903 and the theories of aerodynamics wouldn’t be developed for another 60 years, yet they still flew. How did they do this? How did the brothers' succeed in flight without an understanding of the physics that underpins the principle of lift?
Wilbur and Orville didn’t understand the science of why they flew, but they did learn from previous attempts. Their success was based on the experience gained from all those who had failed before them. Said another way, the Kitty Hawk flew because the Wrights were the first to fail to build an airplane which wouldn’t fly.
My friend asserted that just because we don’t understand the laws that underlie the development of software that doesn’t mean they don’t exist and, perhaps more importantly, the search for those theoretical underpinnings should not preclude an experimental approach.
Some argue that computer science is not a science. Alan Kay famously called it a pop culture. And this was my friends point. It’s not that computer science is a pop culture, or sociology, it’s that the practice of programming is running ahead of the science . We may be reasonably certain that a practice works, but we won’t know why until later.
So, I put it to you that the period of development of the practice of computer programming is running ahead of the scientific foundations on which is will ultimately be based. In a few hundred years we may have a solid scientific basis for computer programming, but right now all we have is empirical evidence.
Knowing when to stick to the code and when to bend the rules probably comes down to experience, gained from observation, experimentation, and research. What’s the difference between a rule and a guideline? I’d say, experience. The mastery of a topic, Malcolm Gladwell’s 10,000 hour yardstick, is equal parts observation and practice. You watch someone else do the thing, while concurrently you practice doing the thing.
I love writing software. I love discovering the patterns and structure that underlies design problems, but I do not pretend to represent the truth about the right way to program. Such a thing may not be knowable, but that should not preclude us from having a discussion based on experience.
With that said, these are my experiences, and the goal today is not to be prescriptive. This is a discussion, not a lecture.
A guiding principle.
If I’m going to talk about best practices for a programming language I need some framework by which to define what I mean by best .
Over the years that I’ve been thinking about this material I’ve tried to identify a common theme, and abstract that can summarise the tens of thousands of words I’ve written.
In forming this abstract many words have come and gone; simplicity, readability, clarity, productivity, but ultimately they are all synonyms for one word_—_m_aintainabilty_.
If there was a quote that summarises the ethos of writing Pratical Go code it would be this quote from Sandi Metz.
1. Declarations.
The first topic I’d like to discuss is declarations . A declaration declares an identifier . An identifier is a fancy word for a name ; the name of a variable, the name of a function, the name of a method, the name of a type, the name of a package, and so on.
Names are important. Given the limited syntax of Go, the names we choose for things in our programs have an oversized impact on the readability of our programs. Readability is a defining quality of maintainable code, thus choosing good names is crucial to the maintainability of Go code.
1.1. Choose identifiers for clarity, not brevity.
Go is not a language that optimises for clever one liners. Go is not a language which optimises for the least number of lines in a program. We’re not optimising for the size of the source code on disk, nor how long it takes to type the program into an editor.
Rather, we want to optimise our code to be clear to the reader. Key to this clarity is the names we choose for identifiers in Go programs. To get technical, when I’m talking about naming, I’m talking about naming identifiers in Go programs. But that’s a bit lengthy, so lets just call it naming from now on_—_you understand what I mean.
First, lets set the ground rules. Anything in Go that is an identifier has a name. What are the things in Go we can name?
the name of a type, struct, or interface.
the name of a function or a method.
the name of a package.
the name of a constant.
the name of a variable, formal parameter, or return value.
Go programmers care about the name of things—_a lot. Naming is key to readability, hence the names of identifiers used in your program is critical. Poorly chosen names contribute to a program that is harder to comprehend thus harder to maintain.
Although not defined by go fmt , the canonical Go style for naming things descends from its original authors and can be identified by the following observations:
The greater the distance between declaration and use, the more descriptive the name.
The more frequently the name is referenced in a particular place, the shorter the name.
These two observations interlock to form general advice. For example, the names of local variables should be shorter than the names of the formal parameters.
Let’s talk about the qualities of a good name:
A good name need not be the shortest possible, but a good name should waste no space on things which are extraneous. Good names have a high signal to noise ratio.
A good name is descriptive.
A good name should describe the application of a variable or constant, not their contents. A good name should describe the result of a function, or behaviour of a method, not their implementation. A good name should describe the purpose of a package, not its contents. The better the choice of a name, the more accurately it describes that that it identifies.
A good name should be predictable.
You should be able to infer the way a symbol will be used from its name alone. This is a function of choosing descriptive names, but it also about following tradition. This is what Go programmers talk about when they say idiomatic (of which I shall have more to say tomorrow).
Let’s talk about each of these properties in depth.
Go traditionally enforces no line length restriction. With that said, things considered, shorter identifiers are preferable to longer ones. If you find yourself arguing with your colleagues over wrapping a particularly long function signature, you may be able to avoid the argument by reducing the length identifiers in your program.
1.1.1. The larger the identifier’s scope, the larger it’s name.
Sometimes people criticise the Go style for recommending short variable names. As Rob Pike said, "Go programmers want the right length identifiers". [1]
The length of the identifier should be proportional to the distance between its declaration and use.
Andrew Gerrand suggests that by using longer identifies to indicate to the reader things of higher importance.
From this we can draw some guidelines:
Short variable names work well when the distance between their declaration and last use is short. Short functions can have short identifiers.
Long variable names need to justify themselves; the longer they are the more value they need to provide. Lengthy bureaucratic names carry a low amount of signal compared to their weight on the page. Long functions shouldn’t have short parameter names as they will be declared a long way from where they are used.
Don’t include the name of your type in the name of your variable. It’s needless ceremony; saying repeating the type of the variable constantly does not make it more type safe.
Constants should describe the value they hold, not how that value is used.
Prefer single words for method, interface, and package identifiers.
Prefer single letter variables for loops and branches, single words for parameters and return values, multiple words for functions and package level declarations. Its ok to use a shorter name in a short block inside a larger function.
Remember that the name of a package is part of the name the caller uses to to refer to it, so make use of that.
Unless you’re embedding, the name of the field should describe its purpose, not its content.
Globals of all kinds deserve longer identifiers than locally scoped ones.
Let’s look at an example:
In this example, the range variable p is declared on line 10 and only referenced once, on the following line. p lives for a very short time on the page and in limited scope during the execution of the function. A reader who is interested in the effect values of p have on the program need only read the loop’s three lines.
By comparison people is declared in the function parameters, is live for the body of the function, and is referenced three times over seven lines. The same is true for sum , and count , thus they justify their longer names. The reader has to scan a wider number of lines to locate them so they are given more distinctive names.
I could have chosen s for sum and c (or possibly n ) for count but this would have reduced all the variables in the program to the same level of importance. I could have chosen p instead of people but that would have left the problem of what to call the for …_ range iteration variable and the singular person would look odd as the loop iteration variable which lives for little time has a longer name than the slice of values it was derived from.
1.1.2. Context is key.
It’s important to recognise that most advice on naming is contextual. I like to say it is a guideline, not a rule.
What is the difference between two identifiers, i , and index . We cannot say conclusively that one is better than another in all situations. For example is.
fundamentally more readable than.
I argue it is not, because it is likely the scope of i , and index for that matter, is limited to the body of the for loop and the extra verbosity of the latter adds little to comprehension of the program. A loop nested within a loop is inherently harder to comprehend reguardless of the name of the loop induction variable.
However, which of these functions is more readable?
In this example, oid is an abbreviation for SNMP Object ID, so shortening it to o would mean programmers have to translate from the common notation that they read in documentation to the shorter notation in your code. Similarly, reducing index to i obscures what i stands for; in SNMP messages a sub value of each OID is called an Index.
1.1.3. Use a predictable naming style.
Another property of a good name is it should be predictable. The reader should be able to predict the use of a name when they encounter it for the first time. When they encounter a common name, they should be able to assume it has not changed meanings since the last time they saw it. You could say that a good name should feel familiar.
For example, if your code passes around a database handle rather than a combination of d *sql.DB , dbase *sql.DB , DB *sql.DB , and database *sql.DB , consolidate on something like;
and use it consistently across parameters, return values, local declarations, and potentially receivers. Doing so promotes familiarity; if you see a db , you know it’s a *sql.DB and that it has either been declared locally or provided for you by the caller.
Similar advice applies to method receivers; use the same receiver name every method on that type. This makes it easier for the reader to internalise the use of the receiver across the methods of that type which may, occasionally, be defined across multiple files.
Finally, certain single letter variables have traditionally been associated with specific use cases. For example;
i , j , and k are commonly the loop induction variable for simple for loops. [2]
n is commonly associated with a counter or accumulator.
v is a common shorthand for a value in a generic encoding function, k is commonly used for the key of a map.
a and b are generic names for parameters comparing two variables of the same type.
x and y are generic names for local variables created for comparision, and s is often used as shorthand for parameters of type string .
Functions or methods that being with Print are traditionaly take string s, or things that can be converted to string s and print them as text.
The f suffix; Printf , Logf , etc, indicate the function takes a format string and a variable number of arguments to format according to the rule.
Functions or methods that begin with Write traditonally take non string values and write them out as binary data.
Collection variables, maps, slices, and arrays, should be pluralised.
As with the db example above programmers expect i to be a loop induction variable. If you ensure that i is always a loop variable, not used in other contexts outside a for loop. When readers encounter a variable called i , or j , they know that a loop is close by.
1.2. A variable’s name should describe its contents.
You should avoid naming your variables after their types for the same reason you don’t name your pets "dog" and "cat". You shouldn’t include the name of your type in your variable’s name for the same reason.
Consider this example:
What’s good about this declaration? We can see that its a map, and it has something to do with the *User type, that’s probably good. But usersMap is a map, and Go being a statically typed language won’t let us accidentally use it where a different type is required. The Map suffix is redundant from the point of view of the compiler. Hence utility of the suffix is entirely down to whether we can prove it is of use to the reader.
Now, consider what happens if we were to declare other variables:
Now we have three map type variables in scope, usersMap , companiesMap , and productsMap , all mapping strings to different types. We know they are maps; it’s right there in their declaration. We also know that their map declarations prevent us from using one in place of another—_the compiler will throw an error if we try to use companiesMap where code was expecting a map[string]*User . In this situation it’s clear that the Map suffix does not improve the clarity of the code, its just extra boilerplate to type.
Removing the suffix leaves us with the more concise and equally descriptive:
If we remove the suffix denoting the type’s type from its name, usersMap becomes users which is descriptive, but userMap would become user , which is misleading.
If users isn’t descriptive enough, then usersMap won’t be either.
My suggestion is to avoid any suffix that resembles the type of the variable. This advice also applies to function parameters. For example:
Naming the *Config parameter config is redundant. We know its a *Config , it says so right there.
In this case consider conf , or maybe c , if the lifetime of the variable is short enough. If there is more that one *Config in scope at any one time then calling them conf1 and conf2 is less descriptive than calling them original and updated as the latter are less likely to be mistaken for one another.
The name of an imported identifier includes its package name. For example the Context type in the context package will be known as context.Context . This makes it impossible to use context as a variable or type in your package.
Will not compile. This is why the local declaration for context.Context types is traditionally ctx . eg.
The name of the variable should describe its contents, not the type of its contents.
1.3. Use a consistent declaration style.
Where I live we have three levels of government; local, state and federal. It is universally accepted that this is one too many, however consensus on which level to eliminate is lacking. In much the same way, Go has at least six different ways to declare a variable:
This list does not include receivers, formal parameters and named return values. I’m sure there are more that I haven’t thought of.
This is something that Go’s designers recognise was probably a mistake, but its too late to change it now, and, they argue, the bigger problem is shadowing. With all these different ways of declaring a variable, how do we avoid each Go programmer choosing their own style?
In Go each variable has a purpose because each variable we declare has to be used within the same scope. Due to Go’s automatic type deduction it is uncommon to declare a type and initialise its value; you either do one or the other, declare without initialisation, or assign without initalisation. Here is a suggestion for how to make the purpose of each declaration clear to the reader. This is the style I try to use where possible.
When declaring a variable that will be explicitly initialised later, use the var keyword.
The var acts as a clue to say that this variable has been deliberately declared as the zero value of the indicated type. You should use this form only when declaring variables that you want to be explicitly initalised to the type’s zero value.
This is advice consistent with the requirement to declare variables at the package level using var as opposed to the short declaration syntax. However, I’ll argue later that you shouldn’t be using package level variables at all.
When declaring and initialising a variable—_that is to say we’re not letting the variable be implicitly initialised to its zero value—_I recommend using the short variable declaration form. For example;
The lack of var prefix is a signal that this variable has been explicitly initalised. This makes it clear to the reader that the variable on the left hand side of the := is being deliberately initialised from the expression on the right.
I’ve also found that by avoiding declaring the type of the variable, instead infering it from the right hand side of the assignment, this makes re-factoring easier in the future.
To explain why, Let’s look at the previous example, but this time deliberately initialising each variable:
In the first and third examples, because in Go there are no automatic conversions from one type to another; the type on the left hand side of the assignment operator must be identical to the type on the right hand side. The compiler can infer the type of the variable being declared from the type on the right hand side, to the example can be written more concisely like this:
This leaves us with explicitly initialising players to 0 which is redundant because 0 is players ' zero value. So it’s better to make it clear that we’re going to use the zero value by instead writing.
What about the second statement? We cannot elide the type and write.
Because nil does not have a type. [3] Instead we have a choice, do we want the zero value for a slice?
or do we want to create a slice with zero elements?
If we wanted the latter then this is not the zero value for a slice so we should make it clear to the reader that we’re making this choice by using the short declaration form:
Which tells the reader that we have chosen to initialise things explicitly.
This brings us to the third declaration,
Which is both explicitly initialising a variable and introduces the uncommon use of the new keyword which some Go programmer dislike. If we apply our short declaration syntax recommendation then the statement becomes.
Which makes it clear that thing is explicitly initialised to the result of the expression new(Thing) --a pointer to a Thing --but still leaves us with the unusual use of new . We could address this by using the compact literal struct initialiser form,
Which does the same as new(Thing) , hence why some Go programmers are upset by the duplication. However this means we’re explicitly initialising thing with a pointer to the literal Thing<> , which itself is the zero value for a Thing .
Instead we should recognise that thing is being declared as its zero value and use the address of operator to pass the address of thing to json.Unmarshall.
Of course, with any rule of thumb, there are exceptions. For example, sometimes two variables are closely related so writing.
would look odd. The declaration may be more readable like this.
However, maybe in this case min and max are really constants, and should be written as:
The clue is that min can be substituted for its zero value whereas max cannot without explicit initalisation.
When something is complicated, it should look complicated.
Here length may be being used with a library which requires a specific numeric type and is more explicit that length is being explicitly declared to be uint32 than the short declaration form:
In the first example I’m deliberately breaking my rule of using the var declaration form with an explicit initialiser. This decision to vary from my usual form is a clue to the reader that something unusual is happening.
Although, again, length may actually be a constant masqurading as a variable. The clue is the requirement to explicitly type the number 0x80 whereas if it were a constant it could be inferred from the calling context.
Small blocks of declarations this style may be may look mildly inconsistent, this is probably acceptable given the other advice in this chapter.
When declaring a variable without initialisation, use the var syntax.
When declaring and explicitly initialising a variable, use := .
1.3.1. Compromise for consistency.
The goal of software engineering is to produce maintainable code. Therefore you will likely spend most of your career working on projects of which you are not the sole author. My advice in this situation is; follow the local style. For example, if functions in the package uses short variables throughout, do not make it inconsistent by adding one that is lengthy.
Changing styles in the middle of a file is jarring. Uniformity, even if its not your preferred approach, is more valuable for maintenance over the long run than your personal preference. The rule I try to follow is; if it fits through go fmt then it’s usually not worth holding up a code review for.
This advice could also be written, "when in Rome, do as the Romans do". For example, here is a short piece of code that violates the rules of using short delcarations only for explicit declarations.
however the overall effect is more harmonious. All four variables are declared and initalised in a block using a regular syntax, vs the inconsistent declaration and initalisation of the latter.
However there are a few places where Go programmers forgoe this advice. For example, network connections are often called conn :
because a network connection is usually live for long enough to justify a name longer than c . Of course, if there is more than one connection in play, rather than calling them conn1 and conn2 , a name that describes their respective roles is better. For example:
1.4. Avoid conflicts with the names of common local variables.
The import statement declares a new identifier at the package level (technically the file level, but files which do not import the identifiers they need will not compile, so the distinction is mostly academic).
Consider the problems naming a package that deals with file descriptors, fd . fd would be the natural identifier for a file descriptor value returned by some hypothetical fd.Open function.
However don’t think up a convoluted package name just to retain the use of a convenient identifier.
FD is a bad name for a package, you want it for the variable, it’s also a bad name for a type, for the same reason.
Don’t let an import statement steal the name of a common identifier for the name of a package.
1.5. An identifier’s name includes the name of its package.
An identifier’s name includes its package name. This means you should think about the name of your types, symbols, etc with their qualified name, package.Symbol .
A symbol’s name always includes the name of it’s package. A symbol’s name never includes the "full path" of its package, so applicationserver/v2/cache , is just cache . /apis/meta/v1 isn’t the v1 package of the meta package for the api , it’s just v1 , and potentially conflicts with all the other v1 packages you imported.
If you find you have a lot of packages that have the same name, you place the user in the position that they’re going to have to rename your imports on import. This is undesirable as the name that symbols inside the file refer to your package as, is not the same name as the package’s declaration. Moving code between files is more laborious as goimports won’t work for you, and now looking at the name of the package in the symbols name you’re going to have to memorise the name you renamed the package too because it conflicted on import.
1.5.1. Reduce repetition.
You may find this repetition comforting. In general gophers find it redundant. [4]
Don’t name your interface type fooInterface , it’s repetitive. The compiler knows its an interface, you don’t have to continually remind it. For the same reason, don’t call your interface Ifoo , because the I is shorthand for interface which still stutters, but adds to the cognitive load because you have to read the I as "interface"
A symbol’s name, to its caller, includes the name of the symbols package.
While there may be many Buffer implementations, in the scope of this file’s imports, there is unlikely to be multiple bytes packages. So the name is unambigious.
Redundancy is everywhere. Here is another example. Consider these two function declarations:
The former is the name parameter of a virtualhost, the latter is a nmonomic for virtualhost, which if you know it, you wouldn’t need to look it up.
The name of a variable or constant is orthogonal to its type. Just as prefixes and suffixes such as.
are more subtle methods of hiding a type in the name of a variable of that type .
1.5.2. Avoid Prefixes unless required.
With the exception of its use within its own package, every public Go symbol is prefixed with the name of its package. To pick a contrived example, nothing in the the http package should start with HTTP (or a derivation thereof) because to the user of that package, everything already starts with http. .
Having said that, there is a growing preference for function prefixes which modify the operation of the function. For example, Must is commonly associated with a wrapper function which panics if the function it wraps is not successful.
1.6. Use the smallest scope possible.
Scope and shadowing in Go are tightly linked. The former is considered to be a powerful tool in avoiding bugs, the latter, to the uninitiated, is a source of bugs. However the solution to both is adopting a consistent style which will highlight possible errors due to the irregularity.
The goal of scoping variables tightly is to turn the accidental use of a variable into a compile error. For example, compare:
The former restricts the scope of the binding err to the block from the start of the if statement to the closing brace. If this check was moved higher or lower in the function it will continue to compile without issue.
Compare this to the other example which requires that err had not been previously declared. If you move these four lines later in the function, it is possible that some other method expects err to be declared and will just be using assignment.
which will cause two compile errors.
Scoping vairables via conditional blocks is convenient, but can cause shadowing issues with named returns and nested blocks.
This is the fault of using named returns and nested blocks, but still, the author must be aware of the complications.
I suggest that the greater the distance between declaration and use, the more descriptive the name given to the declatation.
The corollary of this advice suggests that variables, and by extension functions, types, and even packages, should be arranged to avoid the creation of unnecesarily verbose names.
2. Commentary.
Before we move on to larger items I want to spend a little time talking about comments.
Comments are very important to the readability of a Go program. Each comments should do one—_and only one—_of three things:
The comment should explain what the thing does.
The comment should explain how the thing does what it does.
The comment should explain why the thing is the way it is.
The first form is ideal for commentary on public symbols:
The second form is ideal for commentary inside a method:
The third form, the why , is unique as it does not displace the first two, but at the same time it’s not a replacement for the what , or the how . The why style of commentary exists to explain the external factors that drove the code you read on the page. Frequently those factors rarely make sense taken out of context. The why style comment exists to provide that context.
In this example it may not be immediately clear what the effect of setting HealthyPanicThreshold to zero percent will do. The comment is needed to clarify that the value of 0 will disable the panic threshold behaviour.
Comments such as these record hard won battles for understanding deep in the business logic. When you have the opportunity to write them, be sure to include enough hints that the next reader can follow your research. Links to issues, design documents, RFCs, or specifications that provide more background are always helpful.
Comments on a method or function should describe the purpose of the function and potentially the arguments, the comment should be updated when the arguments change, or the purpiose of the function changes, in which case so will its name, both of which directly follow the comment.
Comments inside a function or method should be diretly followed by the line or block they are associated with, again, when the block changes, the comments should be reviewed.
2.1. Always document public symbols.
Because godoc is the documentation for your package, you should always add a comment for every public symbol—_variable, constant, function, and method—_declared in your package.
Here are two rules from the Google Style guide:
Any public function that is not both obvious and short must be commented.
Any function in a library must be commented regardless of length or complexity.
There is one exception to this rule; you don’t need to document methods that implement an interface. Specifically don’t do this:
This comment says nothing. It doesn’t tell you what the method does, in fact it’s worse, it tells you to go look somewhere else for the documentation. In this situation I suggest removing the comment entirely (although some linters disagree).
Here is an example from the io package.
Note that the LimitedReader declaration is directly preceded by the function that uses it, and the declaration of LimitedReader.Read follows the declaration of LimitedReader itself. Even though LimitedReader.Read has no documentation itself, its clear from that it is an implementation of io.Reader .
2.2. Comments on variables and constants should describe their contents.
I stated earlier that the name of a variable, or a constant, should describe its purpose. When you add a comment to a variable or constant, that comment should describe the variable’s contents , not the variable’s purpose .
In this example the comment describes why RandomNumber is assigned the value six, and where the six was derived from. The comment does not describe where RandomNumber will be used. This is deliberate, RandomNumber may be used many times by any package that references it. It is not possible to keep a record of all those uses at the site that RandomNumber is declared. Instead the name of the constant should be a guide the appropriate use for potential users.
Here are some more examples:
In general use the untyped constant 100 is just the number one hundred. In the context of HTTP the number 100 is known as StatusContinue , as defined in RFC 7231, section 6.2.1. The comment included with that declaration helps the reader understand why 100 has special significance as a HTTP response code.
For variables without an initial value, the comment should describe who is responsible for initialising this variable.
This example comes deep from the bowels of the Go compiler. Here, the comment lets the reader know that the dowidth function is responsible for maintaining the state of sizeCalculationDisabled .
The fact that this advice runs contrary to previous advice that comments should not describe who uses them is a hint that dowidth and sizeCalculationDisabled are intimately entwined. The comments presence suggests a possible design weakness.
This is a tip from Kate Gregory. [5] Sometimes you’ll find a better name for a variable hiding in a comment.
The comment was added by the author because registry doesn’t explain enough about its purpose—_it’s a registry, but a registry of what?
By renaming the variable to sqlDrivers its now clear that the purpose of this variable is to hold SQL drivers.
Now the comment is redundant and can be removed.
This advice also applies to comments within a function.
2.3. Comments on functions and methods should describe their purpose.
The comment on a function signature should describe what the function intends to do, not how it does it. If the name of the function is all the description it needs_—_even better. Similarly they should describe the inputs and outputs of a function, not be overly perscriptive of how those should be used. Rather than describe the type of the return value, the function’s comment should describe the value’s meaning.
The description should be sufficient to write a unit test for the documented behaviour.
2.4. Don’t comment bad code, rewrite it.
Comments highlighting the grossness of a particular piece of code are not sufficient. If you encounter one of these comments, you should raise an issue as a reminder to refactor it later. It is okay to live with technical debt, as long as the amount of debt is known.
The tradition in the standard library is to annotate a TODO style comment with the username of the person who noticed it.
The username is not a promise that that person has comitted to fixing the issue, but they may be the best person to ask when the time comes to address it. Other project annotate todos with a date and or an issue number, which is a benficial tradition.
2.5. Rather than commenting a block of code, refactor it.
Functions should do one thing only. If you find yourself commenting a piece of code because it is unrelated to the rest of the function, consider extracting it into a function of its own.
In addition to being easier to comprehend, smaller functions are easier to test in isolation. Once you’ve isolated the orthogonal code into its own function, its name may be all the documentation required.
3. Style.
This section deals with matters of style.
3.1. Minimize use of vertical whitespace.
gofmt solved so many unproductive style wars, intentation, alignment, and so on, are a thing of the past, but vertical whitespace is still an open question.
This quote from the Google C++ style guide is most apt:
This is more a principle than a rule: don’t use blank lines when you don’t have to. In particular, don’t put more than one or two blank lines between functions, resist starting functions with a blank line, don’t end functions with a blank line, and be sparing with your use of blank lines. A blank line within a block of code serves like a paragraph break in prose: visually separating two thoughts.
The basic principle is: The more code that fits on one screen, the easier it is to follow and understand the control flow of the program. Use whitespace purposefully to provide separation in that flow.
Some rules of thumb to help when blank lines may be useful:
Blank lines at the beginning or end of a function do not help readability.
Blank lines inside a chain of if-else blocks may well help readability.
A blank line before a comment line usually helps readability — the introduction of a new comment suggests the start of a new thought, and the blank line makes it clear that the comment goes with the following thing instead of the preceding.
Just as you begin each paragraph—_each new thought—_with a line break, do so with a new set of statements in a function. This allows you to understand each part of a function, each set of statements. Ideally each function only has one set of statements, so no padding is necessary.
3.2. Prefer shorter functions.
Each function should be written in terms of a single level of abstraction. Ideally a function should do one, and only one, thing.
This should place an upper limit on the length of a function which is beneificial because, besides longer functions being harder to read, longer functions are more likely to mix more than one idea. The required disentanglement must then be performed by the reader.
3.3. Return early rather than nesting deeply.
Every time you indent you add another precondition to the programmers stack consuming one of the 7 ±2 slots in their short term memory.
Go does not use exceptions for control flow thus there is no requirement to deeply indent your code to provide a top level structure for try and catch blocks. Rather than the successful path nesting deeper and deeper to the right, Go code is written in a style where the success path continues down the screen as the function progresses. Mat Ryer calls this practice 'line of sight' coding. [6]
This is achieved by using guard clauses ; conditional blocks which assert preconditions upon entering a function. Here is an example from the bytes package,
Upon entering UnreadRune the state of b.lastRead is checked and if the previous operation was not ReadRune an error is returned immediately. From there the rest of the function proceeds with the assertion that b.lastRead is greater that opInvalid .
Compare this to the same function written without a guard clause,
The body of the successful case, the most commonly executed, is nested inside the first if condition and the successful exit condition, return nil , has to be discovered by careful matching of closing braces. The final line of the function now returns an error, and the reader must trace the execution of the function back to the matching opening brace to know when control will reach this point.
This is more error prone for the reader, and the maintenance programmer, hence why Go prefers to use guard clauses and returning early on errors.
3.4. Make the zero value useful.
Every variable declaration, assuming no explicit initaliser is provided, will be automatically intialised to a value that matches the contents of zero’d memory. This is the value’s zero value . The type of the value determines it’s zero value; for numeric types it is zero, for string types it is "" , for pointer types, nil , the same for slices, maps, and channels.
This property of always setting a value to a known default is important for safety and correctness of your program and can make your Go programs simpler and more compact. This is what Go programmers talk about when they say "give your structs a useful zero value".
Consider the sync.Mutex type. sync.Mutex contains two unexported integer fields, representing the variable’s internal state. Thanks to the zero value those fields will be set to will be set to 0 whenever a sync.Mutex is declared. sync.Mutex has been deliberately written to take advantage of this property, making the type usable without explicit initialisation.
Another example of a type with a useful zero value is bytes.Buffer You can decare a bytes.Buffer and start writing to it without explicit initialisation.
A useful property of slices is their zero value is nil . This makes sense if we look at the runtime’s (pseudo) definition of a slice header.
The zero value of this struct would imply len and cap have the value 0 , and array , the pointer to memory holding the contents of the slice’s backing array, would be nil . This means unless you need to specify a size you don’t need to explicitly make a slice, you can just declare it.
var s []string is similar to the two commented lines above it, but not identical. It is possible to detect the difference between a slice value that is nil and a slice value that has zero length.
A useful, albeit surprising, property of uninitialised pointer variables—_nil pointers—_is you can call methods on types that have a nil value. This can be used to provide default values simply.
I despise configuration structs, a type who’s only purpose is to provide facts—_and they must be facts—_not variables, to another type. Instead, figure out how to make the original type configurable. This often means making its zero value usable.
Hard to prevent a go value being created so work to make the zero value safe.
If your type has no safe zero value, ensure that nobody else can construct it unsafely.
If you have a public type with fields that cannot be zeroed, have no valid zero value, then that type should in fact be private.
3.5. Methods on a T vs methods on a *T.
Without exception, everything in Go is a copy. Fundamental to the understanding of Go are the three following axioms of Go values;
Every variable in Go is a value.
Every assignment is a copy.
Every formal parameter and return value is a copy.
We also know that method calls are just syntactic sugar for calling a function an passing the receiver as the first parameter. So, what are the rules for how the receiver should be passed, as a value or a pointer to that value?
Use a pointer receiver when the caller will change the receiver. This could be also written as use a pointer receiver when the method is stateful.
The inverse is also true, use a value receiver when the receiver is immutable. One of the few std lib examples of this is the time.Time type.
K&D[gopl] points out that if some of your methods have pointer receivers, all your methods should have pointer receivers. The same logic applies in reverse if you really want one method to have a value receiver; all the others must follow suite.
One argument is to always declare all methods on *T as it avoids copying and is thus faster. However Go developers are acutely attuned to this sort of absolutist thinking and tend to reject it without further proof. After all, we pass around slice and string values, both of which are several words in length without a care, so a blanket rule that every method must be declared on a pointer receiver for performance reads like dogma.
In the end I’m left with the unhappy compromise of;
In general, use pointer receivers.
If you use a pointer receiver on one method, the rest should follow suit.
In practical terms, pointer receiver should be your go to unless you are working on a specific type that you want to exploit the properties of the copying behaviour of value receivers. Methods on values should be used sparingly, and with great consideration.
3.5.1. Avoid naming your method’s receiver this, or self.
Not many people know this, but method notation, i.e. v.Method() is actually syntactic sugar and Go also understands the de-sugared version of it: (T).Method(v) . Naming the receiver like any other parameter reflects that it is, in fact, just another parameter quite well.
This also implies that the receiver-argument inside a method may be nil . This is not the case with receivers in other languages.
Convention dictates that the receiver of a method be named as it were an argument, using this or self is not considered idiomatic.
3.6. Function vs methods.
When should something be a method on a type vs a free function? I recommend using methods where state is retained, functions where it is not.
If the state retained is related to this first argument, the method is placed on that type.
If the first arg is a concrete value, and this is a public function, considering making it a method on the first value, especially if there is only one other parameter.
If the function is private, avoid making it a method.
Public functions are the way to communicate across packages, and interfaces are the mechanism to define behaviour across packages.
Pure functions are easier to test than methods because methods live on types and are inherently impure_—_they contain state via the receiver. The opposite is also true, if a method never mutates its receiver, should it be a method?
If you prefer a function to a method, which lets you add methods to interfaces, continue to place the receiver in the first formal parameter. As you wrap and abstract, parameters should move to the left, from the format parameters, to the receiver, to a type embedded in the receiver.
Here is a rule of thumb that may guide you in deciding to use a method or a function. Methods for what they do, functions for what they return.
3.7. Avoid named return values.
Named return values permit the function’s author to;
Increase separation between declaration and use. Which runs contrary to the previous suggestion, and decreases readability, especially when the function or method is long.
Increase the risk of shadowing.
Enable the use of naked returns.
Each of which are a net negative on the readability of the function.
Named returned arguments introduce a discontinuity in the declaration of variables.
Named returns move the declaration to an unexpected location.
Named returns force you to declare all return parameters, or worse declare them _ .
In short, named return values are a symptom of a clever piece of code which should be reviewed with suspicion. If the method is infact simple, then named returns values are playing the short game of brevity over readabilty.
Its’s my opinion that names return arguments should not be used unless required to provide something that could not reasonably be done another way. For example, to modify the return arguments in a defer block, where it is required to name return arguments to capture them.
What is clear is that this function is complex, and named return values are part of that complexity.
All things being equal, you should aim to write simple code, not clever code. And so should avoid designs that require named return values.
There is nothing you can do with named return values that you cannot do with a few more lines of code. Avoid them if possible.
3.8. Avoid naked returns.
Naked returns combine the declaration of a return value in the function declaration with an unspecified assignment somewhere in the body of the function. Everything about the use of naked returns admits a set of actions that hides bugs, in even small functions.
Naked returns are inconsistent; they make it look like the function or method returns no values, when infact it does, as they were declared in the function signature.
Naked returns are often used inconsistently, especially in an error path where nil is returned explicitly, or the zero value of a named return value is used. Combined with early returns [link] this results in multiple, sometimes conflicting, return stamements.
3.9. Avoid incomplete initalisation.
"Use struct literal initialization to avoid invalid intermediate state. Inline struct declarations where possible."_—_Peter Bourgon[bourgon2016]
Where possible values should be completely intitalised by construction, rather than by convention. We see examples of this failure at the most basic levels, for instance when declaring a value then overriding the default zero initalisation.
One the first line, userCount is in scope, but misleadingly holds the value 0 . Only after countUsers has been called is userCount valid.
The incomplete initialsation pattern tends to show up in more complicated declarations.
In this example vhost is incomplete, it has not yet had a set of routes appended too it. Compare this with.
In the revised version, the routes slices is populated fully, then assigned to the &Virtualhost literal, noting that this literal is never given a name so cannot appear partially initialised.
Specifically avoid public Init functions.
How do you know if they’ve been called already?
What happens if they are called twice? Someone might try to use them to clean and object from sync.Pool or otherwise recycle it.
3.10. Avoid finalisation.
Go contains a finalisation facility that lets the programmer register a function to be run when no live references to the object remain. Finalisation’s siren song of garbage collection for non memory resources can be beguiling. Inherently the idea is sound; in some programs it can be difficult to identify the owner of a resource like a file, a lock, or a socket. Do not use it .
At one point in the Go runtime’s development there were serious discussions about making finalisation a noop; functions registered for finalisation would simply be ignored. Fortuntately cooler heads prevailed, but had they not this would not have been a violation of the specification or the Go 1 guarentee. Runtime finalisation does not guarentee timely execution of finalisations; that can be delayed until after the program has exited, and still be compliant.
There is only one place in the entire standard library where finalisation is used; for variables of type *os.File . This use is at best a historical artifact. Given the serious problems with finalisation, and near prohibition in the standard library, do not design your software to rely on timely finalisation.
Do not write programs who’s correctness depends on finalisation, instead associate the lifetime of a resource to the lifetime of a goroutine.
4. Understanding nil.
nil is a curious beast. There is no nil type, nor can you alias another type to be nil , it is a reserved word. nil can be assigned to a value, and values can be compared to nil , but nil cannot be compared to itself.
nil cannot be compared with itself for inequality.
nor with itself for equality.
however nil may appear on either side of a binary operation.
Given all these restrictions, nil sounds out of place in the orthogonal Go world. Why would such a concept exist? The answer is, while nil may appear inconsistent, it makes a lot of other interactions in Go simpler.
If you assign nil to a pointer the pointer will not point to anything.
If you assign nil to an interface, the interface will not contain anything.
If you assign nil to a slice and the slice will have zero len and zero cap and no longer point an underlying array.
nil 's meaning, or it’s type, is fully determined by the static type of the variable it’s assigned to. When you write a statement like.
The rule of expressions dictates that all the variables in the expression must have the same type. We know the type of f , it is a *os.File , therefore we know that nil has been coerced from an ideal constant to an expression which evaluates to the value also of type *os.File .
Here is a more complicated example.
Again, the type of s is known, and as there are no conversions in the expression, the type of nil on the other side of the comparison must be the same, []string .
4.1. Be wary of nil and interfaces.
As we saw above nil can be simple, or complicated, depending on how you reason about it. One area where nil is complicated, until you memorise the rule is the dreaded typed nil .
prints false because the returned value has a type of os.Writer not nil .
If your method returns an interface type , be sure to always return nil explicitly. Assigning nil to a value of a concrete type and returning that will convert it to a typed nil.
Always return an explicit nil , rather than a typed value containing nil .
4.2. Never use nil to indicate failure.
Go’s inclusion of nil tends to upset people who come from a Java, C#, or C++ background because they are traumatised by how nil operated in those languages.
In other languages, especially those that don’t support multiple return values, it is extremely common to return a nil like value a failure happens inside the method. On one hand this is eminently sensible, exceptions are overused, and for most failures they are hardly unexpected, so some mechanism of representing a failure that doesn’t warrant the four alarm fire of an exception is called for. Obviously this has some major downsides. As the flow of execution is not redirected a catch block this nil (or null , not naming any names) sentinel value now represent a silent failure condition.
Fortuntaly Go does not suffer from these drawbacks. This is for two reasons main reasons:
Multiple return values don’t require the author of a function to overload the single return value with an error state.
A general prohibition about passing nil into and out of functions.
Never use nil to indicate a failure, only to indicate the absence of an error.
4.3. A nil receiver is a programming error.
When Go programmers discover that you can call a method on nil receiver it generally blows their mind.
b is of type *Bar but is nil .
This is because a method in Go is just syntactic sugar for a function who’s first parameter is the receiver.
Restated like this it is clear to see why passing a nil value for b is uneventful. However, a problem arises when the code attempts to access b or one of it’s fields.
panics occurs here.
Faced with this realisation Go programmers are gripped with fear that someone could call their code’s methods on an accidental nil method. Their usual reaction is a creeping panic that they will have to pepper their code with nil checks like this protect against this scenario.
only execute the body of the method if b is not nil.
The solution is to realise that the check for a nil receiver before attempting the call is in the right place.
only call the method if b is not nil.
Rather than checking inside the method when it is too late, the check should be executed by the caller. But this is seen as unsatisfactory because it force the check to every call site, rather than in one place, the receiver.
For arguments sake let’s explore the options to effectively handle a nil receiver inside the method . What are the options for an author to handle this situation?
Given that calling a method on a nil receiver, expect where the method was written to explicitly handle this behaviour (there are types in the stdlib that do this, but not many), is an unrecoverable programming error, a reasonable response would be to panic.
But given that dereferencing b to access the message field is going to panic anyway, apart from having control over the panic message, this seems to add little other than boilerplate.
The next option to the reporting problem may be treat this coding error like any other non fatal error and return an error value.
return a descriptive error to the caller.
This has serious implications for the caller of any method.
Every method will have to return an error. Every. Method.
Every caller will have to check the error after a call to any method.
Every interface you define will have to include an error parameter so that an implementation can report it was called on a nil receiver.
Every interface you implement will have to proved you with an error return parameter.
We saw that option above, Whoops would print nothing if it was called on a nil receiver. This is perhaps the worst choice as now the operation will silently do nothing. Imagine trying to debug a complex failure in your application because some logic did not fire because it was passed a nil receiver?
Given there is no reasonable way for the method executed on a nil receiver to protected against this, the remaining option is to simply not worry about it. After all a nil receiver is a symptom of a bug that happened elsewhere in your code. The most likely cause was a failure to check the error from a previous call. That is the place where you should spend your efforts, not defensively trying to code around a failure to follow proper error handling.
Don’t check for a nil receiver. Employ high test coverage and vet/lint tools to spot unhandled error conditions resulting in nil receivers.
5. Interfaces.
Interfaces describe behaviour, types describe data. Interfaces are the key strategy for polymorphism and information hiding in Go.
But wait, aren’t interfaces types? Technically yes, but for the purpose of brevity, even though an interface is declared with the type keyword we’ll say that interface types are disjoint from the set of other types.
5.1. T vs *T for interfaces.
The method sets of a value of type T and type *T are disjoint.
You may convert a value of T to *T with the address of operator, &T . Conversely values of *T can be converted to values of T with the dereference operator.
Sadly Go retains the syntactic difficulty between the *T declaration and the *T dereference, although the former is a type, and the latter is a value of the result of the expression *T .
Although the method sets of T and *T are different, the compiler will help you by automatically inserting the relevant conversion operation, assuming the value is addressable.
This allows the caller to operate as if the methods available on T and *T are a union, almost all of the time (save where addressability is not present, see maps)
However, when a value is assigned to an interface type this illusion is broken.
Deference is easy, however address of, converting a T into a *T would take the address of the copy of the value stored in the interfaces' value slot, not the original value before assignment.
Said another way, because everything in Go is a copy, there is no way to "wrap" an interface around an existing value, a copy must be taken, if the value is a not pointer type, then the copy of the original value placed inside an interface.
If your type implements an interface and has methods on its pointer (which almost all types do), then you should always use a pointer value when assigning to the interface.
6. API Design.
All of the suggestions I’ve made so far are just that, suggestions. These are the way I try to write Go, but I’m not going to push them too hard in code review.
However when it comes to reviewing APIs during code review, I am less forgiving. This is because everything I’ve talked about so far can be fixed without breaking backward compatibility. They are, for the most part, implementation details.
When it comes to the public API of a package, it pays to put considerable thought into the initial design, because changing that design later is going to be disruptive for people who are already using your API. Changing your public API forces the existing user base to have to dedicate engineering resources to upgrading across your API break. The larger the break, the more likely this task will be considered low impact , but high risk , and likely to be pushed off in light of other business priorities.
Go is a language designed for collaboration and composition. There is a strong delineation between what happens inside a package, privately, and what is exposed to callers of the package, publicly. What is the api of your package ? The functions, and the methods obviously, but also:
The formal params.
The returns values.
The methods on your interfaces.
The fields of your structure, including their order.
The errors, their contents and their types.
If it’s exported, its part of your public API.
6.1. Design APIs that are hard to misuse.
If you take away one thing from this section, it should be this advice from Josh Bloch. If an API is hard to use for simple things, then every invocation will look complicated. When the actual invocation of the API is complicated it will be less obvious and more likely to be overlooked.
Strive to make your APIs difficult to misuse by design .
6.2. Design APIs for their default use case.
Take away, libraries define concrete types, helpers, free function, clients define the behaviour they want with their own interfaces.
Good api design should make the default behaviour trivial to implement and none trivial behaviour possible to implement. Each public function that takes an argument must have a obvious and defensible default behaviour.
A few years ago I gave a talk [7] about using functional options [8] to make APIs easier to use for their default case.
The gist of this talk was you should design your APIs for the common, or default, use case. Said another way, your API should not require the caller to provide parameters which they don’t care about. More than being hard to use, you place the user in the position of guessing reasonable values. If they are lucky, the values they YOLO’d have no impact. That’s if they are lucky.
If there is one unusual case, add a second constructor.
If there are more than one, consider a functional option set.
6.2.1.
Скачать Skymonk по прямой ссылке
Просмотров: 11  |  Комментариев: (0)
Уважаемый посетитель, Вы зашли на сайт kopirki.net как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.