Golang Mistakes: #2 Misusing Init() Functions

Golang Mistakes: #2 Misusing Init() Functions

As Go developers, We find many built-in features handy when we are coding in a real life. But we sometimes forget, using those built-in features too extensively which may lead to certain pitfalls. Most of the time, these pitfalls occur when we misuse those features. Let's discuss a built-in feature of Golang which may lead to a pitfall if we misuse them. The feature today discuss is init() function.

Let's first refresh our minds about what an init() function is, with a 101 class here. Then we can discuss if we want to use them or not.

If anyone wants to learn more about this mistake and others please read the book 100 Go Mistakes and How to Avoid Them. I learned a lot from that book.

101 of init() function

An init function is a function used to initialize the application state before the main function execution. It takes no arguments and returns no result. In Golang, a package initializes first, then all the constant and variable declarations in the package are evaluated. Then, the init functions are executed.

Note: We can define more than one init function in a single package.

Here is an example of initializing the main package:

package main

import "fmt"

var a = func() int {
    fmt.Println("variables & consts first")
    return 0
}()

func init() {
    fmt.Println("init evaluated second")
}
func main() {
    fmt.Println("finally main")
}

After running the above code we will see the below results in the terminal.

// result
variables & consts first
init evaluated second
finally main

We can see that the variable & constants will be evaluated first then the init function and finally the main function.

Let's define two packages to understand the init function execution order more clearly. Suppose we have two packages, main & cache. The main package is dependent on the cache package. So the main package imports the cache package. Both packages has init function.

Cache package

package cache

type ICache interface{
    Store(key, value interface{}) error
} 

type client struct {
    // redis for example
}

var myCacheClient client

// init will initialize the cache client & set it to a global variable
func init() {
    // ...
}

func NewCacheClient() ICache {
    return &myCacheClient{
        // assign the cache client instance
    }
}

// Store will save the key & value in the cache client
func (c *myCacheClient) Store(key, value interface{}) error {
    // use cache client store function to persist the key value
}

Note: the above code is not a good practice, we all should avoid this pitfall. We'll explain why later in this post.

Main package

package main

import "cache"

// init will initialize the other dependent packages
func init() {
    // code
}

// Store will save the key & value in the cache client
func main() {
    cacheClient := cache.NewCacheClient()
    err := cacheClient.Store("hello", "world")
    if err != nil {
        return
    }
}

As the above example shows both packages have one init function at their disposal but one may ask which one gets precedence over one another. So it's quite simple if you think about it. So the main package depends on the cache package so the cache package’s init function is executed first, followed by the init of the main package, and then the main function itself.

Above we mentioned that we can define more than one init function per package. When we do something like that, the execution order of the init function inside the package is based on the source files’ alphabetical order. For example, if a package contains an a.go file and a b.go file and both have an init function, the a.go's init function is executed first.

init_execution_order.gif

                                                       Figure 1 Init Functions Execution Order

The init function of the cache package is executed first, then the init function of the main, and finally the main function.

We shouldn’t rely on the ordering of init functions within a package name's alphabetical order. As it can be very dangerous as source files can be renamed by anyone, potentially impacting the execution order. We can also define multiple init functions within the same source file. For example, this code is perfectly valid:

package main

import "fmt"

func init() {
    fmt.Println("init 1")
}

func init() {
    fmt.Println("init 2")
}

func main() {
    fmt.Println("main")
}

The output shows the order of execution

init 1
init 2
main

So now one may ask them when to use the init feature of Golang. One use case would be used as a side effect. Many third-party packages use this as a side effect. for example, Viper a very popular configuration management package use this feature.

Init as Side Effects

When a package doesn't depend on another package strongly but the other package must be initialized before we can use the first package that time we use another package import as a side effect. Let's give a code example for better understanding.

package main

import _ "queue"

func main() {

}

So we could use the _ operator to import the queue package as a side effect. This means queue will be initialized, but not going to use by the main package directly. As the queue package is initialized, if the queue package has a init function then it will be initialized before the main package's init function.

Invoking Init function

In Golang, we're not allowed to invoke the init function directly. Let's check the below code block

package main

func init() {
    // some code
}

func main() {
    init()
}

If we try to run the above code, the Go compiler will through a compilation error saying init is not defined.

So now we cover the 101 of init function in Golang. How it works and how it invokes corresponding order.

When not to use init function

Creating a database connection pool

var db *sql.DB
func init() {
    dataSourceName := os.Getenv("MYSQL_DATA_SOURCE_NAME")
    d, err := sql.Open("mysql", dataSourceName)
    if err != nil {
        log.Panic(err)
    }
    err = d.Ping()
    if err != nil {
        log.Panic(err)
    }
    db = d
}

In the above example, we're using a init function to initialize a connection pool of MySQL database connection and assign the DB instance to a global variable. What do you think the drawbacks would be if you(the reader) review the code above...

Let's list down the drawbacks first

  1. Limited error management
  2. Testing
  3. Assigning to a global variable

Let's a little briefly to understand more clearly.

1. Limited error management

As we already know the init function does not return any error. So the only way to signal an error is to panic, leading the application to be stopped. But in this example, it may not seems very caviar as if our application doesn't have any database connection then the application would be useless. Being said that still the decision to throw a panic still needs to decide by the package itself. Perhaps a caller might have preferred implementing a retry or using a fallback mechanism. In this case, opening the database within an init function prevents client packages from implementing their error-handling logic. In that sense init function limits our error management.

2. Testing

If we add tests to this file, the init function will be executed before running the test cases, which isn’t necessarily what we want (for example, if we add unit tests on a utility function that doesn’t require this connection to be created). Therefore, the init function in the above example complicates writing unit tests.

3. Assigning to a global variable

The above example requires assigning the database connection pool to a global variable. Global variables have some severe drawbacks; for example: Any functions can alter global variables within the package. Unit tests can be more complicated because a function that depends on a global variable won’t be isolated anymore. In most cases, we should favor encapsulating a variable rather than keeping it global. For these reasons, the previous initialization should probably be handled as part of a plain old function like so:

func NewDbClient(dsn string) (*sql.DB, error) {
    sqlDb, err := sql.Open("mysql", dsn)
    if err != nil {
        return nil, err
    }
    if err = sqlDb.Ping(); err != nil {
        return nil, err
    }
    sqlDb.SetConnMaxLifetime(15 * time.Second)
    sqlDb.SetMaxIdleConns(10)
    sqlDb.SetMaxOpenConns(10)     

    return sqlDb, nil
}

After the caller receive the error if there's any then the caller will decide what to do with the error which gives the caller more flexibility over controlling the error. If we use the newly created DB function NewDbClient then the above-mentioned drawbacks are all solved. Here's how...

  1. The responsibility of error handling is left up to the caller.
  2. It’s possible to create an integration test to check that this function works.
  3. The connection pool is encapsulated within the function.

Having said that, Anyone may think we should never use the init function, right?

Wrong, Actually we can use the init which makes out much easier. For example, the official Go blog uses an init function to set up the static HTTP routes:

package main

import (
    "net/http"
    "strings"
    "time"

    "golang.org/x/tools/blog"
    "golang.org/x/website/content/static"

    _ "golang.org/x/tools/playground"
)

const hostname = "blog.golang.org" // default hostname for blog server

var config = blog.Config{
    Hostname:     hostname,
    BaseURL:      "https://" + hostname,
    GodocURL:     "https://golang.org",
    HomeArticles: 5,  // articles to display on the home page
    FeedArticles: 10, // articles to include in Atom and JSON feeds
    PlayEnabled:  true,
    FeedTitle:    "The Go Programming Language Blog",
}

func init() {
    // Redirect "/blog/" to "/", because the menu bar link is to "/blog/"
    // but we're serving from the root.
    redirect := func(w http.ResponseWriter, r *http.Request) {
        http.Redirect(w, r, "/", http.StatusFound)
    }
    http.HandleFunc("/blog", redirect)
    http.HandleFunc("/blog/", redirect)

    // Keep these static file handlers in sync with app.yaml.
    static := http.FileServer(http.Dir("static"))
    http.Handle("/favicon.ico", static)
    http.Handle("/fonts.css", static)
    http.Handle("/fonts/", static)

    http.Handle("/lib/godoc/", http.StripPrefix("/lib/godoc/", http.HandlerFunc(staticHandler)))
}

func staticHandler(w http.ResponseWriter, r *http.Request) {
    name := r.URL.Path
    b, ok := static.Files[name]
    if !ok {
        http.NotFound(w, r)
        return
    }
    http.ServeContent(w, r, name, time.Time{}, strings.NewReader(b))
}

In this above Go blog example, the init function cannot fail (http.HandleFunc can throw panic only if the handler is nil, which is not the case in the above example). Meanwhile, there’s no need to create any global variables, and the function will not impact possible unit tests. Therefore, this code snippet provides a good example of where init functions can be helpful.

If you like, you can read the same article on our official blog

You can read our other official blog-posts Here

You can read my other blog-posts Here

In summary, we saw that init functions can lead to some issues: They can limit error management. They can complicate how to implement tests (for example, an external dependency must be set up, which may not be necessary for the scope of unit tests). If the initialization requires us to set a state, that has to be done through global variables. We should be cautious with init functions. They can be helpful in some situations, however, such as defining static configuration, as we saw in the above section. Otherwise, and in most cases, we should handle initializations through ad hoc functions.