Understanding Golang Project Structure

Go is an interesting programming language, with a nice focus on keeping it minimal. In that sense, the classic Go workspace setup seems like a bit of an anomaly, requiring seemingly complex project structure. It sort of makes sense, once you understand it. But first you need to understand it. That is what this post aims at.

I start with defining the elements of a classic Go workspace setup. In recent Golang versions, the Go Modules are a new way to set up a workspace, which I find can be used to simplify it a bit. However, even with the modules, the classic structure applies. Although it is not strictly necessary to use the classic structure, it is useful to understand.

Go Packages

The base of Go project structure is the concept of a Go package. Coming from years of Java indoctrination, with Python sprinkled on top, I found the Go package nuances quite confusing.

Package Path vs Name

A Go package reference consists of two parts. A package path and a package name. The package path is used to import the code, the package name to refer to it. The package name is typically the last part of the import but that is not a strict requirement. Example:

package main

import "github.com/mukatee/helloexample/hellopkg"

func main(){
	hellopkg.Hello()
}

The above code declares code that has a package name main. Lets say this code is in a file in directory github.com/mukatee/helloexample. Go only allows files from a single package in a single directory, and with this definition, all Go files in the directory github.com/mukatee/helloexample would need to define themselves as package main. The package name main is a special in Go, as it is used as the binary program entry point.

The above code also imports a package in the path github.com/mukatee/helloexample/hellopkg. This is the package path being imported. In the classic Go project structure, the package path practically refers to a directory structure. The typical package name for this import would be hellopkg, matching the last part of the package path used for the import.

For example, consider that the path github.com/mukatee/helloexample/hellopkg contains a file named somefile.go. Full path github.com/mukatee/helloexample/hellopkg/somefile.go. It contains the following code:

package hellopkg

import "fmt"

func Hello()  {
    fmt.Println("hello from hellopkg.Hello")
}

The code from this package is referenced in this case (in the main.go file above) as hellopkg.Hello(). Or more generally as packagename.functionname(). In this example it is quite understandable, since the package name matches the final elements of the package path (hellopkg). However, it is possible to make this much more confusing. Consider if everything else was as before, but the code in somefile.go were the following:

package anotherpkg

import "fmt"

func Hello()  {
    fmt.Println("hello from anotherpkg.Hello")
}

To run code from this package, now named anotherpkg, we would instead need to have the main.go contain the following:

package main

import "github.com/mukatee/helloexample/hellopkg"

func main(){
	anotherpkg.Hello()
}

This is how you write yourself some job security. There is no way to know where the anotherpkg comes from in the above code. Which is why the strong recommendation for keeping that last package path element the same as the package name. Of course, the standard main package needed to run a Go program is an immediate anomaly, but lets not go there.

Finally, you can make the package name explicit when importing, regardless of what the package name is defined inside the file:

package main

import hlo "github.com/mukatee/helloexample/hellopkg"

func main(){
	hlo.Hello()
}

In the above code, hlo is the alias given to the package imported from the path github.com/mukatee/helloexample/hellopkg. After this aliased import, it is possible to refer to the code imported from this path as hlo.Hello() regardless of whether the package name given inside the files in the path is hellopkg, anotherpkg, or anything else. This is similar to how you might write import pandas as pd in Python.

GOROOT

GOROOT refers to the directory path where the Go compiler and related tools are installed. It is a bit like JAVA_HOME for Java Virtual Machines.

I generally prefer to set this up myself, even though each system likely comes with some default. For example, last time I was setting up Go on a Linux machine, the Go toolkit was installable with the Snap package manager. However, Snap warns about all the system changes the Go installer would do. Something about Snap classic mode, and some other scary sounding stuff the install might do. To avoid this, I just downloaded the official Go package, extracted it, and linked it to the path.

Extraction, and symbolic linking (on Ubuntu 20.04):

cd ~
mkdir exampledir
cd exampledir
mkdir golang_1_15
cd golang_1_15
#assume the following go install package was downloaded from the official site already:
tar -xzvf ~/Downloads/go1.15.3.linux-amd64.tar.gz
cd ..
ln -s golang_1_15/go go

The above shell script would extract the Go binaries into the directory ~/exampledir/golang_1_15/go (the last go part comes from the archive). It also creates a symbolic link from ~/exampledir/go to ~/exampledir/golang_1_15/go. This is just so I can point the GOROOT to ~/exampledir/go, and just change the symbolic link if I want to try a different Go version and/or upgrade the Go binaries.

The final step is to point GOROOT to the symbolic link, by adding the following to ~/.profile file, or whatever startup script your shell uses:

export GOROOT=/home/myusername/exampledir/go
export PATH=$PATH:$GOROOT/bin

After this, the go binaries are available in the shell, the libraries are found by the Go toolchain, and my Goland IDE found the Go toolchain on the path as well. Brilliant.

GOPATH

While GOROOT specifies the toolkit path, GOPATH specifies the Go workspace directory. GOPATH defaults to ~/go, and it is not strictly required to define it. When using the new Go modules, it is also easier to skip this definition, as the package path can be defined in the module definition. More on that later.

The part I find confusing for GOPATH is that the workspace directory it defines is not for a specific project, but intended for all Go projects. The actual project files are then to be put in a specific location in the GOPATH. It is possible to set up multiple workspaces, but this is generally not adviced. Rather it is suggested to use one, and put the different projects under specific subdirectories. Which seems all the same, since even if you use multiple workspaces, you still have to put your projects in the same subdirectories. I will illustrate why I find it confusing with an example.

The workspace defined by GOPATH includes all the source code files, compiled binaries, etc. for all the projects in the workspace. The directory structure is generally described as:

|-bin
|-pkg
|-src
  |-github.com/mukatee/helloexample/hellopkg
    |-somefile.go
    |-somefile_test.go

These directions are:

  • bin: compiled and executable binaries go here
  • pkg: precompiled binary components, used to build the actual binaries for bin. Also as some sort of intermediate cache by the go get tool, and likely other similar purposes. Generally I just ignore it since it is mainly for Go tools.
  • src: source code goes here

The above directory layout does not seem too confusing as such. However, if I look at what it means if I checkout code from multiple projects on github (or anywhere else) into the same workspace, the result is something like this:

|-bin
|-pkg
|-src
  |-github.com/mukatee/helloexample
    |-.git
    |-README.md
    |-main.go
    |-hellopkg
      |-somefile.go
      |-somefile_test.go
  |-github.com/randomuser/randompeertopeer
    |-.git
    |-README.md
    |-main.go
    |-netwrk
      |-node.go
      |-peernetwork.go

And in the above is my confusion. Instead of checking out my projects into the root of the workspace, I first need to create their package path directories under the workspace. Then clone the git repo in there (or copy project files if no git). In the above example, this workspace would have two projects under it, with the following package paths:

  • github.com/mukatee/helloexample
  • github.com/randomuser/randompeertopeer

To set this up, I would need to go to github, check the projects, figure out their package paths, create these paths as a folder structure under the workspace, and then, finally clone the project under that directory. After this, Go will find it. There is algo the go get tool to download and install specific packages from github (and likely elsewhere) into the correct place on the GOPATH. However, this does not clone the git repo for me, nor explain to me how my own code and repository should be placed there along with other projects. For that, I need to write posts like this so I get around to figuring it out 🙂

This workspace structure is especially confusing for me, since it seems to force all Go projects on Github to hold their main.go at the project root level along with everything else you put in your repo, including your documentation and whatever else. I find many Go projects also end up hosting pretty much all their code at the root level of the repo. This easily makes the repository a complete mess when I try to look at the code and it is just one big mess in the top directory.

Again, this is what I find to be really confusing about it all. With Go modules it is at least a bit more clear. But still, there is much legacy Go out there, and one has to be able to use and understand those as needed. And even for using Go modules I find I am much better off if I understand this general Go workspace setup and structure.

Go Modules

Go modules are really simple to set up. You just create a go.mod file in the root of your project directory. This file is also very simple in itself. Here is a basic example:

module github.com/mukatee/helloexample

go 1.15

Or if that is too much to type, the go mod init command can do it for you:

go mod init github.com/mukatee/helloexample

That creates the go.mod file for you, all the 3 lines in it..

The above go.mod example defines the module path on the first line. The code itself can be in any directory, it no longer matters if it is on the GOPATH when using the Go modules. The line go 1.15 defines the language version used for the project.

The official documentation still recommends using the whole GOPATH workspace definition as before. However, even with GOPATH undefined everything works if go.mod is there:

$ go build -v -o hellobin
$ ./hellobin
hellopkg from hellopkg.Hello

In the above, I am specifying the output file with the -o option. If GOPATH is not set, it will default to ~/go. Thus if I have the above project with the go.mod definition, and run the standard Go binary build command go install on it, it will generate the output binary file in ~/go/bin/helloexample. The -o option just defines a different output path in this case.

Seems a bit overly complicated, when I am just used to slapping my projects into some random workdir as I please. But I guess having some standardized layout has its benefits. Just took me a moment to figure it out. Hope this helps someone. It certainly helped me look all of this up properly by writing this post.

Conclusions

This post describes the general layout of the classic Go project. While I would use the new module structure for projects whenever possible, I sometimes download projects from Github to play with, and I am sure many corporations have various legacy policies that require this classic architecture. There are a lot more information and nuances, but I encourage everyone to look it up themselves when they come to that bridge.

The package path vs package name system is something that really confused me badly coming from other programming languages. It is not too bad once you understand it. But generally most things get easy once you master them, eh. Achieving the mastery is the hard part. It is just the difficulty in achieving the understanding. I cannot say if the Go project structure is confusing, or if I am just loaded with too much legacy from other environments (Java, Python, ..).

There is much good to Go in my opinion, and the modules system helps fix many of the general issues already. I have written a few small projects in Go, and look forward to trying some more. Sometimes the aim for simplicity can require some bloat in Go, not sure if the project structure qualifies in that category. In any case, good luck working with Go, even if you don’t need it.. I do 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s