Pointers are a fundamental and necessary aspect of Golang. They allow us to manipulate memory and data structures from a simple level, without needing to know the specifics of some more abstract data structures.
This article will cover Go pointers, their use in stacks and heaps, immutability vs efficiency as well as pointer types (primitives and variables).
Everybody uses pointers occasionally. How well-versed are we in it? What is taking place behind the scenes? In this post, we'll talk about pointers and how they might improve program performance at the expense of flexibility. Anyone who reads this will be able to explain how pointers relate to any of this, as well as what happens when a function calls and the distinction between heap and stack allocations.
Let's define a variable in Golang, you can relate to any programming in general. Let's understand what is a variable first. So a variable is a container for storing a value. we can think of it as a box that has 3 things
- a name
- a type
- a value
It will store somewhere in the memory.
It resembles putting a box in a warehouse in many ways. The value is in the box. We gave that box a name and a type as well. We added an address to that box as well. The location of the box inside the warehouse will be indicated by this address. Therefore, if we know the address, we can quickly locate and retrieve the box if we need it.
Let's see this in Golang
func main() {
var foo, bar int = 23, 42
fmt.Println(foo, bar) // will print the value
fmt.Println(&foo, &bar) // will print the address
}
Easy, right? In the first line we are defining 2 variables foo, bar with the value of 23, and 42 of type int. in lines 3 & 4 we print the values and it's addressed in the console.
Quick Note:
&
can be read asaddress of
. Every new variable has been given an address, and with that, we can locate that in the memory. This address will be the value of the pointer if we assign it to a pointer like below
func main() {
var foo, bar int = 23, 42
p := &foo
q := &bar
fmt.Println(p, q) // will print the address of foo, bar
// 0xc00001c0a8 0xc00001c0b0
}
p & q will hold the address of the foo & bar variable. Here we're using the short-hand variable declaration feature of Golang.
In the above picture, we can see how p
is holding the address of foo
.
func main() {
var foo int = 23
p := &foo
fmt.Println(*p)
// any guess
}
*p
will print the value stored in that address which is the foo value 23
we define above.
Here *
can be a little confusing at first as it can be used in two ways.
- Before a type (
*int
) - Before a variable (
*p
)Before a type
*int
the whole thing becomes a type. It ispointer type
and theint
as its base.Before a variable
*p
means when it is before a variable, it acts as an operator which returns the value the p is pointing to. That's why when we print*p
it'll print23
because it's the value of the variablep
is pointing to. It's also calledDereferencing
. So we can say that the value ofp
is the address offoo
and*p
is the value at that address which is the value offoo
. So what if we want to change the value of *p what will happen then. Any guess...
func main() {
var foo int = 23
p := &foo
fmt.Println(*p) // 23
*p = 42
fmt.Println(*p)
// any guess
}
Yes, it'll be 42
.
func main() {
var foo, bar int = 23, 3600
p := &foo
fmt.Println(*p) // 23
p = &bar
*p = *p / 36
fmt.Println(bar)
// any guess
}
Any guess what will happen to the value of bar
variable?
Quick note: We can put
bar
in*p
because p's type is a pointer and the base type is int, if it's not int then it'll return a run time error.
As we're doing an operation on *p
so the value of the bar will be modified. So the value will be printed 100
.
So why do we need pointers anyway?
Good question, right? If we just want to modify bar
s value then we can just modify bar
right? Then why??
Well, It's efficient to store a value in one place and access it from multiple places.
Let's understand with an example
Suppose we have four different functions and all the functions want to access bar
and want to modify it. So bar
will be modified in multiple places. This way of accessing a variable from multiple places using pointers is more efficient than creating a local copy of the variable without using a pointer.
To understand the situation more clearly, we need to understand Memory Allocations
first. Let's understand that first...
Memory Allocations
When we try to run a program, a goroutine is created and each goroutine gets a stack of memory.
You may ask what is a goroutine...
What is a goroutine?
a
goroutine
is an independent path of execution. we can also think of it as a very lightweight thread that is managed by go runtime.
Let's go back to the topic. Whenever a goroutine makes a function call, a part of the stack is going to be allocated we call that frame
. Let's see this in an example for a better understand
func main() {
a := 6
AddN(a)
}
// AddN will add n to the result and print its address and value
func AddN(n int) {
r := 0
r += n
fmt.Println(&r, r)
}
Here we define two functions main
& AddN
, when we run the main function we get a frame on the stack. The current running frame is called the Active Frame
.
So, After running the main function, we then call "AddN" as we follow through the main function. The stack will allocate another frame as soon as we call "AddN," and the goroutine will only operate within that new frame. It cannot go to other frames, stacks, or anything else. This is advantageous because, if we isolate each frame, we guarantee immutability, which implies that there is less chance that the variables will be changed during the program. So here the common question arises how can we access the a
variable inside the active frame? So the straight forward answer would be we can't access it. instead, we have to copy the value of a
into the new active frame and inside the active frame that value is going to be called n
and we can modify n
add it then print it, and do whatever we want with it but because we're making the changes inside the active frame it will not change anything else in the program outside of this frame. So the mutation will only happen inside this isolated frame.
Can anyone guess what's the catch here? Because we need to copy the arguments each time we make a function call which is not going to be so efficient. So when the AddN
function call ends and the active frame goes back to the main function a
will still be 6 but what if we want to change a
itself in the main function we want to get our hands on a
and not just the copy of it well this is where we start talking about pointers.
Let's write a new function using pointers so that it can modify the variable a
in the main function from the function by saying go and changing the value at that specific address.
func main() {
a := 6
squareAdd(&a)
}
// AddN will add n to the result and print its address and value
func squareAdd(p *int) {
*p *= *p
fmt.Println(p, *p)
}
We're inputting an address instead of a normal integer so we're going to put this address as the input parameter and we're calling it p
so the type of the input is *int(star int)
. The star here is not a dereferencing operator as we discuss above. Star int itself is just one whole token. We want to square the value of what's at that address so we need to put a star in front of p if we want to say the value at p which in this example is going to be a
and then let's print out p which is an address and the value of what p is pointing to by saying star p(*p
).
So when we call this function we need to pass in an address, not a value what you need to pass is an ampersand(&) a
because &
means that you're passing in the address of a
.
Let's see what happens in the stack when we call squareAdd
function. Instead of copying a
we are copying the address of a
and assigning it as a pointer p
in the frame and that pointer is pointing across the boundary of the frame and this is how we can modify the value of a
in the currently active frame by using *p
.
After we finish calling the functions we move to the main function again and everything under the active frame then becomes invalid meaning that if we make another function call this space will be overwritten and go will set all the variables to a zero value for the new frame so that we won't accidentally be using any random garbage values.
We'll explain Garbage collector
in detail in a future post. Stay tuned for that. Now let's continue...
When we're using value semantics like the example above with AddN
it was fine there's no way a
can get mutated but when we using pointer semantics we need to be careful because there is more possibility for the variable to be mutated in a way we didn't intend.
When we use pointer semantics you're giving up the safety of immutability for more efficiency. Now that we understand how pointers work in functions and we also learned about how they can affect variables in the stack. Now final topic we need to understand is Heaps. Let's talk about it.
Heaps
To understand the Heaps which is not the data structure we know from CS 101, but it's a separate structure altogether. We need to understand that heaps need to be cleaned by the garbage collection where the stack is self-cleaning. To understand the difference between heap and stack, we need to compare the difference between returning a value and returning a pointer. Let's define an example to understand more clearly...
Return value
package main
type person struct {S
name string
age uint
}
func NewPerson() person {
p := person{
name: "dummy person"
age: 60
}
fmt.Println("new person --> ", p)
return p
}
func main() {
fmt.Println("main --> ", NewPerson())
}
Return Pointer
package main
type person struct {S
name string
age uint
}
func NewPerson() *person {
p := person{
name: "dummy person"
age: 60
}
fmt.Println("new person --> ", &p)
return &p
}
func main() {
fmt.Println("main --> ", NewPerson())
}
So the two code block above has almost identical code with one exception where 1st code block returns a value fromNewPerson
function and 2nd code returns a pointer from NewPerson
function.
The NewPerson
initializes the person struct with dummy values and then returns it. After that, we call the NewPerson
function from the main function and print the result.
What happening behind the scene is, Go runtime assign the main function as an active frame in the stack of memory. Then when we call the NewPerson
, a new frame is created in the stack of memory and allocates p
, and then changes the values in p. Because of the isolation of the NewPerson
frame, we can not send p
to the main function instead we will be making a copy of it and pass to the main active frame so that's what happens when we return a value.
But instead of returning a value, let's return the address of p
which we showed in the above example. The point to be noted here is the function still works the same way as before but instead of copying the value this function going to make a copy of the address of p
to the main function frame, we can notice something important here at the same time something weird as the NewPerson
finishes executing here the New Person
active frame is going to become invalid so the address we copied into the active frame is going to be useless we don't know what that going to point to in the memory. So that can be a huge problem if we can't resolve the address and this is where heaps
going to help us solve the problem.
Note
Heaps
is not the same as the data structure we study in cs 101 data structures, they share the same name but completely different things.
So the compiler will analyze that and conclude that there's going to be a problem so it's going to copy m to the heap
then the NewPerson
function will return the address of p
in the heap and after return when the address of p
is copied to the frame of the main function. So now we can access p
with that address.
In the above, we print the address of the p
to check if they share the same address from NewPerson
function and the main function. So our problem is solved but we're doing this in the cost of heap allocations which can be a burden for the garbage collector and it can cost us performance.
If you like, you can read the same article on our official blog
You can read our other official blog-posts Here
You can read my other blog-posts Here
Conclusion
Go pointers can be a great way to implement efficiency in the codebase. But in doing so we have to think about the garbage collector as we assign more work for it as it needs to clean the heap allocations instead if we want Immutability in the codebase so that it uses a stack of memory which will automatically clean the stack when the frame finished its work and it just discards the frame and everything inside that frame when another function is called this space will be used by another frame. So we need to understand the stack and heaps because if we put too many things on the heap then the Garbage Collector needs to free more things from the heap as we don't use those things anymore. In contrast, it can affect performance.