Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Go's range clause (funcmain.com)
82 points by peterarmitage on June 22, 2013 | hide | past | favorite | 33 comments


I wrote this after finding a bug in some of my code. Basically, I was iterating over a slice of inputs, processing each one. On occasion this processing would reveal new inputs to test, so I appended them to the slice.

    for i := range input {
        if newValue := test(input); newValue != nil {
            input = append(input, newValue)
        }
    }
Of course, this doesn't work, and I figured out (and appreciated) why by reading Go's spec.

Hopefully this brief guide will be of use to someone.


First off, good write up, I'm sure others will find it useful.

To your specific issue, I thought it was good programming etiquette to never modify an object you're iterating over, regardless of how the language handles such a thing. Were my instructors too strict? Is this a common idiom in other environments?


Thank you.

"modify an object" means "change the number of elements", since you obviously want to be able to manipulate the individual elements of a container as you iterate over them. The object here is the container, not the elements themselves.

I can't comment on other languages, but I'd say that guideline is a little too strict for Go.

The classic implementation of Breadth First Search involves iterating over a queue as you fill it.

"don't modify the RHS while in a range clause" would be a more suitable guideline for Go. Note that it's subtley different from "iterating over" - indeed, the answer to my bug was to iterate without using the range clause:

    for i := 0; i < len(input); i++{
        if newValue := test(input); newValue != nil {
            input = append(input, newValue)
        }
    }
I now appreciate the difference between this and the range clause - the length is evaluated every iteration this way. The range clause evaluates it once, at the beginning - rule (1).


After thinking a bit about your examples it makes me appreciate the keyword: the range clause's strictness guarantees iterating only on a certain `range` (self-duh) hence why it's not just called `iterate`.

I found myself making simple mistakes by assuming that range reading on a synchronous channel would cause the goroutine that is sending to the channel to become active. Instead, I wanted to use a for-select statements or a buffered channel because a length guarantee couldn't be made (or so I assume).


The GP's example is equivalent to pushing to the tail of a queue while you're consuming its head, which is a relatively common (and safe) pattern. It looks like the Go designers made this pattern a bit harder to express, in favour of making the general case a bit harder to mess up.


> the Go designers made this pattern a bit harder to express

GP can just do an old-school `for` loop without a `range` clause (generally everyone learns about `for` before learning about `range`) and this immediately becomes incredibly easy to express.


I'm just dabbling in Go now, and this was really helpful. There wasn't a ton of concise writing on this topic that I found. Thanks again.


Thanks for writing this, very informative. You say that adding to a channel's buffer to allow you to append to it while you're reading is code smell. Would you consider just spinning off a goroutine to do the append the same? It seems like it gets around the problem pretty well, though it's pretty much the same thing. Just wondering what you think because I've used this pattern before.


Another gotcha with for loops: scoping is a little weird.

    	xs := []int{1,2,3}
	for _, x := range xs {
		go func() {
			fmt.Println(x)
		}()
	}
That prints `3 3 3` not `1 2 3`. You can fix it like this:

        for i := range xs {
		x := xs[i]
		go func() {
			fmt.Println(x)
		}()
	}
Which seems like it ought to be the default behavior.


This seems to me to be the same class of error as those that aren't so much a go thing as a very comma gotcha for closures; you're placing x in the scope of the for loop and passing it into a series of anonymous functions, so it gets closed over and referenced as a common variable between each of those functions; thus before each goroutine has a chance to run the loop has completed and x == 3. You'll find this behaviour in any language with closures.

In the second example you're allocating a new local variable on each iteration, so each individual value gets closed over separately. That's probably not what you'd want usually, hence that not being default behaviour.


Right. I'm just saying I think the scope ought to be different. The 'x' in the loop should be a new variable each time because its not really 'a common variable between each of those functions'.

In the

    for i := 0; i < 10; i++ {}
case it's definitely more clear that i should be the same thing between iterations. (So you can tinker with i inside the loop) It just seems like they could've done something different for the 'range' for loop.

Javascript is plagued with this same problem (though its even worse because it doesn't even follow { } blocks)


After I wrote my reply I played around with some C# and found to my surprise that foreach does in fact provide a new variable to be captured on each iteration, so clearly this is a design decision that varies between languages.


Not all languages with closures work this way. In Objective-C blocks, the default is to capture locals by value, so this sort of error is less likely. In C++11 lambdas, you have to specify the capture type as well.

Capturing variables by value has both safety and performance benefits in a multithreaded world, and it's unfortunate that Go chose not to do that.


Not sure why you were downvoted! That's interesting. I can see why, this seems a rather common gotcha.


This really isn't a gotcha. This happens in every language with closures, except Java. People complain about Java's insistence on labeling every closed-over variable "final", but it sure does nip this problem in the bud.


I was curious if a simple fix for the bug would be wrapping a `defer` statement around the anonymous function. It seems only arguments are evaluated when defering, not blocks like I had assumed.

http://play.golang.org/p/Ah9SQrSKw4


Maybe something like the following would be more explicit:

        for i,x := range xs {
		go func(x) {
			fmt.Println(x)
		}(x)
	}


Yeah that kind of approach is typically how you resolve this issue, coffeescript even has a neat wrapper around (function([args...]) { ... })([args...]) of do (args...) -> ... to make this less painful.

Interesting how the go syntax makes it easy to explicitly pass in arguments. That is very nice.


I dislike the fact that the value in range over a slice is a copy, rather than an alias, to the slice value.

It seems to me to be strictly less useful than the alternative (aliasing).

And I also don't really buy the argument that "it is a normal assignment and so has to copy", since:

    i, v := range s
isn't a normal assignment. It has special rules to do with looping. Having the additional rule that v aliases to the entry seems to me to be a full win (too late to change now I guess).


The reason it's right is that if it was done by aliasing, it would create a wierd trap variable, an invisible pointer dereference which would modify something else.

  a := 1
  b := []int{2,3,4}
  for i, c := range b {
    // lots more code
    a = 5
    c = 6
    // now a is 5 and c is 6 as you'd expect
    // but also by magic, b[i] is 6 
  }
  // now by magic, b is {6,6,6}


It's not just "too late to change now", it's also proper behavior.

Go code is meant to be concise:

- once a Go developer learned that the above is a copy (which she will learn very early on), she won't ever forget, it's just too fundamental

- to modify the slice, just modify the slice and skip the copy with underscore:

    for i, _ := range mySlice { mySlice[i] = "foo" }
There we go. Syntax simplicity (without exotic compiler flags or prep directives) and conciseness fully preserved.


By the way, you don't need a _ assignment there. You can just do i := range mySlice.


Which I still do constantly, then get a compile error when what I thought was something else is actually an int.


That's a good reason to always include both parts of the LHR:

because if that "something else" actually is an int -- then you'll have a nasty bug at your hands (and not a compiler-catchable one at that).


This catches people out, and is something that some people on the Go team have expressed regret over - unfortunately it is too late to change due to backwards compatibility promises.

http://youtu.be/p9VUCp98ay4?t=22m18s

http://golang.org/doc/go1compat.html


The Go Team regrets defining the scope of the range variables as the for statement. I am not aware of any regrets regarding the alias issue.

The range variable scope is the one big gotcha that's missing from the article. See http://golang.org/doc/faq#closures_and_goroutines for one discussion of the issue.


Sorry, I misread your comment, this isn't relevant.


Though I like the copy behavior more than the reference one I wish there was an option to turn on aliasing. Kinda like in C++11's for(:) where you can get both - a copy or a reference.

Having strict control over reference/value semantics is a feature I don't want to miss in a systems language.


Yep, I suppose this would be the preferable way to deal with this whole situation, although it might confuse new Go programmers and it may produce nasty bugs which are rather hard to find (especially in a larger code base, obviously).


Does go have a generic iterator interface? When I looked through the docs range seemed the closest, but all the examples seemed tied to array or an array of the keys of a dictionary.

For example Python has a iterator protocol and language support, and Java has Iterator/Iterable with language support.


Strangely, the thing I miss most is

    for i in range(100):
except with 100 being, normally, not a constant.


really? what's wrong with:

    for i := 0; i < 100; i++ { }
a little more going on... but not much.


The first one is eight tokens, of which at least three are necessary: 100, for, and i; let's say four. The second one is 13 tokens. That means it has more than twice as much noise to distract you from the signal, and to get right when you write the code. As a result, variations like these require more attention to notice when you're reading the code:

    for i = 0; i < 100; i++ { }
    for i := 1; i < 100; i++ { }
    for i := 0; j < 100; i++ { }
    for i := 0; i <= 100; i++ { }
    for i := 0; i++; i < 100 { }
The first has no counterpart in Python (where the 8-token version comes from) since Python always has that bug. The others are:

    for i in range(1, 100):
    i=0; while j < 100: i += 1; ...
    for i in range(101):
    raise TypeError
In short, every bit of extraneous information you put into your code distracts you from the relevant information, and that extraneous information is something else you can get wrong. Golang does a lot better at this than C does, but it could do better still.

In some cases, where Golang is noisier than Python or Ruby, it's because the extra redundancy is there to catch errors or encourage you to handle failures properly. This is not one of those cases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: