Skip to content

Fix Issue 16659 - Clarify mutating while iterating rules #1704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions spec/hash-map.dd
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,46 @@ $(H3 $(LNAME2 construction_and_ref_semantic, Construction and Reference Semantic
assert(aa[2] == 2); // now both refer to the same instance
------

$(H3 $(LNAME2 mutation_while_iteration, Mutation during Iteration))

$(P There's no guarantee that $(I removing) an entry doesn't
trigger a rehashing and that as a consequence the index variable
isn't invalidated and still points to the same, remaining elements
of the ongoing iteration. Similarly $(I inserting) a new key isn't
guaranteed to maintain the same iteration. $(I Modifying) a key needs to
be considered as a removal and insertion and thus can trigger a rehashing
as well.
Copy link
Member

@quickfur quickfur Jun 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: modifying an AA key in-place is illegal, because the current implementation will get into an invalid state, and in all likelihood you will not be able to find the key again when doing AA lookups, though it will still appear when iterating over the AA. This invalid state may also lead to entries with duplicate keys in the AA.

Even outside the considerations of the current implementation, any AA implementation's correctness is keyed on the key being immutable after you add it to the AA; breaking this contract will cause malfunctions. We did try to limit this problem a few years ago by making the compiler automatically convert AA keys to const. Unfortunately, this is not sufficient, because the caller may still hold mutable references to the key, and thus cause havoc when the key is subsequently modified. Ideally, the AA key type should be immutable, but this was not done for various reasons that I can't recall now.

You can see this effect for yourself, in fact:

void main() {
        int[const(char)[]] aa;
        char[] key1 = "abc".dup;
        
        // Trick the compiler into accepting this key into the AA. Seems dmd
        // git HEAD is smart enough to reject adding key1 directly to aa.
        const(char)[] keyref = key1;
        
        aa[keyref] = 123;
        key1[] = "def"[]; // overwrite key referenced by AA entry
        //assert(("abc" in aa) !is null); // fails
        //assert(("def" in aa) !is null); // *also* fails

        aa["abc"] = 456;  // duplicate key
        //assert(aa.length == 1);       // fails(!)

        assert(aa["abc"] == 456); // OK
        aa.remove("abc"); 
        
        //assert(aa.length == 0); // fails
        key1[] = "abc"[]; // mutate original key back to its "correct" value
        assert(aa["abc"] == 123); // it remembers its original value!
}

As you can see, the AA exhibits very strange (and buggy) behaviour when you mutate a key in-place.

TLDR, mutating an AA key is very bad, even worse than inserting/deleting keys while iterating, and should never, ever be done.

However, it is legal to $(I change) the associated value during an iteration.
$(I Key lookup) (`key in arr`) is also guaranteed not to cause a rehashing.
)

------
unittest
{
import std.stdio : writeln;
int count = 0;

int[string] hash = ["a":5, "b":2, "c": 7, "d": 9, "e": 1];
foreach (key, ref value; hash)
{
if (value < 5)
hash.remove(key); // DANGEROUS: removal might cause a rehashing
else
{
// DANGEROUS: insertion might cause a rehashing
// For example, with the current implementation
// the following can be observed:
hash["_"] = 0; // _not_ seen during this iteration
hash["foo"] = 0; // seen during this iteration
value++;
}
hash.writeln;
count++;
}
assert(count == 6);
assert(hash == ["c":8, "a":6, "d":10, "foo": 0, "_": 0]);
}
------

$(H3 $(LNAME2 properties, Properties))

Expand Down