optimization: IPC: pass FieldNodes around by value instead of reference #543

pixelherodev · 2025-10-23T21:08:10Z

Rationale for this change

This avoids an unnecessary heap allocation, reducing GC pressure and increasing latency and throughput of anything that processes messages.

What changes are included in this PR?

fieldMetadata returns the flatbuf.FieldNode that it loads from metadata directly, instead of a pointer to it. The three callers that pass through the pointer also now return the FieldNode directly.

Are these changes tested?

Yes.

Are there any user-facing changes?

No; the pointer was not exposed anyways.

pixelherodev · 2025-10-23T21:08:31Z

By the way, I've got another half dozen optimizations coming, at least. Most of them are small, non-behavioral changes like this. If it would be easier, I can send a single larger PR?

zeroshade · 2025-10-23T21:09:28Z

A single larger PR (within reason) would be fine.

Ideally, could you also include any relevant go test -bench runs which show the improved performance?

pixelherodev · 2025-10-23T21:11:10Z

A single larger PR (within reason) would be fine.

Ideally, could you also include any relevant go test -bench runs which show the improved performance?

Unfortunately most of this has been measured against a larger binary - a database - where the wins are noticed via reductions in profile times. I can include screenshots of the flame graph portions that are relevant to Arrow, if that helps? Unfortunately, I'm likely not authorized to share the profiles themselves, as they also contain proprietary symbols.

I can write up fresh benchmarks?

pixelherodev · 2025-10-23T21:13:14Z

Also, for optimizations like this one, short-lived benchmarks don't always catch the improvement - the actual change in runtime per invocation here is pretty trivial, the real win is that it reduces the number of live objects on the heap, which makes every GC pass a little bit faster, as well as slightly reducing their frequency...

zeroshade · 2025-10-23T21:14:39Z

I'd be surprised if running go test -bench -mem to run the existing benchmarks doesn't show the reduction in allocations and such at a minimum.

pixelherodev · 2025-10-23T21:15:28Z

Hadn't realized there was an existing benchmark suite, will do!

pixelherodev · 2025-10-23T21:30:33Z

Before:

goos: linux
goarch: amd64
pkg: github.com/apache/arrow-go/v18/arrow/ipc
cpu: AMD Ryzen 9 7900X3D 12-Core Processor          
BenchmarkIPC/Writer/codec=plain-24         	 203998	     7155 ns/op	   8160 B/op	     91 allocs/op
BenchmarkIPC/Reader/codec=plain-24         	 171043	     6329 ns/op	   6131 B/op	     83 allocs/op
BenchmarkIPC/Writer/codec=zstd-24          	   4206	   316075 ns/op	2338796 B/op	    150 allocs/op
BenchmarkIPC/Reader/codec=zstd-24          	  33127	    30308 ns/op	  25878 B/op	    194 allocs/op
BenchmarkIPC/Writer/codec=lz4-24           	  86971	    14408 ns/op	  13341 B/op	    134 allocs/op
BenchmarkIPC/Reader/codec=lz4-24           	  95056	    12548 ns/op	  10285 B/op	    124 allocs/op
PASS
ok  	github.com/apache/arrow-go/v18/arrow/ipc	8.449s

After:

goos: linux
goarch: amd64
pkg: github.com/apache/arrow-go/v18/arrow/ipc
cpu: AMD Ryzen 9 7900X3D 12-Core Processor          
BenchmarkIPC/Writer/codec=plain-24         	 211380	     6277 ns/op	   8160 B/op	     91 allocs/op
BenchmarkIPC/Reader/codec=plain-24         	 172264	     6650 ns/op	   6067 B/op	     81 allocs/op
BenchmarkIPC/Writer/codec=zstd-24          	   3980	   345964 ns/op	2338709 B/op	    150 allocs/op
BenchmarkIPC/Reader/codec=zstd-24          	  42043	    28065 ns/op	  25815 B/op	    192 allocs/op
BenchmarkIPC/Writer/codec=lz4-24           	 113084	    11949 ns/op	  12757 B/op	    134 allocs/op
BenchmarkIPC/Reader/codec=lz4-24           	  94621	    12256 ns/op	  10070 B/op	    122 allocs/op
PASS
ok  	github.com/apache/arrow-go/v18/arrow/ipc	8.533s

Writer code should be unaffected, I think it's just inter-run variance (possibly running on a different CCD?)

pixelherodev · 2025-10-23T21:32:29Z

CI failure looks like flakiness, since it's passing on the Ubunth image?

pixelherodev · 2025-10-23T21:41:30Z

PASS
ok  	github.com/apache/arrow-go/v18/arrow/flight	1.046s

That test is passing locally.

pixelherodev · 2025-10-23T21:58:22Z

With that last commit as well:

BenchmarkIPC/Writer/codec=plain-24         	 223406	     5536 ns/op	   8160 B/op	    91 allocs/op
BenchmarkIPC/Reader/codec=plain-24         	 236199	     4820 ns/op	   6065 B/op	     78 allocs/op
BenchmarkIPC/Writer/codec=zstd-24          	   3708	   325071 ns/op	2338712 B/op	    150 allocs/op
BenchmarkIPC/Reader/codec=zstd-24          	  44013	    25784 ns/op	  25808 B/op	    189 allocs/op
BenchmarkIPC/Writer/codec=lz4-24           	 101749	    11613 ns/op	  12890 B/op	    134 allocs/op
BenchmarkIPC/Reader/codec=lz4-24           	 129542	    10651 ns/op	  10081 B/op	    119 allocs/op

This avoids an unnecessary heap allocation, reducing GC pressure and increasing latency and throughput of anything that processes messages.

This is a drive-by patch; the test was failing for the other optimization, and I noticed this when I went to take a look and figured I'd just send the patch.

The Table() method is unused, so it's fine to embed the table as is.

Drops another allocation per ipc/Message, improves throughput further.

This is a bit bloated, but I wasn't sure if the API for NewMessageReader and NewReaderFromMessageReader could be changed to take in the config struct directly. It would be a lot cleaner to expose both APIs with Config directly... This drops another allocation from every call to NewReader, which - if a new reader must be created for every API call - can be noticeable...

In branch 1, shift is defined as next-addr. We then slice buf with shift as the offset, and with size+shift as length and capacity. In branch 2, addr == next, and thus shift == 0. size+shift == size. The slice statements are thus equivalent; there's no reason to be branching here.

Avoids an allocation per Message call.

Before: cannotInlineFunction: complexity 148 exceeds budget After: canInlineFunction (cost: 68)

Removes four allocations from every NewReader call.

the meta object in messageReader is a wrapper around flatbuf.Message, which is static. There's no allocator usage. As such, we can reuse the memory.Buffer heap object to at least avoid re-allocating it on each call, since it never really changes.

Ideally, we'd reuse the underlying allocation instead of freeing it and reallocating each call, but that seems to be exposing a bug in the buffering logic. Still, this is a few more allocations gone...

Avoid branching in Message()

pixelherodev · 2025-10-24T13:12:02Z

....hmmm, I'd assumed since the Message is invalidated on each call to Next() that that would mean that nothing would try holding on to the underlying buffers. It appears that may be false, from some of the failures?

Avoids refetching Message flatbuf

pixelherodev · 2025-10-24T14:30:12Z

// Code generated by the FlatBuffers compiler. DO NOT EDIT.

.... So I only just noticed this message. Sigh.

I'll have to send a patch to the generation even further upstream, I'm guessing? 😓

zeroshade · 2025-10-24T15:19:09Z

I'll have to send a patch to the generation even further upstream, I'm guessing? 😓

yup. Did you re-generate the flatbuffer code for this?

pixelherodev · 2025-10-24T17:45:22Z

I'll have to send a patch to the generation even further upstream, I'm guessing? 😓

yup. Did you re-generate the flatbuffer code for this?

I tested it with a patch to flatc - the problem is that some of these changes work by changing the interfaces that are generated, which is a breaking change. I'm not sure they'll want those changes, and they don't seem to look at patches much anyways :/

diff --git a/src/idl_gen_go.cpp b/src/idl_gen_go.cpp
index f4a03cec..0c2b14fc 100644
--- a/src/idl_gen_go.cpp
+++ b/src/idl_gen_go.cpp
@@ -186,7 +186,7 @@ class GoGenerator : public BaseGenerator {
   // Most field accessors need to retrieve and test the field offset first,
   // this is the prefix code for that.
   std::string OffsetPrefix(const FieldDef& field) {
-    return "{\n\to := flatbuffers.UOffsetT(rcv._tab.Offset(" +
+    return "{\n\to := flatbuffers.UOffsetT(rcv.Offset(" +
            NumToString(field.value.offset) + "))\n\tif o != 0 {\n";
   }
 
@@ -198,7 +198,7 @@ class GoGenerator : public BaseGenerator {
 
     // _ is reserved in flatbuffers field names, so no chance of name
     // conflict:
-    code += "_tab ";
+    code += "";
     code += struct_def.fixed ? "flatbuffers.Struct" : "flatbuffers.Table";
     code += "\n}\n\n";
   }
@@ -323,9 +323,9 @@ class GoGenerator : public BaseGenerator {
 
     for (int i = 0; i < 2; i++) {
       code += "func Get" + size_prefix[i] + "RootAs" + struct_type;
-      code += "(buf []byte, offset flatbuffers.UOffsetT) ";
-      code += "*" + struct_type + "";
-      code += " {\n";
+      code += "(buf []byte, offset flatbuffers.UOffsetT) (x ";
+      code += struct_type + "";
+      code += ") {\n";
       if (i == 0) {
         code += "\tn := flatbuffers.GetUOffsetT(buf[offset:])\n";
       } else {
@@ -333,11 +333,10 @@ class GoGenerator : public BaseGenerator {
             "\tn := "
             "flatbuffers.GetUOffsetT(buf[offset+flatbuffers.SizeUint32:])\n";
       }
-      code += "\tx := &" + struct_type + "{}\n";
       if (i == 0) {
-        code += "\tx.Init(buf, n+offset)\n";
+        code += "\tx.Table = flatbuffers.Table{Bytes: buf, Pos: n+offset}\n";
       } else {
-        code += "\tx.Init(buf, n+offset+flatbuffers.SizeUint32)\n";
+        code += "\tx.Table = flatbuffers.Table{Bytes: buf, Pos: n+offset+flatbuffers.SizeUint32}\n";
       }
       code += "\treturn x\n";
       code += "}\n\n";
@@ -371,24 +370,8 @@ class GoGenerator : public BaseGenerator {
     GenReceiver(struct_def, code_ptr);
     code += " Init(buf []byte, i flatbuffers.UOffsetT) ";
     code += "{\n";
-    code += "\trcv._tab.Bytes = buf\n";
-    code += "\trcv._tab.Pos = i\n";
-    code += "}\n\n";
-  }
-
-  // Implement the table accessor
-  void GenTableAccessor(const StructDef& struct_def, std::string* code_ptr) {
-    std::string& code = *code_ptr;
-
-    GenReceiver(struct_def, code_ptr);
-    code += " Table() flatbuffers.Table ";
-    code += "{\n";
-
-    if (struct_def.fixed) {
-      code += "\treturn rcv._tab.Table\n";
-    } else {
-      code += "\treturn rcv._tab\n";
-    }
+    code += "\trcv.Bytes = buf\n";
+    code += "\trcv.Pos = i\n";
     code += "}\n\n";
   }
 
@@ -400,7 +383,7 @@ class GoGenerator : public BaseGenerator {
     GenReceiver(struct_def, code_ptr);
     code += " " + namer_.Function(field) + "Length(";
     code += ") int " + OffsetPrefix(field);
-    code += "\t\treturn rcv._tab.VectorLen(o)\n\t}\n";
+    code += "\t\treturn rcv.VectorLen(o)\n\t}\n";
     code += "\treturn 0\n}\n\n";
   }
 
@@ -412,7 +395,7 @@ class GoGenerator : public BaseGenerator {
     GenReceiver(struct_def, code_ptr);
     code += " " + namer_.Function(field) + "Bytes(";
     code += ") []byte " + OffsetPrefix(field);
-    code += "\t\treturn rcv._tab.ByteVector(o + rcv._tab.Pos)\n\t}\n";
+    code += "\t\treturn rcv.ByteVector(o + rcv.Pos)\n\t}\n";
     code += "\treturn nil\n}\n\n";
   }
 
@@ -426,7 +409,7 @@ class GoGenerator : public BaseGenerator {
     code += "() " + TypeName(field) + " {\n";
     code += "\treturn " +
             CastToEnum(field.value.type,
-                       getter + "(rcv._tab.Pos + flatbuffers.UOffsetT(" +
+                       getter + "(rcv.Pos + flatbuffers.UOffsetT(" +
                            NumToString(field.value.offset) + "))");
     code += "\n}\n";
   }
@@ -445,7 +428,7 @@ class GoGenerator : public BaseGenerator {
     } else {
       code += "\t\treturn ";
     }
-    code += CastToEnum(field.value.type, getter + "(o + rcv._tab.Pos)");
+    code += CastToEnum(field.value.type, getter + "(o + rcv.Pos)");
     if (field.IsScalarOptional()) {
       code += "\n\t\treturn &v";
     }
@@ -467,7 +450,7 @@ class GoGenerator : public BaseGenerator {
     code += "\tif obj == nil {\n";
     code += "\t\tobj = new(" + TypeName(field) + ")\n";
     code += "\t}\n";
-    code += "\tobj.Init(rcv._tab.Bytes, rcv._tab.Pos+";
+    code += "\tobj.Init(rcv.Bytes, rcv.Pos+";
     code += NumToString(field.value.offset) + ")";
     code += "\n\treturn obj\n";
     code += "}\n";
@@ -480,19 +463,14 @@ class GoGenerator : public BaseGenerator {
     std::string& code = *code_ptr;
     GenReceiver(struct_def, code_ptr);
     code += " " + namer_.Function(field);
-    code += "(obj *";
-    code += TypeName(field);
-    code += ") *" + TypeName(field) + " " + OffsetPrefix(field);
+    code += "() (obj " + TypeName(field) + ", ok bool) " + OffsetPrefix(field);
     if (field.value.type.struct_def->fixed) {
-      code += "\t\tx := o + rcv._tab.Pos\n";
+      code += "\t\tx := o + rcv.Pos\n";
     } else {
-      code += "\t\tx := rcv._tab.Indirect(o + rcv._tab.Pos)\n";
+      code += "\t\tx := rcv.Indirect(o + rcv.Pos)\n";
     }
-    code += "\t\tif obj == nil {\n";
-    code += "\t\t\tobj = new(" + TypeName(field) + ")\n";
-    code += "\t\t}\n";
-    code += "\t\tobj.Init(rcv._tab.Bytes, x)\n";
-    code += "\t\treturn obj\n\t}\n\treturn nil\n";
+    code += "\t\tobj.Init(rcv.Bytes, x)\n\t\tok = true\n";
+    code += "\t}\n\treturn\n";
     code += "}\n\n";
   }
 
@@ -504,7 +482,7 @@ class GoGenerator : public BaseGenerator {
     code += " " + namer_.Function(field);
     code += "() " + TypeName(field) + " ";
     code += OffsetPrefix(field) + "\t\treturn " + GenGetter(field.value.type);
-    code += "(o + rcv._tab.Pos)\n\t}\n\treturn nil\n";
+    code += "(o + rcv.Pos)\n\t}\n\treturn nil\n";
     code += "}\n\n";
   }
 
@@ -532,13 +510,13 @@ class GoGenerator : public BaseGenerator {
     code += " " + namer_.Function(field);
     code += "(obj *" + TypeName(field);
     code += ", j int) bool " + OffsetPrefix(field);
-    code += "\t\tx := rcv._tab.Vector(o)\n";
+    code += "\t\tx := rcv.Vector(o)\n";
     code += "\t\tx += flatbuffers.UOffsetT(j) * ";
     code += NumToString(InlineSize(vectortype)) + "\n";
     if (!(vectortype.struct_def->fixed)) {
-      code += "\t\tx = rcv._tab.Indirect(x)\n";
+      code += "\t\tx = rcv.Indirect(x)\n";
     }
-    code += "\t\tobj.Init(rcv._tab.Bytes, x)\n";
+    code += "\t\tobj.Init(rcv.Bytes, x)\n";
     code += "\t\treturn true\n\t}\n";
     code += "\treturn false\n";
     code += "}\n\n";
@@ -566,9 +544,9 @@ class GoGenerator : public BaseGenerator {
     code += "(obj *" + TypeName(field);
     code += ", key " + NativeType(key_field.value.type) + ") bool " +
             OffsetPrefix(field);
-    code += "\t\tx := rcv._tab.Vector(o)\n";
+    code += "\t\tx := rcv.Vector(o)\n";
     code += "\t\treturn ";
-    code += "obj.LookupByKey(key, x, rcv._tab.Bytes)\n";
+    code += "obj.LookupByKey(key, x, rcv.Bytes)\n";
     code += "\t}\n";
     code += "\treturn false\n";
     code += "}\n\n";
@@ -585,7 +563,7 @@ class GoGenerator : public BaseGenerator {
     code += " " + namer_.Function(field);
     code += "(j int) " + TypeName(field) + " ";
     code += OffsetPrefix(field);
-    code += "\t\ta := rcv._tab.Vector(o)\n";
+    code += "\t\ta := rcv.Vector(o)\n";
     code += "\t\treturn " +
             CastToEnum(field.value.type,
                        GenGetter(field.value.type) +
@@ -806,12 +784,12 @@ class GoGenerator : public BaseGenerator {
                                  const FieldDef& field, std::string* code_ptr) {
     std::string& code = *code_ptr;
     std::string setter =
-        "rcv._tab.Mutate" + namer_.Method(GenTypeBasic(field.value.type));
+        "rcv.Mutate" + namer_.Method(GenTypeBasic(field.value.type));
     GenReceiver(struct_def, code_ptr);
     code += " Mutate" + namer_.Function(field);
     code +=
         "(n " + GenTypeGet(field.value.type) + ") bool {\n\treturn " + setter;
-    code += "(rcv._tab.Pos+flatbuffers.UOffsetT(";
+    code += "(rcv.Pos+flatbuffers.UOffsetT(";
     code += NumToString(field.value.offset) + "), ";
     code += CastToBaseType(field.value.type, "n") + ")\n}\n\n";
   }
@@ -820,7 +798,7 @@ class GoGenerator : public BaseGenerator {
   void MutateScalarFieldOfTable(const StructDef& struct_def,
                                 const FieldDef& field, std::string* code_ptr) {
     std::string& code = *code_ptr;
-    std::string setter = "rcv._tab.Mutate" +
+    std::string setter = "rcv.Mutate" +
                          namer_.Method(GenTypeBasic(field.value.type)) + "Slot";
     GenReceiver(struct_def, code_ptr);
     code += " Mutate" + namer_.Function(field);
@@ -837,12 +815,12 @@ class GoGenerator : public BaseGenerator {
     std::string& code = *code_ptr;
     auto vectortype = field.value.type.VectorType();
     std::string setter =
-        "rcv._tab.Mutate" + namer_.Method(GenTypeBasic(vectortype));
+        "rcv.Mutate" + namer_.Method(GenTypeBasic(vectortype));
     GenReceiver(struct_def, code_ptr);
     code += " Mutate" + namer_.Function(field);
     code += "(j int, n " + TypeName(field) + ") bool ";
     code += OffsetPrefix(field);
-    code += "\t\ta := rcv._tab.Vector(o)\n";
+    code += "\t\ta := rcv.Vector(o)\n";
     code += "\t\treturn " + setter + "(";
     code += "a+flatbuffers.UOffsetT(j*";
     code += NumToString(InlineSize(vectortype)) + "), ";
@@ -907,8 +885,6 @@ class GoGenerator : public BaseGenerator {
     // Generate the Init method that sets the field in a pre-existing
     // accessor object. This is to allow object reuse.
     InitializeExisting(struct_def, code_ptr);
-    // Generate _tab accessor
-    GenTableAccessor(struct_def, code_ptr);
 
     // Generate struct fields accessors
     for (auto it = struct_def.fields.vec.begin();
@@ -1282,13 +1258,13 @@ class GoGenerator : public BaseGenerator {
         code +=
             "\tt." + field_field + " = rcv." + field_field + "(nil).UnPack()\n";
       } else if (field.value.type.base_type == BASE_TYPE_UNION) {
-        const std::string field_table = field_var + "Table";
-        code += "\t" + field_table + " := flatbuffers.Table{}\n";
+        const std::string fielde = field_var + "Table";
+        code += "\t" + fielde + " := flatbuffers.Table{}\n";
         code +=
-            "\tif rcv." + namer_.Method(field) + "(&" + field_table + ") {\n";
+            "\tif rcv." + namer_.Method(field) + "(&" + fielde + ") {\n";
         code += "\t\tt." + field_field + " = rcv." +
                 namer_.Method(field.name + UnionTypeFieldSuffix()) +
-                "().UnPack(" + field_table + ")\n";
+                "().UnPack(" + fielde + ")\n";
         code += "\t}\n";
       } else {
         FLATBUFFERS_ASSERT(0);
@@ -1399,13 +1375,13 @@ class GoGenerator : public BaseGenerator {
   std::string GenGetter(const Type& type) {
     switch (type.base_type) {
       case BASE_TYPE_STRING:
-        return "rcv._tab.ByteVector";
+        return "rcv.ByteVector";
       case BASE_TYPE_UNION:
-        return "rcv._tab.Union";
+        return "rcv.Union";
       case BASE_TYPE_VECTOR:
         return GenGetter(type.VectorType());
       default:
-        return "rcv._tab.Get" + namer_.Function(GenTypeBasic(type));
+        return "rcv.Get" + namer_.Function(GenTypeBasic(type));
     }
   }

Regenerated the files with that patch to flatc, and updated the usage slightly to conform, and that's another alloc per NewReader and another alloc per Message call gone...

pixelherodev · 2025-10-24T17:51:17Z

...ah. Regenerated files are missing the licenses.

pixelherodev · 2025-10-24T17:55:59Z

Before:

BenchmarkIPC/Writer/codec=plain-24         	 203998	     7155 ns/op	   8160 B/op	     91 allocs/op
BenchmarkIPC/Reader/codec=plain-24         	 171043	     6329 ns/op	   6131 B/op	     83 allocs/op
BenchmarkIPC/Writer/codec=zstd-24          	   4206	   316075 ns/op	2338796 B/op	    150 allocs/op
BenchmarkIPC/Reader/codec=zstd-24          	  33127	    30308 ns/op	  25878 B/op	    194 allocs/op
BenchmarkIPC/Writer/codec=lz4-24           	  86971	    14408 ns/op	  13341 B/op	    134 allocs/op
BenchmarkIPC/Reader/codec=lz4-24           	  95056	    12548 ns/op	  10285 B/op	    124 allocs/op

Current:

BenchmarkIPC/Writer/codec=plain-24         	 200580	     5762 ns/op	   8160 B/op	     91 allocs/op
BenchmarkIPC/Reader/codec=plain-24         	 274100	     4364 ns/op	   5442 B/op	     62 allocs/op
BenchmarkIPC/Writer/codec=zstd-24          	   2731	   369465 ns/op	2338754 B/op	    150 allocs/op
BenchmarkIPC/Reader/codec=zstd-24          	  47542	    25236 ns/op	  25122 B/op	    171 allocs/op
BenchmarkIPC/Writer/codec=lz4-24           	 103572	    10842 ns/op	  12689 B/op	    134 allocs/op
BenchmarkIPC/Reader/codec=lz4-24           	 137494	     8809 ns/op	   9299 B/op	    101 allocs/op

pixelherodev · 2025-10-24T17:58:06Z

I also haven't been able to look into the arrow/flight failure, as that test has never passed for me locally anyways...

pixelherodev · 2025-10-24T18:29:44Z

I have more optimizations locally, but unfortunately most of them deliberately break functionality (e.g. arrowflight). Going to take a look at profiles of the benchmarks and try to improve this as much as I can without those :/

pixelherodev · 2025-10-27T20:02:07Z

I think I'm going to split this into a few small PRs if that makes sense:

Regenerate the flatbuf package without any changes from me (it's out of sync with the arrow/ repo)
- As a follow up to that, regenerate it with patches to remove unnecessary allocations, and document in the headers that the generated code has been modified to remove allocations, and make the changes to e.g. initFB and our usage of it to avoid heap escapes
Separately, the internal changes to the IPC that just clean up buffer management (preallocation / reuse)
Separately, the miscellaneous small changes that don't fit in

That okay?

zeroshade · 2025-10-27T20:43:14Z

Splitting it up sounds good

pixelherodev requested a review from zeroshade as a code owner October 23, 2025 21:08

Noam Preil added 16 commits October 23, 2025 16:59

optimization: IPC: pass FieldNodes around by value instead of reference

301801f

This avoids an unnecessary heap allocation, reducing GC pressure and increasing latency and throughput of anything that processes messages.

optimization: compare: avoid reflection in union comparison

528978c

cookie test: use maps.Equal instead of reflect.DeepEquals

96a5a69

This is a drive-by patch; the test was failing for the other optimization, and I noticed this when I went to take a look and figured I'd just send the patch.

cleanup: internal/flatbuf/message: embed flatbuffers.Table directly

3087b6b

The Table() method is unused, so it's fine to embed the table as is.

internal/flatbuf/Message: embed Message in parent allocation

1fc116a

Drops another allocation per ipc/Message, improves throughput further.

cleanup: minSpaceSavings need not have a pointer

94f155a

optimization: MessageReader: use a static 4-byte buffer for header reads

1002ff9

Avoids an allocation per Message call.

cleanup: just use newConfig(opts...) instead of reapplying

5408cf0

optimization: IPC: allows inlining of schema()

3759737

Before: cannotInlineFunction: complexity 148 exceeds budget After: canInlineFunction (cost: 68)

optimization: don't allow flatbuf-initialized types to escape to heap

dbf9e4c

Removes four allocations from every NewReader call.

optimization: ipc/message: reuse *Buffer for message body

a3e5375

Ideally, we'd reuse the underlying allocation instead of freeing it and reallocating each call, but that seems to be exposing a bug in the buffering logic. Still, this is a few more allocations gone...

cleanup: preallocate meta Buffer

d525959

Avoid branching in Message()

cleanup: buffer: detect resize to zero earlier

88101dd

Noam Preil added 2 commits October 24, 2025 08:44

flatbuf.RecordBatch should not escape to heap

5997025

minor cleanup: modernize benchmarking loop

abe9cae

Noam Preil added 5 commits October 24, 2025 08:58

refactor msg-header handling: drop unnecessary wrappers and abstraction

bc300dd

minor cleanup

8f79504

internal/flatbuf: cleanup Table wrapping

15ae909

cleanup flatbuf initialization systematically

2d44509

minor cleanup to readDictionary: pass Message instead of components

8a0a500

Avoids refetching Message flatbuf

pixelherodev force-pushed the avoid-allocations branch from 2f05df2 to 8a0a500 Compare October 24, 2025 14:29

Regenerate flatbuf with an optimized flatc, update usage

e5b2e0b

Noam Preil added 2 commits October 24, 2025 12:52

lint

c6dea5f

restore licenses

a7bd52b

Noam Preil added 2 commits October 24, 2025 14:12

regen flatbufs with more cleanup, update usage

884c6be

drop another allocation per message where possible

9d78daf

Uh oh!

optimization: IPC: pass FieldNodes around by value instead of reference #543

Are you sure you want to change the base?

optimization: IPC: pass FieldNodes around by value instead of reference #543

Uh oh!

Conversation

pixelherodev commented Oct 23, 2025

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

pixelherodev commented Oct 23, 2025

Uh oh!

zeroshade commented Oct 23, 2025

Uh oh!

pixelherodev commented Oct 23, 2025

Uh oh!

pixelherodev commented Oct 23, 2025

Uh oh!

zeroshade commented Oct 23, 2025

Uh oh!

pixelherodev commented Oct 23, 2025

Uh oh!

pixelherodev commented Oct 23, 2025

Uh oh!

pixelherodev commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pixelherodev commented Oct 23, 2025

Uh oh!

pixelherodev commented Oct 23, 2025

Uh oh!

pixelherodev commented Oct 24, 2025

Uh oh!

pixelherodev commented Oct 24, 2025

Uh oh!

zeroshade commented Oct 24, 2025

Uh oh!

pixelherodev commented Oct 24, 2025

Uh oh!

pixelherodev commented Oct 24, 2025

Uh oh!

pixelherodev commented Oct 24, 2025

Uh oh!

pixelherodev commented Oct 24, 2025

Uh oh!

pixelherodev commented Oct 24, 2025

Uh oh!

pixelherodev commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zeroshade commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pixelherodev commented Oct 23, 2025 •

edited

Loading

pixelherodev commented Oct 27, 2025 •

edited

Loading