broken_broken_ 9 hours ago

I love reading optimization stories so thank you!

So essentially the code got faster by switching to smarter regexp engine that can transform (hopefully) the regexp into a FSM. Cool, but what about replacing the regexp with straightforward parsing code written manually? That way it is statically known that this function is fast all of the time.

At least that’s what I would do in a natively compiled language with access to simd, not sure if ruby has access to it and can detect the best variant at runtime.

And I’d argue plain code is simpler to understand, debug, troubleshoot than a complex regexp.

The regexp engine part reminds me of writing scalar code, relying on the compiler doing autovectorization to make it fast. And what very often happens is that the new compiler version changed their heuristics and now there is no autovectorization that kicks in and the performance falls off a cliff . It’s a risky bet.