"You Don't Need Kafka, Just Use Postgres" Considered Harmful

39 points by ingve 18 hours ago

ekjhgkejhgk 17 hours ago

"Considered harmful" considered harmful.

noxs 11 hours ago

At certain point this will become ""Considered harmful" considered harmful" considered harmful.
wewewedxfgdf 15 hours ago

It lends an academic pomposity that demands respect.

Id argue that if you are in the position where you legitimately NEED kafka, you hopefully also know what you're doing. You're outside the audience for the "just use postres" crowd. That said, if you're in a startup with a few thousand users, just use postgres is still solid advice.

threatofrain 16 hours ago

If you need some kind of event streaming system there are other choices which have less dev ops burden, such as just using any particular cloud's proprietary or managed offerings. I've seen two companies on NATS so far, I'm trying it out myself for size as well.
There are plenty of choices between PSQL & Kafka. It's not like you take one step north and you're in the "oh no you better know what you're doing" territory.
- strken 16 hours ago
  
  The problem with taking one step north and leaving the border of Postgres is what you lose, not the direct ops burden.
  Postgres land is a comfy place filled with transactions across all your data at once, one backup solution that you (hopefully) have had running for months or years and has been thoroughly tested, and ACID compliance. You have a single host, probably, which means that you are neither Available nor Partion-tolerant, but at least you are Consistent.
  The moment you expand beyond a single database host you now have a distributed system, and woe unto you if you don't understand what that means.
  - 112233 9 hours ago
    
    Current "one host" options are in ridiculous territory - 256 core CPUs with terabytes of RAM an storage in 100 GB/s range. A decade ago that much needed a few racks.
  - threatofrain 15 hours ago
    
    If you wanted such simplicity then nothing is stopping you from running single-node NATS or even just Redis. You always had all the simplicity and consistency you wanted.
    
    strken 15 hours ago
    
    The problem is that now you use Postgres for 95% of your system, and also Redis or NATS, which means you lose the ability to atomically commit changes to your database and send a message in one transaction.
    You can work around this, but to the best of my knowledge you can't have consistency (between your existing Postgres database and your separate queue or event log) and simplicity.
    
    hbogert 6 hours ago
    
    Indeed, only eventual consistency. The article approaches this subject and mentions the use of the outbox pattern and/or using tools like Debezium.
  - cultofmetatron 15 hours ago
    
    well said. I've been working on my startup. We are profitable in part because I spend my time focused on building new features and improving our reliability instead of chasing after all the idiosyncratic bugs that come with distributed systems.

ryandvm 2 hours ago

In my experience, "you don't need *MQ, just use Kafka" is a way worse problem.

Trying to explain the distinction between an event streaming platform and a distribute message queue to your enterprise architect is an exercise that no one should have to go through.

wewewedxfgdf 15 hours ago

There's many many ways to make a message queue these days - all the main SQL databases can act as a queue - everything from Postgres to MS SQL server to MySQL to Oracle to sqlite to the custom applications like Kafka and for the most part they are all more or less valid - it's not all about Postgres.

Take the approach that appeals to you and feel happy about it without big open source telling you "you're holding it wrong!"

brettgriffin 16 hours ago

> Looking to make it to the front page of HackerNews?

Nailed it. I read the original post earlier this week and was very impressed with its technical detail. But the point of the the post was incongruent with the post's title. But the post got way more attention because of that title.

But if you think about the effort it took to write that post, the title was a really good bet on ROI.

tacticus 14 hours ago

> > Looking to make it to the front page of HackerNews?
> Nailed it.
Worked for the confluent marketing fluff as well.

scottcodie 16 hours ago

One thing the other blog post missed and this post misses too is that you don't need Kafka to use Debezium with Postgres. This gives you a pretty seamless onramp to event streaming tools as you scale.

gunnarmorling 6 hours ago

Are you referring to using Debezium embedded as a library? If so, yes, it absolutely has its place; for instance, it's used by Flink CDC. There's pros and cons to either way of running Debezium. Seeing embedded Debezium a lot for in-app use cases, for instance cache invalidation. Going through Kafka allows for reply and setting up multiple independent consumers for the same change event stream.

philipwhiuk 17 hours ago

> Named a Java Champion, I enjoy speaking at conferences, for instance at QCon, JavaOne, Red Hat Summit, JavaZone, JavaLand and Kafka Summit.

candiddevmike 16 hours ago

That tracks, I feel like Kafka is over represented in the Java codebases I've seen TBH.

oompydoompy74 14 hours ago

Insufferable tone aside, I really dislike the “right tool for the job” argument. The correct tool is the one that is handy and gets the job done. Has the author never encountered a Swiss Army Knife?

jauntywundrkind 14 hours ago

I'm more interested in the "You don't need Kafka the product, when we have this Kafka protocol compatible alternative". Kafka is more than a product: it's become a standard, with many many implementations. I'd love to see wider coverage of the alternatives. RedPanda, StreamNative Ursa, OSO, Aiven, many others.

hactually 17 hours ago

isn't Kafka old news at this point?

LinkedIn have moved onto Northguard... but no GitHub yet

AceJohnny2 15 hours ago

so you mean that Kafka is boring, functional and stable?
https://boringtechnology.club/
rubenvanwyk 11 hours ago

Also wish there was more information available about Northguard.

atoav 17 hours ago

Could we please just agree not to use this "considered harmful" phrase to describe advice where the answer is "depends"? This kinda makes the author seem like he has lost the ability to consider what software is out there. That he is working for Kafka doesn't help.

Example: Someone writes a software that could use something simple like SQLite, and they switched to Postgres for performance reasons. Now unless what Kafka beings is the core reason they switched to Postgres not pulling in another dependency and adding a nother piece to the puzzle, can be a total legitimate engineering decision. And that renders the "considered harmful" utterly ridiculous.

Use a system like Kafka if you need what it brings (a distributed event streaming platform). If that isn't what you need or a very simple postgres solition suffices, go for that. Maybe you need event streaming but distributing it is overkill. Maybe you just need some sort of queue. Who knows? Not the author of this post.

gunnarmorling 6 hours ago

> the answer is "depends"?
Indeed that is the point I am trying to make in the article. Postgres oftentimes absolutely is the right tool to use, and oftentimes it's not. The thing I'm advocating to be wary of is "if all you have is a hammer...". This is to what "considered harmful" refers.

blindriver 17 hours ago

""You don't need Kafka" considered harmful by employees of Kafka."

redhale 16 hours ago

Yes. Setting aside the specific merits of the argument, this blog post should really have a disclaimer somewhere that the author works for Confluent, a major managed Kafka service provider. Perhaps that makes him an expert on this topic, but it should still be disclosed!
> Managed services make running Kafka a very uneventful experience (pun intended) and should be the first choice
Confluent, you say?
- gunnarmorling 16 hours ago
  
  > this blog post should really have a disclaimer somewhere that the author works for Confluent
  Good idea; this is stated in the bio on my web site, but I've just added the same info again to the end of the post.
  - redhale 15 hours ago
    
    Fair point.
    It might be worth adding a more direct call-out to posts like this one. Many may not go as far as reading the Bio page. That may be on them technically speaking, but still.
    In any case, thank you for writing and sharing your considered opinion!
    
    gunnarmorling 8 hours ago
    
    Thank you, appreciate it!
- blindriver 16 hours ago
  
  Confluent isn't just "a major managed Kafka service provider." The founders of Confluent created Kafka and they and their employees/former employees dominate the PMC committee for Kafka, meaning they control the direction of Kafka. Confluent is Kafka.
  The author is a an employee for Confluent/Kafka so because his paycheck and equity grant depends on it and CFLT stock price, obviously whatever he writes is going to be heavily slanted in favor of Kafka. This isn't something that is a footnote at the bottom, it should be right up at the front.
pheggs 16 hours ago

employee of Confluent.
I think that shouldn't matter but I still have a lot to disagree with the article.
feels like overengineering has become the standard for some people, and I quite dislike it personally.