Show HN: I built an automated AI lab that generates and publishes inventions

1 points by Archivist_Vale 4 hours ago

Hi HN – For the last year, I’ve been building Unpatentable.org, an AI “innovation lab” that does one thing: generate new inventions and immediately publish them so they can’t be patented later.

The engine runs continuously; the public side is now a library of hundreds (900 so far) of novel AI-generated inventions across domains like energy, life sciences, robotics, space tech, and more.

Each innovation is documented in three separate reports totaling 120+ pages:

a main report (problem, mechanism, constraints)

an implementation guide - Step by step how-to

an societal/industry impact overview

We publish them on the site, anchor PDFs on the Arweave blockchain for immutable timestamps, assign metadata for discoverability, and place them into the USPTO prior art archive. The research papers are structured to meet or exceed international criteria for defensive disclosures (public, enabling, novel, time-stamped, non-confidential, specific, reproducible by a skilled practitioner). They are intended to be treated as prior art blueprints that enable others to build from them.

The engine itself isn’t a product; it's more of a publisher. That said, we do grant engine access to thinkers and inventors who want to solve and open-source solutions to particular challenges. We’re exploring sponsorships where organizations fund tracks like “wildfire resilience” or “decentralized compute,” and the output is a stream of open, unpatentable innovations in that domain. Monetizing the project has been an afterthought. I built this because I believe information wants to be free, and that in the coming age of ai, humanity is at risk of losing shared knowledge to well funded corporate patent trolls. Published innovations, are free forever, for everyone on earth.

Side note: for human inventors, there’s also an Unpatent tool (/unpatent) that lets you upload your own write-up and have it published as prior art (USPTO-linked prior art archive + Arweave + search indexing) for a flat fee. That’s separate from the AI engine, but built on the same plumbing.

You can browse the public innovation library here: https://unpatentable.org/innovation

I'll add some detail in the comments below and try to address a few anticipated questions. Feedback, critique, and suggestions are very welcome.

Archivist_Vale 4 hours ago

More technical detail for the curious:

The engine itself is a collaborative multi-agent pipeline stitched together mostly in Python. At a high level:

Problem hunting agents are given domain/subdomain pairs and are tasked with identifying the most impactful problems humanity faces in each area. Challenges are expanded into structured problem statements.

Domain agents (energy, digital systems, life-sciences-adjacent, etc.) work collaboratively to propose and refine seed candidate mechanisms and architectures - working from first principles within gaps discovered during systematic searches of patent archives, academic papers, and the open internet.

Refinement loops force agent interaction to resolve contradictions, tighten constraints, and generate something physically, economically, and organizationally plausible.

Report agents turn that graph into three separate documents: main report, implementation guide, and societal/industry impact report.

A guardian layer tries to filter out obvious garbage and anything in categories we simply do not want to publish (biohazards, weapons, high-risk surveillance, etc.).

Right now it runs on a mixture of frontier models behind an abstraction layer so we can swap individual agent models without rewriting the pipeline. Model selection is important, but perhaps less so than the framework and choreography: prompt schemas, roles, collisions, constraints, and consistency checks contribute most to the end result.

Every published innovation is a shared vision between at least 18 individual ai agents.

Archivist_Vale 4 hours ago

On prior art, legal reality, and the Unpatent tool:

A few clarifications from the IP side, since this always comes up.

This is not legal advice.

“Prior Art” is messy in practice. What counts, and how heavily it weighs, varies by jurisdiction and fact pattern.

What we can control is: novelty, public availability, timestamps, persistence, and how closely the disclosures resemble things examiners actually read.

For the AI-generated innovations, I worked with a patent lawyer to make each package look like a reasonably “enabling” disclosure: a clear statement of the problem, the core idea, constraints, and at least a sketched path to implementation. Some will inevitably be too hand-wavy to block a very narrow claim, others will be overly broad. The goal is to increase the probability that examiners and litigators have something concrete to point at.

Separately from the lab, the Unpatent tool exists for humans who already have their own write-ups and want to publish them as prior art:

You upload a document or paste text, add some metadata, pay a flat fee.

The system publishes to a prior-art archive that is explicitly consumed by the USPTO, and anchors the same content on Arweave for timestamping.

Result is a public, timestamped disclosure that can be cited as a printed publication equivalent.

That path is meant for cases where someone has a real, specific invention and wants a cleaner, more obvious defensive disclosure than hoping a blog post gets indexed and noticed.

I am very interested in hearing from more patent practitioners on how this is perceived, and how to make both the AI-generated disclosures and the Unpatent tool more useful in the real world.

Archivist_Vale 4 hours ago

Why a one-way lab and why sponsorships instead of a SaaS product:

A few people have asked why I am not just offering this as “GPT for inventions”. A product that anyone can prompt.

Three reasons (this is my current reasoning, subject to change):

Safety and incentives. If the engine was exposed as a generic invention API, there would be strong incentives to push it into areas I do not want to automate or publish (offense-heavy domains, surveillance optimizations, etc.). Keeping it as a one-way lab allows steering towards domains with positive human impact.

Mission alignment. The point of this project is to create a massive public library of prior art that is hard to appropriate and easy to build on. Charging per-generation access to the engine pushes in the opposite direction. I would rather keep the core engine pointed at “public good” domains and let people use the outputs freely.

Funding reality. There are real running costs (models, infra, time), so it cannot be self funded forever. Demand for an ai invention tools that gives away the secret sauce is generally low among the public sector. The model we are exploring is domain sponsorship:

A sponsor funds a track (or tracks), for example “wildfire resilience” or “low-cost diagnostics”.

We configure and run cohorts of the engine focused on that track.

The output still goes into the public library and through all disclosure channels with clear notice that the track was sponsored; there are no exclusive rights.

That feels like a better fit for what this is trying to be: infrastructure for open innovation rather than another IP-wrapping startup.

If you have opinions on better funding structures for this kind of thing, I would genuinely like to hear them. I am trying to avoid the failure mode where something starts as “for the commons” and drifts into enclosure because of financial pressure.

andsoitis 4 hours ago

What is an example invention your lab had come up with that you think is novel? Bonus points if you also think that innovation is meaningful.

Archivist_Vale 3 hours ago

Good question and honestly very difficult to choose one of the 900 innovations generated so far, but I'll give it a shot with one innovation that domain experts have provided robust feedback on...
I’ll give the concrete example, and then explain how we think about “novelty,” because that word means different things depending on whether you’re talking about academia, patents, or engineering.
Example: One invention the engine produced (back in June) is called Distributed Edge-Coupled Wildfire Physics Network (DECWPN). Link here: https://unpatentable.org/innovation/distributed-edge-coupled...
The idea was to break wildfire simulation into many small local physics domains, each solved on cheap edge-compute nodes (fuel load --> heat transfer --> local flow), with nodes exchanging boundary conditions in real time. Instead of one massive simulation that takes hours, you get a multiscale mesh that can update fast enough to approach sub-minute guidance.
It’s not new physics, but the architecture is very interesting (and novel). It's a combination of multiscale coupling + edge compute + rugged sensors arranged as a “stitched” field solver. I personally hadn’t seen that configuration before, and experts in the domain that we shared it with concluded that it's robust, new, and directionally plausible.
But is it “novel” in a legal or academic sense? Well, that brings me to the second bit:
We don't claim novelty in the strong “this is scientifically unprecedented” sense because that’s not our role, and it’s not something anyone can assert responsibly. We're not inventing or solving new physics, we're working from first principles of known physics.
But we do check for novelty in the defensive disclosure sense:
Every invention report includes a reference section listing any related patents, related academic papers, prior mechanisms, and adjacent methods.
The engine checks whether the specific configuration, mechanism, or architecture appears in the clusters of relevant prior art we surface.
We’re not doing a state of the art patent examiner search (because that requires access we don’t have) but we do run a structured “is this described anywhere obvious?” pass.
If something is obviously described already, the system pivots or refines.
If not, we proceed, and the references are included so a reader can see the conceptual neighborhood the idea sits in.
This isn’t a novelty guarantee (no one can give that), but it does filter out obvious rehashes and ensures each disclosure is at least not trivially duplicated from known literature.
And importantly: For defensive disclosure purposes, documenting the idea clearly, timestamping it, and making it public is what matters. Even if an invention later turns out to have partial overlap with something obscure, the published disclosure still functions as citable prior art.
Now let me try to claim the bonus points:
For each invention, the third document we produce for an innovation (the industry / societal impact report) tries to quantify or at least articulate:
who would use it
what it could change
what barriers exist
which industries it touches
where it fits in current trends
what potential global value could be unlocked
This is less about hype and more about giving context so readers can judge whether it’s a direction they want to explore.
For the wildfire system, for example, the meaningful part for me is simple: anything that moves the needle on early prediction timelines is meaningful, even incrementally.
Our problem seeking agents in the pipeline are dialed into choosing impactful human problems to solve in all domains - so in this sense, all of our innovations are hopefully "meaningful" as buildable and open-source solutions to large scale human problems.

Archivist_Vale 4 hours ago

On the publishing side:

We render PDFs with consistent front-matter, metadata, and internal IDs.

Hashes and PDFs are anchored on Arweave for timestamping and persistence.

A WordPress front end and some custom templates serve the human-readable library.

Throughput is currently set at ~4 innovations per day, mainly constrained by API cost and some deliberately conservative safety checks. There is a scheduler that manages queues, retries, and backpressure so the thing can run 24/7 without me babysitting it.

Happy to go deeper on any of the above if there is interest.

Archivist_Vale 4 hours ago

On AI Hallucinations:

People have asked about ai hallucinations, so here’s the honest version.

TLDR; Hallucinations sometimes happen. But they don’t break the usefulness or validity of the disclosures, and the system is designed so that incorrect details don’t invalidate the core idea. Modern models hallucinate less than they used to, and the disclosure format is resilient to the remaining noise.

More detail:

Hallucination rates are dropping over time Across the last two model generations, we’ve seen a very noticeable decline in fabrications when using structured, constrained, multi-agent prompting. It’s not perfect, but the trend is clearly improving.

We don’t treat the AI as an oracle The engine is more like a tireless research assistant than a truth machine. The supervisor + refinement loop forces internal consistency, so the output tends to be coherent even if some details aren’t experimentally validated.

Incorrect details don’t invalidate prior art Prior art doesn’t have to be flawless to count. In practice, examiners and courts often rely on the portions that are correct, even in papers or disclosures that contain errors. What matters is that the idea, mechanism, and implementation pathway are publicly taught. A disclosure can be partially wrong and still function fully as prior art.

We run internal filters to catch the worst issues There are consistency checks, cross-agent critiques, domain sanity filters, and safety blocklists. These reduce (but don’t 100% eliminate) hallucinations. The goal is to keep the output in a usable, conceptually solid range.

The point isn’t to assert truth; it’s to make ideas public These aren’t meant to be validated research papers. They’re structured disclosures designed to teach a concept clearly enough that a practitioner can understand and build on it. This is the bar for defensive publication, not empirical proof.

Valid portions still create value Even if a paragraph is off, the rest of the disclosure still stands. And if anything truly incorrect is identified, we treat that as feedback for improving the filtering and refinement loops.

So yes, hallucinations exist. But they don’t undermine the purpose of making ideas public, timestamped, findable, and structurally useful as prior art. And as the models improve, the signal to noise ratio keeps improving too.

Happy to dig deeper into how we evaluate or mitigate this if anyone wants specifics.