Why Local-First Matters for Patent Discovery
Before you file a patent, your invention is a trade secret. The moment you upload source code or invention details to a cloud AI service, you introduce risk across three independent legal frameworks.
Most AI patent tools operate on already-filed patents: searching, analyzing, and visualizing public databases. That work is downstream, where confidentiality is not a concern. The problem is upstream: unfiled inventions, draft disclosures, and source code containing novel concepts that have no filing date.
This is where architecture matters. Local-first scanning eliminates all three risks by ensuring that invention data never leaves the developer's machine.
Last updated: March 2026. This page is informational only and not legal advice. Consult a patent attorney for your specific situation.
Three Legal Frameworks at Risk
Each framework independently creates liability when pre-filing invention data is processed in the cloud. Every patent attorney we spoke to raised at least one of them unprompted.
Trade Secret Exposure
DTSA / State Equivalents
Trade secret protection requires "reasonable measures" to maintain secrecy. Uploading unfiled invention details to third-party cloud services with broad terms of service may undermine that standard.
Foreign Filing License
35 U.S.C. 184
Inventions made in the U.S. generally must be filed domestically first. If code is processed on servers in unknown or foreign jurisdictions, this creates compliance risk. Most cloud AI providers do not guarantee data residency.
Attorney-Client Privilege
Heppner v. Agilent (SDNY 2026)
Consumer AI tools whose terms permit training on inputs may constitute waiver of privilege. Enterprise agreements with no-training clauses mitigate this, but consumer tools do not.
The Prior Art Trap
There is a fourth risk that operates independently of the three legal frameworks above: your own cloud disclosure becoming prior art against your own patent application.
In the United States, 35 U.S.C. 102 provides a one-year grace period for your own disclosures. But that grace period has limits. If your invention details are stored on third-party servers, used for model training, or become reconstructible by other users, the clock starts ticking. Documented cases exist of shared AI conversations appearing in search engine results.
United States
1-year grace period
Under 35 U.S.C. 102, you have 12 months from your own public disclosure to file. But any unintended leak to a cloud service starts that clock, and the grace period does not protect against third-party disclosures derived from your input.
Europe, China, Japan, and Most of the World
Absolute novelty. No grace period.
Any public disclosure anywhere before filing destroys patentability. There is no forgiveness. If invention details transmitted to a cloud AI become accessible to anyone outside your organization, international patent rights may be permanently lost.
Cloud AI providers store inputs for safety monitoring, debugging, and in many cases model improvement. Even with opt-out settings, enforcement is uncertain. The conservative approach: assume that anything uploaded to a cloud LLM could become accessible, and treat that as a potential disclosure event.
What Patent Attorneys Told Us
We spoke with three patent attorneys about AI tools in their practice. All three independently identified confidentiality as the primary barrier to adoption.
Keep the documents on local machines. Sandbox approach. Don't store patent documents in cloud unless you know exactly where those servers are.
Patent attorney, boutique firm with enterprise patent experience. His primary concern was data residency and foreign filing compliance. Enterprise clients have strict data governance policies, and any tool that touches pre-filing invention data needs to operate within those policies.
I talk to patent attorney friends. 'Are you using AI tools?' Answer is always no.
Patent attorney, formerly Perkins Coie and Microsoft in-house. The reason is not capability skepticism. Attorneys will not risk disclosing unfiled inventions to any cloud provider whose terms of service permit training on inputs. He also introduced the "time exposure" concept: risk is not just about where data goes, but how long it sits unprotected.
Zero of her clients have approved any AI tool for patent work.
Partner at an Am Law firm, 12 years of patent practice. The barrier is not technical sophistication. There is no audit process to verify that cloud AI providers do not train on your data, and no contractual assurance sufficient for pre-filing invention documents. She was specifically interested in local-first architecture when we described it.
Why Most AI Patent Tools Fail This Test
The current landscape of AI patent tools primarily operates on already-filed patent data. They search, analyze, and visualize existing patent databases. This is downstream work where confidentiality is not a concern because the patents are already public.
The problem is upstream: unfiled inventions, draft disclosures, and source code containing novel concepts. This is where cloud processing creates the most risk.
| Tool | Operates On | Architecture | Pre-Filing Safe? |
|---|---|---|---|
| Patsnap | Filed patent databases | Cloud | N/A (downstream) |
| IPRally | Filed patent databases | Cloud | N/A (downstream) |
| Cypris | Filed patent databases | Cloud | N/A (downstream) |
| ChatGPT / Claude | Any text input | Cloud | No |
| Azure OpenAI (Enterprise) | Any text input | Cloud (contracted) | Partial |
| ObviouslyNot Scanner | Source code | Local | Yes |
Consumer AI tools are especially problematic for upstream patent work: terms of service typically permit training on inputs, there is no guaranteed data residency, no contractual confidentiality sufficient for pre-filing documents, and no audit trail for compliance verification.
Enterprise AI agreements with no-training clauses improve the picture but still involve data transmission to third-party infrastructure. The gap in the market is clear: nobody else scans source code for patentable concepts, and nobody else does it locally.
How Local-First Scanning Works
The ObviouslyNot scanner runs entirely on your machine. No cloud. No API keys. No internet connection required after the initial model download.
Download the binary
Single executable for macOS, Linux, or Windows. Published on GitHub. No account required.
Install Ollama
Provides the local LLM runtime. One-time setup, roughly five minutes.
Run the scan
Point the scanner at any local directory. It reads source files, analyzes patterns, and identifies potentially patentable concepts.
Review results
Scored patent concepts with evidence linking back to specific files and code lines. Everything stays on your machine.
What the scanner does and does not do
The scanner is a discovery and structuring tool. It identifies potentially patentable concepts in your code, scores them, links evidence to specific files, and formats the output as a technical disclosure. It is useful for expanding keywords and technical terminology, mapping invention features, clustering related concepts, and comparing candidate ideas against each other.
It does not make final legal judgments. Novelty assessment, obviousness analysis, claim strategy, and filing decisions still require a qualified patent professional reviewing original sources. The scanner accelerates the path to that conversation. It does not replace it.
What Local Processing Resolves
Each legal framework maps to a specific architectural decision. Local processing resolves all three where cloud processing introduces risk.
| Legal Framework | Cloud Processing Risk | Local Processing |
|---|---|---|
| Trade secret (DTSA) | Upload may undermine "reasonable measures" | Data never leaves machine. Secrecy maintained. |
| Foreign filing (35 USC 184) | Unknown server jurisdiction | No data transmission. No jurisdiction issue. |
| Privilege (Heppner) | Training-permissive ToS may waive privilege | No third-party involvement. Privilege preserved. |
| Prior art (35 USC 102) | Cloud disclosure may start grace period or destroy international novelty | No disclosure event. No novelty risk in any jurisdiction. |
| Audit trail | Cloud outputs are hard to reproduce, validate, or audit | Full local record of what was analyzed and how conclusions were reached. |
| Time exposure | Days or weeks of cloud storage pre-filing | Minutes from scan to disclosure. Minimal window. |
The audit trail matters more than it seems. For important IP decisions, companies need a defensible record of what was searched, what sources were used, and how conclusions were reached. Cloud AI outputs can change between model versions with no notice. Local processing gives you a reproducible, verifiable chain of custody from scan to disclosure.
Speed-to-File as a Security Feature
Security is not just about where data sits. It is about how long it sits unprotected.
The ideal workflow compresses the exposure window to near zero:
Scan codebase locally
Review discovered concepts
Structure into disclosure or PPA
File with USPTO
From invention identification to filed application in under 24 hours. The scanner does not just protect privacy during analysis. It accelerates the entire pathway from "I might have something patentable" to "I have a filing date."
A practical approach: if you are unsure whether something is patentable, file a provisional patent application first. A PPA costs $65-$325 and gives you 12 months of "patent pending" protection. Once filed, you can use cloud tools more safely on the filed disclosure because it already has a priority date. The scanner helps you identify what to file before you need to decide how.
But My Code Is Already on GitHub
This is the most common pushback. If the source code is already public on GitHub, why does local scanning matter?
The answer: source code on GitHub is raw implementation. It does not identify what is novel or patentable. The scanner creates a fundamentally different document, one that maps inventive concepts, identifies claim boundaries, scores novelty, and structures information for patent filing.
Source Code on GitHub
- Raw implementation
- No identification of what is novel
- No claim boundaries
- No strategic filing information
The haystack
Scanner Output
- Identifies what the inventor believes to be novel
- Structures claims and defines protection scope
- Includes prior art analysis and strategic decisions
- Triggers foreign filing and export control frameworks
The map showing which needles are valuable
A GitHub repo is a haystack. The scanner's output is a map showing exactly which needles are valuable. That map deserves different security treatment than the haystack.
Frequently Asked Questions
If my code is on GitHub, why does local scanning matter?
Source code is implementation. The scanner output identifies what is novel, maps claim boundaries, and structures filing information. That document is more sensitive than the code itself because it explicitly identifies inventive concepts and reveals patent strategy.
Does local-first mean the results are worse?
No. The local scanner uses the same analysis pipeline. The LLM runs locally through Ollama. The quality of concept identification and evidence linking is equivalent to cloud processing.
What about the foreign filing license requirement?
Under 35 U.S.C. 184, inventions made in the U.S. must be filed domestically first. Processing invention data on foreign servers creates compliance questions. Local processing avoids this entirely because data never leaves your machine.
What about international patents?
Most countries outside the United States apply absolute novelty rules: any public disclosure before filing destroys patentability with no grace period. If invention details uploaded to a cloud AI service become accessible to third parties, international patent rights may be permanently lost. Local processing avoids this risk entirely because there is no disclosure event.
Can I use cloud AI tools for patent work that is already filed?
Yes. Once a patent application is filed with the USPTO, the risk profile changes. Prior art searches, office action analysis, and portfolio management on already-public patents do not carry the same confidentiality concerns. The risks described here apply specifically to pre-filing work.
How does local scanning handle large codebases?
The scanner processes repositories of any size by analyzing files incrementally. Processing time scales with codebase size. For typical repositories, a full scan takes minutes.
Related Resources
From ObviouslyNot
- Patent Documents and the Cloud: What Developers Need to Know
- The PPA as a Stock Option on Your IP
- The Three-Level Disclosure Framework
- Download the Patent Scanner