February 17, 2026 Technical Deep Dive

Why we built on PostgreSQL

We made a decision early on to leverage PostgreSQL and pgvector for our data store instead of a dedicate vector database. Our single-stack arhcitecture allows us to deliver capabilities that split-system architectures struggle to match.

Why We Built on pgvector Instead of a Dedicated Vector Database

foundation4.ai stores every vector in PostgreSQL. Not in Pinecone. Not in Weaviate. Not in Qdrant. In the same PostgreSQL instance that holds the documents, the metadata, the classification hierarchies, the permission boundaries, and the version history. One database for the entire knowledge layer.

That's not a compromise we arrived at reluctantly — it's the architectural foundation the platform is built on, and the reason foundation4.ai can deliver capabilities that split-system architectures struggle to match. Temporal versioning where vectors and documents stay transactionally consistent. Classification-enforced access control that operates at the query level, not as a post-retrieval filter. Metadata predicates and vector similarity composed in a single query plan. And a hybrid search roadmap — combining semantic similarity with full-text keyword retrieval and cross-encoder re-ranking — that becomes dramatically simpler when vectors and text live in the same database engine.

The AI infrastructure industry spent the last three years telling teams they need a dedicated vector database. For the workloads foundation4.ai serves — defense, intelligence, regulated enterprise, any organization that takes data sovereignty seriously — that advice creates more problems than it solves. This post explains why.

The Two-System Problem

A production knowledge retrieval platform doesn't just store vectors. It stores documents with classification hierarchies, metadata fields, version histories, permission boundaries, and temporal state. When a user queries the system, the platform must embed the query, search for similar vectors, filter by classification and metadata, enforce access control, and return results that respect the document's current version — all in a single request path.

If vectors live in a dedicated vector database and everything else lives in PostgreSQL, every one of those operations becomes a coordination problem. The vector database returns candidate fragment IDs ranked by similarity. Then the application must round-trip to PostgreSQL to check whether each candidate's parent document is still active (not expired, not deleted), whether the requesting API key has permission to access that classification subtree, whether the metadata matches the query filters, and whether the document version is current as of the requested point in time.

That coordination isn't just slow — it's architecturally fragile. You're now maintaining consistency between two systems that have different transaction models, different failure modes, and different backup schedules. A vector database doesn't know about PostgreSQL's ACID guarantees. PostgreSQL doesn't know about the vector database's eventual consistency model. The seam between them becomes the source of bugs that are hard to reproduce, hard to diagnose, and hard to explain to a security auditor.

The right answer is no seam at all.

What pgvector Actually Gives You

pgvector is a PostgreSQL extension that adds vector data types and similarity search operators directly to the database. You store embeddings as a column on a table, create an index (HNSW or IVFFlat), and query with standard SQL operators that compute cosine similarity, Euclidean distance, or inner product. The vectors live alongside every other column — classification, metadata, version, timestamps, permission flags — in the same row, the same table, the same transaction.

This colocation is the foundation of everything else foundation4.ai does well.

Metadata filtering happens in the same query. When an agent searches for fragments, the WHERE clause combines vector similarity with classification scoping, metadata filters, version checks, and permission enforcement in a single SQL statement. There's no post-filtering, no round-trip to a second system, no application-level join logic. PostgreSQL's query planner handles it. The result set is correct by construction — not by hoping two systems agree.

ACID compliance covers vectors and documents together. When a document is updated, the old version's fragments can be expired and the new version's fragments inserted in a single transaction. There's no window where the vector index contains embeddings for a document that PostgreSQL considers deleted, or vice versa. Point-in-time queries — foundation4.ai's as_of parameter, which lets you ask "what did the knowledge base look like at timestamp X?" — work because PostgreSQL's temporal data model applies to vectors exactly as it does to any other column.

Backup and recovery is one operation. A single pg_dump or streaming replication setup captures vectors, documents, metadata, classifications, permissions, and version history together. There's no reconciliation step after a restore. There's no "the vector database recovered to 3:00 AM but PostgreSQL recovered to 3:05 AM" scenario. For organizations that require auditable disaster recovery — and in defense, intelligence, and regulated enterprise, that's not optional — this matters more than raw query speed.

Access control is enforced at the data layer. foundation4.ai's classification-based permission system restricts which API keys can access which document subtrees. Because vectors and classifications live in the same database, that restriction is enforced in the same query that performs the similarity search. An API key scoped to secret/operations physically cannot retrieve vectors classified under top-secret/programs — not because the application filters them out after retrieval, but because the query never sees them. For environments operating under security classification guides, this distinction between "filtered after the fact" and "never accessed" is the difference between compliant and non-compliant.

The Performance Question

The most common objection to pgvector is performance. Dedicated vector databases are purpose-built for similarity search, and at extreme scale — hundreds of millions of vectors, thousands of queries per second — they can outperform PostgreSQL on raw vector operations. That's a real technical fact, and we don't dispute it.

But performance benchmarks for isolated vector operations don't reflect how a production enterprise system actually works. In a real deployment, almost no query is a pure vector search. Nearly every query combines similarity search with at least one metadata filter, a classification scope, and a permission check. This is exactly where PostgreSQL excels — combining vector similarity with relational predicates in a single query plan is something it does natively, and dedicated vector databases either don't support or handle through costly post-filtering that wastes work on results that will be discarded.

Consider a concrete example. An intelligence analyst executes a query against secret/operations with metadata filters for a specific mission program and active document status. In foundation4.ai, that becomes a single SQL query: the HNSW index produces vector similarity candidates, the WHERE clause eliminates fragments outside the classification subtree and metadata scope, and the result set respects document versioning — all before a single row is returned to the application. With a dedicated vector database, the similarity search runs blind to all of those constraints. The application retrieves a broad set of candidates, then filters against PostgreSQL for classification, permissions, metadata, and version status. The vector database did work retrieving fragments the application will discard. PostgreSQL did work answering questions that could have been part of the original query. Both systems burned cycles that a unified architecture avoids.

pgvector supports two index types, and the choice between them matters. HNSW (Hierarchical Navigable Small World) is the right default for most workloads: it delivers high recall at high throughput, handles incremental inserts without rebuilding, and scales logarithmically with dataset size. In benchmarks, HNSW achieves recall rates above 99% with query throughput an order of magnitude higher than the alternative. IVFFlat (Inverted File with Flat compression) builds faster and uses less memory, making it useful for development environments or workloads where index rebuild time matters more than peak query speed. foundation4.ai uses HNSW in production, where its incremental update capability aligns with the platform's continuous ingestion model — documents arrive via NATS queues around the clock, and the index absorbs new fragments without a rebuild step.

At the corpus sizes typical of organizational knowledge bases — thousands to low millions of documents, producing tens of millions of embedded fragments — HNSW delivers sub-200ms search latency consistently. foundation4.ai's target deployments — a defense program's mission documentation, a law firm's contract corpus, a hospital system's clinical knowledge base, an enterprise's internal documentation — produce vector counts measured in millions, not billions. At this scale, pgvector's performance is not a compromise. It's more than sufficient, and the operational simplicity it buys is worth more than the marginal latency improvement a dedicated system might offer on a synthetic benchmark that ignores metadata, permissions, and versioning entirely.

The Hybrid Search Advantage

Vector similarity is powerful, but it's not the only retrieval strategy that matters. Semantic search excels at matching intent — a query about "unauthorized access attempts" will retrieve documents about "security breaches" even when the exact words don't overlap. But pure vector search has a well-known blind spot: it can miss exact terminology that matters. A query for "CVE-2024-3094" needs to match that precise string, not a semantically similar concept. A legal team searching for a specific contract clause number needs keyword precision, not approximate meaning.

The answer is hybrid search — combining vector similarity with traditional full-text keyword retrieval, then using a re-ranking model to score and merge the results. This is where PostgreSQL's architecture becomes a compounding advantage.

PostgreSQL has shipped production-grade full-text search for over two decades. The tsvector and tsquery types, GIN indexes, ranking functions, phrase matching, language-aware stemming — these aren't bolted-on features. They're deeply integrated capabilities with their own query planner support, their own index types, and their own performance characteristics that have been refined across hundreds of PostgreSQL releases. With pgvector installed alongside, the same database engine that performs vector similarity search also performs keyword retrieval — in the same query, against the same data, within the same transaction.

This colocation matters enormously for hybrid search. In a split-system architecture, hybrid retrieval requires two separate queries to two separate systems (a vector search to the vector database, a keyword search to a text search engine or relational database), followed by application-level result merging and re-ranking. That's three systems coordinating on a single user query — with all the latency, consistency, and failure-mode complexity that entails.

In PostgreSQL, the hybrid query is one query. The vector similarity scores and the full-text relevance scores can be computed in the same query plan, filtered by the same classification and metadata predicates, and passed to a re-ranking step that operates on a unified candidate set. The database does the work that a split architecture forces the application to do — and it does it with the consistency guarantees that PostgreSQL provides natively.

foundation4.ai's hybrid search capability is on the near-term roadmap, and the architectural decision to build on PostgreSQL is what makes it practical. We don't need to integrate a third system. We don't need to build a coordination layer. We don't need to reconcile result sets from different engines with different scoring models. We need to compose capabilities that already exist in the same database — and that's exactly the kind of problem PostgreSQL was built to solve.

What We Didn't Have to Build

The underappreciated benefit of choosing pgvector is everything we didn't have to build around it.

We didn't have to build a synchronization layer between a vector database and a relational database. We didn't have to design a reconciliation process for when those systems disagree after a partial failure. We didn't have to implement distributed transactions across two storage engines with different consistency models. We didn't have to maintain two backup strategies, two monitoring dashboards, two scaling plans, two security audit surfaces, two sets of access credentials, two patch cycles.

We didn't have to explain to a CISO why sensitive document vectors are stored in a separate system with a different access control model than the documents themselves. We didn't have to convince an accreditation authority that the data boundary is sound when classified content flows between two independently managed datastores. We didn't have to answer the inevitable question from an inspector general or compliance auditor: "If the vector database and the relational database disagree about which documents exist, which one is authoritative?"

Instead, we built foundation4.ai's temporal versioning — full document version history with point-in-time queries — directly on PostgreSQL's strengths. We built the classification hierarchy and permission system as database-level constraints that apply to vectors and documents identically. We built metadata filtering as SQL predicates that compose naturally with vector similarity operators. We built the as_of parameter as a temporal query against the same tables that store embeddings. We built the taxonomy system — parent-child relationships between metadata values, resolved at query time via the $teq operator — as standard relational lookups that execute in the same query plan as the vector search.

Every one of those features would have been harder, slower, and less reliable if vectors lived in a different system. Some of them — particularly temporal versioning with vector consistency, and classification-enforced access control over similarity search — would have been architecturally impractical.

The Operational Argument

For the teams that deploy foundation4.ai — infrastructure engineers at defense contractors, DevOps teams at enterprise IT organizations, platform engineers at government agencies — PostgreSQL is a known quantity. They know how to monitor it with Prometheus. They know how to scale it with read replicas. They know how to back it up, how to restore it, how to tune it, how to secure it. The operational playbook for PostgreSQL in production has been written over three decades and is battle-tested at every scale that matters.

A dedicated vector database is none of those things. It's a new operational surface with its own deployment model, its own monitoring requirements, its own failure modes, and its own talent pool. Finding a PostgreSQL DBA takes days. Finding someone with deep operational experience running Weaviate or Qdrant in a classified environment is a very different proposition.

When foundation4.ai deploys on Kubernetes — PostgreSQL, NATS, Redis, Prometheus, all in one namespace via Helm — the entire stack is composed of components that enterprise infrastructure teams already understand. Adding a dedicated vector database would increase the operational surface area of every deployment without delivering a benefit that matters at the scale these deployments operate.

The data sovereignty dimension reinforces this. Many dedicated vector databases are cloud-native services — Pinecone is SaaS-only, and even self-hostable options like Weaviate and Qdrant add a separate system that must be secured, audited, and kept within the network boundary. For air-gapped deployments — a SCIF, a classified mission network, a HIPAA enclave — every additional system is another component that must be validated for the environment, another container image to pre-pull, another configuration surface to lock down. PostgreSQL is already there. It's already validated. It's already understood by the authorization body. pgvector adds vector capability to that existing, trusted component without expanding the security perimeter at all.

When a Dedicated Vector Database Makes Sense

We're not arguing that dedicated vector databases have no place. If you're building a consumer-scale recommendation engine over billions of items, a real-time fraud detection system processing thousands of similarity queries per second, or a search product where pure vector throughput is the primary bottleneck — the engineering trade-offs shift. Purpose-built systems can deliver higher throughput at extreme vector counts, and the operational overhead of a second system is justified by the performance requirements.

But that's not what foundation4.ai is built for. Foundation4.ai is secure knowledge infrastructure for organizations that need their proprietary data searchable, their access controls enforced, their version history auditable, and their entire stack running inside their own boundary. For that workload, the question isn't whether pgvector is fast enough — it is. The question is whether the additional operational complexity of a dedicated vector database delivers enough value to justify the cost. For our customers, it doesn't.

One Database, One Truth

The decision to build on pgvector wasn't a shortcut. It was a deliberate architectural choice that makes foundation4.ai simpler to deploy, easier to secure, cheaper to operate, and more reliable under the conditions that enterprise and mission-critical environments actually impose.

The AI infrastructure landscape is still young, and architectural decisions made now will compound for years. Teams that adopt a dedicated vector database today are signing up for the long-term operational cost of keeping two systems synchronized, two security surfaces hardened, and two failure modes understood — indefinitely. Teams that build on PostgreSQL with pgvector inherit three decades of operational maturity, a global talent pool, and an extension ecosystem that continues to improve. pgvector's HNSW implementation is faster and more capable today than it was a year ago, and that trajectory shows no sign of slowing.

Vectors, documents, metadata, classifications, permissions, version history, temporal state — all in one database, all in one transaction, all under one backup, all behind one access control model. That's not a limitation. That's the architecture.

Deploy foundation4.ai on your infrastructure — one database, zero data exposure: foundation4.ai

Interested in foundation4?

Learn how our AI data pipeline platform can help your team.

Get in Touch