1. Home
  2. Docs
  3. Knowledgebase and Tutorials
  4. ClusterControl
  5. Tutorials
  6. Deploying PostgreSQL cluster with Pgvector extension

Deploying PostgreSQL cluster with Pgvector extension

Introduction

pgvector is an open-source extension for PostgreSQL that enables storing and searching over machine learning-generated embeddings. It provides several capabilities that allow users to identify both exact and approximate nearest neighbors, making it a powerful tool for applications such as search, recommendation, and anomaly detection.

Key features of pgvector include:

  • Support for various embedding types: Pgvector can store and search embeddings of different dimensions and data types, including text, images, and audio.
  • Efficient nearest neighbor search: Pgvector provides efficient algorithms for finding the nearest neighbors of a given query embedding, both exact and approximate.
  • Integration with PostgreSQL: Pgvector seamlessly integrates with PostgreSQL, allowing users to leverage existing PostgreSQL features such as indexing and querying.
  • Scalability: Pgvector is designed to scale to large datasets and can be used in production environments.

Pgvector can be useful in different cases. Here are some examples of the use cases where you may find it very handy:

  • Search: Pgvector can implement search functionality for various data types, such as product search, image search, and text search.
  • Recommendation: Pgvector can be used to generate recommendations for users based on their past behavior or preferences.
  • Anomaly detection: Pgvector can detect anomalies in data by identifying points significantly different from the rest of the data.

Overall, pgvector is a powerful and versatile tool for working with embeddings in PostgreSQL. It is a valuable addition to the PostgreSQL ecosystem and can be used to build a wide range of intelligent applications.

ClusterControl supports pgvector by enabling this extension with our PostgreSQL deployment wizard through an additional ‘extensions’ step. Currently, pgvector is the only choice, we plan to add more extensions to PostgreSQL in future releases.

How to enable pgvector on PostgreSQL cluster in ClusterControl

Currently, it is possible to enable pgvector on the cluster deployment. First, you want to pick the deployment option:

You should use PostgreSQL 15 as this is the version where extensions are supported in ClusterControl.

Then you follow the next steps of the deployment process picking the name for the cluster, defining SSH access, credentials, and the nodes that the cluster should be deployed on:

Once this is done, you will be presented with a new step, in which you will be able to enable extensions:

Click on pgvector, and write down further instructions on how to create the extension on the database side. This is it – you can review the summary of the choices that you made and deploy the cluster:

Once the cluster is deployed, as per instructions, you want to execute

CREATE EXTENSION vector;

and enjoy using the extension.

Was this article helpful to you? Yes No