Database Manual / Sharding

mongot Sizing Quickstart

To quickly provide your mongot deployment with a healthy balance of resources for most general use cases, a small or medium High-CPU node is often an effective starting point. This configuration provides a solid foundation for common search workloads.

For more precise resource provisioning tailored to a specific workload, review the following pages:

These pages offer guidance on mission-critical applications, or when higher-fidelity optimization is required.

Note

Sizing resources for search and vector search workloads is an iterative process. These examples represent a starting point, but advanced considerations and measurements may be required for sizing a specific workload.

Workload Classes

mongot deployments fall into two classes:

  • Low-CPU (suitable for lower data volumes and vector search)
  • High-CPU (suitable for higher data volumes and full-text search)

Use the following guidance to select a starting configuration that matches your application's needs.

Low-CPU Workloads

The low-CPU archetype is ideal for vector search applications or low data volumes where memory is prioritized over raw CPU power. These nodes typically have an 8:1 RAM-to-CPU ratio. A key factor in determining the appropriate size category is an estimate of your expected total vector size. To see reference vector size ranges, refer to the table in the Select a starting size step of the Introduction.

The following table shows recommendations for memory, storage, and CPU cores based on your expected workload in Low-CPU deployments:

Workload Size CategoryDefault Memory (GB)Default Storage (GB)CPU Cores

Small

8 - 16

50 - 100

1 - 2

Medium

32 - 64

200 - 400

4 - 8

Large

128 - 256

800 - 1600

16 - 32

Additional considerations:

  • Small: Suitable for initial testing or very small vector search applications.
  • Medium: Suitable for growing vector search use cases or moderate data volumes.
  • Large: Suitable for substantial vector search applications or larger low-CPU-intensive workloads.

High-CPU Workloads

The High-CPU archetype is designed for general-purpose full-text search workloads where queries are more CPU-intensive. These nodes typically have a 2:1 RAM-to-CPU ratio. Key factors in determining the appropriate size category include the required throughput (QPS) and the expected indexing load. The volume of inserts can serve as a proxy for indexing load. More inserts generally indicate a higher level of indexing activity. To see reference QPS ranges, refer to the table in the Select a starting size step of the Introduction.

The following table shows recommendations for memory, storage, and CPU cores based on your expected workload in High-CPU deployments:

Workload Size CategoryDefault Memory (GB)Default Storage (GB)CPU Cores

Small

4 - 8

100 - 200

2 - 4

Medium

16 - 32

400 - 800

8 - 16

Large

64 - 96

1600 - 2400

32 - 48

Additional considerations:

  • Small: A starting point for full-text search with moderate query rates. A minimum setup of two small nodes (4 CPUs total) supports roughly 40 QPS.
  • Medium: Suitable for more active full-text search applications with higher query throughput.
  • Large: Suitable for demanding full-text search, heavy indexing, or substantial query workloads.

Considerations for Large Vector Search Workloads

Vector search is a key focus area for AI applications. Modern techniques like automatic binary quantization are shifting the primary resource constraint from RAM to storage. Binary quantization makes indexes more storage-constrained.

In these cases, consider a Low-CPU class node with a large amount of storage available. The large storage supports full-fidelity vector embeddings and the quantized version of the source vector embeddings. This alignment of resources to workload ensures you can build and scale modern AI applications efficiently and economically.