To quickly provide your 为了在大多数一般用例中快速为mongot deployment with a healthy balance of resources for most general use cases, a small or medium High-CPU node is often an effective starting point. This configuration provides a solid foundation for common search workloads.mongot部署提供健康的资源平衡,中小型高CPU节点通常是一个有效的起点。此配置为常见的搜索工作负载提供了坚实的基础。
For more precise resource provisioning tailored to a specific workload, review the following pages:有关针对特定工作负载量身定制的更精确的资源配置,请查看以下页面:
Deployment Architecture Patterns部署架构模式Resource Allocation Considerations资源分配注意事项Hardware Considerations硬件注意事项
These pages offer guidance on mission-critical applications, or when higher-fidelity optimization is required.这些页面提供了关于关键任务应用程序或需要高保真度优化时的指导。
Note
Sizing resources for search and vector search workloads is an iterative process. These examples represent a starting point, but advanced considerations and measurements may be required for sizing a specific workload.调整搜索和矢量搜索工作负载的资源大小是一个迭代过程。这些示例代表了一个起点,但可能需要高级考虑和测量来确定特定工作负载的大小。
Workload Classes工作量类别
mongot deployments fall into two classes:部署分为两类:
Low-CPU (suitable for lower data volumes and vector search)低CPU(适用于低数据量和矢量搜索)High-CPU (suitable for higher data volumes and full-text search)高CPU(适用于更大的数据量和全文搜索)
Use the following guidance to select a starting configuration that matches your application's needs.使用以下指南选择符合应用程序需求的启动配置。
Low-CPU Workloads低CPU工作负载
The low-CPU archetype is ideal for vector search applications or low data volumes where memory is prioritized over raw CPU power. These nodes typically have an 8:1 RAM-to-CPU ratio. A key factor in determining the appropriate size category is an estimate of your expected total vector size. To see reference vector size ranges, refer to the table in the Select a starting size step of the Introduction.
The following table shows recommendations for memory, storage, and CPU cores based on your expected workload in Low-CPU deployments:
| Small | 8 - 16 | 50 - 100 | 1 - 2 |
| Medium | 32 - 64 | 200 - 400 | 4 - 8 |
| Large | 128 - 256 | 800 - 1600 | 16 - 32 |
Additional considerations:
- Small
: Suitable for initial testing or very small vector search applications.:适用于初始测试或非常小的矢量搜索应用程序。 - Medium
: Suitable for growing vector search use cases or moderate data volumes.:适用于不断增长的矢量搜索用例或中等数据量。 - Large
: Suitable for substantial vector search applications or larger low-CPU-intensive workloads.:适用于大量矢量搜索应用程序或较大的低CPU密集型工作负载。
High-CPU WorkloadsCPU工作负载高
The High-CPU archetype is designed for general-purpose full-text search workloads where queries are more CPU-intensive. These nodes typically have a 2:1 RAM-to-CPU ratio. High CPU原型专为查询更占用CPU的通用全文搜索工作负载而设计。这些节点通常具有2:1的RAM与CPU比率。Key factors in determining the appropriate size category include the required throughput (QPS) and the expected indexing load. The volume of inserts can serve as a proxy for indexing load. More inserts generally indicate a higher level of indexing activity. To see reference QPS ranges, refer to the table in the Select a starting size step of the Introduction.
The following table shows recommendations for memory, storage, and CPU cores based on your expected workload in High-CPU deployments:下表显示了基于高CPU部署中预期工作负载的内存、存储和CPU内核建议:
| Small | 4 - 8 | 100 - 200 | 2 - 4 |
| Medium | 16 - 32 | 400 - 800 | 8 - 16 |
| Large | 64 - 96 | 1600 - 2400 | 32 - 48 |
Additional considerations:
- Small
: A starting point for full-text search with moderate query rates. A minimum setup of two small nodes (4 CPUs total) supports roughly 40 QPS.:以适中的查询率进行全文搜索的起点。至少两个小节点(总共4个CPU)的设置支持大约40个QPS。 - Medium
: Suitable for more active full-text search applications with higher query throughput.:适用于查询吞吐量更高的更活跃的全文搜索应用程序。 - Large
: Suitable for demanding full-text search, heavy indexing, or substantial query workloads.:适用于要求苛刻的全文搜索、繁重的索引或大量的查询工作负载。
Considerations for Large Vector Search Workloads大型矢量搜索工作量的考虑因素
Vector search is a key focus area for AI applications. Modern techniques like automatic binary quantization are shifting the primary resource constraint from RAM to storage. Binary quantization makes indexes more storage-constrained.矢量搜索是人工智能应用的一个关键焦点领域。自动二进制量化等现代技术正在将主要资源约束从RAM转移到存储。二进制量化使索引更受存储限制。
In these cases, consider a Low-CPU class node with a large amount of storage available. The large storage supports full-fidelity vector embeddings and the quantized version of the source vector embeddings. This alignment of resources to workload ensures you can build and scale modern AI applications efficiently and economically.在这些情况下,考虑具有大量可用存储的低CPU级节点。大存储支持全保真向量嵌入和源向量嵌入的量化版本。这种资源与工作负载的对齐确保了您可以高效、经济地构建和扩展现代人工智能应用程序。