Skip to main content

Documentation Index

Fetch the complete documentation index at: https://gcore.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

New Gcore accounts start with a zero quota for Everywhere Inference. A quota increase is required before creating deployments. The Quotas section has two pages:
  • Quotas Viewer — view current resource limits and submit quota increase requests.
  • Quotas Request History — track the status of submitted requests.

Quota increase request

The Quotas Viewer lists all available resource types alongside current usage and quota limit. Requests are processed in up to 15 minutes.

Step 1. Quotas Viewer

In the Gcore Customer Portal, navigate to Everywhere Inference > Quotas > Quotas Viewer. The page shows all available quota resources and their current usage:
  • Inference CPU Millicore Count — total vCPU millicores allocated across all deployments.
  • Inference GPU A100 Count — number of A100 GPUs.
  • Inference GPU H100 Count — number of H100 GPUs.
  • Inference GPU L40S Count — number of L40S GPUs.
  • Inference Instances Count — total number of deployments.
Account Quotas page showing available resources and the Request form

Step 2. Resource selection

Click a resource row to add it to the Selected resources panel on the right. Use the + and buttons in the panel to set the requested quota value for each resource. Each pod in each region consumes one GPU slot and the corresponding CPU millicores, so multiply the per-pod values by the total number of pods across all regions.

Step 3. Request submission

In the Request panel, enter a description of the use case, then click Send request. After the quota is updated, new deployments can be created and autoscaling limits on existing ones can be modified. Request status is visible in Quotas > Quotas Request History.