Errors and Retries
The SDK maps every YDB server status code to a Python exception class. All exceptions
inherit from Error, so you can catch the whole hierarchy with a single
except ydb.Error clause, or handle individual error types precisely.
Exception Hierarchy
ydb.Error
├── ydb.BadRequest — malformed query or invalid argument
├── ydb.Unauthorized — authentication succeeded but operation not permitted
├── ydb.Unauthenticated — authentication failed
├── ydb.InternalError — server-side internal error (usually transient)
├── ydb.Aborted — transaction aborted due to a conflict; safe to retry
├── ydb.Unavailable — service temporarily unavailable; safe to retry
├── ydb.Overloaded — server is overloaded; retry with backoff
├── ydb.SchemeError — schema-related error (e.g. table not found, already exists)
├── ydb.GenericError — unclassified server error
├── ydb.Timeout — server-side deadline exceeded
├── ydb.BadSession — session is invalid or expired; create a new one
├── ydb.PreconditionFailed — operation precondition not met
├── ydb.AlreadyExists — object already exists (DDL)
├── ydb.NotFound — object not found
├── ydb.SessionExpired — session TTL expired
├── ydb.Cancelled — operation was cancelled
├── ydb.Undetermined — outcome is unknown (e.g. network lost before ack)
├── ydb.Unsupported — operation not supported by this server version
├── ydb.SessionBusy — session is executing another request
├── ydb.ExternalError — error in an external data source
├── ydb.TruncatedResponseError — result set was truncated by the server
├── ydb.SessionPoolEmpty — all sessions are busy; pool exhausted
├── ydb.SessionPoolClosed — session pool has been stopped
├── ydb.ConnectionError — base class for transport-level errors
│ ├── ydb.ConnectionFailure — could not establish connection
│ ├── ydb.ConnectionLost — connection dropped mid-request
│ └── ydb.Unimplemented — server does not support this RPC
└── ydb.DeadlineExceed — client-side deadline exceeded
Catching Errors
import ydb
try:
pool.execute_with_retries("SELECT * FROM users")
except ydb.SchemeError:
print("table does not exist")
except ydb.Unauthorized:
print("access denied")
except ydb.Unavailable:
print("service temporarily unavailable")
except ydb.Error as e:
print(f"other YDB error: {e}")
Each exception exposes:
str(e)— human-readable message from the server.e.issues— list of structuredIssueMessageobjects withmessage,issue_code, andseverity.e.status— theydb.StatusCodeenum value.
try:
pool.execute_with_retries("BAD QUERY")
except ydb.Error as e:
print(e.status) # e.g. StatusCode.BAD_REQUEST
print(e.message)
if e.issues:
for issue in e.issues:
print(issue.message, issue.severity)
Retriable vs Non-Retriable Errors
Not every error is worth retrying. The SDK classifies errors into three groups:
Always retriable (fast backoff):
Unavailable— service is temporarily unavailable.ClientInternalError— internal SDK error.SessionExpired— session TTL passed; the SDK opens a new one.NotFound— by default retried (configurable viaretry_not_found).Cancelled— only whenretry_cancelled=Trueis set.
Retriable with slow backoff:
Aborted— transaction conflict; the whole transaction must be replayed.BadSession— session is invalid; the SDK acquires a new one.Overloaded— server under heavy load; back off and try again.SessionPoolEmpty— all pool sessions busy; wait and retry.ConnectionError/ConnectionLost— network issues; reconnect and retry.
Retriable only for idempotent operations (slow backoff):
Undetermined— outcome unknown; only safe to retry if the operation is idempotent (i.e. repeating it has the same effect as running it once). Setidempotent=TrueinRetrySettingsto enable.
Never retried:
BadRequest,Unauthorized,Unauthenticated,SchemeError,AlreadyExists,Unsupported,Timeout,PreconditionFailed,ExternalError— these indicate a problem with the query, credentials, or schema that won’t resolve by retrying.
RetrySettings
All pool methods that perform retries (execute_with_retries, retry_operation_sync,
retry_tx_sync) accept an optional RetrySettings object:
import ydb
retry = ydb.RetrySettings(
max_retries=5, # max retry attempts (default: 10)
idempotent=False, # set True to also retry Undetermined errors
retry_cancelled=False, # set True to retry Cancelled errors
)
pool.execute_with_retries("SELECT 1", retry_settings=retry)
BackoffSettings
RetrySettings uses two separate backoff curves:
fast backoff — used for errors expected to clear quickly (
Unavailable,SessionExpired, etc.).slow backoff — used for errors where the server needs more breathing room (
Overloaded,Aborted, connection failures, etc.).
Both are instances of BackoffSettings:
fast = ydb.BackoffSettings(
ceiling=10, # exponent cap: max wait slot = 2^ceiling * slot_duration
slot_duration=0.005, # base time unit in seconds (default: 5 ms)
uncertain_ratio=0.5, # fraction of the window that is randomised (jitter)
)
slow = ydb.BackoffSettings(
ceiling=6,
slot_duration=1.0, # 1 second base (default)
uncertain_ratio=0.5,
)
retry = ydb.RetrySettings(
max_retries=10,
fast_backoff_settings=fast,
slow_backoff_settings=slow,
)
The actual sleep duration for retry n is:
slots = 2 ^ min(n, ceiling)
max_ms = slots * slot_duration * 1000
sleep = max_ms * (random() * uncertain_ratio + (1 - uncertain_ratio)) / 1000
Error Callback
To log or instrument every retry attempt, pass on_ydb_error_callback:
import logging
logger = logging.getLogger(__name__)
def log_retry(err: ydb.Error):
logger.warning("YDB error, will retry: %s", err)
retry = ydb.RetrySettings(
max_retries=10,
on_ydb_error_callback=log_retry,
)
pool.execute_with_retries("SELECT 1", retry_settings=retry)
@ydb_retry Decorator
For functions that run outside a session pool, use the @ydb.ydb_retry decorator.
It wraps both synchronous and asynchronous functions:
import ydb
@ydb.ydb_retry(max_retries=5, idempotent=True)
def fetch_user(driver: ydb.Driver, user_id: int):
with ydb.QuerySessionPool(driver) as pool:
result = pool.execute_with_retries(
"SELECT name FROM users WHERE id = $id",
parameters={"$id": user_id},
)
return result[0].rows[0]["name"]
# Async version — the decorator detects coroutines automatically:
@ydb.ydb_retry(max_retries=5, idempotent=True)
async def fetch_user_async(driver: ydb.aio.Driver, user_id: int):
async with ydb.aio.QuerySessionPool(driver) as pool:
result = await pool.execute_with_retries(
"SELECT name FROM users WHERE id = $id",
parameters={"$id": user_id},
)
return result[0].rows[0]["name"]
Note
The decorator retries the entire function on failure, not just the individual
query. Only use idempotent=True when repeating the full function body is safe.
Handling Undetermined
Undetermined means the network was lost before the server could confirm whether
the operation succeeded or failed. The server may have applied the write — or not.
For read-only queries this is always safe to retry. For writes, only retry if
you can tolerate duplicates or the query is naturally idempotent (e.g. UPSERT):
retry = ydb.RetrySettings(
max_retries=10,
idempotent=True, # enables retry on Undetermined
)
# Safe: UPSERT is idempotent
pool.execute_with_retries(
"UPSERT INTO events (id, data) VALUES (42, 'payload')",
retry_settings=retry,
)
# Unsafe: INSERT will fail with AlreadyExists on the second attempt —
# but that error is not retriable, so at worst you get an exception.
pool.execute_with_retries(
"INSERT INTO events (id, data) VALUES (42, 'payload')",
retry_settings=retry,
)
Common Patterns
Fail fast on connection errors during startup:
try:
driver.wait(timeout=5, fail_fast=True)
except TimeoutError:
raise SystemExit("Could not reach YDB — check endpoint and credentials")
Distinguish schema errors (fix the code) from transient errors (retry):
try:
pool.execute_with_retries("SELECT * FROM nonexistent_table")
except ydb.SchemeError as e:
raise RuntimeError(f"Schema problem: {e}") from e
except ydb.Error:
pass # handled by retry logic inside execute_with_retries