Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google.api_core.exceptions.ResourceExhausted: 429 Quota exceeded for quota metric 'Administrative requests' and limit 'Administrative requests per minute' #104

Closed
hadim opened this issue Aug 22, 2021 · 9 comments · Fixed by googleapis/python-spanner#543
Assignees
Labels
api: spanner Issues related to the googleapis/python-spanner-sqlalchemy API.

Comments

@hadim
Copy link

hadim commented Aug 22, 2021

I am running into some quota limitations when doing a high number of read requests in parallel:

google.api_core.exceptions.ResourceExhausted: 429 Quota exceeded for quota metric 'Administrative requests' and limit 'Administrative requests per minute' of service 'spanner.googleapis.com' for consumer 'project_number:xxxx'.

The request is quite similar to a simple SELECT * FROM table1 as t1 JOIN table2 as t2 ON t1.id = t2.id.

The error says Administrative requests per minute which seems weird to me since it has nothing to do with an admin action right?

Is it something related to python-spanner-sqlalchemy?

From https://cloud.google.com/spanner/quotas#administrative_limits, it seems like the limit can't be modified.

This issue is quite limiting for us at the moment to scale our workload with Spanner.

@hadim
Copy link
Author

hadim commented Aug 22, 2021

I also tried to look at the GCP > IAM & Admin > Quotas limits for Spanner but the admin one can't be modified.

@hadim
Copy link
Author

hadim commented Aug 23, 2021

Here is a more complete traceback:

Process LokyProcess-4:
Traceback (most recent call last):
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 67, in error_remapped_callable
    return callable_(*args, **kwargs)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/grpc/_channel.py", line 923, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/grpc/_channel.py", line 826, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.RESOURCE_EXHAUSTED
	details = "Quota exceeded for quota metric 'Administrative requests' and limit 'Administrative requests per minute per user' of service 'spanner.googleapis.com' for consumer 'project_number:363164121308'."
	debug_error_string = "{"created":"@1629719711.845320792","description":"Error received from peer ipv4:172.217.13.170:443","file":"src/core/lib/surface/call.cc","file_line":1062,"grpc_message":"Quota exceeded for quota metric 'Administrative requests' and limit 'Administrative requests per minute per user' of service 'spanner.googleapis.com' for consumer 'project_number:363164121308'.","grpc_status":8}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 431, in _process_worker
    r = call_item()
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 285, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in __call__
    return self.func(*args, **kwargs)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/datamol/utils/jobs.py", line 67, in _run
    return fn(args, **fn_kwargs)
  File "/home/hadim/Drive/Documents/valence/Platform/Libs/circus/circus/db/utils.py", line 144, in _query_fn
    mols = query.all()
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3373, in all
    return list(self)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3556, in _execute_and_instances
    conn = self._get_bind_args(
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3571, in _get_bind_args
    return fn(
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3550, in _connection_from_session
    conn = self.session.connection(**kw)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1142, in connection
    return self._connection_for_bind(
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1150, in _connection_for_bind
    return self.transaction._connection_for_bind(
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 433, in _connection_for_bind
    conn = bind._contextual_connect()
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2302, in _contextual_connect
    self._wrap_pool_connect(self.pool.connect, None),
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
    return fn()
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 364, in connect
    return _ConnectionFairy._checkout(self)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
    rec = pool._do_get()
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 140, in _do_get
    self._dec_overflow()
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
    compat.raise_(
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 137, in _do_get
    return self._create_connection()
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
    return _ConnectionRecord(self)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
    self.__connect(first_connect_check=True)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
    pool.logger.debug("Error on connect(): %s", e)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
    compat.raise_(
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
    connection = pool._invoke_creator(self)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
    return dialect.connect(*cargs, **cparams)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 508, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/google/cloud/spanner_dbapi/connection.py", line 425, in connect
    if not database.exists():
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/google/cloud/spanner_v1/database.py", line 400, in exists
    api.get_database_ddl(database=self.name, metadata=metadata)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/google/cloud/spanner_admin_database_v1/services/database_admin/client.py", line 961, in get_database_ddl
    response = rpc(request, retry=retry, timeout=timeout, metadata=metadata,)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/google/api_core/gapic_v1/method.py", line 145, in __call__
    return wrapped_func(*args, **kwargs)
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/google/api_core/retry.py", line 286, in retry_wrapped_func
    return retry_target(
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/google/api_core/retry.py", line 189, in retry_target
    return target()
  File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 69, in error_remapped_callable
    six.raise_from(exceptions.from_grpc_error(exc), exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.ResourceExhausted: 429 Quota exceeded for quota metric 'Administrative requests' and limit 'Administrative requests per minute per user' of service 'spanner.googleapis.com' for consumer 'project_number:xxxx'.

Looking at it it seems that during connect() in spanner_dbapi/connection.py, the lib is doing an admin call to check whether the database exists:

database = instance.database(database_id, pool=pool)
    if not database.exists():  # <<<- admin call to Spanner
        raise ValueError("database '%s' does not exist." % database_id)

    conn = Connection(instance, database)
    if pool is not None:
        conn._own_pool = False

    return conn

and this is probably what triggers the admin quota limit.

I tried to reproduce the same error using python-spanner only and the same SQL query:

 with database.snapshot() as snapshot:
        results = snapshot.execute_sql()

But I don't see the error.

This makes me think that somehow, python-spanner-sqlalchemy is making too many admin calls (database.exists()) while it's not needed and then triggers an admin quota limit when doing many SQL calls in parallel.

@hadim
Copy link
Author

hadim commented Aug 23, 2021

When commenting:

database = instance.database(database_id, pool=pool)
# if not database.exists():  # <<<- admin call to Spanner
#     raise ValueError("database '%s' does not exist." % database_id)

I don't see the error anymore.

@hadim
Copy link
Author

hadim commented Aug 23, 2021

In case it's useful for someone else, here is a monkey patch that works for me:

def _patch_spanner_connect():
    """Prevent admin quot limits.
    See https://github.com/cloudspannerecosystem/python-spanner-sqlalchemy/issues/104
    """

    from google.cloud import spanner_dbapi
    from google.cloud.spanner_dbapi import Connection

    from google.api_core.gapic_v1.client_info import ClientInfo
    from google.cloud import spanner_v1 as spanner

    from google.cloud.spanner_dbapi.version import DEFAULT_USER_AGENT
    from google.cloud.spanner_dbapi.version import PY_VERSION

    def _patched_connect(
        instance_id,
        database_id,
        project=None,
        credentials=None,
        pool=None,
        user_agent=None,
    ):
        client_info = ClientInfo(user_agent=user_agent or DEFAULT_USER_AGENT, python_version=PY_VERSION)

        if isinstance(credentials, str):
            client = spanner.Client.from_service_account_json(
                credentials, project=project, client_info=client_info
            )
        else:
            client = spanner.Client(project=project, credentials=credentials, client_info=client_info)

        instance = client.instance(instance_id)
        if not instance.exists():
            raise ValueError("instance '%s' does not exist." % instance_id)

        database = instance.database(database_id, pool=pool)

        # NOTE(hadim): monkey patch to remove Spanner admin calls.
        # if not database.exists():
        #     raise ValueError("database '%s' does not exist." % database_id)

        conn = Connection(instance, database)
        if pool is not None:
            conn._own_pool = False

        return conn

    spanner_dbapi.connect = _patched_connect

@IlyaFaer IlyaFaer self-assigned this Aug 26, 2021
@IlyaFaer
Copy link
Contributor

@hadim, thank you very much for your investigation!

I think you're right, Database.exists() method uses an admin operation to check if the database exists. Working on a solution.

@IlyaFaer
Copy link
Contributor

IlyaFaer commented Aug 26, 2021

I've taken a look at the Database class, and I don't see what can be used instead of exists():

  • reload() does the very same backend call
  • state attribute is not filled after Database object instantiation
  • is_ready() uses the state property

Seems to me the best way to resolve the problem is to move instance and database existence check into a separate Connection method. Thus, on connection initiation there will be no backend calls. If one wants to check the connection, they can explicitly call, let's say Connection.validate() method, which'll do exists() calls and check if the given instance and database are reachable.

As an alternative, we probably can use a simple select from the given database in order to check that it's is reachable.

In any case I think some kind of existance check should be there.

@olavloite, what do you think?
I see that in Go we have IsValid() method, but it checks only if the connection is closed. What about Java?

@olavloite
Copy link
Contributor

Java (JDBC) does not really do a check like this, so if the user connects to a non-existing database, the error will be delayed until the first request that is done. I guess the closest we have in any of the other connection APIs is the Ping method in the go-sql driver: https://github.com/cloudspannerecosystem/go-sql-spanner/blob/0ae00ada6de5118dbfb75e566cd57e0907acd207/driver.go#L216

That method executes a SELECT 1 statement to verify that the connection is (still) valid. I think that that could be a reasonable solution for dbapi as well. Such a statement will return a Database not found error if the database does not exist, and removes the need for doing an admin call. It would also remove the need to call instance.exists() in the connect method (which current probably also does an admin call).

@IlyaFaer
Copy link
Contributor

@olavloite, I suppose in Spanner terms ping is mostly considered as a mechanism to keep sessions alive (like PingingPool). Honestly, I first thought about Go's Ping() in the same way. Anyway, it doesn't look like it changes things very much...

Then, manual method with SELECT 1. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the googleapis/python-spanner-sqlalchemy API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants