Skip to content

Improve performance of async execution #142

Open
@Cito

Description

@Cito

When executing a query against a schema that contains async resolvers, then the executor method creates some overhead by defining and calling async functions (async def async_... inside methods in execute.py).

We should check whether pre-defining these functions or static methods of the ExecutionContext could reduce this overhead and boost performance. See also benchmarks provided in #141.

Activity

jkimbo

jkimbo commented on Oct 14, 2021

@jkimbo
Member

@Cito would you be open to splitting the ExecutionContext class into sync and async versions? That way we can drop most of the is_awaitable checks and simplify the code significantly.

Cito

Cito commented on Oct 15, 2021

@Cito
MemberAuthor

We should consider all options, but the API should be backward compatible, to not cause another turmoil in the GraphQL-Python ecoystem, and the code and API should not deviate too much from GraphQL.js so that it's easier to keep up with the development there.

Currently, the ExecutionContext can deal with mixed resolvers (some sync, and some async). This is similar to GraphQL.js, but it is more complicated to implement in Python because contrary to JavaScript you can't await something non-awaitable here.

What we could do is create optimized versions (possibly subclasses) of the ExecutionContext for the two cases when all resolvers (including type resolvers) are sync or all are async. The schema could get an attribute that tells whether it is sync, async or mixed, which could be determined automatically via introspection at schema build time. If possible, the executor could then call the optimized version. This way it would be fully backward compatible and transparent to the caller, and always as performant as possible.

If we do this, we should strive for as little code duplication as possible (just as much as needed to solve the performance issues).

Before experimenting with such a big (internal) change, we should first check how much we can gain by predefining the async functions as explained above. Maybe this already suffices. For optimization, in the case of a purely sync or async schema, you can also pass an is_awaitable function that always returns either False or True as. This would avoid the costly real checks every time (see also #54 regarding the is_awaitable function and the optimization that has been already done here).

Using a profiler like Yappi should also give insights into where things can be further optimized.

added
investigateNeeds investigaton or experimentation
optimizationCode optimizations and performance issues
on Oct 15, 2021
jkimbo

jkimbo commented on Oct 15, 2021

@jkimbo
Member

We should consider all options, but the API should be backward compatible, to not cause another turmoil in the GraphQL-Python ecoystem, and the code and API should not deviate too much from GraphQL.js so that it's easier to keep up with the development there.

Absolutely!

Before experimenting with such a big (internal) change, we should first check how much we can gain by predefining the async functions as explained above. Maybe this already suffices.

Agreed. I can try and put together a PR to see how much it would improve things.

For optimization, in the case of a purely sync or async schema, you can also pass an is_awaitable function that always returns either False or True as.

I think this already happens with the sync execution pathway but I don't think it's possible to always return True for async schema's because the default resolver is still sync.

Cito

Cito commented on Oct 15, 2021

@Cito
MemberAuthor

I think this already happens with the sync execution pathway but I don't think it's possible to always return True for async schema's because the default resolver is still sync.

Good points. Regarding the default resolvers, we could either use async default resolvers in this case or we could simply make a case distinction in the async execute_field method when there is no field resolver, and similarly for the type resolver in complete_abstract_value.

Cito

Cito commented on Dec 28, 2021

@Cito
MemberAuthor

See also #101 for performance optimization.

Cito

Cito commented on Dec 28, 2021

@Cito
MemberAuthor

For the records: In #142 the locally defined async methods were moved to the class level, but this did not make things better. The next step could be to create an execution context that assumes every resolver is async and that is optimized for that case, and see if that can improve performance significantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

help wantedExtra attention is neededinvestigateNeeds investigaton or experimentationoptimizationCode optimizations and performance issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Improve performance of async execution · Issue #142 · graphql-python/graphql-core