惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

美团技术团队
T
Troy Hunt's Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
S
Schneier on Security
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
NISL@THU
NISL@THU
The Hacker News
The Hacker News
Know Your Adversary
Know Your Adversary
L
Lohrmann on Cybersecurity
SecWiki News
SecWiki News
S
Security Affairs
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Help Net Security
Help Net Security
L
LINUX DO - 热门话题
Application and Cybersecurity Blog
Application and Cybersecurity Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
I
Intezer
S
Secure Thoughts
罗磊的独立博客
Attack and Defense Labs
Attack and Defense Labs
G
GRAHAM CLULEY
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
博客园_首页
Cyberwarzone
Cyberwarzone
IT之家
IT之家
T
Threatpost
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The Cloudflare Blog
博客园 - 叶小钗
Cloudbric
Cloudbric
量子位
Scott Helme
Scott Helme
N
News | PayPal Newsroom
L
LINUX DO - 最新话题
O
OpenAI News
C
Cyber Attacks, Cyber Crime and Cyber Security
Security Archives - TechRepublic
Security Archives - TechRepublic
C
Cybersecurity and Infrastructure Security Agency CISA
J
Java Code Geeks
有赞技术团队
有赞技术团队
月光博客
月光博客
大猫的无限游戏
大猫的无限游戏
W
WeLiveSecurity
宝玉的分享
宝玉的分享
P
Privacy International News Feed
A
Arctic Wolf
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
云风的 BLOG
云风的 BLOG

Supabase Blog

AI Agents Know About Supabase. They Don't Always Use It Right. Custom OIDC Providers for Supabase Auth 100,000 GitHub stars Supabase docs over SSH Navigating Regional Network Blocks Supabase Joins the Stripe Projects Developer Preview Log Drains: Now available on Pro Supabase Storage: major performance, security, and reliability updates Supabase incident on February 12, 2026 Hydra joins Supabase X / Twitter OAuth 2.0 is now available for Supabase Auth BKND joins Supabase Supabase is now an official Claude connector Supabase PrivateLink is now available Introducing: Postgres Best Practices When to use Read Replicas vs. bigger compute Introducing TRAE SOLO integration with Supabase Supabase Security Retro: 2025 Sync Stripe Data to Your Supabase Database in One Click Building ChatGPT Apps with Supabase Edge Functions and mcp-use Own Your Observability: Supabase Metrics API Introducing iceberg-js: A JavaScript Client for Apache Iceberg Introducing Supabase for Platforms Adding Async Streaming to Postgres Foreign Data Wrappers Build "Sign in with Your App" using Supabase Auth Introducing Seven New Email Templates for Supabase Auth The new Supabase power for Kiro Introducing Supabase ETL Introducing Analytics Buckets Introducing Vector Buckets Snap, Inc. Launches Snap Cloud, Powered by Supabase Triplit joins Supabase Supabase Series E 1000 Y Combinator Founders Choose Supabase gm 👋 web3, welcome aboard to Sign in with Web3 (Solana, Ethereum) Announcing the Supabase Remote MCP Server Enterprise speed, enterprise standards with Bolt Cloud + Supabase PostgREST 13 Lovable Cloud + Supabase: The Default Platform for AI Builders Processing large jobs with Edge Functions, Cron, and Queues Defense in Depth for MCP Servers OrioleDB Patent: now freely available to the Postgres community Supabase Launch Week 15 Hackathon Winner Announcement The Vibe Coder's Guide to Supabase Environments Testing for Vibe Coders: From Zero to Production Confidence The Vibe Coding Master Checklist Vibe Coding: Best Practices for Prompting Supabase Auth: Build vs. Buy Top 10 Launches of Launch Week 15 Supabase Launch Week 15 Hackathon Storage: 10x Larger Uploads, 3x Cheaper Cached Egress, and 2x Egress Quota Persistent Storage and 97% Faster Cold Starts for Edge Functions Algolia Connector for Supabase New Observability Features in Supabase Improved Security Controls and A New Home for Security Introducing Branching 2.0 Stripe-To-Postgres Sync Engine as standalone Library Supabase Analytics Buckets with Iceberg Support Create a Supabase backend using Figma Make Introducing JWT Signing Keys Supabase UI: Platform Kit Build a Personalized AI Assistant with Postgres Announcing Multigres: Vitess for Postgres Building on open table formats Open Data Standards: Postgres, OTel, and Iceberg Simplifying back-end complexity with Supabase Data APIs PostgreSQL Event Triggers without superuser access Top 10 Launches of Launch Week 14 Supabase MCP Server Data API Routes to Nearest Read Replica Declarative Schemas for Simpler Database Management Realtime: Broadcast from Database Keeping Tabs on What's New in Supabase Studio Edge Functions: Deploy from the Dashboard + Deno 2.1 Automatic Embeddings in Postgres Introducing the Supabase UI Library Supabase Auth: Bring Your Own Clerk Postgres Language Server: Initial Release Migrating from Fauna to Supabase Migrating from the MongoDB Data API to Supabase Dedicated Poolers Postgres as a Graph Database: (Ab)using pgRouting AI Hackathon at Y Combinator Calendars in Postgres using Foreign Data Wrappers Supabase Launch Week 13 Hackathon Winners How to Hack the Base! Running Durable Workflows in Postgres using DBOS database.build v2: Bring-your-own-LLM Restore to a New Project Hack the Base! with Supabase Top 10 Launches of Launch Week 13 Supabase Queues High Performance Disk Supabase Cron Supabase CLI v2: Config as Code Supabase Edge Functions: Introducing Background Tasks, Ephemeral Storage, and WebSockets Supabase AI Assistant v2 OrioleDB Public Alpha Executing Dynamic JavaScript Code on Supabase with Edge Functions ClickHouse Partnership, improved Postgres Replication, and Disk Management
How pg_graphql works
Raminder Singh · 2024-01-24 · via Supabase Blog

How pg_graphql works

Supabase’s GraphQL API is powered by pg_graphql. In this post, we will look at the internal workings of pg_graphql. Since it is an extension written in the Rust programming language, familiar with Rust will help - although it’s not a requirement to understand this post.

This article will give you a deeper understanding of pg_graphql , helping you to:

  • Make design decisions about how to use GraphQL in your application.
  • Understand how a GraphQL query is processed internally by pg_graphql.
  • Understand, or even fix, simple bugs and implement beginner-friendly features in pg_graphql.

pg_graphql is a Postgres extension that reads the SQL schema in a database and exposes it as a GraphQL schema. The GraphQL interface is made available through a SQL function graphql.resolve(...) which allows any programming language to use GraphQL without any additional servers, processes, or libraries. It is also possible to call the graphql.resolve(...) function from PostgREST, or any other HTTP proxy, to safely expose the GraphQL API via HTTP/S.

When a client sends a GraphQL request to pg_graphql it gets a response back. But do you know what happens inside pg_graphql to serve that request? Let’s take a look at the life of a GraphQL request.

The entry point of a request is the graphql.resolve(query, ..) function. This function calls the graphql._internal_resolve(query) function which is written in Rust and where the real magic starts. The first line in that function uses the graphql-parser crate to parse the GraphQL query into a tree structure. Let’s see how that works.

Parsing converts a query string into an abstract syntax tree (AST). This is a two-step process, first, a tokenizer converts the input string into tokens, and then a parser organizes the tokens into nodes of the AST.

Tokenizer#

A tokenizer (aka a lexer) reads the query string and spits out tokens in the language. For example, take a look at the following query:


_10

query {

_10

bookCollection {

_10

edges {

_10

node {

_10

id

_10

}

_10

}

_10

}

_10

}


It will be turned into the following tokens by the tokenizer: query, {, bookCollection, {, edges, {, node, {, id, }, }, } and }. How does the tokenizer know where a token starts and ends? The tokenizer relies on lexical grammar to figure out token boundaries. It looks at the next character in the text to first find the kind of token to expect and then uses the grammar rules for that token to find where it ends. For example, the first character in our example is q which means it must be a Name because its grammar looks like this:

The grammar tells the tokenizer that if the next character is a Letter or an underscore then it is the start of a Name. And when the tokenizer finds a character that is not a NameContinue it ends the token. In our example, the Name token ends before the first whitespace after query.

Note that the tokenizer’s job is to just produce valid lexical tokens, even if those tokens do not make a valid GraphQL query. For example, the tokenizer will happily produce the tokens }, query, and { for an input string } query {. It is the parser’s job to reject these sequences of tokens as invalid.

There are also tokens that the tokenizer ignores. There’s good reason, for example, to ignore whitespace because it allows you to format your code as you please. But a quirk of the lexical structure of GraphQL is that it also ignores commas. This means, you can stick a comma just about anywhere and the query would still be valid. E.g. the last example can also be rewritten as:


_10

query, {,

_10

bookCollection, {,

_10

edges, {,

_10

node, {,

_10

id,

_10

},

_10

},

_10

},

_10

},


It’s possible you haven’t heard that before. We suggest not abusing the comma; use it thoughtfully to write queries that are easy to read.

Parser#

The list of tokens produced by the tokenizer are consumed by the parser to generate the AST. The parsing grammar dictates how the parser makes sense of the tokens. Since pg_graphql needs to execute the query, it expects an ExecutableDocument. So the parser tries to parse an ExecutableDocument which is defined in the grammar like this:

An ExecutableDocument contains a list of ExecutableDefinitions which is defined like this:

An ExecutableDefinition can be either an OperationDefinition or a FragmentDefinition. Which one should the parser try to parse? Similar to how the tokenizer looks at the next character, the parser can look at the next token to know which definition lies next. The next token in our example is query. Can query appear at the beginning of OperationDefinition? Let’s check its definition:

The parser again has two choices since OperationDefinition can either be an OperationType Name(opt) VariableDefinitions(opt) Directives(opt) SelectionSet or a SelectionSet. So which of those two can start with query? The first one starts with OperationType which is defined like this:

Et voilà! The query token can start an OperationType, which means the parser now knows that it has to parse the first arm of the OperationDefinition. Which in turn means that it is going to parse an OperationDefinition arm of the ExecutableDefinition. Can’t the other arm of the definition also start with query? No, the language designers designed the grammar to avoid such ambiguities.

Now that the parser knows it will parse OperationType Name(opt) VariableDefinitions(opt) Directives(opt) SelectionSet, it skips the query token and looks at the next token, which is {. A { It can’t appear in the beginning of Name(opt), VariableDefinitions(opt) or Directives(opt), but it can start a SelectionSet:

So the parser skips past the { token and then tries to parse a list of Selections. The parser parses the rest of the input string using the same process. It rejects an invalid list of tokens like }, query and { because no grammar rule starts with an }. For our example query, the parser generates the following AST:


_40

Document {

_40

definitions: [Operation(Query(Query {

_40

name: None,

_40

variable_definitions: [],

_40

directives: [],

_40

selection_set: SelectionSet {

_40

items: [Field(Field {

_40

alias: None,

_40

name: "bookCollection",

_40

arguments: [],

_40

directives: [],

_40

selection_set: SelectionSet {

_40

items: [Field(Field {

_40

alias: None,

_40

name: "edges",

_40

arguments: [],

_40

directives: [],

_40

selection_set: SelectionSet {

_40

items: [Field(Field {

_40

alias: None,

_40

name: "node",

_40

arguments: [],

_40

directives: [],

_40

selection_set: SelectionSet {

_40

items: [Field(Field {

_40

alias: None,

_40

name: "id",

_40

arguments: [],

_40

directives: [],

_40

selection_set: SelectionSet { items: [] },

_40

})],

_40

},

_40

})],

_40

},

_40

})],

_40

},

_40

})],

_40

},

_40

}))],

_40

}


An AST in graphql-parser is just Rust structs and enums. For example, the Document is a struct:


_10

#[derive(Debug, Clone, PartialEq)]

_10

pub struct Document<'a, T: Text<'a>> {

_10

pub definitions: Vec<Definition<'a, T>>,

_10

}


A Definition is an enum:


_10

#[derive(Debug, Clone, PartialEq)]

_10

pub enum Definition<'a, T: Text<'a>> {

_10

Operation(OperationDefinition<'a, T>),

_10

Fragment(FragmentDefinition<'a, T>),

_10

}


However, it is not enough to parse the query into an AST. Why? Similar to how a valid list of tokens produced by the tokenizer doesn’t mean it will produce a valid AST, a valid AST doesn’t mean that the AST can be executed by pg_graphql. So before execution, pg_graphql validates the AST.

To understand why a valid AST doesn’t mean that the query can be executed take the following example. Here the query produced a valid AST but it is still invalid because there can't be a named operation together with an anonymous operation in a GraphQL query:


_19

{

_19

bookCollection {

_19

edges {

_19

node {

_19

id

_19

}

_19

}

_19

}

_19

}

_19

_19

query getBookNames {

_19

bookCollection {

_19

edges {

_19

node {

_19

name

_19

}

_19

}

_19

}

_19

}


The above was just one example of the kind of validations performed. The GraphQL spec defines many other types of validations like:

pg_graphql performs these validations and returns errors if they fail. For example:

Some of the validations need to know the types in the reflected GraphQL schema. For example, for the field selection validation, the validation code must know which fields are defined on a GraphQL object. This information is contained in the reflected GraphQL schema.

pg_graphql builds GraphQL schema by reading information from many system catalogs. The load_sql_context function reads this information into a Context object by running the query in the load_sql_context.sql file. load_sql_context.sql loads information about enums, composite and non-composite types, foreign keys, schemas, tables and functions.

It might look like the context is loaded for each GraphQL query, but this function is memoized for performance by the cached attribute. This means it will only be called again if its input argument _config of type Config changes from the last time it was called.Config has three parts:

  1. Current search path. It is a list of schemas that are searched in order when looking up a database object.
  2. Current role. It is the role under which any SQL statements will be executed.
  3. Current schema version. It is the current version of the database’s schema. The version is updated by a couple of triggers that detect changes in schema.

It makes sense to reload the schema if any Config fields change because they can potentially alter the results of running transpiled SQL statements.

The Context object is wrapped in a __Schema object which is used to not only serve introspection queries but also provide information to run validations and for transpilation to SQL. For example, take a look at how tables are converted into<table_name>Collections. This code adds Collection objects to the QueryType object by iterating over tables and adding a <table_name>Collectionfield. Notice howQueryType's schema field is used extensively throughout this code snippet.

Transpilation is a two-step process. First, builder objects are constructed from the AST and __Schema, and then the builder objects are converted into SQL. A builder object contains all the information needed to produce a SQL query.

For example, when a table is added as a collection object to the QueryType object, its type_ field is set to __Type::Connection(_). This type is then checked in the resolve_selection_set function and converted to an appropriate builder type. A builder implements the QueryEntrypoint trait. QueryEntrypoint has only one required method named to_sql_entrypoint which the ConnectionBuilder implements by calling its to_sql method. The to_sql method contains the meat of the logic to generate SQL.

An important aspect of the SQL generation code is how it calls quote_ident and quote_literal functions to avoid SQL injection. Without them, a caller could potentially send a specially crafted input to execute arbitrary SQL code.

The generated SQL code is then run in the execute method of QueryEntrypoint trait. The transpiled queries return a jsonb object which is deserialized into a pgrx::JsonB object. Since pgrx::JsonB is a wrapper over a serde_json::Value object, it is trivial for the code to unwrap it and return this as a JSON response to the client.

In this post, we looked at how pg_graphql processes a GraphQL request. A request goes through the steps of tokenization, parsing, validation, transpilation, and execution. We looked in detail at the actions pg_graphql performs in each step. This knowledge should equip you to understand how pg_graphql works internally and help you make more informed decisions about how you can better use GraphQL APIs. If you feel ambitious, you can also start contributing to pg_graphql which we always welcome.