Skip to content

Fix domain over composite arrays#772

Open
greg-rychlewski wants to merge 3 commits into
elixir-ecto:masterfrom
greg-rychlewski:fix_domain
Open

Fix domain over composite arrays#772
greg-rychlewski wants to merge 3 commits into
elixir-ecto:masterfrom
greg-rychlewski:fix_domain

Conversation

@greg-rychlewski

@greg-rychlewski greg-rychlewski commented Jun 20, 2026

Copy link
Copy Markdown
Member

There was another gap in the way we were handling domains. This one is kind of weird but I'll do my best to paint a clear picture. This fix is for when the domain has a base type that is an array of a composite type. For example this

CREATE TYPE composite_test AS (a text, b integer);
CREATE DOMAIN composite_array_domain AS composite_test[] CHECK (array_length(VALUE, 1) > 0);

When you try to encode a parameter for this you get the error "Postgrex has no clue how to encode this because we don't know the right extension" (see here). So why does this happen:

  1. In the first bootstrap query we filter composites and arrays of composites out for efficiency (see here)
  2. Our domain type does get bootstrapped as an array type (send = array_send) and its element type is the composite type that was not bootstrapped.
  3. When we encode the parameter we go to fetch the type info and we see it's an array and then we say ok so what type is in the array and find a composite that has not been bootstrapped. So it errors and says Postgrex doesn't know what to do with this.

Then you might ask how are composites/arrays of composites working. And basically they get reloaded the first time they are found missing. So the fix here is to basically do the same thing: take domains with a base composite array type out of the original bootstrap so they can be reloaded.

The reason this isn't working right now is because it's only if the "top" type is missing that we reload. If a sub type is missing it just raises. And in our case we bootstrapped the top type (domain) but not the subtype (composite array)

@greg-rychlewski

greg-rychlewski commented Jun 20, 2026

Copy link
Copy Markdown
Member Author

I think if we want to do this 100% properly we need to reload sub-types. But that is a pretty big change. We're only handling one level of domain nesting...i.e. you can define a new domain over an old domain. But that is consistent with how we were dealing with domain over arrays here for a long time:

{typelem, join_domain} =
      if version >= {9, 0, 0} do
        {"coalesce(d.typelem, t.typelem)", "LEFT JOIN pg_type AS d ON t.typbasetype = d.oid"}

@greg-rychlewski

Copy link
Copy Markdown
Member Author

I think if we want to do this 100% properly we need to reload sub-types.

Or instead of this if we had some kind of way to know if we are looking at a domain defined over a domain and trace the pointers to the end of the road.

@josevalim

Copy link
Copy Markdown
Member

Doesn't this tie back to the previous discussion that composite types are hard to support for several reasons, one because tables are composite types, and that would make bootstrap much more expensive, and another because they can be updated, which means future encodings can fail unless we start polling/watching for changes? (see #353)

@greg-rychlewski

greg-rychlewski commented Jun 21, 2026

Copy link
Copy Markdown
Member Author

Reading over that issue I think there are some overlapping ideas but maybe not exactly the same as this. He seems to be pointing out these two specific things:

  • For named composites the underlying types can change
  • For anonymous composites we might not know about a subtype and crash (similar to this issue)

The main difference I see is that anonymous composites basically can't fix the issue without trying to fix subtypes at query time because they are not in pg_type. But domains are in pg_type so you could fix it the same way we handle composite and arrays of composites. Which is basically ignore the "top" type at bootstrap time and take advantage of the existing reload functionality.

My last two messages also touch a different issue in the handling of domains. Even without composites our bootstrap assumes that when we have a domain that is an array of a normal type that we can find the array element type in one join. But if you have something like this that's true:

CREATE DOMAIN d1 AS int[];
CREATE DOMAIN d2 AS d1;

Basically you would nee two joins or you would crash. And this can go on for arbitrary levels. I haven't solved this issue in this PR but my gut feeling is there is something that can be done about it by treating domains as pointers to the base type.

@greg-rychlewski

Copy link
Copy Markdown
Member Author

by treating domains as pointers to the base type

But actually if we did something like this then this PR wouldn't be necessary. Because it would point to the composite array which would be missing and then the top level reload would kick in

@greg-rychlewski

greg-rychlewski commented Jun 21, 2026

Copy link
Copy Markdown
Member Author

Would you mind if I try a PR where basically for domains we store the resolved extension to use in the types table? It might not be possible but my feeling is that there is a chance there is a nice way to do it. And if there is we don’t need this PR. It would also let us remove one of the joins from our existing bootstrap query trying to get the domain array element t

@josevalim

Copy link
Copy Markdown
Member

I don’t mind at all. The only concern is inflating the types in memory. I don’t fully remember all the details but there was something about tables also being types. We may need to make it opt-in.

@greg-rychlewski

greg-rychlewski commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

@josevalim I actually discovered more than I anticipated trying to dig into this. But I think the result is pretty nice and actually let's us handle more than I originally thought.

There are basically two main changes:

  1. When we encounter a domain type try to find the type that is eventually resolved to when bootstrapping. May be arbitrary levels of nesting
  2. In the past when we bootstrapped the types we conflated two different things: (1) a necessary sub-type was not bootstrapped and (2) the type was bootstrapped but there are no extensions matching the type.

For (2) I believe these should be separated because they are not the same thing. Missing types can be fixed by reloading and they can be fixed to arbitrary levels of nesting, i.e. a parent type and one or more of its sub types might be missing.

For example something that already works before this change is missing arrays of composites. Neither the array type nor the underlying element type are bootstrapped. The code here essentially ensures they are bootstrapped from bottom up so that they can find the extension properly:

defp reload_row({type_infos, oids, missing, current}, values, types) do

So basically all PR is doing is extending what we do for composite arrays to other things that have the same problem. This had a few other side effects:

  • Now we can also handle things like ranges over composites. In the past the composite would not be bootstrapped and this would raise a "postgres can't handle this type" exception
  • We could get rid of the hacks we were doing for domains in the bootstrap query. Essentially we merged composites and arrays into domains (up to one level of nesting). But now we can just handle them naturally.

The thing that is not perfect is that we essentially have to remove some types that we pulled from the bootstrap query. But I don't see any way around that because with arbitrary levels of nesting you can't really detect which ones to leave out in SQL without it being a massively complex query.

Let me know what you think.

@josevalim

Copy link
Copy Markdown
Member

@greg-rychlewski excellent work! The only remaining question is: does this affect how long the query takes and how much data we store? If it doesn't have a significant impact, we can definitely ship it. If you have an existing database you can connect to and measure, even better!!

@greg-rychlewski

greg-rychlewski commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

Yeah definitely I can spin something up and do some tests on the old and the new.

I can give a few high level replies to your questions before as well:

  1. You meant how long the bootstrap query takes right? If so then this change actually makes it simpler. We don't have to join pg_type onto itself anymore. In the past we did this to try to merge some of the underlying type information into the domain. But now we are treating domains as pointers and don't have to conflate fields
  2. If you mean the data we store in the types table then we are not adding any new columns or changing anything about what was already stored. But we are storing more in the sense that now we handle arbitrary levels of nesting for domains. And to make the extension resolution faster we essentially copy the base type's extension into each of the individual domains table record.

For 2 there are a few things to think about:

  1. Instead of copying the base type's extension into all of the domains in the chain we could essentially store pointers in them and use that to get to the base type's extension from the table. This would add more work to each query though.
  2. Instead of making a chain of N domains someone could have made N domains of depth 1 all pointing to the same base type and we wouldn't be able to do anything about it now or with our old code
  3. It's similar to how nesting already works in the types table. For example if you have an array over a composite over a composite the parent type would essentially store the whole extension chain. And each subtype would store its own part of the extension chain and lower. Rather than having them point to other places in the ets table.
  4. I don't know for sure but I imagine the number of domains people are defining is not crazy. There is some evidence of that in the sense that we are not really handling it completely but no one has complained much over the years or raised issues about deeply nested domains.

@greg-rychlewski

Copy link
Copy Markdown
Member Author

I guess the TLDR of the above is there is probably only one serious decision to make. For domains do we store a pointer to the base type's table entry or store the full extension. And then the main considerations I can think of:

  1. If we store a pointer then for domains we have 2 ets lookups for every query instead of 1
  2. If we store the extension it is more copies of the same thing but it's similar to how nesting works today already and there might not be a tonne of domains anyway

@josevalim

Copy link
Copy Markdown
Member

👍 for flattenning it all so the query lookup/building is faster. If the query is faster even better, let's just be double sure the flattening is not leading to much larger tables!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants