Fix domain over composite arrays#772
Conversation
|
I think if we want to do this 100% properly we need to reload sub-types. But that is a pretty big change. We're only handling one level of domain nesting...i.e. you can define a new domain over an old domain. But that is consistent with how we were dealing with domain over arrays here for a long time: {typelem, join_domain} =
if version >= {9, 0, 0} do
{"coalesce(d.typelem, t.typelem)", "LEFT JOIN pg_type AS d ON t.typbasetype = d.oid"}
|
Or instead of this if we had some kind of way to know if we are looking at a domain defined over a domain and trace the pointers to the end of the road. |
|
Doesn't this tie back to the previous discussion that composite types are hard to support for several reasons, one because tables are composite types, and that would make bootstrap much more expensive, and another because they can be updated, which means future encodings can fail unless we start polling/watching for changes? (see #353) |
|
Reading over that issue I think there are some overlapping ideas but maybe not exactly the same as this. He seems to be pointing out these two specific things:
The main difference I see is that anonymous composites basically can't fix the issue without trying to fix subtypes at query time because they are not in pg_type. But domains are in pg_type so you could fix it the same way we handle composite and arrays of composites. Which is basically ignore the "top" type at bootstrap time and take advantage of the existing reload functionality. My last two messages also touch a different issue in the handling of domains. Even without composites our bootstrap assumes that when we have a domain that is an array of a normal type that we can find the array element type in one join. But if you have something like this that's true: CREATE DOMAIN d1 AS int[];
CREATE DOMAIN d2 AS d1;Basically you would nee two joins or you would crash. And this can go on for arbitrary levels. I haven't solved this issue in this PR but my gut feeling is there is something that can be done about it by treating domains as pointers to the base type. |
But actually if we did something like this then this PR wouldn't be necessary. Because it would point to the composite array which would be missing and then the top level reload would kick in |
|
Would you mind if I try a PR where basically for domains we store the resolved extension to use in the types table? It might not be possible but my feeling is that there is a chance there is a nice way to do it. And if there is we don’t need this PR. It would also let us remove one of the joins from our existing bootstrap query trying to get the domain array element t |
|
I don’t mind at all. The only concern is inflating the types in memory. I don’t fully remember all the details but there was something about tables also being types. We may need to make it opt-in. |
919933a to
6a18148
Compare
|
@josevalim I actually discovered more than I anticipated trying to dig into this. But I think the result is pretty nice and actually let's us handle more than I originally thought. There are basically two main changes:
For (2) I believe these should be separated because they are not the same thing. Missing types can be fixed by reloading and they can be fixed to arbitrary levels of nesting, i.e. a parent type and one or more of its sub types might be missing. For example something that already works before this change is missing arrays of composites. Neither the array type nor the underlying element type are bootstrapped. The code here essentially ensures they are bootstrapped from bottom up so that they can find the extension properly: postgrex/lib/postgrex/protocol.ex Line 2008 in c482554 So basically all PR is doing is extending what we do for composite arrays to other things that have the same problem. This had a few other side effects:
The thing that is not perfect is that we essentially have to remove some types that we pulled from the bootstrap query. But I don't see any way around that because with arbitrary levels of nesting you can't really detect which ones to leave out in SQL without it being a massively complex query. Let me know what you think. |
|
@greg-rychlewski excellent work! The only remaining question is: does this affect how long the query takes and how much data we store? If it doesn't have a significant impact, we can definitely ship it. If you have an existing database you can connect to and measure, even better!! |
|
Yeah definitely I can spin something up and do some tests on the old and the new. I can give a few high level replies to your questions before as well:
For 2 there are a few things to think about:
|
|
I guess the TLDR of the above is there is probably only one serious decision to make. For domains do we store a pointer to the base type's table entry or store the full extension. And then the main considerations I can think of:
|
|
👍 for flattenning it all so the query lookup/building is faster. If the query is faster even better, let's just be double sure the flattening is not leading to much larger tables! |
There was another gap in the way we were handling domains. This one is kind of weird but I'll do my best to paint a clear picture. This fix is for when the domain has a base type that is an array of a composite type. For example this
When you try to encode a parameter for this you get the error "Postgrex has no clue how to encode this because we don't know the right extension" (see here). So why does this happen:
Then you might ask how are composites/arrays of composites working. And basically they get reloaded the first time they are found missing. So the fix here is to basically do the same thing: take domains with a base composite array type out of the original bootstrap so they can be reloaded.
The reason this isn't working right now is because it's only if the "top" type is missing that we reload. If a sub type is missing it just raises. And in our case we bootstrapped the top type (domain) but not the subtype (composite array)