Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query returns incorrect records in associations when the foreign key is omitted from select #3246

Closed
nicnilov opened this issue Mar 14, 2020 · 11 comments

Comments

@nicnilov
Copy link

nicnilov commented Mar 14, 2020

Not sure if this was already considered or fixed but it seems to be pretty important to be worth checking.

Environment

  • Elixir version: Elixir 1.8.2 (compiled with Erlang/OTP 21)
  • Database and version: PostgreSQL 10.12
  • Ecto version: 2.2.12
  • Database adapter and version: postgrex 0.13.5
  • Operating system: elementary OS 5.1.2 Hera

Current behavior

Consider a query which returns records with two levels deep of associations and uses a select: clause. If the select: does not include the relevant foreign key field, Ecto silently accepts the query and returns the result where parent records have incorrect children (and grandchildren) attached. The children get attached to parents in such a way that all parents receive the same single child which happened to be returned first.

When the resultset does not include the foreign key field, Ecto, naturally, cannot associate the records correctly (which it should know), but it silently proceeds nevertheless, and returns an inconsistent result.

This does not happen when there is no select: clause because then all fields are included and the result reconstruction proceeds correctly. This also does not happen in a simpler parent-child query where the children are simply not included in the result.

Schema

  +-------------------+          +-----------------+          +------------------+
  | Patient           | 1      1 | User            | 1      n | Document         |
  |                   +--------->+                 +--------->+                  |
  +-------------------+          +-----------------+          +------------------+

  schema "patients" do
    field :first_name, :string
    field :middle_name, :string
    field :last_name, :string

    belongs_to :user, User
  end

  schema "users" do
    field :email, :string

    has_one :patient, Patient, on_delete: :delete_all
    has_many :documents, Document, on_delete: :delete_all
  end

  schema "documents" do
    field :filename, :string

    belongs_to :user, User
  end

No select in the query

  # Note that the query starts from the Patient and not from the User,
  # that is in the `belongs_to` -> `has_one` -> `has_many` order.
  # This is dictated by the realities of the schema and may or may not
  # be relevant to the issue.

  def patient_summary_query do
    documents_query = from d in Document

    from p in Patient,
      join: p_u in assoc(p, :user),
      preload: [
        user: {
          p_u,
          documents: ^documents_query
        }
      ]
  end

SELECT p0."id", p0."first_name", p0."middle_name", p0."last_name", p0."user_id", u1."id", u1."email" FROM "patients" AS p0 INNER JOIN "users" AS u1 ON u1."id" = p0."user_id" []
SELECT d0."id", d0."filename", d0."user_id", d0."user_id" FROM "documents" AS d0 WHERE (d0."user_id" = ANY($1)) ORDER BY d0."user_id" [[24, 23, 1, 21, 2]]

The result is correct with all the associations in their right places:

[
  %Patient{
    __meta__: #Ecto.Schema.Metadata<:loaded, "patients">,
    first_name: "Jerrod",
    id: 1,
    last_name: "Gislason",
    middle_name: "Ansel",
    user: %User{
      documents: [
        %Document{
          __meta__: #Ecto.Schema.Metadata<:loaded, "documents">,
          filename: "test1.jpg",
          id: 1,
        },
        %Document{
          __meta__: #Ecto.Schema.Metadata<:loaded, "documents">,
          filename: "test1.jpg",
          id: 2,
        },
        %Document{
          __meta__: #Ecto.Schema.Metadata<:loaded, "documents">,
          filename: "test1.jpg",
          id: 3,
        }
      ],
      id: 1,
      ...
    },
    user_id: 1
  },
  %Patient{
    __meta__: #Ecto.Schema.Metadata<:loaded, "patients">,
    first_name: "Patient3",
    id: 4,
    last_name: "Test",
    middle_name: "3",
    user: %User{
      documents: [],
      id: 23,
      ...
    },
    user_id: 23
  },
  %Patient{
    __meta__: #Ecto.Schema.Metadata<:loaded, "patients">,
    first_name: "Patient4",
    id: 5,
    last_name: "Test",
    middle_name: "4",
    user: %User{
      documents: [],
      id: 24,
      ...
    },
    user_id: 24
  }
]

Select skips the foreign key

  def patient_summary_query do
    documents_query = from d in Document

    from p in Patient,
      join: p_u in assoc(p, :user),
      preload: [
        user: {
          p_u,
          documents: ^documents_query
        }
      ],
      select: [
        :id,
        :first_name,
        :middle_name,
        :last_name,
        user: [
          :id,
          :email,
          documents: [
            :id,
            :filename
          ]
        ]
      ]
  end

SELECT p0."id", p0."first_name", p0."middle_name", p0."last_name", u1."id", u1."email" FROM "patients" AS p0 INNER JOIN "users" AS u1 ON u1."id" = p0."user_id" []
SELECT d0."id", d0."user_id", d0."filename", d0."user_id" FROM "documents" AS d0 WHERE (d0."user_id" = ANY($1)) ORDER BY d0."user_id" [[24, 23, 1, 21, 2]]

Note how the result lists the user_id: nil but the user is in fact attached, the same one to every patient:

[
  %Patient{
    __meta__: #Ecto.Schema.Metadata<:loaded, "patients">,
    first_name: "Jerrod",
    id: 1,
    last_name: "Gislason",
    middle_name: "Ansel",
    user: %User{
      documents: [],
      id: 24,
      ...
    },
    user_id: nil
  },
  %Patient{
    __meta__: #Ecto.Schema.Metadata<:loaded, "patients">,
    first_name: "Patient3",
    id: 4,
    last_name: "Test",
    middle_name: "3",
    user: %User{
      documents: [],
      id: 24,
      ...
    },
    user_id: nil
  },
  %Patient{
    __meta__: #Ecto.Schema.Metadata<:loaded, "patients">,
    first_name: "Patient4",
    id: 5,
    last_name: "Test",
    middle_name: "4",
    user: %User{
      documents: [],
      id: 24,
      ...
    }, 
    user_id: nil
  }
]

A query with the correct select clause

  def patient_summary_query do
    documents_query = from d in Document

    from p in Patient,
      join: p_u in assoc(p, :user),
      preload: [
        user: {
          p_u,
          documents: ^documents_query
        }
      ],
      select: [
        :id,
        :first_name,
        :middle_name,
        :last_name,
        :user_id,          # <- added
        user: [
          :id,
          :email,
          documents: [
            :id,
            :filename
          ]
        ]
      ]
  end

Expected behavior

It would be nice if Ecto did not attach associated records when there is not enough data to do so. Ideally, it would also be nice to have a warning that the field on which it attempts to match the records is not specified in the select.

@josevalim
Copy link
Member

Unfortunately this is really hard to verify in practice because all the preloader sees is that fields are nil, and it is completely fine for them to be nil as it means they have no associated data. We do document you need to be careful to select all relevant data when using custom queries - it is an advanced feature and it requires attention when using it. In any case, thanks for opening the issue!

@nicnilov
Copy link
Author

nicnilov commented Mar 14, 2020

@josevalim I understand the detection for the purpose of warning may be infeasible, but Ecto does return inconsistent data, which is a different story.

@josevalim
Copy link
Member

In an attempt to track why data gets unrelated I was able to introduce a warning. The secret was looking at the data, and not in trying to validate the query!

@nicnilov
Copy link
Author

@josevalim Excellent! Thanks!

@gfodor
Copy link

gfodor commented Apr 18, 2020

hey @josevalim, we are seeing this error occur in our phoenix service, and are not using select: in our queries. it seems non-deterministic. we're doing a chained preload of a few associations, deep, our preload tree looks like (where letters are record types)

A
  belongs_to(B)
  belongs_to(C)
    belongs_to(B)

The error seems to be getting raised for the first level B reference. I'm not sure, but seems possible the fact the record type B is involved at two levels of the heirarchy might be an edge case. Haven't been able to repro in a test case yet, but might be obvious to an ecto expert what is going on. For now, we're going to roll back to previous version since this doesn't seem like a real problem ecto is raising, but the error mis-firing. (Though maybe it is surfacing a non-deterministic bug!)

gfodor added a commit to Hubs-Foundation/reticulum that referenced this issue Apr 18, 2020
@josevalim
Copy link
Member

Hi @gfodor! 👋

I can see the repo where you see this failure is public. Can you please give me a commit SHA and a command to run that reproduces the failure? Then I can take a look at it and cut a new Ecto version.

@gfodor
Copy link

gfodor commented Apr 19, 2020 via email

@technicalcapt
Copy link

technicalcapt commented Apr 23, 2020

Hi @josevalim 👋, just want to follow up the issue @gfodor mentioned above.

Our case is when building a nested FormData from changeset.

# Document has_many :attachments ( in document changeset, we do preload :attachments)

changeset = Document.changeset(%Document{attachments: [%Attachment{}]})
# <- And it will error out
(RuntimeError) association `attachments` for `Document` has a loaded value but its association key `id` is nil. This usually means `id` was not selected in a query

How should we handle this?

@josevalim
Copy link
Member

@technicalcapt does it come with a default attachment for view purposes? So you have something to show by default? Maybe you can do that only in your new/edit actions and now when you actually persist your changeset? I.e. not on create/update.

@josevalim
Copy link
Member

In any case, I have migrated this to a warning, so everyone has a easier time migrating to it and can fix things without a broken app. Thanks for the input everyone!

@technicalcapt
Copy link

@josevalim unfortunately viewing is not the case. We wanted to create the form dynamically with two level nested from changeset.
I will investigate more about this and yeah a warning will work.
Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants