Specifying the Related Entities to Include in the Query Results

This blog post covers how to use Include() method when fetching the related entities along with the parent entity being queried. The class diagram of the example covered in this tutorial is given below:

1

How Does DbSet Work?

In Entity Framework, when you fetch records from the database by using the DbSet variable of a certain entity, only records from the corresponding table will be retrieved to the server memory. For example, let’s assume that we have the following DbContext class declaration which includes the DbSet type Transfers collection.

 
public class FootballDbContext : DbContext 
{ 
    public FootballDbContext(): base("FootballDatabase") { } 
    
     public DbSet<Transfer> Transfers { get; set; } 
} 

When we use this DbSet collection variable to fetch records from the database, only the rows from the Transfers table will be retrieved:

List<Transfer> myTransfers = db.Transfers.List();

Behind the scenes, based on the statement above, Entity Framework will actually create the following SQL query:

SELECT * FROM Transfers

which will return the following data set:

2

As you may see the output above, this query will return all the records from the Transfers table. Then, Entity Framework will convert these raw transfer records into instances of Transfer, which will be added to the myTransfers list as you may see below:

List <Transfer> myTransfers = {
    new Transfer() 
    { 
         TransferId=1, StartYear=2017, TeamIdFK=5, PlayerIdFK=8, 
         Team=NULL, Player=NULL
    },
    new Transfer() 
    { 
         TransferId=2, StartYear=2012, TeamIdFK=5, PlayerIdFK=8, 
         Team=NULL, Player=NULL
    }, 
    new Transfer() 
    { 
         TransferId=3, StartYear=2011, TeamIdFK=5, PlayerIdFK=8, 
         Team=NULL, Player=NULL
    },
    new Transfer() 
    { 
         TransferId=4, StartYear=2015, TeamIdFK=5, PlayerIdFK=8, 
         Team=NULL, Player=NULL
    }
} 

CASE 1: RETRIEVING THE  TEAM AND PLAYER RECORDS ALONG WITH TRANSFERS

In the previous section, you may notice that the Team and Player properties of the transfers were NULL. This is because we read records only from the Transfers table which has no additional information about the associated team and player except the foreign key ids (TeamIdFK, and PlayerIdFK). If we also want to retrieve the team and player records associated with the transfer records, we need to access the records in the Teams and Players table.

If we want to fetch the Team and Player records associated with the Transfer records, then we need to tell Entity Framework about our intention with the help of the Include() method. A lambda expression is used in the Include() method that should point to a Navigation property referencing the related entity. This related entity should be the one that is to be retrieved from the database along with the Parent entity. In our case, the Include() method should have a lambda expression that points to Team (or Player) navigation property because we want the related team (or player)  record loaded with the transfer records. Therefore, to have Team and Player records included in the result dataset along with the Transfer records, we will need the following statement:

List<Transfer> myTransfers = db.Transfers
                               .Include(tr => tr.Team)
                               .Include(tr => tr.Player)
                               .ToList();

We appended two separate Include() methods to the Transfers DbSet collection since we want both Team and Player records fetched from the database along with the Transfers.

Here is the SQL statement generated after Include() methods are included:

SELECT * FROM Transfers
JOIN Players ON Players.PlayerId = Transfers.PlayerIdFK 
JOIN Teams ON Teams.TeamId = Transfers.TeamIdFK 

Include() method indeed corresponds to JOIN keyword in the SQL statement because the associated records requested with Include() method actually exist in separate tables, and JOIN statement is used to read associated data existing in different tables by merging all records into a single dataset.

Since we have two Include() methods, the SQL query generated by Entity Framework has two JOIN keywords, merging Transfers table with Teams and Players table. Here is how the output dataset would look like after the SELECT query defined above executed:

5

The output above is actually composed of the three tables joined together: Players, Transfers, and Teams. You may notice that some players seem to be duplicated. Actually, they are not. This is a repetition resulting from having multiple transfer records per player. For example, Nani has been involved in 2 transfers (2012-2014 and 2000-2004). For each of these transfer record, the player information needs to be displayed even though both transfer involves the same player.

The entity framework will map the output dataset (obtained above) to the appropriate class objects.

After the mapping, the myTransfers list collection will look like this:

List <Transfer> myTransfers = 
{
    new Transfer() 
    { 
         TransferId=2, StartYear=2012, TeamIdFK=5, PlayerIdFK=8, 
         Team = new Team() {TeamId=4, TeamName="Fenerbahce", ...}, 
         Player = new Player() { PlayerId=1, FullName="Nani", ...}
    },
    new Transfer() 
    { 
         TransferId=19, StartYear=2000, TeamIdFK=5, PlayerIdFK=8,
         Team = new Team() {TeamId=4, TeamName="Fenerbahce", ...}, 
         Player = new Player() { PlayerId=1, FullName="Nani", ...}
    }, 
    new Transfer() { 
         TransferId=3, StartYear=2011, TeamIdFK=5, PlayerIdFK=8,
         Team = new Team() {TeamId=4, TeamName="Fenerbahce", ...}, 
         Player = new Player() { PlayerId=1, FullName="Nani", ...}
    }, 
    new Transfer() { 
         TransferId=5, StartYear=2012, TeamIdFK=5, PlayerIdFK=8,
         Team = new Team() {TeamId=4, TeamName="Fenerbahce", ...}, 
         Player = new Player() { PlayerId=1, FullName="Nani", ...}
    }, 
} 

CASE 2: RETRIEVING THE TEAM RECORDS ALONG WITH THE PLAYERS

In the practiced scenario above, the Transfer entity is linked to the Player and Team entities:

Player <- Transfer -> Team

That is why we were able to retrieve the player and team records along with the transfer records with two separate Include() methods discussed above. However, when we want to fetch the player records from the database and include the teams (to which players have been transferred) in the dataset then the way we use Include() method will be different. This is because the only way we can access the related team records from the Player entity is through Transfer entity:

Player -> Transfer -> Team

Being dependent on Transfer entity in accessing Team records will slightly complicated the lambda expression that goes into the Include() method (attached to Players DbContext collection –db.Players).

Let’s start with simpler scenario and plan to retrieve only the transfer records along with the players. Then, we would need the following statement:

List<Player> myPlayers = db.Players.Incude(pl => pl.Transfers).ToList();

The statement above would result in an SQL query that joins Players and Transfers tables, carrying no information regarding the teams. Therefore, this would be no use for our purpose. For example, the content of the myPlayers list collection would look like this:

List<Player> myPlayers =
{
    new Player()
    {
        PlayerId=1, PlayerName=Nani, ...
        Transfers = {
            new Transfer()
            {
                TransferId=2, StartYear=2012, ...
                Team=NULL
            },
            new Transfer()
            {
                TransferId=19, StartYear=2000, ...
                Team=NULL
            }
        } 
    }

Not surprisingly, as you may notice, Team properties of Transfer instances would contain a NULL reference. If we attempt to read the name of the teams, we would receive a NULL exception.

To avoid the NULL references in Team values, we need to use Select() method inside the Include() method to indicate that we want to load the Team records associated with the Transfers:

List<Player> myPlayers = db.Players
                           .Incude(pl => pl.Transfers
                                           .Single(tr => tr.Team)
                                  ).ToList();

The reason why we used .Single() method is because we had to include a collection first –Include(pl => pl.Transfers), and then include an entity reference one level down –Single(tr => tr.Team). In other words, when your Include method receives a collection reference (e.g., Transfers), then you have to use Single() method to include the related entity (e.g., Team) of that collection.

Although the Include() method has changed, the SQL statement generated by Entity Framework will be the same as the one in Case 1:

SELECT * FROM Transfers
JOIN Players ON Players.PlayerId = Transfers.PlayerIdFK 
JOIN Teams ON Teams.TeamId = Transfers.TeamIdFK 

The SQL query generated by Entity Framework has two JOIN keywords, merging Transfers table with Teams and Players table. Here is how the output dataset would look like after the SELECT query defined above executed:

5

Actually this dataset is the same with the one we obtained in CASE 1. However, because our Linq statement is different, entity framework will map this dataset to class objects differently. It will map the results into a list of Players this time:

myPlayers = {
     new Player
     { 
         PlayerId=1, FullName="Nani", ...
         Transfers = {
             new Transfer()
             {
                  TransferId=2, StartYear=2012, ...
                  Team = { TeamId=4, TeamName="Fenerbahce", ... }
             },
             new Transfer()
             {
                  TransferId=19, StartYear=2000, ...
                  Team= { TeamId=3, TeamName="Atletico", ...}  
             }
      },
      //other players...
}

As you may notice above, the transfer collection for each player will be loaded, and each transfer instance carries the team information as well.

CASE 3: RETRIEVING THE PLAYER RECORDS ALONG WITH THE TEAMS

In Case 3, the parent entity is Team, as opposed to the Case 2, where the parent entity was Player. That is, in Case 3, we will fetch Team records from the database, and we want to include the related player records in the dataset.

We will basically use the same Linq query structure. However, the lambda expressions will point to different navigation properties.

List<Team> myTeams = db.Teams
                       .Include(t => t.Transfers.Single(tr =>tr.Player))
                       .ToList();

Similar to the Case 2, the reason why we used .Single() method is because we had to include a collection first –Include(pl => pl.Transfers), and then include an entity reference one level down –Single(tr => tr.Player). The resulted output will look like this:

myTeams = {
    new Team
    {
        TeamId=4, TeamName="Fenerbahce", ...
        Transfers = {
           new Transfer()
           {
               TransferId=2, StartYear=2012, ...
               Player = { ...player information... }
           },
           new Transfer()
           {
               TransferId=19, StartYear=2000, ...
               Player = {...player information...} 
           } 
     }, 
     //other players... 
} 

CASE 4: RETRIEVING THE LEAGUE INFO ALONG WITH THE TEAM WHEN FETCHING TRANSFERS 

In this case, when we fetch the transfer records, we want to bring in the team information as well as the league that the team is in. Please note that each transfer involves ONLY ONE team and each team plays in ONLY ONE league. That is, when referencing the related entities in the Linq statement there will be NO collection type: a transfer is link to a one team which is link to a one league. Therefore, we will not need to use .SELECT() method.  Here is what we need:

List<Transfer> myTransfers = db.Transfers
                               .Include(tr => tr.Team.League)
                               .ToList();

In the above code, we used Include() method to first include an entity reference (not a collection), and then include another entity reference one level down that is linked to the Team entity reference. Using a period (‘.’) was enough to achieve this because Team is not a collection type but a single entity reference (since only one Team is involved in a transfer).

In this tutorial, I covered the most common cases when loading the related objects in the query results. Here is a link to a further documentation about specifying the related objects to include in the query results:

https://msdn.microsoft.com/en-us/library/gg671236(v=vs.103).aspx