JPA Fetch and N+1 SELECT problem - Blog by Vincenzo Racca

Block Image

What is JPA Fetch?

It is a feature of JPA that allows entities to load or unload objects with which they have a relationship.
You can declare the fetch value in the @OneToOne, @OneToMany, @ManyToOne and @ManyToMany annotations. These annotations in fact, have an attribute called fetch that serves just to indicate the wished fetch of the relation. It can have two values: LAZY and EAGER. With the first value we are telling JPA that we want a lazy loading of the relation, so when retrieving an entity, its relations will not be loaded as well. On the contrary, with EAGER it will its relation will be loaded too.

By default all ToMany relationships are LAZY, while ToOne relationships are EAGER.

In this article we will analyze the statistics using both FETCHs and will see how to optimize the relationships.

We will use the JpaDao of the Application managed e Container managed project.

Use Case: Company, User and Car Entities

For our use case we will use these three tables:

Block Image

These are the related Java classes:

User:

@Entity
@Table(name = "USERS")
public class User implements JpaEntity {

  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  private String name;

  private String surname;

  @Column(nullable = false, unique = true)
  private String code;

  @ManyToOne
  @JoinColumn(name = "company_code", referencedColumnName = "code")
  private Company company;

  @OneToMany(mappedBy = "user", cascade = CascadeType.ALL)
  private Set<Car> cars;

  //getters, setters, equals, hashcode
}

Company:

@Entity
@Table(name = "COMPANIES")
public class Company implements JpaEntity {

  @Id
  private String code;

  private String name;

  @OneToMany(mappedBy = "company", cascade = CascadeType.ALL)
  private Set<User> users;

  //getters, setters, equals, hashcode
}

Car:

@Entity
@Table(name = "CARS")
public class Car implements JpaEntity {

  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  @Column(nullable = false, unique = true)
  private String plate;

  private String name;

  @ManyToOne
  @JoinColumn(name = "user_id", referencedColumnName = "id")
  private User user;
  
  //getters, setters, equals, hashcode
}

So the USERS table is in a 1:N relationship with CARS and COMPANIES is in a 1:N relationship with USERS.

Extrapolating JPA queries and statistics

We insert in the persistence.xml the following properties to log JPA queries and statistics:

<property name="hibernate.show_sql" value="true" />
<property name="hibernate.generate_statistics" value="true" />

We test with JUnit the findById

We will test the findById of Company. We preload the DB with 1000 rows in the COMPANIES table, each of which is associated with 500 rows of USERS, each of which is associated with two rows in the CARS table.

Now we write the unit test of the findById:

@Test
public void findByIdTest() {
  Company entity = companyDao.findById("COMPANY1");
  Assert.assertNotNull(entity);
}

Remember that Company is in a One to Many relationship with User and the FETCH is the default, so it is LAZY. Let's run the test and see the query executed and the metrics:

Hibernate: select company0_.code as code1_1_0_, company0_.name 
  as name2_1_0_ from COMPANIES company0_ where company0_.code=?
  
22:50:44.992 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    347086 nanoseconds spent acquiring 1 JDBC connections;
    387565 nanoseconds spent releasing 1 JDBC connections;
    448483 nanoseconds spent preparing 1 JDBC statements;
    548631 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Only one select was made and it took 0.000448483 seconds to prepare it and 0.000548631 seconds to execute it.
The metrics for this example are negligible because they are very low, testing on a local MySQL db, instead, we notice the number of queries made by JPA.

Being the relation LAZY, a simple select is effected on the table COMPANIES, without recovering the relation with USERS.

What if we wanted to retrieve all the Users in the Company with id COMPANY1? Let's add a printout of the list of Users:

@Test
public void findByIdTest() {
  Company entity = companyDao.findById("COMPANY1");
  System.out.println(entity.getUsers());
  Assert.assertNotNull(entity);
}

Let's run the code. We will get this exception:

org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: 
com.vincenzoracca.jpaproject.entities.Company.users, could not initialize proxy - no Session

The error is telling: we cannot retrieve a LAZY collection unless we are inside a Session.

We then create a method called findByIdWithUsers:

@Override
@Transactional
public Company findByIdWithUsers(String id) {
  Company entity = entityManager.find(Company.class, id);
  Set<User> users = entity.getUsers();
  System.out.println("SIZE USERS: " + users.size());
  return entity;
}

Here the getUsers is allowed because it is executed inside the same transaction/session as the find.

Let's create the test class and run it:

@Test
public void findByIdWithUsersTest() {
  Company entity = companyDao.findByIdWithUsers("COMPANY1");
  Assert.assertNotNull(entity);
  System.out.println("STAMPA" + entity.getUsers());
}

Let's look at the queries made and the metrics:

Hibernate: select company0_.code as code1_1_0_, company0_.name as name2_1_0_ from COMPANIES company0_ where company0_.code=?
Hibernate: select users0_.company_code as company_5_2_0_, users0_.id as id1_2_0_, users0_.id as id1_2_1_, 
    users0_.code as code2_2_1_, users0_.company_code as company_5_2_1_, users0_.name as name3_2_1_, 
    users0_.surname as surname4_2_1_ from USERS users0_ where users0_.company_code=?

23:48:43.308 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    537225 nanoseconds spent acquiring 1 JDBC connections;
    352008 nanoseconds spent releasing 1 JDBC connections;
    18853810 nanoseconds spent preparing 2 JDBC statements;
    8545463 nanoseconds spent executing 2 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    51666997 nanoseconds spent executing 1 flushes (flushing a total of 501 entities and 501 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

The relevant fact is that two queries are performed this time. If instead of calling the findById, we had called the findAll, JPA would have retrieved the 1000 rows of the COMPANIES table and for each one it would have made a select to recover the relative rows of the table USERS. JPA would have therefore carried out 1001 queries!

This JPA problem is known as the N+1 queries problem and the N+1 SELECT problem.

We change the FETCH of the relationship between Company and Users to EAGER

@OneToMany(mappedBy = "company", cascade = CascadeType.ALL, fetch = FetchType.EAGER)
private Set<User> users;

Let's run the findByIdWithUsers test method again:

Hibernate: select company0_.code as code1_1_0_, company0_.name as name2_1_0_, 
  users1_.company_code as company_5_2_1_, users1_.id as id1_2_1_, users1_.id as id1_2_2_, 
  users1_.code as code2_2_2_, users1_.company_code as company_5_2_2_, users1_.name as name3_2_2_, users1_.surname 
  as surname4_2_2_ from COMPANIES company0_ left outer join USERS users1_ 
  on company0_.code=users1_.company_code where company0_.code=?
  
12:05:21.214 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    455686 nanoseconds spent acquiring 1 JDBC connections;
    333864 nanoseconds spent releasing 1 JDBC connections;
    16132759 nanoseconds spent preparing 1 JDBC statements;
    8139858 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    47884750 nanoseconds spent executing 1 flushes (flushing a total of 501 entities and 501 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

We note that JPA now makes only one query, using a LEFT JOIN on the USERS table. Much more efficient, the problem of N+1 queries no longer occurs!

But then the optimal solution is to enhance the FETCHs of all relationships with EAGER? Let's assume yes, then we modify the FETCH to EAGER also in the relation between USERS and CARS:

@OneToMany(mappedBy = "user", cascade = CascadeType.ALL, fetch = FetchType.EAGER)
private Set<Car> cars;

We run the previous test method again:

Hibernate: select company0_.code as code1_1_0_, company0_.name as name2_1_0_, users1_.company_code as company_5_2_1_, 
  users1_.id as id1_2_1_, users1_.id as id1_2_2_, users1_.code as code2_2_2_, users1_.company_code as company_5_2_2_, 
  users1_.name as name3_2_2_, users1_.surname as surname4_2_2_, cars2_.user_id as user_id4_0_3_, cars2_.id as id1_0_3_, 
  cars2_.id as id1_0_4_, cars2_.name as name2_0_4_, cars2_.plate as plate3_0_4_, cars2_.user_id as user_id4_0_4_ 
  from COMPANIES company0_ left outer join USERS users1_ on company0_.code=users1_.company_code 
  left outer join CARS cars2_ on users1_.id=cars2_.user_id where company0_.code=?
  
12:11:55.005 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    455398 nanoseconds spent acquiring 1 JDBC connections;
    373653 nanoseconds spent releasing 1 JDBC connections;
    24160296 nanoseconds spent preparing 1 JDBC statements;
    21923923 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    90972614 nanoseconds spent executing 1 flushes (flushing a total of 1001 entities and 501 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Only one query is executed this time as well, but an extra LEFT JOIN is made, on CARS, even though we don't need to retrieve the rows of CARS.
The performance got worse of course. And that's having only three tables! If CARS had another EAGER relationship, it would have been automatically recovered as well and, cascading, others again. FETCH to EAGER therefore is not always the optimal solution (in fact, it almost never is).

Also, the FETCH to EAGER only works well by executing EntityManager methods, such as find. More precisely, it does not work with JPQLs. If we called Company's findAll with the EAGER fetch, which executes a JPQL we wrote, JPA would execute N+1 select.

A possible solution: the JOIN FETCH

The basic idea with JOIN FETCH is to load the relationship only when we really need it. In the case of findByIdWithUsers method, we want to retrieve only the USERS of the Company with specified id.
We return all FETCH to LAZY again and modify the method:

@Override
public Company findByIdWithUsers(String id) {
  String sql = "SELECT c FROM Company c LEFT JOIN FETCH c.users";
  Company entity = entityManager.createQuery(sql, Company.class)
                                  .getSingleResult();
  Set<User> users = entity.getUsers();
  System.out.println("SIZE USERS: " + users.size());
  return entity;
}

Let's run the test again:

Hibernate: select company0_.code as code1_1_0_, users1_.id as id1_2_1_, company0_.name as name2_1_0_, 
    users1_.code as code2_2_1_, users1_.company_code as company_5_2_1_, users1_.name as name3_2_1_, 
    users1_.surname as surname4_2_1_, users1_.company_code as company_5_2_0__, users1_.id as id1_2_0__ 
from COMPANIES company0_ left outer join USERS users1_ on company0_.code=users1_.company_code where company0_.code=?

12:36:55.303 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    572896 nanoseconds spent acquiring 1 JDBC connections;
    449356 nanoseconds spent releasing 1 JDBC connections;
    12735636 nanoseconds spent preparing 1 JDBC statements;
    8147869 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Only one query and one JOIN is executed, just what we want!

A more elegant solution: the EntityGraph

The general solution to apply then is:
«retrieve relationship items only when I explicitly ask for them.»

This solution can be applied, as we have just seen, with the JOIN FETCH.
The awkward thing, however, is to write a method for every possible different JOIN FETCH.
To avoid this, we can use another feature of JPA, called EntityGraph:

@Entity
@Table(name = "COMPANIES")
@NamedEntityGraphs({
    @NamedEntityGraph(name = Company.CompanyFetch.ALL, includeAllAttributes = true),
    @NamedEntityGraph(name = Company.CompanyFetch.USERS, attributeNodes = { @NamedAttributeNode("users")})
})
public class Company implements JpaEntity

...

public interface CompanyFetch {
  String ALL= "Company.all";
  String USERS = "Company.users";
}

Let's analyze the code:

The @NamedEntityGraphs annotation takes an array of @NamedEntityGraphs as a parameter.
With the first @NamedEntityGraph, called Company.all, we are telling JPA that we want to retrieve all the relations of the table COMPANIES, thanks to the
includeAllAttributes = true.
With the second @NamedEntityGraph, called Company.users, we are telling JPA that we want to retrieve the relationship with USERS, thanks to
attributeNodes = { @NamedAttributeNode("users")}.

Obviously, since Company only has one relationship, in this case the two @NamedEntityGraphs will return the same result.

We add a general method in the JpaDaoImpl class:

@Override
public T findById(ID id, String entityGraph) {
    T entity = entityManager.createQuery(FIND_ALL + " where " + idField() + " = :id", persistentClass)
            .setParameter("id", id)
            .setHint("javax.persistence.fetchgraph", entityManager.getEntityGraph(entityGraph))
            .getSingleResult();
    return entity;
    
    ...

private String idField() {
    try {
        String id = Arrays.stream(persistentClass.getDeclaredFields())
        .filter(field -> field.isAnnotationPresent(Id.class))
        .findAny()
        .map(field -> field.getName())
        .orElse(null);
    
        return id;
    }
    catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Here the only relevant part is
.setHint("javax.persistence.fetchgraph", entityManager.getEntityGraph(entityGraph)),
which allows us to run the query with an entityGraph.

Let's create a test class to try out the use of EntityGraph:

@Test
public void findByIdWithUsersGraphTest() {
    Company entity = companyDao.findById("COMPANY1", Company.CompanyFetch.USERS);
    Assert.assertNotNull(entity);
    System.out.println("STAMPA" + entity.getUsers());
}

We execute the method and see the queries made and the statistics:

Hibernate: select company0_.code as code1_1_0_, users1_.id as id1_2_1_, 
    company0_.name as name2_1_0_, users1_.code as code2_2_1_, users1_.company_code as company_5_2_1_, 
    users1_.name as name3_2_1_, users1_.surname as surname4_2_1_, users1_.company_code as company_5_2_0__, 
    users1_.id as id1_2_0__ from COMPANIES company0_ left outer join USERS users1_ 
    on company0_.code=users1_.company_code where company0_.code=?

13:31:22.260 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    418745 nanoseconds spent acquiring 1 JDBC connections;
    458270 nanoseconds spent releasing 1 JDBC connections;
    13035992 nanoseconds spent preparing 1 JDBC statements;
    7172620 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

As we expected, only one query is executed, with only one JOIN.
Moreover, using the entityGraph, we can write a general method to which then pass him the name of the EntitiGraph that we want. The JPQL in the createQuery method in fact, will be always the same one, will be then task of JPA to manage the eventual joins according to the EntityGraph.

Other possible solutions (Hibernate)

There are also other possible solutions.

For example, if we are using Hibernate, we can implement these strategies.

One is by using the @BatchSize annotation. Let's look at an example:

@OneToMany(mappedBy = "company", cascade = CascadeType.ALL, fetch = FetchType.EAGER)
@BatchSize(size = 50)
private Set<User> users;

Recall that, without the @BatchSize annotation, JPA would perform N+1 SELECT, where N is the number of rows in the COMPANIES table. Let's run the findAll test:

Hibernate: select company0_.code as code1_1_, company0_.name as name2_1_ from COMPANIES company0_
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
12:49:30.977 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    667681 nanoseconds spent acquiring 1 JDBC connections;
    574034 nanoseconds spent releasing 1 JDBC connections;
    19252743 nanoseconds spent preparing 21 JDBC statements;
    2874138319 nanoseconds spent executing 21 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

As described by the logs, JPA performed 21 SELECTs.
Given N the number of rows of COMPANIES, and M the size of the annotation @BatchSize, the number of queries that JPA makes is N:M+1 (1000:50+1=21)

Another possible optimization, to not have too many cascading JOINs, is to use the property
hibernate.max_fetch_depth = n
where n is the maximum depth at which JPA performs JOINs.

Personally, I still think the JOIN FETCH and EntityGraph solutions are better.

Conclusions

In this article we have seen JPA's Fetch types and possible solutions to the N+1 SELECT problem.
Of course, depending on the project, one solution may be more functional than another.

You can find the complete project on my github at this link: JPA Project

Articles about JPA: JPA