Fetch in JPA e problema delle N+1 SELECT

Block Image

Cos'è il Fetch di JPA?

È una funzionalità di JPA che permette alle entities di caricare o meno gli oggetti con cui hanno una relazione.
È possibile dichiarare il valore di fetch nelle annotazioni @OneToOne, @OneToMany, @ManyToOne e @ManyToMany. Queste annotations infatti, hanno un attributo chiamato fetch che serve proprio a indicare il fetch desiderato della relazione. Può avere due valori: LAZY ed EAGER. Col primo valore stiamo dicendo a JPA che vogliamo un caricamento pigro della relazione, quindi quando si recupererà una entity, non verrà caricata anche la sua relazioni. Al contrario, con EAGER verrà caricata anche la sua relazione.

Di default tutte le relazioni ToMany sono LAZY, mentre quelle ToOne sono EAGER.

In questo articolo analizzeremo le statistiche utilizzando entrambe le FETCH e vedremo come ottimizzare le relazioni.

Utilizzeremo il JpaDao del progetto Application managed e Container managed.

Caso d'uso: entities Company, User e Car

Per il nostro caso d'uso useremo queste tre tabelle:

Block Image

Queste le relative classi Java:

User:

@Entity
@Table(name = "USERS")
public class User implements JpaEntity {

  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  private String name;

  private String surname;

  @Column(nullable = false, unique = true)
  private String code;

  @ManyToOne
  @JoinColumn(name = "company_code", referencedColumnName = "code")
  private Company company;

  @OneToMany(mappedBy = "user", cascade = CascadeType.ALL)
  private Set<Car> cars;

  //getters, setters, equals, hashcode
}

Company:

@Entity
@Table(name = "COMPANIES")
public class Company implements JpaEntity {

  @Id
  private String code;

  private String name;

  @OneToMany(mappedBy = "company", cascade = CascadeType.ALL)
  private Set<User> users;

  //getters, setters, equals, hashcode
}

Car:

@Entity
@Table(name = "CARS")
public class Car implements JpaEntity {

  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  @Column(nullable = false, unique = true)
  private String plate;

  private String name;

  @ManyToOne
  @JoinColumn(name = "user_id", referencedColumnName = "id")
  private User user;
  
  //getters, setters, equals, hashcode
}

Quindi la tabella USERS è in relazione 1:N con CARS e COMPANIES è in relazione 1:N con USERS.

Estrapoliamo queries e statistiche di JPA

Inseriamo nel persistence.xml le seguenti properties per loggare le queries e le statistiche di JPA:

<property name="hibernate.show_sql" value="true" />
<property name="hibernate.generate_statistics" value="true" />

Testiamo con JUnit la findById

Testeremo la findById di Company. Precarichiamo il DB con 1000 righe nella tabella COMPANIES, ognuna delle quali è associata a 500 righe di USERS ognuna delle quali è associata a due righe nella tabella CARS.

Scriviamo ora il test unitario del findById:

@Test
public void findByIdTest() {
  Company entity = companyDao.findById("COMPANY1");
  Assert.assertNotNull(entity);
}

Ricordiamo che Company è in relazione One to Many con User e il FETCH è quello di default, quindi è LAZY. Eseguiamo il test e vediamo la query eseguita e le metriche:

Hibernate: select company0_.code as code1_1_0_, company0_.name 
  as name2_1_0_ from COMPANIES company0_ where company0_.code=?
  
22:50:44.992 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    347086 nanoseconds spent acquiring 1 JDBC connections;
    387565 nanoseconds spent releasing 1 JDBC connections;
    448483 nanoseconds spent preparing 1 JDBC statements;
    548631 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

È stata fatta una sola select e ci sono voluti 0,000448483 secondi per prepararla e 0,000548631 secondi per eseguirla.
Le metrics per questo esempio sono trascurabili perché sono molto basse, testando su un db MySQL in locale, piuttosto, notiamo il numero di query effettuate da JPA.

Essendo la relazione LAZY, viene effettuata una semplice select sulla tabella COMPANIES, senza recuperare la relazione con USERS.

Se volessimo recuperare tutti gli User della Company con id COMPANY1? Aggiungiamo una stampa della lista di User:

@Test
public void findByIdTest() {
  Company entity = companyDao.findById("COMPANY1");
  System.out.println(entity.getUsers());
  Assert.assertNotNull(entity);
}

Eseguiamo il codice. Avremo questa eccezione:

org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: 
com.vincenzoracca.jpaproject.entities.Company.users, could not initialize proxy - no Session

L'errore è parlante: non possiamo recuperare una collezione LAZY se non ci troviamo all'interno di una Sessione.

Creiamo allora un metodo chiamato findByIdWithUsers:

@Override
@Transactional
public Company findByIdWithUsers(String id) {
  Company entity = entityManager.find(Company.class, id);
  Set<User> users = entity.getUsers();
  System.out.println("SIZE USERS: " + users.size());
  return entity;
}

Qui la getUsers è consentita perché viene eseguita dentro la stessa transazioni/sessione del find.

Creiamo la classe di test ed eseguiamola:

@Test
public void findByIdWithUsersTest() {
  Company entity = companyDao.findByIdWithUsers("COMPANY1");
  Assert.assertNotNull(entity);
  System.out.println("STAMPA" + entity.getUsers());
}

Vediamo le query effettuate e le metrics:

Hibernate: select company0_.code as code1_1_0_, company0_.name as name2_1_0_ from COMPANIES company0_ where company0_.code=?
Hibernate: select users0_.company_code as company_5_2_0_, users0_.id as id1_2_0_, users0_.id as id1_2_1_, 
    users0_.code as code2_2_1_, users0_.company_code as company_5_2_1_, users0_.name as name3_2_1_, 
    users0_.surname as surname4_2_1_ from USERS users0_ where users0_.company_code=?

23:48:43.308 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    537225 nanoseconds spent acquiring 1 JDBC connections;
    352008 nanoseconds spent releasing 1 JDBC connections;
    18853810 nanoseconds spent preparing 2 JDBC statements;
    8545463 nanoseconds spent executing 2 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    51666997 nanoseconds spent executing 1 flushes (flushing a total of 501 entities and 501 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Il dato rilevante è che questa volta vengono effettuate due queries. Se invece di chiamare la findById, avessimo chiamato la findAll, JPA avrebbe recuperato le 1000 righe della tabella COMPANIES e per ognuna avrebbe effettuato una select per recuperare le righe relative della tabella USERS. JPA avrebbe quindi effettuato 1001 queries!

Questo problema di JPA è noto come problema delle N+1 queries e problema delle N+1 SELECT.

Modifichiamo il FETCH della relazione tra Company e Users a EAGER

@OneToMany(mappedBy = "company", cascade = CascadeType.ALL, fetch = FetchType.EAGER)
private Set<User> users;

Eseguiamo di nuovo il metodo di test findByIdWithUsers:

Hibernate: select company0_.code as code1_1_0_, company0_.name as name2_1_0_, 
  users1_.company_code as company_5_2_1_, users1_.id as id1_2_1_, users1_.id as id1_2_2_, 
  users1_.code as code2_2_2_, users1_.company_code as company_5_2_2_, users1_.name as name3_2_2_, users1_.surname 
  as surname4_2_2_ from COMPANIES company0_ left outer join USERS users1_ 
  on company0_.code=users1_.company_code where company0_.code=?
  
12:05:21.214 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    455686 nanoseconds spent acquiring 1 JDBC connections;
    333864 nanoseconds spent releasing 1 JDBC connections;
    16132759 nanoseconds spent preparing 1 JDBC statements;
    8139858 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    47884750 nanoseconds spent executing 1 flushes (flushing a total of 501 entities and 501 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Notiamo che ora JPA effettua solo una query, usando una LEFT JOIN sulla tabella USERS. Molto più efficiente, il problema delle N+1 queries non si presenta più!

Ma allora la soluzione ottimale è valorizzare i FETCH di tutte le relazione con EAGER? Ipotizziamo di si, quindi modifichiamo il FETCH a EAGER anche nella relazione tra USERS e CARS:

@OneToMany(mappedBy = "user", cascade = CascadeType.ALL, fetch = FetchType.EAGER)
private Set<Car> cars;

Eseguiamo nuovamente il metodo di test precedente:

Hibernate: select company0_.code as code1_1_0_, company0_.name as name2_1_0_, users1_.company_code as company_5_2_1_, 
  users1_.id as id1_2_1_, users1_.id as id1_2_2_, users1_.code as code2_2_2_, users1_.company_code as company_5_2_2_, 
  users1_.name as name3_2_2_, users1_.surname as surname4_2_2_, cars2_.user_id as user_id4_0_3_, cars2_.id as id1_0_3_, 
  cars2_.id as id1_0_4_, cars2_.name as name2_0_4_, cars2_.plate as plate3_0_4_, cars2_.user_id as user_id4_0_4_ 
  from COMPANIES company0_ left outer join USERS users1_ on company0_.code=users1_.company_code 
  left outer join CARS cars2_ on users1_.id=cars2_.user_id where company0_.code=?
  
12:11:55.005 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    455398 nanoseconds spent acquiring 1 JDBC connections;
    373653 nanoseconds spent releasing 1 JDBC connections;
    24160296 nanoseconds spent preparing 1 JDBC statements;
    21923923 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    90972614 nanoseconds spent executing 1 flushes (flushing a total of 1001 entities and 501 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Viene eseguita solo una query anche questa volta, ma viene fatta una LEFT JOIN in più, su CARS, anche se noi non abbiamo bisogno di recuperare le righe di CARS.
Le prestazioni sono peggiorate ovviamente. E questo avendo solo tre tabelle! Se CARS avesse avuto un'altra relazione EAGER sarebbe stata recuperata automaticamente anche questa e, a cascata, altre ancora. Il FETCH a EAGER quindi non è sempre la soluzione ottimale (anzi, non lo è quasi mai).

Inoltre il FETCH a EAGER funziona bene solo eseguendo metodi dell'EntityManager, come find. Più precisamente, non funziona con le JPQL. Se chiamassimo la findAll di Company con il fetch EAGER, che effettua una JPQL scritta da noi, JPA eseguirebbe N+1 select.

Una possibile soluzione: la JOIN FETCH

L'idea di base con la JOIN FETCH è recuperare la relazione solo quando ne abbiamo davvero bisogno. Nel caso del metodo findByIdWithUsers, vogliamo recuperare solo gli USERS della Company con id specificato.
Riportiamo nuovamente tutti i FETCH a LAZY e modifichiamo il metodo:

@Override
public Company findByIdWithUsers(String id) {
  String sql = "SELECT c FROM Company c LEFT JOIN FETCH c.users";
  Company entity = entityManager.createQuery(sql, Company.class)
                                  .getSingleResult();
  Set<User> users = entity.getUsers();
  System.out.println("SIZE USERS: " + users.size());
  return entity;
}

Eseguiamo nuovamente il test:

Hibernate: select company0_.code as code1_1_0_, users1_.id as id1_2_1_, company0_.name as name2_1_0_, 
    users1_.code as code2_2_1_, users1_.company_code as company_5_2_1_, users1_.name as name3_2_1_, 
    users1_.surname as surname4_2_1_, users1_.company_code as company_5_2_0__, users1_.id as id1_2_0__ 
from COMPANIES company0_ left outer join USERS users1_ on company0_.code=users1_.company_code where company0_.code=?

12:36:55.303 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    572896 nanoseconds spent acquiring 1 JDBC connections;
    449356 nanoseconds spent releasing 1 JDBC connections;
    12735636 nanoseconds spent preparing 1 JDBC statements;
    8147869 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Viene eseguita una sola query e una sola JOIN, proprio quello che vogliamo!

Una soluzione più elegante: l'EntityGraph

La soluzione generale da applicare quindi è: «recuperare gli oggetti della relazione solo quando li chiedo esplicitamente.»

Questa soluzione può essere applicata, come abbiamo visto poc'anzi, con la JOIN FETCH.
La cosa scomoda è però scrivere un metodo per ogni possibile JOIN FETCH diversa.
Per evitare ciò, possiamo usare un'altra funzionalità di JPA, chiamata EntityGraph:

@Entity
@Table(name = "COMPANIES")
@NamedEntityGraphs({
    @NamedEntityGraph(name = Company.CompanyFetch.ALL, includeAllAttributes = true),
    @NamedEntityGraph(name = Company.CompanyFetch.USERS, attributeNodes = { @NamedAttributeNode("users")})
})
public class Company implements JpaEntity

...

public interface CompanyFetch {
  String ALL= "Company.all";
  String USERS = "Company.users";
}

Analizziamo il codice:

L'annotation @NamedEntityGraphs prende come parametro un array di @NamedEntityGraph.
Col primo @NamedEntityGraph, chiamato Company.all, stiamo dicendo a JPA che vogliamo recuperare tutte le relazioni

della tabella COMPANIES, grazie all'attributo
includeAllAttributes = true. 3. Col secondo @NamedEntityGraph, chiamato Company.users, stiamo dicendo a JPA che vogliamo recuperare la relazione con USERS, grazie all'attributo
attributeNodes = { @NamedAttributeNode("users")}.

Ovviamente, essendo che Company ha solo una relazione, in questo caso le due @NamedEntityGraph restituiranno lo stesso risultato.

Aggiungiamo un metodo generale nella classe JpaDaoImpl:

@Override
public T findById(ID id, String entityGraph) {
    T entity = entityManager.createQuery(FIND_ALL + " where " + idField() + " = :id", persistentClass)
            .setParameter("id", id)
            .setHint("javax.persistence.fetchgraph", entityManager.getEntityGraph(entityGraph))
            .getSingleResult();
    return entity;
    
    ...

private String idField() {
    try {
        String id = Arrays.stream(persistentClass.getDeclaredFields())
        .filter(field -> field.isAnnotationPresent(Id.class))
        .findAny()
        .map(field -> field.getName())
        .orElse(null);
    
        return id;
    }
    catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Qui l'unica parte rilevante è
.setHint("javax.persistence.fetchgraph", entityManager.getEntityGraph(entityGraph)),
che ci permette di eseguire la query con una entityGraph.

Creiamo una classe di test per provare l'uso dell'EntityGraph:

@Test
public void findByIdWithUsersGraphTest() {
    Company entity = companyDao.findById("COMPANY1", Company.CompanyFetch.USERS);
    Assert.assertNotNull(entity);
    System.out.println("STAMPA" + entity.getUsers());
}

Eseguiamo il metodo e vediamo le queries effettuate e le statistics:

Hibernate: select company0_.code as code1_1_0_, users1_.id as id1_2_1_, 
    company0_.name as name2_1_0_, users1_.code as code2_2_1_, users1_.company_code as company_5_2_1_, 
    users1_.name as name3_2_1_, users1_.surname as surname4_2_1_, users1_.company_code as company_5_2_0__, 
    users1_.id as id1_2_0__ from COMPANIES company0_ left outer join USERS users1_ 
    on company0_.code=users1_.company_code where company0_.code=?

13:31:22.260 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    418745 nanoseconds spent acquiring 1 JDBC connections;
    458270 nanoseconds spent releasing 1 JDBC connections;
    13035992 nanoseconds spent preparing 1 JDBC statements;
    7172620 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Come ci aspettavamo, viene eseguita una sola query, con una sola JOIN.
Inoltre, utilizzando le entityGraph, possiamo scrivere un metodo generale a cui poi passargli il nome dell'EntitiGraph che vogliamo. La JPQL nel metodo createQuery infatti, sarà sempre la medesima, sarà poi compito di JPA gestire le eventuali join a seconda dell'EntityGraph.

Altre possibili soluzioni (Hibernate)

Esistono anche altre possibili soluzioni.

Ad esempio, se stiamo utilizzando Hibernate, possiamo attuare queste strategie.

Una è utilizzando l'annotation @BatchSize. Vediamo un esempio:

@OneToMany(mappedBy = "company", cascade = CascadeType.ALL, fetch = FetchType.EAGER)
@BatchSize(size = 50)
private Set<User> users;

Ricordiamo che, senza l'annotation @BatchSize, JPA effettuerebbe N+1 SELECT, dove N è il numero di righe della tabella COMPANIES. Eseguiamo il test della findAll:

Hibernate: select company0_.code as code1_1_, company0_.name as name2_1_ from COMPANIES company0_
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select users0_.company_code as company_5_2_1_, users0_.id as id1_2_1_, users0_.id as id1_2_0_, users0_.code as code2_2_0_, users0_.company_code as company_5_2_0_, users0_.name as name3_2_0_, users0_.surname as surname4_2_0_ from USERS users0_ where users0_.company_code in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
12:49:30.977 [main] INFO  o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
    667681 nanoseconds spent acquiring 1 JDBC connections;
    574034 nanoseconds spent releasing 1 JDBC connections;
    19252743 nanoseconds spent preparing 21 JDBC statements;
    2874138319 nanoseconds spent executing 21 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Come descritto dai log, JPA ha effettuato 21 SELECT.
Dato N il numero di righe di COMPANIES, ed M il size dell'annotation @BatchSize, il numero di queries che effettua JPA è N:M+1 (1000:50+1=21)

Un'altra possibile ottimizzazione, per non avere troppe JOIN a cascata, è utilizzare la property
hibernate.max_fetch_depth = n
dove n è la massima profondità con cui JPA effettua le JOIN.

Personalmente, ritengo comunque migliori le soluzioni JOIN FETCH ed EntityGraph.

Conclusioni

In questo articolo abbiamo visto i tipi di Fetch di JPA e delle possibili soluzione al problema delle N+1 SELECT.
Ovviamente, a seconda del progetto, può essere più funzionale una soluzione piuttosto che un'altra.

Potete trovare il progetto completo sul mio github a questo link: JPA Project

Articoli su JPA: JPA