I wrote an app that scrapes internet radio playlists then saves them to a database. To learn about hibernate I migrated the app to use hibernate, but I've seen large performance dropoffs when doing a SELECT ... WHERE lookup compared to other attempts. The same procedure (to fetch around 17,000 tracks grouped by which programme they were played on and who played them) took 150ms in my python sqlite prototype, and the initial java version using apache db utils which took about 250ms, compared to my (probably horrific) hibernate version which takes about 1100ms.
@Override
public DJAllProgrammes getAllProgrammesFromDJ(Collection<String> names) {
DJAllProgrammes djAllProgrammes = new DJAllProgrammes();
session.beginTransaction();
List<Presenter> result = session.createQuery("from Presenter p WHERE p.presenter_name in :names", Presenter.class)
.setParameterList("names", names)
.getResultList();
for (Presenter presenter : result) {
int presenter_id = presenter.getPresenter_id();
List<Programme> programmes = session
.createQuery("from programme prog WHERE prog.presenter_origin_id = :pres_orig_id", Programme.class)
.setParameter("pres_orig_id", presenter_id)
.getResultList();
for (Programme programme : programmes) {
//this is the critical performance death zone
List<Track> tracksOnThisProgramme = session
.createQuery("FROM track t WHERE t.programme.programme_id in :progIds", Track.class)
.setParameter("progIds", programme.getProgramme_id())
.getResultList();
djAllProgrammes.addProgramme(new ProgrammeData(presenter.getPresenter_name(), programme.getDate(), tracksOnThisProgramme));
}
}
session.getTransaction().commit();
return djAllProgrammes;
}
Debug info:
INFO: Session Metrics
{
33339 nanoseconds spent acquiring 1 JDBC connections;
71991 nanoseconds spent releasing 1 JDBC connections;
12938819 nanoseconds spent preparing 258 JDBC statements;
88949720 nanoseconds spent executing 258 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
4671332 nanoseconds spent executing 1 flushes (flushing a total of 9130 entities and 0 collections);
599862735 nanoseconds spent executing 258 partial-flushes (flushing a total of 1079473 entities and 1079473 collections)
}
Looking around the internet I saw a suggestion based on having WAY too many entities in the transaction to "use pagination and smaller batch increments"- I can find information about what pagination is, but not so much what "using smaller batch increments means"
I'm kind of in a bind where this app had fine performance doing basically the same thing using Apache DB Utils (a lightweight jdbc wrapper), and I'm so ignorant I don't even really know what to search for to speed this up. Help a brother out?
Also beans (persistence entities ...?) used here https://pastebin.com/pSQ3iGK2