Image Recognition Using TensorFlow in Java

Share this blog post

At UnderstandLing, we already do quite a lot of natural language processing. One of the main – though by far not the only – sources of getting textual data for us are social media. While social media started out being quite textual, we now see a shift towards more visual data being used such as images and videos. For us it hence made sense to also start analyzing the content of these visual posts next to just the texts.

Image Recognition Using Deep Learning

Many of the current state of the art image analysis algorithms make use of deep learning. There are quite a lot of pre-trained models out there already that we can use as a starting point for finetuning it to our own data. The benefit of doing it this way is that we save a lot of time to teach the model to learn generic features that are useful on any image, regardless the task at hand.

Next to having a lot of pre-trained models, there are also a lot of frameworks out there that allow us to do deep learning and image recognition in particular. Since we are a fully JVM (and in fact Scala) oriented company, we would prefer to do this the JVM way. Unfortunately, most deep learning frameworks are very much Python-oriented, with some exceptions like DL4J – which we have used before.

We started off trying to use pre-trained Keras models and import them into DL4J. This turned out to be fairly easy and worked quite well (see for example here and here). After having run this for a couple of our clients we found out that memory consumption rose quite high every time and eventually led to out of memory errors. It later turned out to be a bug in DL4J.

We abandoned DL4J and image recognition for a while until we found out that nowadays, TensorFlow has quite fancy Java support as well. We immediately started hacking away and after a few days of hard work we had our own image recognition code and models using TensorFlow and a pre-trained Inception model.

The current status is that we are test driving our finetuned models in production systems in real-time to analyze images from different sources. We will probably come back with an update post on this later!

Share this blog post