Text Recognition App with Google ML Kit - Android

In this blog I will show how to make a simple yet effective text recognition app using Google's ML toolkit in android.

To get started you need an empty project You can use Kotlin/Java but I will be using Kotlin for demo.

You can find a demo app here

Dependencies

  1. In your project-level build.gradle file, make sure to include Google's Maven repository in both your buildscript and allprojects sections.
    mavenCentral()
    
  2. Add the dependencies for the ML Kit Android libraries to your module's app-level gradle file, which is usually app/build.gradle:
dependencies {

  implementation 'com.google.android.gms:play-services-mlkit-text-recognition:17.0.0'
}

This step is optional but recommended by google. It will help you increase app's performance

You can configure your app to automatically download the ML model to the device after your app is installed from the Play Store. To do so, add the following declaration to your app's AndroidManifest.xml file:

<application ...>
  <meta-data
      android:name="com.google.mlkit.vision.DEPENDENCIES"
      android:value="ocr" />
  <!-- To use multiple models: android:value="ocr,model2,model3" -->
</application>

Your dependencies are all set.

Classes and functions used

So, first let us understand better about the classes and functions that we will be using for this demo. Most of which are from the Google ML kit dependency.

  1. TextRecognizer: This class has the main task of actually detecting the text from a frame class which is obtained from a bitmap(photo). We use it detect() and Builder() functions in this tutorial.

  2. Frame: A frame is constructed via the builder class, it takes in a bitmap and specifies the image data, dimensions, and sequencing information. This frame is then passed to TextRecognizer via the detect() method.

  3. StringBuilder: We will use this class which is inbuilt android class to append the textBlock that we will be getting from TextRecognizer which returns a SparseArray which will be iterated and individual values will be stored using the append() method.

  4. TextBlock: It is a block of text detected by the OCR engine to be a paragraph or a continues text.

Code

So now that we know about classes and functions involved, lets get our hands dirty.

Here we will not be dealing with how to get a bitmap from either camera or gallery.

We will assume that we have a bitmap.

This line of code will actually initialize the recognizer

val recognizer = TextRecognizer.Builder(this).build()

You can check if the recognizer is actually operational or not to avoid errors using the recognizer.isOperational method

This line of code initializes our frame with bitmap

val frame = Frame.Builder().setBitmap(bitmap).build()

Next we will run the detection algorithm

val stringish = recognizer.detect(frame)

Here stringish is of the type SparseArray<TextBlock!>

Now, to actually make this into string we need a couple of steps

val stringBuilder = StringBuilder()
var i =0
while (i<stringish.size()-1){
                i++
                val textBlock = stringish.valueAt(i)
                stringBuilder.append(textBlock.value)
                stringBuilder.append("\n")
            }

Here basically we are using StringBuilder to append the individual TextBlocks from SparseArray<TextBlock!> and the \n is for proper indentation.

Now to convert stringBuilder to String simply:

stringBuilder.toString()

And Boom that's your recognized text.

This is a great project for your resume and to show off to your friends and family, Do make it and show me some love by reacting to the article.

Thankyou, Have a wonderful day.