Text Recognition App with Google ML Kit - Android
In this blog I will show how to make a simple yet effective text recognition app using Google's ML toolkit in android.
To get started you need an empty project You can use Kotlin/Java but I will be using Kotlin for demo.
You can find a demo app here
Dependencies
- In your project-level build.gradle file, make sure to include Google's Maven repository in both your buildscript and allprojects sections.
mavenCentral()
- Add the dependencies for the ML Kit Android libraries to your module's app-level gradle file, which is usually app/build.gradle:
dependencies {
implementation 'com.google.android.gms:play-services-mlkit-text-recognition:17.0.0'
}
This step is optional but recommended by google. It will help you increase app's performance
You can configure your app to automatically download the ML model to the device after your app is installed from the Play Store. To do so, add the following declaration to your app's AndroidManifest.xml file:
<application ...>
<meta-data
android:name="com.google.mlkit.vision.DEPENDENCIES"
android:value="ocr" />
<!-- To use multiple models: android:value="ocr,model2,model3" -->
</application>
Your dependencies are all set.
Classes and functions used
So, first let us understand better about the classes and functions that we will be using for this demo. Most of which are from the Google ML kit dependency.
TextRecognizer: This class has the main task of actually detecting the text from a frame class which is obtained from a bitmap(photo). We use it
detect()
andBuilder()
functions in this tutorial.Frame: A frame is constructed via the builder class, it takes in a bitmap and specifies the image data, dimensions, and sequencing information. This frame is then passed to TextRecognizer via the
detect()
method.StringBuilder: We will use this class which is inbuilt android class to append the textBlock that we will be getting from TextRecognizer which returns a SparseArray which will be iterated and individual values will be stored using the
append()
method.TextBlock: It is a block of text detected by the OCR engine to be a paragraph or a continues text.
Code
So now that we know about classes and functions involved, lets get our hands dirty.
Here we will not be dealing with how to get a bitmap from either camera or gallery.
We will assume that we have a bitmap.
This line of code will actually initialize the recognizer
val recognizer = TextRecognizer.Builder(this).build()
You can check if the recognizer is actually operational or not to avoid errors using the
recognizer.isOperational
method
This line of code initializes our frame with bitmap
val frame = Frame.Builder().setBitmap(bitmap).build()
Next we will run the detection algorithm
val stringish = recognizer.detect(frame)
Here stringish
is of the type SparseArray<TextBlock!>
Now, to actually make this into string we need a couple of steps
val stringBuilder = StringBuilder()
var i =0
while (i<stringish.size()-1){
i++
val textBlock = stringish.valueAt(i)
stringBuilder.append(textBlock.value)
stringBuilder.append("\n")
}
Here basically we are using StringBuilder
to append the individual TextBlocks
from SparseArray<TextBlock!>
and the \n is for proper indentation.
Now to convert stringBuilder to String simply:
stringBuilder.toString()
And Boom that's your recognized text.
This is a great project for your resume and to show off to your friends and family, Do make it and show me some love by reacting to the article.
Thankyou, Have a wonderful day.