Vietnamese OCR in C# and .NET
Other versions of this document:
IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Vietnamese.
It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.
Contents of IronOcr.Languages.Vietnamese
This package contains 132 OCR languages for .NET:
- Vietnamese
- VietnameseBest
- VietnameseFast
- VietnameseAlphabet
- VietnameseAlphabetBest
- VietnameseAlphabetFast
Download
Vietnamese Language Pack [Tiếng Việt]
Installation
The first thing we have to do is install our Vietnamese OCR package to your .NET project.
PM> Install-Package IronOCR.Languages.Vietnamese
PM> Install-Package IronOCR.Languages.Vietnamese
Code Example
This C# code example reads Vietnamese text from an image or PDF document.
// You need to install the IronOCR.Languages.Vietnamese package using the following NuGet command before running this code:
// PM> Install-Package IronOCR.Languages.Vietnamese
using IronOcr;
var Ocr = new IronTesseract();
// Set the OCR language to Vietnamese
Ocr.Language = OcrLanguage.Vietnamese;
using (var Input = new OcrInput(@"images\Vietnamese.png"))
{
// Perform OCR on the input image
var Result = Ocr.Read(Input);
// Extract all recognized text
var AllText = Result.Text;
// Example: Output the extracted text to the console
Console.WriteLine(AllText);
}
// You need to install the IronOCR.Languages.Vietnamese package using the following NuGet command before running this code:
// PM> Install-Package IronOCR.Languages.Vietnamese
using IronOcr;
var Ocr = new IronTesseract();
// Set the OCR language to Vietnamese
Ocr.Language = OcrLanguage.Vietnamese;
using (var Input = new OcrInput(@"images\Vietnamese.png"))
{
// Perform OCR on the input image
var Result = Ocr.Read(Input);
// Extract all recognized text
var AllText = Result.Text;
// Example: Output the extracted text to the console
Console.WriteLine(AllText);
}
In this code sample:
- We create an instance of
IronTesseract
. - Set the language to Vietnamese using
Ocr.Language = OcrLanguage.Vietnamese;
. - Create an
OcrInput
object with the path to the image or PDF. - Call the
Read
method to perform OCR and obtain the extracted text. - The extracted text is stored in
AllText
, which can be used as needed, such as displaying it or saving it to a file.