Vietnamese OCR in C# and .NET

Other versions of this document:

IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Vietnamese.

It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.

Contents of IronOcr.Languages.Vietnamese

This package contains 132 OCR languages for .NET:

  • Vietnamese
  • VietnameseBest
  • VietnameseFast
  • VietnameseAlphabet
  • VietnameseAlphabetBest
  • VietnameseAlphabetFast

Download

Vietnamese Language Pack [Tiếng Việt]

Installation

The first thing we have to do is install our Vietnamese OCR package to your .NET project.

PM> Install-Package IronOCR.Languages.Vietnamese
PM> Install-Package IronOCR.Languages.Vietnamese
SHELL

Code Example

This C# code example reads Vietnamese text from an image or PDF document.

// You need to install the IronOCR.Languages.Vietnamese package using the following NuGet command before running this code:
// PM> Install-Package IronOCR.Languages.Vietnamese

using IronOcr;

var Ocr = new IronTesseract();

// Set the OCR language to Vietnamese
Ocr.Language = OcrLanguage.Vietnamese;

using (var Input = new OcrInput(@"images\Vietnamese.png"))
{
    // Perform OCR on the input image
    var Result = Ocr.Read(Input);

    // Extract all recognized text
    var AllText = Result.Text;

    // Example: Output the extracted text to the console
    Console.WriteLine(AllText);
}
// You need to install the IronOCR.Languages.Vietnamese package using the following NuGet command before running this code:
// PM> Install-Package IronOCR.Languages.Vietnamese

using IronOcr;

var Ocr = new IronTesseract();

// Set the OCR language to Vietnamese
Ocr.Language = OcrLanguage.Vietnamese;

using (var Input = new OcrInput(@"images\Vietnamese.png"))
{
    // Perform OCR on the input image
    var Result = Ocr.Read(Input);

    // Extract all recognized text
    var AllText = Result.Text;

    // Example: Output the extracted text to the console
    Console.WriteLine(AllText);
}
$vbLabelText   $csharpLabel

In this code sample:

  • We create an instance of IronTesseract.
  • Set the language to Vietnamese using Ocr.Language = OcrLanguage.Vietnamese;.
  • Create an OcrInput object with the path to the image or PDF.
  • Call the Read method to perform OCR and obtain the extracted text.
  • The extracted text is stored in AllText, which can be used as needed, such as displaying it or saving it to a file.