Icelandic OCR in C# and .NET

Other versions of this document:

IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Icelandic. It is an advanced fork of Tesseract, built exclusively for .NET developers, and it regularly outperforms other Tesseract engines in both speed and accuracy.

Contents of IronOcr.Languages.Icelandic

This package contains 52 OCR languages for .NET:

  • Icelandic
  • IcelandicBest
  • IcelandicFast

Download

Icelandic Language Pack [Íslenska]

Installation

The first thing we have to do is install our Icelandic OCR package to your .NET project.

PM> Install-Package IronOCR.Languages.Icelandic
PM> Install-Package IronOCR.Languages.Icelandic
SHELL

Code Example

This C# code example reads Icelandic text from an image or PDF document.

using IronOcr;

class Program
{
    static void Main()
    {
        // Create an instance of the IronTesseract OCR engine
        var Ocr = new IronTesseract();

        // Set the language to Icelandic
        Ocr.Language = OcrLanguage.Icelandic;

        // Load the image or PDF file to be processed
        using (var Input = new OcrInput(@"images\Icelandic.png"))
        {
            // Perform OCR on the input file
            var Result = Ocr.Read(Input);

            // Extract all recognized text from the result
            var AllText = Result.Text;

            // Print the extracted text to the console
            System.Console.WriteLine(AllText);
        }
    }
}
using IronOcr;

class Program
{
    static void Main()
    {
        // Create an instance of the IronTesseract OCR engine
        var Ocr = new IronTesseract();

        // Set the language to Icelandic
        Ocr.Language = OcrLanguage.Icelandic;

        // Load the image or PDF file to be processed
        using (var Input = new OcrInput(@"images\Icelandic.png"))
        {
            // Perform OCR on the input file
            var Result = Ocr.Read(Input);

            // Extract all recognized text from the result
            var AllText = Result.Text;

            // Print the extracted text to the console
            System.Console.WriteLine(AllText);
        }
    }
}
$vbLabelText   $csharpLabel

Explanation

  • The IronTesseract class is part of the IronOcr library, designed for performing OCR operations.
  • Ocr.Language = OcrLanguage.Icelandic; sets the OCR language to Icelandic.
  • OcrInput takes the path to the input file (an image or a PDF) and prepares it for processing.
  • Ocr.Read(Input) processes the input file and returns the OCR result.
  • Result.Text retrieves all recognized text from the processed input.

Ensure you have the IronOCR library and its Icelandic language package installed in your .NET project to run this example successfully.