Skip to main content

Optical character recognition (OCR)


OCR (optical character recognition) is the recognition of printed or handwritten text characters by a computer. The basic process of OCR involves examining the text of a document and translating the characters into character codes a computer program can understand.

OCR systems are used to convert physical documents into machine-readable text. Software features can also take advantage of artificial intelligence (AI) to implement more advanced methods of intelligent character recognition (ICR), like identifying languages or styles of handwriting.
The process of OCR is most commonly used to turn hard copy legal or historic documents into PDFs. Once digitized, the document can be interacted with as if it was created with a word processor. This is why OCR is sometimes also referred to as text recognition.

How optical character recognition works

The first step of OCR is to scan the physical document. OCR programs typically target one character, word or block of text at a time. When a character is identified, it is converted into ASCII code.
Characters are typically identified using one of two algorithms:
  • Pattern recognition - OCR programs are fed examples of text in various fonts and formats which are then used to compare, and recognize, characters in the scanned document.
  • Feature detection - OCR programs apply rules regarding the features of a specific letter or number to recognize characters in the scanned document. Features could include the number of angled lines, crossed lines or curves in a character for comparison. For example, the capital letter "A" may be stored as two diagonal lines that meet with a horizontal line across the middle.

Optical character recognition use cases

OCR can be used for a variety of applications, including:
  • Indexing print material for search engines.
  • Deciphering handwritten documents into text that can be read aloud to visually-impaired or blind users.
  • Archiving historic information, such as newspapers, magazines or phonebooks, in searchable formats.
  • Electronically depositing checks.
  • Recognizing text, such as license plates, with a camera or software.
  • Sorting letters for mail delivery.
  • Translating words within an image into a specified language.


Comments

Popular posts from this blog

Understanding the Evolution: AI, ML, Deep Learning, and Gen AI

In the ever-evolving landscape of artificial intelligence (AI) and machine learning (ML), one of the most intriguing advancements is the emergence of General AI (Gen AI). To grasp its significance, it's essential to first distinguish between these interconnected but distinct technologies. AI, ML, and Deep Learning: The Building Blocks Artificial Intelligence refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. Machine Learning, a subset of AI, empowers machines to learn from data and improve over time without explicit programming. Deep Learning, a specialized subset of ML, involves neural networks with many layers (hence "deep"), capable of learning intricate patterns from vast amounts of data. Enter General AI (Gen AI): Unraveling the Next Frontier Unlike traditional AI systems that excel in specific tasks (narrow AI), General AI aims to replicate human cognitive abilities across various domains. I...

Normalization of Database

Database Normalisation is a technique of organizing the data in the database. Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts data into tabular form by removing duplicated data from the relation tables. Normalization is used for mainly two purpose, Eliminating reduntant(useless) data. Ensuring data dependencies make sense i.e data is logically stored. Problem Without Normalization Without Normalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anamolies are very frequent if Database is not Normalized. To understand these anomalies let us take an example of  Student  table. S_id S_Name S_Address Subject_opted 401 Adam Noida Bio 402 Alex Panipat Maths 403 Stuart Jammu Maths 404 Adam Noida Physics Updation Anamoly :  To upda...

How to deal with a toxic working environment

Handling a toxic working environment can be challenging, but there are steps you can take to address the situation and improve your experience at work: Recognize the Signs : Identify the specific behaviors or situations that contribute to the toxicity in your workplace. This could include bullying, harassment, micromanagement, negativity, or lack of support from management. Maintain Boundaries : Set boundaries to protect your mental and emotional well-being. This may involve limiting interactions with toxic individuals, avoiding gossip or negative conversations, and prioritizing self-care outside of work. Seek Support : Reach out to trusted colleagues, friends, or family members for support and advice. Sharing your experiences with others can help you feel less isolated and provide perspective on the situation. Document Incidents : Keep a record of any incidents or behaviors that contribute to the toxic environment, including dates, times, and specific details. This documentation may b...