#

ocr

Here are 4,708 public repositories matching this topic...

tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)

machine-learning ocr tesseract lstm tesseract-ocr hacktoberfest ocr-engine

Updated Jun 11, 2024
C++

shared-services

Suhas-H-C / shared-services

Shared services serves as a ready made solutions to most of the code snippets required for back-end services development in spring boot.

docker pdf ocr csv spring spring-boot excel gherkin data-structures

Updated Jun 11, 2024
Java

SciPhi-AI / R2R

Build and deploy a fully-featured, observable, user-facing RAG backend in minutes.

search pdf machine-learning ocr deep-learning retrieval chatbot artificial-intelligence question-answering data-pipelines retrieval-systems large-language-models llm langchain llama-index retrieval-augmented-generation

Updated Jun 11, 2024
HTML

ds4v / NomNaOCR

Leverage Deep Learning to digitize old Vietnamese handwritten for historical document archiving (Made with national pride in every single line of code): https://www.kaggle.com/datasets/quandang/nomnaocr

ocr vietnamese history text-recognition text-detection nom optical-character-recognition digitization

Updated Jun 11, 2024
Jupyter Notebook

veryfi / veryfi-ruby

Ruby gem for communicating with the Veryfi OCR API.

ruby api ocr sdk receipt invoice ocr-library sdk-ruby invoice-parser receipt-reader

Updated Jun 11, 2024
Ruby

Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Updated Jun 11, 2024
HTML

LinXueyuanStdio / LaTeX_OCR_PRO

🎨 数学公式识别增强版：中英文手写印刷公式、支持初级符号推导（数据结构基于 LaTeX 抽象语法树）Math Formula OCR Pro, supports handwrite, Chinese-mixed formulas and simple symbol reasoning (based on LaTeX AST).

ocr latex deep-learning cnn lstm rnn seq2seq

Updated Jun 11, 2024
Jupyter Notebook

paperless-ngx / paperless-ngx

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

pdf machine-learning django angular ocr archiving dms document-management optical-character-recognition document-management-system

Updated Jun 11, 2024
Python

mindee / mindee-api-python

Mindee API Helper Library for Python

ocr sdk api-client python3 ocr-api

Updated Jun 11, 2024
Python

file-organizer-2000

different-ai / file-organizer-2000

AI-powered organization for people who struggle with organization

ocr gpt obsidian

Updated Jun 11, 2024
TypeScript

regulaforensics / DocumentReader-web-openapi

OpenAPI definitions of Regula Document Reader web application

ocr yml passport mrz barcode-reader idcard document-reader regula mrz-codes openapi-definitions document-recognition regulaforensics

Updated Jun 11, 2024

tomodachi94 / hydrus-ocr

Retrieve files from Hydrus Network and run them through OCR.

ocr optical-character-recognition hydrus hydrusnetwork hydrus-network

Updated Jun 11, 2024
Python

devmehq / extract-text

node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!

pdf ocr extractor tesseract-ocr extract-text tessaract

Updated Jun 11, 2024
HTML

jeroenvansweeveldt / DTA_Thesis

Repository for the MA Digital Text Analysis thesis.

python r ocr xml tesseract corpus-linguistics historical-corpus-linguistics

Updated Jun 11, 2024
Jupyter Notebook

siyuan

siyuan-note / siyuan

A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.

electron markdown pdf ocr notebook s3 webdav self-hosted openai note-taking evernote anki knowledge-base obsidian pkm notion notes-app local-first chatgpt

Updated Jun 11, 2024
TypeScript

TechnicalOtter / tesseract-scripts

Set of scripts for OCR'ing PDF files with Tesseract.

pdf ocr history tesseract

Updated Jun 11, 2024
Shell

ballerine

ballerine-io / ballerine

Open-source infrastructure and data orchestration platform for risk decisioning

Updated Jun 11, 2024
TypeScript

PrusWielki / NotesReader

App deployed on Google Cloud Platform allowing for OCR and note summaries, developed under the supervision of Google

firebase google cloud ocr ai gcp google-cloud svelte google-cloud-platform cloud-functions sveltekit svelte5

Updated Jun 11, 2024
Svelte

infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

nlp machine-learning information-retrieval ocr deep-learning chatbot orchestration preprocessing pdf-to-text data-pipelines document-parser rag document-understanding table-structure-recognition llm llmops retrieval-augmented-generation

Updated Jun 11, 2024
Python

sandbox-pokhara / hash-ocr

Fast OCR to read computer rendered texts

ocr computer-vision

Updated Jun 11, 2024
Python

Improve this page

Add a description, image, and links to the ocr topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ocr topic, visit your repo's landing page and select "manage topics."