this post was submitted on 09 Jan 2025
1 points (100.0% liked)

Self-Hosted Alternatives to Popular Services

224 readers
2 users here now

A place to share, discuss, discover, assist with, gain assistance for, and critique self-hosted alternatives to our favorite web apps, web...

founded 2 years ago
MODERATORS
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/selfhosted by /u/Spare_Put8555 on 2025-01-09 14:47:56+00:00.


Hey everyone,

I've noticed discussions in other threads about paperless-ai (which is awesome), and some folks asked how it differs from my project, paperless-gpt. Since I’m a newer user here, I’ll keep things concise:

Context

  1. paperless-ai leans toward doc-based AI chat, letting you converse with your documents.
  2. paperless-gpt focuses on LLM-based OCR (for more accurate scanning of messy or low-quality docs) and a robust pipeline for auto-generating titles/tags.

Why Another Project?

  • I didn't know paperless-ai in Sept. '24: True story :D
  • LLM-based OCR: I wanted a solution that does advanced text extraction from scans, harnessing Large Language Models (OpenAI or Ollama).
  • Tag & Title Workflows: My main passion is building flexible, automated naming and tagging pipelines for paperless-ngx.
  • No Chat (Yet): If you do want doc-based chatting, paperless-ai might be a better fit. Or you can run both—use paperless-gpt for scanning/tags, then pass that cleaned text into paperless-ai for Q&A.

Key Features

  • Multiple LLM Support (OpenAI or Ollama).
  • Customizable Prompts for specialized docs.
  • Auto Document Processing via a “paperless-gpt-auto” tag.
  • Vision LLM-based OCR (experimental) that outperforms standard OCR in many tough scenarios.

Combining With paperless-ai?

  • Totally possible. You could have paperless-gpt handle the scanning & metadata assignment, then feed those improved text results into paperless-ai for doc-based chat.
  • Some folks asked about overlap: we do share the “metadata extraction” idea, but the focus differs.

If You’re Curious

  • The project has a short README, Docker Compose snippet, and minimal environment vars.
  • I’m grateful to a few early sponsors who donated (thank you so much!). That support motivates me to keep adding features (like multi-language OCR support).

Anyway, just wanted to clarify the difference, since people were asking. If you’re looking for OCR specifically—especially for messy scans—paperless-gpt might fit the bill. If doc-based conversation is your need, paperless-ai is out there. Or combine them both!

Happy to answer any questions or feedback you have. Thanks for reading!

Links (in case you want them):

  • paperless-gpt code and docs: github.com/icereed/paperless-gpt
  • paperless-ngx: github.com/paperless-ngx/paperless-ngx

Cheers!

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here