FRESH

Hacker News

Home

Show HN: Nano PDF – A CLI Tool to Edit PDFs with Gemini's Nano Banana

171 points by GavCo

by tecoholic

1 subcomments

> Converts an image to a single-page PDF with a hidden text layer using Tesseract. This is the 'State Preservation' step.
Does this mean the text only pdf page is transformed into an image that covers the full page, but the text is still under there. So, any machine based extraction would still get the text, but would probably loose all the bounding box information and regular users cannot just use their mouse to select text anymore?

by lxe

1 subcomments

This is nuts and I absolutely love this. So you convert the PDF into image, edit the image, then convert the image back into a PDF.

by shevis

1 subcomments

A side effect of replacing entire pages with images is that the file size will expand dramatically. Most PDFs only contain a couple of images

by moezd

0 subcomment

Behold, the might of LLMs! Instead of ushering the age of AGI as advertised 6 months ago, now it cleans your PDFs for you.
Many thanks to humanity for failing to standardise PDF and this project for paying interest on that tech debt with datacenter levels of energy consumption.

by ohans

0 subcomment

Really cool! I reckon a nice UI would be a good addition

by treetalker

0 subcomment

I'd love to see clearer examples: a video, or original pdf / command / result pdf. Very cool!

by struc_so

0 subcomment

Interesting approach. I've spent a lot of time wrangling PDF internals recently, and the issue is usually maintaining the xref table integrity when you inject new content streams.
Does this approach rewrite the entire file structure on save, or are you appending incremental updates to the EOF? Incremental is safer for corruption, but file size bloats quickly with AI-generated diffs.

by perfectritone

0 subcomment

It's incredible how many hacks there are to make PDFs semi-usable.

by itsmevictor

0 subcomment

Very nice! I wonder whether that could be used to get LLMs to annotate pdfs. Say an "agentic" CLI like Claude Code or Gemini-cli reviews a pdf and finds typos, could it use this to annotate the pdf like underlining them in red or something of that sort? That could be nice.

by mentalgear

1 subcomments

Nice - but consider adding an animated screengrap like: https://github.com/pythops/oryx

by iamflimflam1

1 subcomments

The lack of examples makes me very reluctant to commit any time to trying this out - despite it being something that I’m interested in.
Has anyone given any it a go? Does it work?

by McNulty2

0 subcomment

I like the example of updating latest market data. Updating a deck one-off is tedious. Keeping it updated long-term was never going to happen. But now it can

by ThrowawayTestr

1 subcomments

I recently tried to change a single word in a PDF and nearly tore my hair out (thank you LibreOffice) I'll definitely keep this in mind for next time, thank you.

by toddmorey

1 subcomments

I thought it was kinda funny that Google Slide’s own built in “beautify this slide” button converts the whole slide into an uneditable image.

by mlpoknbji

0 subcomment

Somewhat unrelated but can anyone recommend a way to edit the text of a PDF using LLM? Something like AI + acrobat pro?

by vood

0 subcomment

Congratulations on the release; that's a really good job.

by informal007

0 subcomment

it will be more excited if i can use this feature in application with GUI, it’s now convenient to check the result after edit the PDF, i need to transfer between CLI and PDF reader

by John7878781

0 subcomment

Love this.
After several iterations of edits, would the image quality decrease?

by Zopieux

0 subcomment

I am disappointed that this doesn't modify the underlying pdf structure (which is a horror show, I know) but instead relies on fairly lossy OCR back&fourths.
I wish an agent with a validation and rendering tools could instead manipulate the structure to accomplish those edits way less destructively, checking its progress with the tools.

by mertleee

0 subcomment

[dead]

by sultson

0 subcomment

[dead]