- I find WikiData to be perfect for aggregating identifiers. I mostly work with species names and it's perfect for getting the iNaturalist, GBIF, Open Tree of Life, Catalogue of Life, etc identities all in one query
I haven't tried it for books. I imagine it's not sufficiently complete to serve as a backbone but a quick look at an example book gives me the ids for OpenLibrary, Librarything, Goodreads, Bing, and even niche stuff like the National Library of Poland MMS ID.
https://www.wikidata.org/wiki/Q108922801
- Do you handle books with no ISBN?
I’ve recently acquired some photo books that don’t appear to have any ISBN but are listed on WorldCat and have OCLC Numbers and are catalogued in the Japanese National Diet Library. Not sure if they actually don't have ISBNs or if I just haven't been able to find them, but from what I got from some research it's quite common for self-published books.
by apublicfrog
1 subcomments
- Wow. I don't have any use for this personally, but your post is really well presented, detailed and sourced. I hope it goes well!
- Are you able to pull upcoming titles? All I want is a weekly/monthly list of books by authors I've ready which are coming out, and I've not been able to find it or to build it.
- I applaud the effort, but last time I tried this the major issue was the sheer amount of book data only available from amazon.com and scraping that is tedious to put it mildly.
- Does it handle languages other than English? I remember trying out some APIs like that for some tasks, and while I managed to find titles in English somewhat successfully, any other languages (be it the original title, or a translation of some fairly well-known book) were basically inaccessible.
by mehdi1964
1 subcomments
- Nice approach! Merging metadata from multiple sources is tricky, especially handling conflicts like titles and covers. Curious how you plan to handle scalability as your database grows—caching helps, but will the naive field strategies hold with thousands of books?
- Tried throwing a batch of known-to-be-in-Amazon ISBN's through (from a recent "export my data", so even if they're old amazon fundamentally knows them.) Got 500's for a handful of the first hundred, then a bunch of 502/503s (so, single threaded, but part of the HN hug to death, sorry!)
(Only the first 4 or so were json errors, the rest were html-from-nginx, if that matters.)
- Nice, I might try your API for my ISBN extractor / formatter at https://github.com/infojunkie/isbn-info.js
Right now, I use node-isbn https://www.npmjs.com/package/node-isbn which mostly works well but is getting old in the tooth.
by moritzruth
2 subcomments
- What do you think about BookBrainz?
https://bookbrainz.org/
- Would it be possible to use a SQLite file instead of a PostgreSQL instance? Or do you rely on some specific PostgreSQL functionality?
by wizzwizz4
1 subcomments
- Please ensure that your database keeps track of whence data was obtained, and when. It's exceptionally frustrating when automated data ingesting systems overwrite manually-corrected data with automatically-generated wrong data: keeping track of provenance is a vital step towards keeping track of authoritativeness.
- 502 Bad Gateway :|
by ocdtrekkie
1 subcomments
- Library of Congress data seems like a huge omission especially for something named after a librarian. ;) It is a very easy API to consume too.
- hella hella cool
goodluck