Project: Library Scan

Specification

Use Cases

I have a hand held barcode reader, that connects to the USB port of my laptop and acts as though I'd typed in the numbers of the barcode at the keyboard. Using this it is fairly easy to scan a shelf of books and end up with a text file containing one ISBN number per line.

What I'd like to end up with is a database describing my library, from which I can generate web pages and interact with via scripts and forms in order to:

Keep track of which books I have - especially useful if I could use my iPhone from a bookshop to check which books in a series I don't have
Help me lend out books to friends - things like ratings, loan history, genre and compatibility with sites like Library Thing or Book Crossing
Help me find, sort and organise my books - thinks like size, format, shelf location, on loan, publication date, date added to database and date last read
In the event of a fire, it would be useful to make an insurance claim then rebuild my library, so price, publisher, URL of info source

Scripts

What I'm going to need are the following scripts:

To build the catalog

isbn2info: Given a single ISBN number return as much info as possible, in a useful format - Moonshadow has already done a majority of the work needed for this in his script that checks a number of sources, such as Amazon, and parses them if the book is successfully found
shelf2info: Given a text file containing ISBNs and optionally some tags at the top, uses isbn2info to create an output file ready to be processed into SQL
info2sql: Given a directory full of output files containing duplicates, errors and books already in the database, updates the database sensibly
isbn2sql: A CGI script allowing you to update the database one book at a time
notfound: A CGI script allowing hand entry of all fields for books without ISBNs

To use the catalog

summary: Use number of books, prices, age and format to estimate cost to buy originally and cost to replace now. Also statistical breakdown by age, genre, reading frequency, etc. Total mass and volume of books.
suggest: Make a suggestion of what to read or re-read next.
onloan: Who still has what, and when they borrowed it.
listing: Nice compact listing of books matching given criteria, suitable for printing out and taking along to a book convention dealer hall.

Interactive use

browse: Generate a form webpage for a single book, allowing update of its fields and navigation to next book by author, genre, title, date, etc.
navigate: Search and navigation interface, suitable for mobile device browsing
manage: Tool to aid mass changes (eg re-shelving authors)

Data formats

Tags / Fields

ISBN - defaults to "AUTHOR-TITLE-FORMAT-PUBDATE" if no ISBN
AUTHOR - full name in order supplied by source
AUTHORNAMELAST - Author's last name
AUTHORNAMEOTHER - Author's other names, including co-authors
SOURCENAME
SOURCEURL
TITLEFULL - foo (bar)
TITLE - foo
TITLEEXTRA - bar, hopefully series name and number
READ - binary Yes or No, if the owner has read it yet
LENT - binary Yes or No, if it book is currently out on loan
DATEPUB - listed publication date
DATEENT - date entered into system
DATEALT - date entry last updated
DATERED - date last read by owner
DATEOUT - date last lent out
NAMEOUT - name of person last lent to
LINKOUT - email, phone or other contact for person last lent to
GENRE - comma separated list: Fantasy,Science Fiction,Reference, etc.
FORMAT - Hardback, Paperback, Trade, Manga, Graphic Novel, Other
PRICE
LOCATION - House:Room:Case:Shelf works for me, but use your own
PUBLISHER
DIMENSIONS
SOURCERATING - if the info source rated it
RATING

Loan, reading date and rating history should be kept in a separate table of the database.

No fields may contain tabs, as that is used as a field seperator in files.

Dates are optionally "YYYY" or "YYYY-MM-DD"

Files

Shelf file format is to ignore any lines starting with a # or whitespace, to treat lines starting with a digit as an ISBN and lines starting with a letter as being TAG=VALUE which then apply to all succeeding books

Output file format is tab separated TAG=VALUE pairs, with zero len VALUE ok

Project: Library Scan

Specification

Use Cases

Scripts

Data formats

Links