Brewster Kahle’s talk on “Universal access to all knowledge”
I love Brewster’s talks and his enthusiasm. Makes me think there are good things going on in the world despite others’ efforts to thwart them.
Question he asks is: In our generation/lifetime, can we provide universal (online) access to all knowledge ever?
Can we put all the world’s text online?
The LoC has ~26M volumes, costs about $30/book (10c/pg) to do the whole chain from scanning to putting up on spinning storage w.metadata => $800M scans entire LoC and puts it on spinning storage – would require <100TB in ASCII format.
Audio? biggest cooperation from (eg) Grateful Dead and their tribute bands! Open audio costs $10/disc – $10/hr (for vinyl or cassettes) => 2-3M discs = $20-30M to digitize.
Video? archival films and some old films. Eg, govt propaganda/ads, old classroom documentaries, “social training” videos (“Duck and Cover”), stuff in the TV archive… $15/video-hour to digitize.
Software? about 50k commercial SW titles ever. Threatened by DMCA, not technology.
Web? lifetime of a page (before change/del) is ~100 days. Wayback snapshots every ~2mo.
Open Content Alliance building open collections among univesrities, with support from MS, HP, Adobe, others. Internet Archive datacenters being setup in Europe to build up their own collections and then swap among themselves. (Lots of classical recordings that are legal to download in Europe are blocked to US visitors “due to copyright laws”.)
Opportunity they are looking for help with: front ends for searching, browsing, etc the Archive. “Can 1 person build a whole search engine given the underlying infrsatructure of the Internet Archive?” We should take him up on this challenge in the RoR class and RAD Lab apps!