Computer Science Department Events
There will be a CS Colloquium in the E&CS Auditorium on Tuesday July 26th at 3:30 PM
Title: The Internet Archive: Our Collections, Programs and Research Initiatives Location: E&CS first floor auditorium Speaker: Kris Carpenter Negulescu Advisor: N/A
Abstract: Kris Carpenter Negulescu, Director of the Web Group at the Internet Archive, will present an overview of the current holdings of the Archive, review active research and development currently underway at IA, and introduce other major initiatives IA has undertaken or plans to undertake in 2011-2012. Her talk will touch briefly on the following topics: - Overview of the Internet Archive: Data Repository, Books, TV, Other Special Collections, Web.
- Recent Web Archive Statistics and Reports, Changes to Ingest/Update Cycles
- Special Projects:
- Data Mining & Extraction via Hadoop/Pig, etc.
- Generating Link Graphs of an entire domain from 1996-2010 (e.g. .uk)
- WebWide crawling and Hbase
- Automated QA of web data at scale
- ISC SIE and other data collaboratives
- Dynamic, On Demand, Archiving of video, annotations, etc.
- Semantic Data extraction and IA's TV archives
- IA Data clusters, Cloud computing, VM's
- Digital Archive Services & planned bulk Api's
|