2024:Program/10 Years and 20 Million Links Fixed
Session title: 10 Years and 20 Million Links Fixed
- Session type: Lecture
- Track: Technology
- Language: en
Over the past decade, the Internet Archive has preserved Wikipedia's integrity by fixing 20 million broken links using the InternetArchiveBot (IABot). This presentation will cover their joint efforts, showcasing the bot's evolution, its algorithmic sophistication, key milestones, and the challenges faced. We'll highlight the impact on Wikipedia, the collaboration driving IABot's success, and future enhancements like AI integration and scalability. Discover how these initiatives are securing Wikipedia's content for future generations, ensuring its accuracy and accessibility.
Description
editOver the past ten years, the Internet Archive has undertaken a significant project to maintain the accuracy and reliability of information on Wikipedia. With the development and deployment of the InternetArchiveBot (IABot), it has successfully addressed the issue of link rot by fixing 20 million broken links. This extensive effort has not only preserved the integrity of knowledge but also ensured that historical and educational content remains accessible. The presentation will talk about the collaborative journey of the Internet Archive and IABot, emphasizing the impact of their work on digital preservation. By replacing dead links with archived URLs, they have managed to safeguard the longevity of online content, preventing the loss of valuable information. The discussion will cover the bot’s development, from its initial conception to its current state, highlighting the key milestones achieved along the way. We will examine the sophisticated algorithms that drive IABot, understanding how it identifies and repairs broken links with precision. The technical challenges encountered during this process will be addressed, showcasing the problem-solving strategies employed to enhance the bot's efficiency and effectiveness. The significant impact of these efforts on the Wikipedia community will be a major focus, illustrating how they have contributed to maintaining the reliability of the world's largest online encyclopedia. The presentation will also shed light on the collaborative nature of this project. It will discuss the partnership between the Internet Archive, Wikipedia editors, and the broader community, highlighting how collective input and feedback have been instrumental in refining IABot’s functionality and scope. Looking forward, we will outline the future directions for IABot and digital preservation efforts at the Internet Archive. This includes integrating advanced artificial intelligence to improve link detection and repair processes, scaling the bot to handle increased demands, and expanding digital preservation strategies to encompass a wider range of online content. The goal is to enhance the bot’s capabilities, making it more adaptive and responsive to the evolving landscape of the internet.
Session recording: https://www.youtube.com/watch?v=BbGrkYK8FEk&list=PLhV3K_DS5YfJ1xyY0LNDNX3RKyRQEXOdB&t=6772
- How does your session relate to the event theme, Collaboration of the Open?
My session showcases the collaboration between the Internet Archive and the Wikipedia community through the InternetArchiveBot (IABot) project. It exemplifies how open-source initiatives and community-driven efforts can effectively tackle digital preservation challenges, ensuring the longevity and reliability of online knowledge. This aligns with the "Collaboration of the Open" theme by highlighting the successful partnership and shared goals in the realm of open access and information preservation.
- What is the experience level needed for the audience for your session?
Average knowledge about Wikimedia projects or activities
- Etherpad link
https://etherpad.wikimedia.org/p/WM2024_Day3_Ochrid_-_Room_9
Resources
editSpeakers
edit- Cyberpower678
- Cyberpower678 is an established member of the Wikimedia movement, best known for being the author, developer, and operator of InternetArchiveBot (IABot). With a passion for digital preservation and open-source technology, Cyberpower678 has been actively involved in the Wikimedia community for over a decade. Through the development of IABot, Cyberpower678 has significantly contributed to maintaining the verifiability and accessibility of Wikipedia's content, and references, by fixing millions of broken links. This work has been pivotal in combating link rot, ensuring that valuable information remains available for future generations. Cyberpower678's commitment to enhancing the Wikipedia experience through innovative solutions reflects a deep dedication to the principles of open collaboration and knowledge sharing.
- Jake Orlowitz
- Founder of The Wikipedia Library. Lead at WikiBlueprint. Global Wikipedia strategy consultant.