^M00:00:13 >> Mark Sweeney: Good afternoon and welcome to today's LC Digital Future and You program. Today we're going to hear from the Director of Digital Strategy, Kate Zwaard and her Digital Strategy team. They're going to update us on the library's digital strategy. And the work that they're doing across the library to implement it. You know, there's no place quite like the Library of Congress. And from whether we're talking about the breath of the collections and the expertise of staff. And I've always been inspired in the possibilities of what we can do in terms of building the collections, the collecting, the preserving, and the connecting with people in many different ways. So and as you know, the library's vision is that all Americans are connected to the library. And that our strict strategic plan lays out a path to become more user center and data-driven. We've charted an ambitious course to make connections across the nation engaging and inspiring more people with our unique collections, expertise, and services. And we're doing a lot of work engaging users and enriching their experiences. So as part of this work, we adopted a digital strategy last year. And The Digital Strategy's intended to be bold. A holistic vision that enables the library to accomplish its mission in an increasingly digital world. And we think digital transformation has a lot to do with engaging users and fulfilling the library's mission. And we created The Digital Strategy directorate with Kate Zwaard at the helm to help lead this transformation. Of course, we've been engaged in digital innovation here at the library for a very long time. You know, most of us would think of that, you know, beginning back in the in the late 1960s with the development of the Mark standard for our cataloging. You know, in the 90s and in the early 2000's we were transforming how Americans connected to legislative information through Thomas. And now through ExpandedCongress.gov. You know, more recently we turned a decades-long partnership with the National Endowment for the Humanities to expand access to historic newspapers through the chronicling America. And that's a partnership that involves, you know, 50 different states. So we've been at this changing and transforming and connecting through digital technology in many different ways. And there's lots of other examples I could give, but I think those, you know, gives you the sense of the range of some of the things that we've been doing. So many of these initiatives involve partnerships and collaboration. That's both within the library as well as with institutions outside of the library. And as you're going to hear today, many hands from many different parts of the library are involved in creating our digital strategies. So that's within the library as well as collaborating with people as well outside of here. The expertise of the library staff was critical in creating this forward-leaning vision. And as we move forward together with our digital transformation I know that Kate and her team are eager to collaborate with people in this room and people across the library. We're all invested in one way or another in sharing the library's collections and services to engage the nation. So this is important work. Today, Kate and her team will update us on The Digital Strategy. And will share some examples of the work that they've been doing to implement it. It's wonderful to see these projects take root and grow. For example, I'm excited about the projects like By the People. And the potential for how digital initiatives like the crowdsourcing program can provide new levels of engagement with our users. And their understanding of the library by the American people. So I'm inspired to see colleagues from across the library working together on this, sharing their expertise, building something new in the digital space. And the library's certainly going to benefit from it. And I hope that you do as well. So what I'll do now is turn it over to our Director of Digital Strategy Kate Zwaard. Thanks, Kate. ^M00:04:33 ^M00:04:38 >> Katherine Zwaard: Hi, everyone. I'm Kate Zwaard, the Director of Digital Strategy. I'm so glad that so many of you were able to make it today. Thanks, Mark, for that really kind introduction. My team and I are here today to talk to you about how we're implementing the library's digital strategy. Some of you may remember last year, I'm sorry, last May we spoke to this group about Envisioning 2025. And we talked about how the library strategic plan and the library's digital strategy are, sort of, working together to try to vision for the future of the agency. Many of you may know, we spent about a year writing The Digital Strategy with input from around the library. Last October we released that digital strategy. And after that, we solicited an additional round of feedback from staff and from the general public. And we made a minor update last year, our 1.1 version. All in all, throughout this process we incorporated hundreds of edits and ideas and contributions. And I am enormously grateful for your good minds, lending your good minds to this project. I think that The Digital Strategy as it exists really reflects the desires and needs of the staff and leadership throughout the agency. I think it incorporates a very good balance between being driven by the strategic plan of the agency, but also incorporating your feedback and your plans as well. The Digital Strategy is intended to be sort of a coordinated high-level articulation of the library's priorities for using digital technology to achieve its user-centered vision in the library's strategic plan. And now we're working as a coalition, the library is, to achieve this vision using the strategic planning process. So the next step in this evolution is we spent a couple of months talking to the library's planning units. That's a term of art. That means all of the units in the library that made a directional plan. So that's all the service units, and the centers, and digital strategy itself. Helping them to incorporate The Digital Strategy in their future plans. And we wrote one ourselves. So now we're moving forward with our plans for implementing it. So we have our directional plan. And what we want to talk to you today is about our next steps in implementation. Because my team and I, who are here today, and you're hear a lot more from them, feel feel really strongly that implementing The Digital Strategy means working hand-in-hand with everyone throughout the agency. So one of the things we're going to try to do today is to invite you to think of the big and small things we can do to amplify digital library. We'll later invite you to think about some blue sky things that you'd like to do related to digital. And now I'll hand things over to one of the newest members of our team, Leah Weinryb Grohsgal. ^M00:07:28 ^M00:07:34 >> Leah Weinryb Grohsgal: Thanks, Kate. Hi, everybody. I'm Leah. I'm very new to the library, although not to DC. And I'm very excited to be here and speak with you today. As Kate said, The Digital Strategy is a bold vision for the library. It contains three basic components of what we like to do. And this is how we're going to be organizing our discussion today. So I thought remind you and talk a little bit about what we mean here. The three components are throwing open the treasure chest, connecting, and investing in the future. So by throwing open the treasure chest what we mean, what do we mean? We mean exponentially growing our collections, maximizing the use of content, and supporting emerging styles of research. By connecting, we mean inspiring a lifelong relationship with every visit, bringing the library to our users, welcoming other voices, and driving momentum in our communities. And by investing in our future, we mean cultivating an innovation culture, ensuring enduring access to content, and building toward the horizon. And these is the way we've laid out The Digital Strategy. So how are we going about implementing The Digital Strategy? In 2018 we launched LC Labs to provide a virtual place for experimentation, and to provide a friendly interface to our more technical side. This year we've created a directional plan covering the next three years. Which is, kind of, as far forward as we could envision in a pretty rapidly changing environment. As well as an implantation plan for this fiscal year. And we'll do one of those every fiscal year. So today we'll ill share some of the work we're currently doing in each of these areas. But as Kate said, one of the things that we are really interested in is in connecting with all of. Again, we want to hear what you think the library should be doing in this space. We want to reach out all over the library to enable our digital transformation. So after each section of our talk, we have a short exercise planned for you. You probably found the sheets that we left on your seats. And we would love to have your input however you want to share it. So we have a bowl of pens that you can use to write down things for us. If you don't hand it in today and you want to give any more thought, feel free to scan it and email it or drop it by to one of us. We have prompts for each of the three sections. And we are hoping to, kind of, give you an opportunity to think about it. And maybe discuss a little bit in between each section. We would, in short, love your thoughts about where we can go with The Digital Strategy in the next couple of years. So with that said, I will now hand things over to the other new member of our team, Laura Allen, who will begin with the section on throwing open the treasure chest. >> Laura Allen: Thanks. This is going to be, like, rapid succession members of the team getting up to talk. So I'm Laura Allen. I am the newest member of the team, but only by, like, two weeks. So thank you, Leah. I am super happy to be here. As Leah said, The Digital Strategy identifies a few ways that we want to throw open the treasure chest, right? Including maximizing the use of content and supporting emerging styles of research. So I'm going to talk about one project that's designed to do this throwing open the treasure chest. We were awarded a grant by the Andrew W. Mellon foundation called Competing Cultural Heritage in the Cloud that we really think will help maximize use of content and support emerging styles of research. Before we get into emerging styles of research, let's talk about how content is used in non-emerging methods. That is, how do we support the use of content in very established methods of research. At its most basic, the library support users as they identify content, access it, analyze it, or reflect on it. And then report on it in some way. And we have a lot of infrastructure and deep expertise in supporting a huge range of activities that rely on this model. Or that support this model where people want to study an item or a few items. Recently we've seen a growth in research questions that require millions of items. Or that require little bits from millions of items. What you're looking at here is from a project called Viral Texts, which uses the Chronicling America API as they tried to understand, these researchers tried to understand how information went viral in the 19th century via newspapers. That is, they studied the reuse of little bits of text across thousands and thousands of texts. And that method of research requires a different kind of access than the kind that we have so much infrastructure to support. In this case, the API allows for the access they wanted. This is, of course, relatively oversimplified. But in the method that they're using, text mining, computation linguistics, it can be called various other things, but it is just one of many. And Chronically America is just one of many collections that could lend themselves to this kind of work. There's also audio analysis techniques, image analysis, machine learning, many other methods that require different methods of access. So let me mention another example that calls for actually an analytic method relatively similar to viral texts. This was a request from a researcher just to do text mining on the websites of candidates from a recent election. So looking at a particular issue and how it was represented in the text of websites over time as events came up in the news how did camping websites change the way that they reported on a particular issue. It's a really methodologically quite relatively simple. And yet the library was not able to help this researcher with this question, because we simply don't have the analogous tools to make that content available in a way that this researcher can use it. And so for many emerging methods we find that the situation is a little bit more like this. And the reasons are very good. There are many unknowns about how to best provide infrastructure for these emerging styles. Creating tools for subsiding data is really expensive. We want to maximize use and support these emerging styles, but we want to do it in ways that are cost efficient, that make sense, that will actually meet the real needs of researchers. So the Mellon grant. This Mellon grant was designed to help us learn as much as we can about what people want. And about how they want to use our collections. To really maximize use in a way that helps us learn the most. So basically this green zone that you're looking at here, the idea there is that we are going to create a cloud environment. We're basically going to put a lot of data, including digital collections and metadata, into cloud storage. And then we're going to invite four researchers or research teams into the cloud infrastructure, into that environment, to do research on the collections on their own computing or whatever they want. And we'll see how it goes. We will learn what infrastructure makes sense to provide. And what they need in terms of research support. What kinds of expertise do they need to support them. They'll just be four of them. So we get to learn a lot from what they need. And then also what kinds of tools. And what are some of the costs and service models that we might develop out of this. The grant will help the library develop models for supporting large computational research and library collections. Practically, it will fund these four research data experiments using the Amazon computing infrastructure. It will allow us to hire three grant-funded staff members to help support this research. As well as providing funds to help us report on what we learn. I think it's really important to call it out that Mellon funded this project. I didn't write the grant, so I can say it's really beautifully written. Not to set up a program that we will then continue running forever, but actually to help the library figure out what we can learn about this tremendously complicated but absolutely emerging landscape of research. So the program is designed to help us learn so that we can create services and programs and technologies and infrastructure in the future with a bunch more experience under our belts. So right now it's the very beginning of this grant, like, the money came in just now. So we're getting ready right now. We're getting ready to recruit staff. We're developing the call for research experts. And finally, and this is the one right now where I'm really hoping that the people in this room can help, like, today, we're identifying and preparing collections for the cloud. So I'm going to turn it over to Jaime who's going to talk about another project that we've been doing. And then after she talks we'll give you the prompt for you to write in that top square box. But for me, I hope that what I've said for the last, hopefully, was exactly five minutes, will help you think about what collections you can think of that you'd like people to use in exciting new ways. And what kinds of methods you think the collections that you know most about would lend themselves to. What do you think people could do with your collections if only they had all the computing power and money in the world? So with that, I will turn it over to Jaime. Thank you. ^M00:17:28 ^M00:17:34 >> Jaime Mears: Is Jamie Brezner [phonetic] here? I have your bust. I tried looking for you. That can be edited out. Okay. So another way that we are trying to throw open the treasure chest is by funding a program called the Innovator in Residence Program. And I'm really curious 'cause it's been around for about a year, raise your hand if you've heard of it before. Whoa, that's like, okay, I had bets. Okay. Great. That's awesome. That's good. So if you have not heard of it that is what I was prepared for actually because it's really new. So we started as a proof of concept with two staff innovators. And we've only had one member of the public become an Innovator in Residence so far. And that's him on the bottom. He's a data artist named Jer Thorp. He's amazing. I'm sure that he got to interview some of you for his podcast, or you found him wandering the halls, which he did a lot when he was here. And I'll talk a little bit more about his work in a second. But I want to start by just framing out what the program is. So the idea is that we want people to submit, members of the public to submit two page concept proposals outlining some type of innovative vision for the collection. And the idea is that their project will have a ripple effect of access. So whatever they create will be in the public domain. And then hopefully it will be something that then, in turn, becomes a way for other users to engage with our collections and so on and so on and so on. So the idea is that the work doesn't stop with them. They're creating a tool for other people to engage with our collections. So in order to do this we need you. What happens is when we get these concept proposals in we review them. And then based on what the concept proposal is we tap technical and subject specialists to help evaluate those proposals and give feedback. And then the highest scoring concept proposals are invited to submit a full one. Which I think is really important, because everyone here knows that there are some barriers, right? In working in our context. Especially for someone from the outside. So I think it's a really great way to be collaborative, to lean on your expertise, and to really choose people that we think will be able to get the thing done that we want them to do. Okay. So what do you mean when we say innovative? I think this Roadside America picture is a perfect metaphor. 'Cause we say this word a lot, it's a buzzword. But what I think it means in the terms of this residency is essentially someone from the outside that can take their perspective and make staff and other members of the public see our collections in a completely new way, right? Completely new. So the idea would be that if it's really successful, we actually have no idea, like, what the next innovator is going to propose, right? So that's, kind of, what I think of. Is it a whale? Is it a garage? I don't know. So an example of this I'm taking from Jer. So while Jer was here he was researching the idea of bringing serendipitous discovery back to large-scale digital collections. He wanted people to engage with our collections without having a specific research question in mind. And he did this through a podcast called Artist and the Archive. And he also did this through a series of proof of concept applications that are on our website. So this one that you see is called Library of Color. And what you're seeing here is our Mark Records Collection, which we made available around the same time that Jer was onboarding. And what you're seeing are visual items. They're Mark records represented as a color spectrum, that if you were interacting with the application you could scroll over, and it would represent, you know, hundreds of items. And sometimes it's pulling descriptions of those items. And sometimes it's pulling the title. And each of the items is mapped to a color. So how did he do this. He did this with a totally open crowdsource. There's a fan. Yeah. A crowdsource data set online. It's a color dictionary that was created by the XCD web comic author. And he put a survey up online that over 221,000 people answered where he presented them with a color and he asked them to name it. And there were 5 million colors named. So Jer took this crowdsourced data set, and then applied it to color words in our collection. So for this specific one, you see in this William Hogarth engraving dealers in dark pictures. So I got this because dark someone saw as this, kind of, like, it's black to me. I don't know what it is to you, but it's, like, a black color. It's dark. It's hashtag 1B2431. And that is how it ends up here. So I'd invite you to check that out if you can. So this application, like a lot of the ones that Jer did, not only upends traditional paradigms of the way that we think about search. But he also was literally injecting what I think of it is, like, a cacophony of public voices, and the way that members of the public names things. And, kind of, like, mashing them together with the way that we ourselves have described things, like, very formally in Mark record. And I love the dialogue that happens in this. He did it for a lot of his collections. Okay. So I am very, very pleased to announce -- The press release has not gone out yet. But we have two innovators. We're playing with a cohort model for fiscal year '20. Our innovators for fiscal year '20 are Brian Fu and Ben Lee. They were actually here last week onboarding. This is Brian talking with some NBRS people. And his project is going to be called Citizen DJ, where he will collaborate with library staff to identify sonically interesting and culturally relevant audio and moving image collections that are free to use for sample-based hip-hop production. So by embedding these materials in hip-hop, he's hoping that listeners will be able to discover items that they never knew existed. It's going to be so cool. And then Ben, I don't have his face, but you'll see him son. Ben is going to be applying machine learning and computer vision to basically extract images from Corpora [phonetic] at the library. It might be one library collection. It might be multiple. I'm not sure. This is an example of a possible workflow with our Chronicling America collection. Where he's extracting an image, the computer is labeling it horse, person, whatever. And then he organizes those images based on likeness of what is inside of the image. So the idea is that we will have some image data sets. And then we'll have a visualization where people can browse by image at the end of his time with us. >> What is Corpora? >> Jaime Mears: Oh, sorry. It's actually a word that he was using a lot that I've just started using. Where it's, like, a corpus of things, like, things that come together. That's what I think of as a Corpora. Mm mm. I guess Corpora is multiple corpuses. Which we're hoping that he'll do more than one collection. So that's why I'm saying Corpora. So if you did not meet them last week they are coming to the holiday party. And I'm going to be bringing them around. So hopefully you'll be able to meet with them in person. And if not, they'll be presenting the week of December 9, about their projects. Probably in West Dining Room or something. So we'll put that announcement in the Gazette. Okay. So now you're going to talk. So we didn't know -- I don't know, it might be weird because you don't have something to, like, put your thing on, but we have pens. Abby Potter has pens. So we are really interested in the next 10 minutes in having a conversation about what your ideas are. So what would you like, how would you like to see the library throw open the treasure chest? And we really see your responses, kind of, running a spectrum. They can be directly related to your work, your specific area of expertise that you have. You know, is there some idea that you've been thinking about, like, for a while? Or do you have a big library-wide idea that's, kind of, been in your back pocket and now's the time to let it out, you know, in a healthy way. So, yeah, it's not very serious. We hope it's fun. There'll be five minutes for you to think and write. And then five minutes where we're hoping that we can just hear some ideas. And then hopefully we'll be able to collect the worksheets so that if anyone doesn't have a chance to speak. >> You can put your name on them. And then if you want, there's probably a blank one near you. So if you want a version with your name and a version without, you can probably do both. >> Oh. >> Just saying. >> That's so innovative. Okay. Yeah, so I'm not sure what time it is, but we have five minutes. >> Meghan Ferriter: Okay, this is the best. We love hearing you laugh and chat. But we actually cut our 10 minute short, so we can share a little bit more and do some more exercises. So please continue writing if you're writing. I'll try not to distract you with what I'm going to say. I'm Meghan. I'm also part of the team. And our second goal in The Digital Strategy is we will connect. So really this means making our incredible programs, services, content all available and accessible, available to and accessible by Congress and the American people. So I'm going to share a little bit more about some of the ways that we are working to connect with everyone through our communications approaches. So it'll be a bit of broad strokes. A few specific things. If you have questions, please come find me, be happy to share more. If you have ideas to collaborate, we love learning from our community of practice here at the library So using our blend of communications vehicles, we've really tried to build bridges to help create understanding about the library's Digital Strategy. And also the work that we're doing every day. So I'm going to take a few minutes here and share a little bit more about what we did in the last year. Spoiler, we were very busy. And the ways that we spent it growing our network and trying new communications approaches, showcasing our work, and amplifying that of our colleagues, and connecting with others through our work, and a range of different approaches. So through a number of touch points, we really tried to stitch together approaches and harness the incredible resources we have to share here at the library. So that includes the content, services, and programs in which you work. And the work that we're doing to facilitate different uses of our collections, or new ways of approaching the work that we're doing here at the library. And we share our work through channels, like, our website, the signal blog, Twitter, and our Listserv. But also through our internal networks here at the library, such as The Gazette, forums in Listservs, open houses, town halls. And our brown bags and working with our community of practice who are sharing their approaches as well. And then we reach further to our peers and new audiences through conferences and hosted events. So with all of these interconnected communication channels, I often, kind of, call this our ecosystem, we hope to build stronger ties and stability that help us take the next steps in moving toward inspiring lifelong relationships with the library. So from our metrics, we have evidence that we have done an okay job anyway. We've maintained steady engagement with our visits to the website. We have many increased downloads of data sets and the reports that we share on our website. But we also know from qualitative measures that our peers and other organizations are actually trying some of the approaches that we're trying as well. So that includes creating spaces such our LC for Robots page so --- And actually doing a little bit of a better job than us in creating computational access to resources. And also other folks are trying and leveraging the Jupiter notebooks that we've created for use of content, such as derivative data sets that we share on our website. So most of our work really in our communications approaches is about illuminating the work of our colleagues who are stewarding and making available our incredible collections, metadata, and authoritative information. What we're able to do is really only possible, and what we're able to facilitate is only possible because of your work and the work of our colleagues here at library. So we aim to use our communication channels as a window through which people can gain a view into what is available to all Americans in the Library of Congress. And if we approach our work in this way, we really are going to make great strides in bringing the laboratory to our users. In The Digital Strategy this approach is really encouraged. Quoting now among our most important treasures at the library are the knowledge and wisdom of our staff. We will empower our staff with tools and pathways to make it easy for them to share stories, standards, expertise, and data with the broadest audience possible. And we want to use our opportunities in communicating across our many channels and in our ecosystem to really communicate with this spirit in mind. And so far it's working. So for example, last year we hosted two Twitter chats. Jaime showed a picture of our serendipity run with our innovator in residence Jer Thorp. And we also hosted a web archiving Twitter chat to draw attention to newly available web archives derivative data sets. And highlight our incredible web archiving team and the practice here at the library. For example, during the web archiving Twitter chat, we sent about 28 tweets over the course of 3 hours. That sounds like not that much, but when you're typing it live and making it go the time flies and we had great fun during that time. But we reached over 52 and a half thousand accounts, and we made nearly 290,000 impressions on Twitter users. And that's a pretty big impact for a quick amount of work in a three-hour chat. We also live Tweeted events that we were attending, including those we hosted, like the Machine Learning and Library Summit and the Arts and Humanities Research Council, UK, US, Collaboration Workshop. And we went behind the scenes here at the library. We highlighted preservation and conservation. And we rode along with innovator and residence and made that available to people as they were, kind of, following as well. And then we went, as we mentioned we go to a lot of events and meet people who are practitioners or who are looking into similar approaches that we're using here at the library. As we cohosted a data jam at the Association for Computers in the Humanities during which participants created new apps and visualizations using library APIs. And of course, we continue to collaborate on the signal blog with the digital content management section of DCMS. And through the signal we continue to publish blog posts that illuminate practice, lessons learned, and encouraged more questions. So finally, we continue to aim to bring people together digitally and in person to share and learn from one another. As The Digital Strategy states, a traditional strength of libraries is a willingness to work together. When we collaborate, we can achieve together what we cannot accomplish alone. And that is a really strong them through our communications approaches as well. So as you've heard, seems like the word connect is definitely a buzzword in our presentation today. But we really do want to hear from you. So please get in touch with us. This year we want to improve the ways that we share information, opportunities, outcomes, and our progress towards The Digital Strategy. And we'd like to hear from you about the ways that you find information. So not just from us, what are the methods or the resources from which you gather information. And how do you find out more about what's happening here at the library as well as in your professional sphere. So I'm going to hand over to Abby Potter, my colleague here. And we'll be back to chat with you in a bit. ^M00:32:50 ^M00:32:56 >> Abigail Potter: Thank you, Meghan. And as Meghan talked about connecting with communities is a big part of The Digital Strategy. And part of that work is translating the needs and opportunities in The Digital Strategy. And from the experience and pilots that we undertake to different communities. And connecting to committees can meet a lot of different things. A big part of the experience we do is learning more about the nontraditional users of the Library of Congress, or sort of new users. We want to explore new ways to reach and engage scholars using digital methods. Artists like Lori and Jaime described. We're also interested in learning more about how to engage with the general public and underrepresented communities. And how all these different communities can be present in the library today and in the future. So we also want to connect with our existing partner communities. So end-users and in this we're including the staff of the Library of Congress. And we want to deeply understand the needs of our, sort of, existing partners and what drives momentum in these communities that we're already partnered with. Both of the categories of user communities, the new and existing have a lot to offer us. And part of The Digital Strategy and the overall strategy of the library is to create an environment where all communities are inspired to engage with us. And today I'm going to talk specifically about engaging with expert communities. All the people in this room together here are part of an expert community. It's a community of staff. You all know very, very well how this institution works and how it can be improved. And you're all very passionate about what you contribute to the library. And we as lab's team are trying to explore different methods to harness and capture this expertise and passion to help achieve The Digital Strategy and the vision to connect all Americans to the public. All the public to library. So this is a very oversimplified drawing of, sort of, what a labs process is. And I circled staff ideas 'cause that's what I want to focus on here. But we have things coming in, ideas, services, questions. We go into the lab. There's collaboration, experimentation. We try new methods, processes. We talk to a lot of people. And we develop new pathways of doing things. And then outcomes, I think it's important to see that not everything that comes out of the lab is a tool or a digital thing. It's services, methods, skills. Those are, sort of, the things that often make the most difference. This diagram came from a book that was recently published called Open a Glam Lab. So if you have been following the Signal or Twitter, you might have noticed that we took part, I took part in something called a Books Sprint. A Book Sprint is a rapid publishing model. It's a method that's been around for about 10 years that came out of the open source software movement. And where's where a group of people, no more than 15, get together and share their knowledge and perspective. And they write a book in five days. So this is something that I did the last week of September. And the book was published, I don't know, last week. So it was a very rapid process. But it was also really interesting. It was intense and it was fast. There's no time for extensive research or wordsmithing. So really brings the tacit knowledge out. So the kind of thing that you would share in a hallway conversation. And try to get that knowledge out of people's brains and onto paper. It was a great process for producing something from a collective voice and perspective. It was a facilitated process so no one voice dominated. And there was an extensive, sort of, reviewing and rewriting process. And the book itself it came out great. It really articulated why new ways of approaching the work of galleries, library, archives, and Museum, that's what the GLAM stands for, why having these, sort of, experimental spaces is important to when we're looking at what to do in digital. And we see a lab as a place where you can provide space to be you can be open to experimentation and risk-taking and iteration. And this theme is also, sort of, what the Book Sprint is all about. This is the map of, sort of, representative of people who participated in the Book Sprint. It gathered expertise from around the world. And the expertise was really important to write in the book. But what the process also required was for the participants to be vulnerable and open to critique and able to accept other people's perspectives. So this process it was difficult, but it did create, you know, a book length publication in five days. And it also really bonded the group together. And positioned it to, sort of, do even more. That small group was representative of a larger group that is calling itself a GLAM Labs Community. It's about 60 institutions from about 30 countries. And some of them are on the map, not all of them. And you'll see there's one little dot in North America. And that is us. That's the Library of Congress. And to hoping to build on this momentum we're going to host as part of the Collections as Data series in May we're going to host this group at library to try to think about what's the next big thing that this group can do together. So you can find the book at Gamlabs.IO, Glam Labs with an S. Or you can find it from our website under our Report section. Okay, so now we're here at the exercise. So you saw that some impactful work can happen that aligns The Digital Strategy without one line of code being written. I think we want to underscore the fact that a lot of the, sort of, experimentation, innovation that we do is not, sort of, coding. It's a lot of process work. But we're interested in trying out a lot, a method that's similar to the Book Sprint where we can focus intensely on a specific topic over a restricted amount time to draw out the expertise from the people in the library and cross pollinate ideas. So we're borrowing language from what Lori and her colleagues have come up with of when she was at Penn around something called a pop-up lab. So there's some proposals, kind of, floating the library right now about doing a pop-up lab about a specific, sort of, idea. But we want to ask you, how would you use this method. And what, sort of, topics would you like to explore. So the exercise that we have in the second boxes we're asking you to, if you were given a whole week to, sort of, work on one gnarly problem, what would that problem be? And it might not be a problem. It could be an idea. What, sort of, you know, what do you think just needs an extra push or some extra examination to, sort of, get to the next level to, sort of, propose. Maybe to propose an experiment with labs. Or to propose something else over a different channel. So that's a question. And you have that space to fill in. We're interested in, sort of, your ideas, and also to, sort of, figure out if this, sort of, pop-up lab idea would work, sort of, on a borrow [phonetic] scale so -- Yes. Oh, yeah. Yeah, but five minutes 'cause we're running a little bit over. So five minutes to write down your idea. Start now. Okay. >> Eileen Jakeway: Hello. Alright. Hey, everyone. Thank you so much for willing participation. I am Eileen Jakeway. I'm an innovation specialist with Labs. And I'm kicking off the third part of our plantation plan, or sorry, Digital Strategy, which is to invest in future. Which is, kind of, an overwhelming prospect if you ask me. So what I'm going talk about for the next three to five minutes that I have of your time is machine learning. Which is a new technology, well, a technology that we have, sort of, been investigating for the past several months. And you'll hear more about the specifics of a project later on from Meghan and Abby. So I want to just start just, kind of, for my sake, and hopefully for someone else's in the room as well, asking the question what is machine learning. Because in a way the name of the technology is a bit of a misnomer. Because really it's not machines doing all the learning. It's very much a work of human labor that entails as much human intervention as, like, coding website, right? So in its most boiled down form, machine learning is essentially training computers and computer algorithms to recognize patterns across large data sets, right? And then the second part of this slide is using those patterns to make sense of the new maybe previously unseen data set, right? So sometimes that includes supervised learning in which humans are actually intervening by labeling or segmenting data And so I wrote down here, because I wanted to really say it to this group, that as professionals whose work entails identification, right, and categorically structuring data on a daily basis, you're a crucial part of this loop. So I wanted to give a brief example. And I'm sorry if it's a little juvenile. But essentially if I have a data set consisting of dogs, cats, and I think I wrote elephants down here, I go through in my training data and I label all of the cats as cats, right? So presumably now, the algorithm will have been sufficiently trained to pick out all of the cats in a data set that it hasn't seen before by looking for visual features that it now believes a cat to possess, right? So this is actually pretty fun. This is images from the library's collections of free to use image sets. The image on the left would be a cat. And the image on the right would be not a cat or an ostrich, according to the caption of the picture. And I'm hoping one of you can maybe explain that to me at a later date. Because I actually think this picture would be really hard for a machine learning algorithm. Because the caption underneath reads that it also has very particularly un-goose-like features, feathers, sorry. And so I don't know what that means, what an un-goose-like feather looks like. But I think you have it on the screen here. So why, might you ask, is missing learning relevant to libraries and to this library in particular? Well, I think, you know, part of that is throwing that question back to you in terms of people who have a very, very intimate knowledge of taxonomies using library settings and labels and structures of data that are helpful for exploration and discovery. But I do know a little bit about how other libraries, and this library in some ways, is using machine learning for a series of tasks. So some of them are up here on the screen. I don't want to bore you by reading all of them. But thinking about identifying and extracting types of content, right? So visual content versus textual content. Being able to generate labels and tags that actually increase and enhance discoverability by being entered in as metadata or used in discovery tools. Doing quality assessment of some of the digitized collections that we have. And maybe even being able to identify subjects in photographs. So on September 20, this is the same blazer, okay, so you were wondering, we hosted a conference that essentially convened around 75 professionals who are combining cultural heritage and machine learning in some way. So the purpose of the summit was in part to actually run a, sort of, survey of what are people doing and how could we potentially be using this technology. So I wanted to just showcase three quick examples that came and presented at this conference. Because I think their work demonstrates some of the uses that might be the most applicable for you in your work. And hopefully will, sort of, prompt further ideas from the group. So the first is called the Civil War Photo Sleuth Project. I don't know if some of you have heard of this before. It's based out of Virginia Tech. So not too far. And the core purpose of this project is to identify the soldiers who were in I think over 40,000, 4 million, excuse me, portraits of soldiers from the Civil War. So it's a combination of crowdsourcing and machine learning. So essentially the process, I'm sorry, I didn't realize this is cutting into my time. So essentially the process is they have created an archive using materials from the library's collections and other major cultural heritage institutions. But also from private collections. So allowing people to actually upload and share pictures from their family. Or from their local library. Or from other places where we might not already have absorbed these materials. Mapping, sort of, doing facial recognition software to map the reference points on someone's face against the reference points in a database that we already have. So whether that's people who've already been identified in other archives. Or whether that's through crowdsourcing by people going in and actually manually tagging people and their identities. So that's, kind of, one example. A second example was actually a project that took place here in DC at the United States Holocaust Memorial Museum. So the lead on this project is Ben Lee, who is also now, as Jaime mentioned, one of our current Innovators in Residents. And his problem was one, I don't know if you're familiar with it, but essentially was one of searchability. Because he was interested in looking at the reference cards held in the central name index of the International Tracing Service. So these are reference cards that actually point back to death certifications that were, sort of, issued from people in the concentration camps in the Holocaust. So problem is, these reference cards are interspersed with the other 39 million items in the index. And so Ben used essentially machine learning to train an image set on what is a death certificate reference card and what is not, right? So his training set, I wrote this down because I did not want to lead you astray, was 22,117 cards that he hand labeled. And he got his algorithm to be so effective that he was able to run it over all 39 million scans in the index and retrieve 312,183 death certificate reference cards that were previously only indexed by name. So there is an article that I'd be happy to share if you want to know more about the specifics of that. Or how that, kind of, worked out. Actually, you can ask Ben when he's here. And then lastly, this is a tool that is currently in the making at the Smithsonian Data Science Lab. So also pretty local. And they're using machine learning to identify duplicative images on their hardware and network drives. So using a hash algorithm to, and I'll confess this is, like, the one that I probably know the least about technically, but to essentially, like, run a series of the hashes against the databases that they have to see where they retrieve identical hashes. So it's essentially, like, compressing an image to see where there are duplicates. And whether or not that was intentional. And which one is higher quality. So that's just to get you, kind of, thinking about machine learning. I hope it was instrumental and you learned something. And, yeah, I will hand it over now to Meghan and Abby to talk more about a particular project that we did. Thank you. ^M00:50:29 ^M00:50:36 >> Meghan Ferriter: We're back. So we're going to share a little bit about a project that has been ongoing since July. And we'll wrap up actually in January. But tomorrow we will hear some in progress results from this team at the University of Nebraska Lincoln. So we were interested, as Eileen mentioned, in what began as a summer of machine learning and has turned in, blossomed into a season of machine learning. We wanted to really know about the ways that machine learning processes work on library collections. And what information could be created. As well as directions or indicators for machine learning applied to our collections broadly. So to move forward on this idea, we created a statement of work. And we released a request for proposals. And specifically, which seems to maybe be a little different than these processes, we really wanted to hear about how it works inside the black box. So rather than just asking for a full-scale solution, we wanted a research collaborator who would partner with us, help us to codesign, and make decisions together. And also understand from a professional perspective in a computer science profession perspective, why and how decisions are made with using certain types of material. So we were very fortunate that we put the contract out for bid. And we received a proposal from the University of Nebraska Lincoln and the Project Ada [phonetic] Team, who've worked closely with Chronicling America content and identifying poetry content within newspapers. And we worked with them and they, within their proposal, articulated a plan of research that would address our goals. And also result in a prototype of report, collaborative research design, and research support for two graduate students. And then we had -- And that was Doctors Liz Lorang and Leen-kiat Soh. And we had doctoral candidates Mike Pack and Yi Liu come to work with us on-site at the library for six weeks before they returned back to Lincoln. And they said they really enjoyed it because they had the freedom to focus on just the work and, kind of, get lost in collections. So the project in total is 21 weeks. But throughout the process, since the start of July, we've been having weekly calls and now biweekly calls to, kind of, check in on progress and to answer, to questions to connect Mike and Yi with staff here at the library from across library services and OCIO. And they really began working pretty quickly with what was available from the library's API, APIs, excuse me, and they specifically were looking at digitized newspaper content, handwritten materials, and rare book content. When we started this process we also had a kickoff meeting. And we really discussed some of the possibilities and concerns of applying machinery to our collections here. And so were chatting about things such as improving discoverability of resources, wanting to know how machine learning can help us learn more about our collections. So leveraging the work again. I mentioned earlier leveraging the work of staff in this space. And some good discussions around the apprehension of how to deal with bias in the data and the ethics of using these technologies in our spaces. So as we moved forward, Mike and Yi designed five small projects in just six weeks. It took about a week and a half for each project. And their goal was with their time on campus here at the library to get enough of progress on each of their products that they could go back and iterate on those in Lincoln. Which they did successfully. And I also gave them the bid with less than 24 hours' notice to give us a presentation the day before they left. And they nailed it. It was really incredible. So some of the things specifically that they are focusing on. This slide is shared via link cat. So this presentation at our machine learning and libraries event. And in essence the main approaches that Mike and Yi are exploring in their own research, and so they brought them to bear on this project, are segmentation, figure extraction, and image quality assessment. And in essence what they were able to determine pretty quickly is that they, we had wondered if they would be able to use some of our crowdsourcing content as a training set. So from our Beyond Words application. And it turns out that that was not clean enough data for them to use. So great finding for us. And makes a lot of sense since it was not designed for that purpose. It was designed to help augment and caption images. They were able to use the European Newspaper Project as training data. And one of the really exciting components is that they were able to transfer the model that they created around newspaper content to rare book content for figure extraction. And the demonstration of that showed that just with a little bit of extra training, they were able to really refine that. They also explored several models to determine which would be the best to use for these particular approaches. So as you can see here in this slide, it's pretty small actually, so I'll read to you. They were working on document segmentation, figure and graph extraction, text extraction from those spaces, document type classification. And in the document type classification, this is specifically working with material that is currently presented in the By the People Project. And they were able to, kind of, determine some ideas about complexity of those materials based on basically density of text within those images. And then they also were able to do some quality assessment of the images that are presented in Chronicling America. And create a few preliminary recommendations around the feasibility of using those collections within some of these models. It was a pretty exciting outcome for us. And one last thing to wrap up, that we will be sharing the results of their project openly on our website when we receive them in January. And if you have other questions we would be happy to answer them. >> Abigail Potter: I want to say one thing. >> Meghan Ferriter: Yeah. >> Abigail Potter: And then I'm just going to wrap up that by, sort of, mentioning the summer of machine learning theme that we were talking about. That is a theme of doing an experiment, having a meeting, doing an internal experiment. That's a sort of way that we're thinking about tackling these, sort of, more forward horizon issues, like, machine learning. We're thinking about different kinds of, applying these methods to different content, like, AV or maps and what, you know, what would the experiments around that be. You know, what would different events or different, sort of, conversations that we could have around those different formats. So I just wanted to call that out that there's this, sort of, seasonal learning is to move us along on reaching our, sort of, horizon goals. So that we can, sort of, work with you all about how to operationalize these ideas and the things that we learn. >> Meghan Ferriter: So we are now into our third exercise. You can turn your sheet over and find a new blank field. So we've shared a lot about some of the things that we're currently working on. And we have wonderful hallway conversations with you all about ideas that you're interested in. But we really would like to hear from, you know, what is the most exciting thing you can imagine the library doing. And then associated with that, what would it really take to make that possible. So this is an opportunity to engage in some blue sky thinking with maybe a little bit of practicality, kind of, tucked in there as well. >> Abigail Potter: And you don't have 20 minutes, you have five. >> Meghan Ferriter: Yes. >> Abigail Potter: So -- >> Meghan Ferriter: Sorry. Moving quickly. >> Abigail Potter: And then we're going to come back. And we'll do some QA or little wrap up and QA. >> Meghan Ferriter: And a little bit of a preview of the next LC's Digital Future and You as well. >> Abigail Potter: Okay. >> Lauren Algee: Hello. So hi, I'm Lauren Algee. I'm the final member of the Labs team. And I'm one of the community managers for By the People, the crowdsourcing program that just celebrated its first anniversary here at the library. So to circle back to Mark's introduction, you didn't hear anything about By the People yet today. And that's because we're having an entire other Digital Features and You about crowdsourcing next month. So you have to come back. But I will quickly throw some numbers at you since we celebrated our birthday. And we're really excited about what we've accomplished in our first year. So in one year of By the People, which looks like this, we've launched 11 different crowdsourcing campaigns. We've had over 11,000 volunteers register with the site. And many, many more have contributed anonymously. You don't have to register. Thirty-four thousand pages have been completed in the last year. We have another 56,000 waiting review, peer review, also by our volunteers. And 8,000 are back in LSU.gov. So that's just a teaser. And I have two final assignments for you, though, before you can leave and before I had it back to Kate to close out. And then we can answer all of your questions. First, you have to go try By the People. How many people in this room have tried By the People? Yay. Those of you who haven't, I've noted what you all look like. But especially, you know, I hope that you'll come back for next month. But if you do we really, you know, it's going to be much more fruitful for you if you've given it a try. And we can have a really great conversation about its future and things you might think would make it better. And then also there's one final little panel on your card that will help us shape next month's program. Which is write down one question that you have about By the People as a program, as a platform, about crowdsourcing at the library. And we will do our best to answer all of those questions when we talk to you next month. So that's all I have to say for now. Stay tuned. And Kate, do you want to lose up? >> Katherine Zwaard: Okay. Thanks Lauren. I'd like to thank Judith and Angela, too, for planning this and for doing all of the actual work to make this possible. And thank you all for coming today. Thanks, Mark, for the introductions. I do want to note that many people told me they had to leave at 3. What time is it? At 2 today. So if you notice empty seats it's not because people stormed. It's just because they had other meetings. So don't panic. But I think we've got, like, 10 minutes. So if there are comments or questions or things you want to talk about, now would be a great time to do that. Oh, and also there are people in the back, hand them your worksheets. Do that. Yeah. Yeah. Any questions or comments, or concerns, requests? Song titles that you -- Ideas? Things that we should be thinking about? Ways you would like us to get to know you and your interests and needs and -- Let me think. What else? What else? Yes. >> Hi, first of all, thank you so much for putting on this wonderful presentation. A lot of this is news to me, even though I do work in a digital division. So it's really wonderful to see you out and engaging with us. I was just wondering in general, this is one of my questions but my pen ran out of ink, do you do any open demos to the LC staff of some of this technology? >> Katherine Zwaard: Great. Thank you for the kind words. And the question was do we do any open demos to LC staff for this technology. So I think Meghan shared a little bit about our communications work in the past. And we've been really heads down in doing things. And now I think now that we have a little bit of additional capacity, it's time for us to pick our heads up and think more strategically and coherently, cohesively. Coherently? Both. Cohesively and coherently about how we can engage with you all and, sort of, you mentioned that even though you work in a digital department, a lot of this was news to you. That's not what we'd like, right? We would like for you to be, sort of, at the level of awareness that you'd like to be. So I think one of things we're considering is open houses and things like that. But I think actually what would be best is if you could give some examples of things that you'd like to see. I mean, would you like to see that, sort of, thing? Like, a regular check in or -- >> Yeah. I'm looking to get more involved on Listservs. And, granted, I've only been a library employee since the end of July, so that might also be why. But I think I'd be really interested in the open house idea. I think that would be terrific. More of a come as you can, especially with, I'm sure a lot of us have lots of responsibility that we have to attend to as well. But I like that you have events like these networking us with what's going on maybe in different departments of the library. I noticed that there's a communications initiative in place where you engage on behalf of all the departments that are working with digital technology, what everybody's accomplishing. So I'm definitely going to look at ways that we can engage our projects over in my division with you. >> Katherine Zwaard: Great. Thank you. Oh, I should mention, that one pager, please share that widely. It contains all the information about the directional plan and the strategic plan and the FY20 plan. So please do take a look it. And share with anybody you think might be interested. Yes, ma'am. >> Terry: First question, is the one pager on confluence? >> Katherine Zwaard: It sure is. >> Terry: Awesome. >> Katherine Zwaard: Yeah. >> Terry: Okay. Have you -- >> Katherine Zwaard: A fantastic piece of technology, courtesy of our friends in OCIL [phonetic]. >> Terry: Have you done anything with kids since they think of technology in a whole different way than we do? Like, a workshop in the Young Reader Center to, kind of, say, you know, what's your blue sky idea as opposed to all us old folks. >> Katherine Zwaard: I think that's -- So Terry asks have we engaged kids? Because they think of technology differently than we do. And I think that's such a super important question. A couple of years ago I read a WIRED explainer about Snapchat. Which is it used to be a magazine for people who knew things. And that explained -- A toy for children, right? And I've been, like, checking out Tik Tok. I don't know, how many of you have seen, have been on Tik Tok? It's so weird. And it definitely doesn't fit with my, like, old person reign. And so it will continue to happen that there will be platforms and places that people that we want to engage with crop up that we are not even aware of. And so we've done some things with the Young Reader Center. We we've been engaged in the past with the team board. You know, I continue to play on ridiculous apps at night while my husband tries to tell me about his day. So I think that's a really good point. ^M01:05:09 ^M01:05:17 >> Was there a mouse? Okay. Is that it? Laura looks like she wants to say something. You do. Yeah. Yeah. I can tell. >> Laura: It was just striking me. I found it all so interesting all the things you were talking about. And trying to think of ways that the thinking can be brought back in to so the processes that people go through in planning these projects. If there was some way of making a brainstorming sessions open to people to, kind of, so we could be privy to how the, things that people are thinking about. It also struck me recently there a lot, the questions that come in about digital scholarship. And all those Twitter questions. Which are, you know, not dirty laundry. But there's a whole database of them. And I was thinking, god, can we just use them to teach selves, you know, what people are looking for. Also, if there could be some information shared in the kinds of questions that come in like that. What do people want to do? Because I feel like I don't know. I'd love to know more. >> Katherine Zwaard: Yeah. >> Laura: And understand the thinking. And the language that you all use that I don't, you know? Doesn't come easily to my tongue. >> Katherine Zwaard: So I've been instructed by Judith to repeat the question. But that one was, sort of, complicated, so I'm not going to do it. But I think I'm hearing what you're saying. And I think there's a a couple of things that we could do here. And one is that, you know, a while ago, in 2017, the library convened a Digit Scholarship Working Group Report, which you know about. A Digital Scholarship Working Group, which produced a report. And that report has been circulated. I think we're looking to figure out what of that we can publish to the wider world and, sort of -- Because it creates a necessary framework upon which some of this work that we're doing rests. And in part of that is we did some research, the group did some research into what question point questions were there about digital scholarships. So one of the things we wanted to analyze was what demand is there that we're not meeting. And I think that there is, it's easy the ways in which we're meeting user demand are very visible, because we're doing it. And we get, you know, we get user numbers. And we get research outcomes. But the things that we're not able to meet are invisible because it's a, sort of, a no and it doesn't really go any further. So as part of that working group report, we collected a bunch of those requirements and use cases and documented them. And I think that if you are personally interested in that, we would love to chat more. But I think that that is a valuable tool. One of the things I've heard from a bunch of people is the interest in the model of British library labs. Which does a, like, quarterly introduction to digital scholarship. And digital library at British Library. And I think that, you know, sort of, resourcing is a question how we would actually pull that off. And who would do it. But I think, I know when I was a new person here that would've been super useful to me. So I think that's worth chatting about as well. Yes, Elaine. ^M01:08:32 ^M01:08:39 >> Elaine: So my question is as you're thinking, or you've been in The Digital Strategy, kind of, doing this for the last year, and what has your vision around operationalizing? So there's a bunch of stuff that is happening, that's gonna be happening with machine learning. From those findings, or even from tools that are created from those, or even an outcome of a tool, are there any, like, a vision for how you will share those tools to the staff that could use those tools within their own work? >> Katherine Zwaard: So Elaine asks about how do we operationalize the things that we're working on. And I have a nuanced not very satisfying answer for you. I think that Abby showed that chart of inputs and outputs. And I think that it's important to note that a lot of the outputs of these experiments are our own greater degree of understanding of a solution set or a problem space and/or a better fidelity in questions that we're asking and not actually tools. But sometimes tools will be the outcome, right? Sometimes we will try a thing and, you know, we're working with a division on something, and we try a tool. And it's just perfect, and we just want to implement that as is. I think that's actually going to be a very small percentage of the projects we work on. I think more likely it will be things like the Melon Project that Lori talked about. Where we'll do two years of experimentation. And then we'll have a better sense. Management will have a better sense. Staff will have a better sense of how or if this fits into our broader work planning. When we come out with a new tool that we want to put into production, then we go through the IT investment process. So that gets prioritized and resourced in the same way that any new technology investment would be done. Does that answer your question? Thank you. It was a great question. I mean, all the questions were great. You know, I don't want to say yours was the best because it was just as good as all the other ones. Anybody else? Okay. Please get in touch. You have our email. We would love to chat. Thank you.