Summing Up

After five months of hands-on tutorials, guest speakers, and group projects, we have reached the end of Data Scientist Training for Librarians knowing much more than we did before about data, where we can get it, and what we can do with it.  This blog has been my vehicle for communicating what we were learning each week in all its nitty-gritty detail.  For this post, I want to leave you with a broader picture of what I think we learned in this class.

These skills are important.

Knowing how to find and access data, clean it up, manipulate it, analyze and present it puts you in a strong position to make informed management decisions, articulate the value of your library, and expand the kinds of services you offer to patrons.

There are GUIs (graphical user interfaces) to get you started.

Tableau and OpenRefine are two powerful desktop applications that allow you to do some pretty amazing things with data without any programming knowledge at all.  These tools and others like them have a relatively shallow learning curve, which means you can start using them right away—and teach them to your colleagues.

But you will probably want to learn to program.

If you end up doing a lot of data wrangling and analysis, chances are you will eventually find that the ability to code would make you more efficient and give you more control over your work.  Sometimes writing a script is the only way to do what you want to do.  Also, a script can be reused every time it is needed, and chunks of code can be repurposed for different projects and datasets, which means you’ll invest most of your time in the beginning and reap the rewards repeatedly over the long term.

Learning to program takes time.

Learning to program is more difficult than learning a markup language or a sophisticated software application.  It takes time and a sustained commitment, much like learning a natural language does.  And it can only be learned hands-on, through problem-solving and trial and error.

You don’t have to become expert at all of it.

We heard time and again from our guest speakers that the work being done in this area is done in teams, from James Turk’s team of developers and data wranglers at the Sunlight Foundation, to the journalists and programmers on Matt Carroll’s data visualization team at the Boston Globe, to David Dietrich’s description of the different kinds of talent required for big data analytics.  In the real world, projects are often most successful when people with different skills and knowledge come together to offer their strengths.

But the more you know, the more effective you will be.

Even if you will not be the one designing the tools and processes or conducting the sophisticated analysis, having a foundational knowledge of the concepts and vocabularies covered in this class will improve your ability to communicate effectively with those who are doing that work and to think creatively about what your institution could begin to offer.  And as Melanie Radik discussed in her data story, it can also give you a boost in your conversations with patrons about their data needs.

With a committed group of people, you can do some impressive things.

Before ending my last blog post for DST4L, I’d like to acknowledge all the people who made this class possible.  First of all, a great big thank you to Chris Erdmann, Head Librarian at the Harvard-Smithsonian Center for Astrophysics John G. Wolbach Library, whose vision this was. The class happened because Chris thought it should happen and because he had the energy and conviction to see it through.  As I said at our recap event, it was a learning experience in itself to witness how Chris was able to take this class from idea to reality in a matter of months.

On behalf of everyone in the class, I’d also like to thank our excellent guest presenters—Rahul Dave, James Turk, Seth Woodworth, Tom Morris, David Dietrich, Ray Randall, Alex Storer, Erin Braswell, Matt Carroll, and Jay Luker—all of whom took time out of their busy lives to meet with us after hours and share their expertise.

Finally, a shout out to my fellow participants, who came week after week, giving up their Thursday nights and even many of their Saturdays to attend this class.  It was a varied group of people at different points in their careers and with different levels of experience with technology.  But there was also a common commitment to the class, a spirit of experimentation, and a willingness to be bewildered at times and press on despite setbacks and frustrations. As a student of library science and not yet a practitioner, I was especially privileged by this arrangement, which allowed me to work side by side with professionals in the field and learn from their experience.  I’m glad we stuck with it.  It was a great challenge, a unique opportunity, and a fun ride.  There’s plenty more to learn.  Let’s keep it going!

No Responses

Leave a Reply