Data Scientists : Tools of the trade


Today our Data Scientist, David Meier, talks about some of the tools he uses on a daily basis and why he chose them.

Trying to change the oil in your car with a paintbrush would be quite a debacle. Most people have heard the adage, “The right tool for the job” and it is as valid a statement with regard to social science research as it is to car maintenance. Often, we find that several similar tools could do the job but there is one that does it just a little bit easier, quicker, and/or better than the others. Sometimes that one tool has a price tag or learning curve that is just too much to bear.  Below I’ll outline what works for me.

Please keep in mind that what follows are only my personal opinions and not the official views of the Institute for Learning Innovation. My experience is genuine, with no paid endorsements for any of the tools mentioned below.

Online Surveys

The Data Scientist can’t do much without some data. I’ve used both Qualtrics and SurveyMonkey in the past and while overall I prefer Qualtrics, I’m currently using SurveyMonkey. Qualtrics certainly has more bells and whistles, however, I’ve found that their business/pricing model changes over the last few years makes them not really a viable choice for smaller organizations anymore. Although neither Qualtrics nor SurveyMonkey offers a perpetual license option, SurveyMonkey’s paid tier plans are far more reasonably priced. While SurveyMonkey lacks the myriad of intricate options of modern-day Qualtrics, for the most part it does what I need it to with their Team Premier paid tier including functionality such as question and page skip logic, crosstabs, multilingual surveys, and up to 15,000 responses per year. We heavily utilized the skip logic functionality for our California State Library-funded California Cultural Collections Protection Survey Project.

Quantitative Analyses

Okay, I’ve got the data, now what? When it comes to quantitative analyses and inferential statistical techniques, I prefer the Statistical Package for the Social Sciences (SPSS). Honestly, I’ve always been a bit biased in favor of SPSS because that is what we used in my graduate program so I was pretty familiar with it. However, like Qualtrics, the pricing and package model for SPSS has changed over the last decade. Gone are the days of buying a perpetual license for a particular version. Now it’s all about monthly “rental” and modular addons. Instead of the continual expense of SPSS, I’ve chosen to use Stata. Stata offers perpetual licenses and lies in the sweet spot between the expense of SPSS and the sharp learning curve of R. The quantitative analyses for many of our projects such as Rural Gateways and STEM 360 were performed with Stata.

Qualitative Analyses

So that covers the numbers, but what about the letters? For qualitative analyses I’ve used both Atlas.ti and NVivo 11 Pro. Atlas.ti, it worked well enough for me but I found NVivo 11 Pro to be “cleaner” and more user-friendly with a lot more advanced functionality. Given this and its perpetual licensing, I would definitely recommend NVivo 11 Pro to both the novice and experienced qualitative researcher. Right now we are utilizing NVivo 11 Pro for the thematic coding of hundreds of documents for our Women in STEM Conference.

File sharing

Analyses are done, time to share with the team. There are a lot of file-sharing software options out there. Honestly, I find many of them to be similar with regard to price, storage, and usability. However, for a larger segment of the global population, I have found Basecamp to be the most user-friendly. Basecamp is costlier than Dropbox if you have less than five users and has less features, but I’ve found these tradeoffs for its simplicity of use to be more than worth it in the long run. 

Reference Management

Okay, time to share the results with the world. A common dissemination method is the journal article or book chapter. Reference management software such as Zotero, Mendeley, and Endnote can make your writing time a lot easier, helping you to almost magically insert citations and create bibliographies. Honestly, I’ve only now come around to the idea of using one of these pieces of software. I’ve been steadily creating my own libraries of articles and splicing together my Frankenstein Master Reference List (FMRL) over the years, but a recent demonstration of the ease of citation insertion and “tracking” within a Word document has convinced me to drop the notion that using this kind of software is “cheating” and I think I’m going to give Zotero a chance. While you can’t properly code media within Zotero, or any other specific Reference Management software that I’m aware of, you can add tags and create projects/libraries within Zotero.

I hope my experience with the above-mentioned tools will help you as you continue your good work!

Be well, 



Posted Dec 8, 2021