Assistant Professor, Drexel University
Matthew Schneider develops data protection methodologies for real world data sources (e.g., text, customer-level data, or third party datasets) to protect consumer privacy. Importantly, his methodologies also preserve the value of a variety of business use cases. He is an Assistant Professor of Business Analytics in the Decision Sciences Department at the LeBow College of Business.
Schneider’s most recent research shows firms how to maintain the value of customer data in the presence of data privacy. His work has been published in the Journal of the Royal Statistical Society, Marketing Science, the International Journal of Research in Marketing, the Journal of Consumer Psychology, the International Journal of Forecasting, and the International Conference on Data Mining. Dr. Schneider teaches Statistics for Business Analytics and Multivariate Analysis to professional masters students at the LeBow College of Business.
Prior to Drexel, Dr. Schneider was an Assistant Professor of Marketing at Northwestern University and a Visiting Scholar at the Samuel C. Johnson Graduate School of Management at Cornell University. He was also the Director of Research at Fort Rock Asset Management LLC, a fund of hedge funds based in Portland, Oregon. He holds a PhD and MS in Statistics from Cornell University, a MS in Public Policy and Management from Carnegie Mellon University, and a BS in Quantitative Economics from the United States Naval Academy. Before finishing his PhD, he was employed at the RAND Corporation from 2008 to 2013 and served in the U.S. Navy as an Officer of the Deck and Surface Warfare Officer on the USS Boxer from 2003 to 2005.
WATCH LIVE: November 2nd at 10:30 am
User-generated content (e.g., online reviews) are an important source of information on products and services for consumers and firms. Although incentivizing high-quality reviews is an important business objective for any review platform, we show that it is also possible to identify anonymous reviewers by exploiting the characteristics of posted reviews. Using data from major review platforms and our two-stage de-anonymization methodology, we demonstrate that the ability to identify an author is determined primarily by the amount and granularity of structured data (e.g., location, first name) posted with the review and secondarily by the author’s writing style across reviews. When the number of potential authors with identical structured data ranges from 100 to 5 and sufficient training data exists for text analysis, the average probabilities of identification range from 40 to 81%. Our findings suggest that review platforms concerned with the potential negative effects of privacy-related incidents should limit or aggregate their reviewers’ structured data when it is adjoined with textual content or mentioned in the text itself. We also show the probabilities of identification on Twitter, Yelp, emails, and student essays, and explain the “black box” reasons behind authorship using Shapley values on machine learning predictions.