Thousands of researchers use social media data to analyze human behavior at scale. The underlying assumption is that millions of people leave digital traces and by collecting these traces we can re-construct activities, topics, and opinions of groups or societies. Some data biases are obvious. For instance, most social media platforms do not represent the socio-demographic setup of society. Social bots can also obscure actual human activity on these platforms. Consequently, it is not trivial to use social media analyses and draw conclusions to societal questions. In this presentation, I will focus on a more specific question: do we even get good social media samples? In other words, do social media data that are available for researchers represent the overall platform activity? I will show how nontransparent sampling algorithms create non-representative data samples and how technical artifacts of hidden algorithms can create surprising side effects with potentially devastating implications for data sample quality.