2017 GitHub Open Source Survey Description: The enclosed documents include two data sets comprising the results of the 2017 GitHub Open Source Survey (survey_data.csv, negative_incidents.csv) the full text of the English questionnaire (questionnaire.txt), and notes for working with the data (notes.txt). Free text responses have been removed to protect respondent privacy. Because of the high level of visibility of negative incidents in the open source community, the combination of detailed incident information, demographic data, and contribution practices may make respondents identifiable. In order to maintain respondent anonymity, harassment and related questions have been unlinked from the rest of the data and the order has been randomized. Researchers who have need to be able to link responses on the harassment questions to other variables in the dataset may contact us with a detailed explanation of their research needs, their plans for securing the data, and, if applicable, IRB approval. Methodology: The data here covers two distinct samples, sourced differently. GitHub.com sample: Between March 21 and 31, 2017, a small percentage of eligible visitors to licensed open source repositories on GitHub.com were invited to take the survey through a dialog box that linked to an off-site survey site. Eligibility was determined based on activity indicating sincere interest in open source projects (visits to 3 distinct projects or 3 clicks in a single project in 30 minutes). Invitations persisted across 3 subsequent page views or until dismissal. The introductory text on the survey landing page informed respondents that anonymous results would be publicly released as an open data set, all questions were optional, and provided instructions for accessing translated versions of the survey (available in Traditional Chinese, Japanese, Spanish, and Russian). Off-site sample: Between April 11 and 21st, 2017, 20 open source communities that work primarily outside of GitHub.com were invited to participate in the survey. Communities were sourced via public calls for participation in the the project GitHub repository, via Twitter, and through partner communities of the Open Source Initiative and targeted outreach to community managers and open source software foundations. For each community, a template invitation was provided to a liaison for distribution to the community. Each community received a unique link, so that data collection could be monitored for data quality problems (e.g. distribution outside the community, such as on Twitter). Links led to the same third party landing page described above. Terms of Use: Use of this data is licensed under the CC0 1.0 Universal License and governed by GitHub's Terms of Service (https://help.github.com/articles/github-terms-of-service/). Please note that while this data is public, our respondents have not waived their privacy rights: please see our Privacy Statement (https://github.com/site/privacy) regarding Public Information on GitHub (https://github.com/site/privacy#public-information-on-github). In particular, do not attempt to reidentify survey participants. Please contact us at https://github.com/contact with questions or concerns. If you use this dataset in a publication, a link or citation would be appreciated. If you extend this dataset, we hope you'll share your additions as open data.