Reviewer 4 (AC)

Expertise: Knowledgeable
Recommendation: Between neutral and possibly accept; 3.5

Review

Paper is relevant to CHI as it considers the topical issue of data sharing, a topic that is often discussed and is gaining force in the community right now (R1). This is an important topic to the community in terms of the type of data that should be shared, and how this sharing fits with the new GDPR rules about storing data and so on. There are some interesting findings in this work, as well as some actionable, useful suggestions.

The significance and merit of the paper comes from both the empirical survey it conducted to collect CHI authors’ perspectives and practice of research data sharing and its discussions on both the potential benefits and barriers for such data sharing (R2). I would recommend that the authors focus on addressing the issues raised by 2AC and R2.

The authors could improve this work by providing more explanation about the taxonomy, describing study materials and also explaining the main reasons for not wanting to share study materials.

Other questions that need to be addressed: Why did the authors not include a discussion of restrictions about data sharing coming from IRBs and what would be the impact of data sharing on informed consent? What could be some suggestions for "data processing procedures" for qualitative research? Is sharing data on request a good practice from the authors perspectives?

The authors need to present an assertive reflection on what is considered sensitive data and ethical concerns which may arise from preserving data sets and making them public. The authors also need to focus on addressing 2ACs comments on 'misconceptions' in addition to commenting on other policies that might impact sharing in addition to GDPR.

The authors point out that there are clearly different concerns in sharing qualitative data and quantitative data, however it is not fully reflected in the recommendations section" (R2). The authors should describe whether and how they might engage with this concern in the rebuttal.

Reviewer 3 (2AC)

Expertise: Knowledgeable
Recommendation: Between possibly reject and neutral; 2.5

Review

This paper presents a survey of CHI authors about their data sharing practices, with a focus on reasons why they choose not to. I think that this is an incredibly important and relevant topic for CHI, and I really appreciate research that helps us look inward at our research community and our own practices. There are some really interesting findings here, as well as some actionable, useful suggestions. I think that this paper is well on its way to being a strong contribution to CHI. However, I have some serious concerns with some of the assumptions that seem to underlie some of the interpretations of this data by the authors. Because I think that this work is important, I have tried to be detailed in my comments, and some of the issues are quite small and easily fixed; others I think will require revisiting interpretations of the findings.

First, a few methodological notes. The survey design section notes that responses to the 2018 survey indicated that the taxonomy of research artifacts was flawed. However, there is no information about how that original taxonomy was created. In Figure 1, because “survey” is separate from “empirical,” I assume that means that “survey” means “survey of the literature” or something similar rather than a survey to collect data from human subjects? This is confusing.

I appreciate the (appropriate, given the topic of this paper!) great deal of transparency and methodological detail in this paper (e.g., in the participant recruitment section). However, given this level of transparency and detail for much of the methods, the detail provided for the qualitative analysis seems very thin. It was good to see a bit more detail in the supplemental materials, though there is no information about, for example, how many people performed open coding, whether it was independent, etc. The level of detail here isn’t necessarily unusual for a methods section for CHI (though it is typical to include a citation for an analysis method, unless the paper goes into detail about the steps of that method), though it is just a contrast to the unusually high level of detail for the other components of the methods.

The results note re: study materials that it is surprising that the top five reasons for not sharing included concerns about participant data and permission, given that they supposed to be referring to artifacts produced by researchers and presented to participants, and not results or data collected. This finding to me suggests it is possible that respondents misunderstood or misinterpreted the question, which might impact the findings and interpretation; I wonder if there is a way through the qualitative analysis that the authors could speculate if this is true? Though I’ll note that another possibility is that, for human subjects work, the respondents were imagining materials that the authors were not - for example, perhaps they thought this meant actual questions asked during semi-structured interviews rather than initial interview scripts, since the questions would include follow-up questions to responses that might reference those previous responses.

I really appreciate the effort put into understanding the relevance of GDPR to scientific data (though a summary in the paper itself would have been useful), though given discussion of other regulatory factors (respondent concerns about IRBs as noted on page 6), I was surprised not to see discussion in the paper (ideally based on existing literature) about the role of IRBs and other ethics regulatory bodies and the kinds of restrictions that they actually place on human subjects data. This seems really important to contextualizing these findings about respondents’ reasoning.

It would be great to see examples of “data processing procedures” since I wasn’t sure I had a great sense of what these might be, particularly for qualitative data. I was wondering where, for example, codebooks fit into this schema.

The authors note that “it is puzzling” that low percentages of shared study materials and processing code are only shared upon request. Why is this puzzling? I also was surprised not to see much more of a discussion of sharing upon request, given that this is commonly suggested as good ethical practice when it comes not just to human subjects data but to, e.g., datasets created from publicly available data. I expected sharing upon request to be a recommendation of this paper, so though I appreciate that historically there have been problems with this working in practice, I’m curious if the authors actually think that this is not good practice or if they would recommend it if it is done properly.

Similarly, I was concerned by the statement that there are “no reasons” in some of the DOI-mapped papers to be sensitive to participant privacy, but one of the examples given is “scraped data from public websites.” Publicly available information can absolutely lead to privacy violations, particularly if it is sensitive subject matter and/or includes identifying information such as names or online handles. Moreover, data that is publicly available at the time of a scrape may be deleted later, which means that its preservation in a shared dataset is something that researchers should have as an ethical consideration. I worry that these kinds of assumptions underlying the analysis performed in this paper about respondent motivations might be damaging some of the interpretation in these results. It is noted later that there may be a lack of consensus as to what is considered sensitive: there absolutely is (as evidenced by the position taken by the authors of this paper), and this is something that deserves more attention in this analysis.

To be clear, there are some GREAT findings in this paper. In several places (e.g., the discussion of lack of resources, and lack of incentives) I wrote “yes!!!” in the margins as I read this. I think that this is important work, and I would like to see it published. However, one of the core findings (as stated in the abstract) is that a lack of sharing is due to “misconceptions” - but it appears that some of those misconceptions (e.g., that participant privacy is important in certain kinds of artifacts where it might be non-obvious) are not misconceptions at all, but rather may be based on assumptions by the authors in the interpretation of these findings. To me, this is a serious issue that might require revisiting the data with a broader perspective. I look forward to reading the rebuttal, in which I imagine the authors will be able to address at least some of these concerns.

As a smaller note, the paper needs proofreading. There are a number of typos and mistakes, including e.g., an unfinished sentence at the end of the “preregistration and ethics approval” section.

Reviewer 1 (reviewer)

Expertise: Expert
Recommendation: Strong Accept: I would argue strongly for accepting this paper; 5.0

Review

This paper presents the results of an extensive survey of CHI authors' behaviour and intentions around the sharing of research artefacts.

At a time when research ethics, statistical methods and open science are becoming hotly discussed issues, this creation of a state of play within the CHI community is of great value.

STRENGTHS ========= Many of the results are not surprising to those of us who have considered these issues over the years, but this paper takes these vague feelings and backs them up with hard evidence. In the days of small research communities and relatively leisurely publishing explicit sharing may have been less necessary – you knew everyone else working in your area well and if needed you could exchange informally. However, it is clear that now we need to fundamentally overhaul the fundamental value systems embedded in academia and this paper offers key evidence and data to inform this.

I particularly liked the way the paper turned the (somewhat depressing) results into positive recommendations for researchers, reviewers and programme chairs. As is evident, I believe there is an even wider political message for academia, perhaps this work could be the inspiration for a workshop on this at a future CHI!

WEAKNESSES ========= There is some inevitable self-selection amongst respondents, but the response rate is far higher than I would have expected. If anything I would assume we would have seen the most diligent authors responding :-/

Some Detailed Points ================= 1. p1, RHC, 2nd para. I'd not actually come across this distinction before, I guess because I've tended to do stats in areas where 'reproducability' in the computational sense here is taken for granted and hence I've always heard 'reproducability' as the property that allows 'replication' (in Claerabout terminology). I'm not at all sure how common this distinction is, but it is helpful and clearly described here.

2. p1, RHC, 3rd para. Although the paper is particularly focusing on reproducibility, the sharing of research artefacts has many advantages beyond this. Sharing data allows reanalysis to address different questions; sharing code allows others to build on one's work.

3. p1, RHC, last para. – It is not just CHI, I recall a paper form some years ago where they had asked machine learning paper authors for data and code and found few able to provide it. I think now-a-days the ML community is better though.

4. page 3, RHC, 2nd para 'researchers' discretion' – I liked this distinction and terminology, it gets rid of some of the value judgements around the terms subjective/objective.

5. page 5, table 1. – I was interested in the relatively high number of masters students (no MAs!) I'm guessing this partly reflects the tendency to encourage masters students to publish dissertation work, so they tend to be the first author plus I assume a self-selection effect with a greater likelihood that more junior authors will respond to a survey :-/

6. page 6, LHC, 1st para. "do not see the benefts" – My immediate thought was "for whom?' and this is something that gets picked up later in the paper.

7. page 6, LHC, 3rd para. – This does highlight the quite narrow remit of ethics boards: if the practices they permit encourage at best weakens in the research (hence devaluing the contributions of research subjects) and at worst encourages an ecology of malpractice, then it doesn't feel very ethical!

8. page 6, LHC, 5th column (c.f. comment 6 above) – "how can [the junior author] benefit from this process" – This is so critical – we need to create academic career structures that encourage rather than discourage good research practice.

9. page 6, RHC, para 4, the effort – This is the other side of comment 8: it is hard and not valued :-( :-(

10. page 8, RHC, para 4. "recognition of the benefots of sharing" – A sad example of this is the UK REF process that on 7 year cycle assesses university research performance. In the humanities panels, curated datasets are one of the submittable outputs of research. However, in the science and engineering panels (which includes CS and HCI), data is NOT regarded as an acceptable submission (papers, books, software, patents, are) and even papers that focus heavily on the collection (rather than the processing or interpretation) of data are treated with very low regard. That is the UK national research assessment structure for computing officially devalues the sharing of data … even though the UK research councils would say it is a good thing.

11. page 9, LHC, 2nd para and footnote 12, data management plans – In the UK he AHRC (arts and humanities rsearch council) demands such plans and at one point demanded that all the data produced by funded projecst was deposited in its own repository … and then,pretty much overnight, decided to ditch the repository (I assume for cost reasons) effectively destroying yaers worth of research data … and this was not a single researcher or university, but the government funded agency responsible for funding and overseeing UK humanities research.

12. page 10, LHC, 2nd para list, immutable – I think of university websites and even research project websites that constantly destroy past links when they are restructured and despair!

Reviewer 2 (reviewer)

Expertise: Knowledgeable
Recommendation: Between neutral and possibly accept; 3.5

Review

This paper sets out do explore the research data and artefacts sharing practice within the CHI community through surveys with CHI authors from 2018 and 2019 and provide a set of recommendations for guidelines for open research data initiative in CHI. The significance and merit of the paper comes from both the empirical survey it conducted to collect CHI authors’ perspectives and practice of research data sharing and its discussions on both the potential benefits and barriers for such data sharing. It also well reviewed the previous work that has been carried out in other discipline in regards of open data sharing. As the paper itself argues the merit of such an exploration and push for open research data sharing would help with the replicability and reproducibility of certain research (e.g. quantitative research and software) and potentially resource saving in the case of software. It is overall a very well written paper of interesting research! Therefore my reviews for this paper is not really on how to improve the specifics of the paper but some reflections I would like to invite the authors to consider and for the committee too. As the authors rightfully point out in the paper, that CHI is a very diverse community and this diversify is not just in the research disciplines, methodologies but also the cultural contexts and different countries the research and researchers are in. I appreciate the fact the authors considered the impact of GDPR in research data sharing practice and guidelines, but I think there are more context/region specific policies, regulations and understand that might impact on open data sharing than one policy. This leads me to the next point I would like to invite the author to think about, given the diversity in one community, is one set of policy guidelines to guide data sharing the approach we should explore or the exploration could also take into consideration granularity of policies. The authors point out that there are clearly different concerns in sharing qualitative data and quantitative data, however it is not fully reflected in the recommendations section. Given the difference, should the data sharing practice be under the same guideline or have more methodologically specific data sharing guidelines? And this is also the same question I had for the sharing of design prototypes. So the ultimate point maybe the authors could make extra explicit is how one set of guidelines for such a disserve community could /could not guide different types of data/artefact sharing. I understand the authors also recommend the option of not sharing and providing explanations. Though it is an option for authors who might have difficulties sharing data, it does not really help them to address the question on if they were to share the data how could they address the difficulties. Another question the authors pointed out in the paper is in regards of IRB process and consent. It is not fully addressed in the recommendations, I wonder if the data sharing were to become a practice, as I could see the value in it, how would this impact on the push for informed consent as there would be questions the authors may not be able to answer themselves after sharing the data to the research community. It would be great if the authors could also discuss the implications for consent more explicitly (it could be just one more sentence).

Lastly I wonder if CHI conference is actually the best venue for this paper. As it is great policy discussion and I think its contribution to this community comes from its implication, to evaluate its contribution one cannot avoid thinking about its potential implication which would influence the review. Though being a very interesting discussion, I do find for this paper to be accepted the paper ought to address the concerns I had on the different impact data sharing on different sub-disciplines in CHI in more detail as it might be used as policy briefing if accepted.