Sunday, May 19, 2013

Bibliography


Bosch, Torie. “How Kate Middleton’s Wedding Gown Demonstrates Wikipedia’s Woman Problem.” Slate. Online Magazine. 13 July 2012. Web. Accessed on 6 May 2013 from http://tinyurl.com/6wmqeo5.

Cohen, Noam. “Define Gender Gap? Look Up Wikipedia’s Contributor List.” The New York Times. Online Newspaper. 30 Jan. 2011. Web. Accessed on 5 May 2013.http://tinyurl.com/9wkpgqv.

Gardner, Sue. “Nine Reasons Why Women Don’t Edit Wikipedia (in Their Own Words).” Sue Gardner’s Blog, 19 Feb. 2011. Web. Accessed on 4 May 2013 at http://tinyurl.com/7fsjyhj.

“Gender Gap Stories.” Wikimedia Meta-Wiki. Meta-Wiki Page. 27 Apr. 2013. Web. Accessed on 6 May 2013 at http://tinyurl.com/aok6wmd.

Haralanova, Christina. “Wikipedia: Why So Few Women Edit?” Ludost 24 May 2012. Web. Accessed on 8 May 2013 from http://tinyurl.com/aga6rh9.

Lam, Shyong (Tony) K. et al. “WP:Clubhouse? An Exploration of Wikipedia’s Gender Imbalance.” Mountain View, California: ACM, 2011. 10. PDF. Accessible at http://www.wikisym.org/ws2011/_media/proceedings:p1-lam.pdf.

Potter, Claire. “Prikipedia? Or, Looking for the Women on Wikipedia.” Tenured Radical 10 Mar. 2013. Web. Accessed on 6 May 2013 from http://tinyurl.com/b8vhqv3.

Walker, Tim. “What Has Wikipedia’s Army of Volunteer Editors Got Against Kate Middleton’s Wedding Gown?” The Independent. Online Magazine. 16 Aug. 2012. Web. Accessed on 7 May 2013 from http://tinyurl.com/aqmvusc.

Wikipedia Editors Study: Results from the Editor Survey, April 2011. The Wikimedia Foundation, 2011. PDF. Accessible here: https://upload.wikimedia.org/wikipedia/commons/7/76/Editor_Survey_Report_-_April_2011.pdf

Friday, May 17, 2013

Executive Summary


According to the second official Wikipedia Editor Survey conducted in December of 2011, women editors comprise only 9% of contributors to Wikipedia's “sum of all human knowledge”1. This significant lack of women and women's voices in the Wikipedia community has led to systemic bias towards male histories and culturally “masculine” knowledge (Bosch, 2012; Lam et al., 2011; Haralanova, 2012; Gardner, 2011; Potter, 2013; Walker, 2012; Wikimedia Meta-Wiki, 2013), and an editing environment that is often hostile and unwelcoming to women editors (Gardner, 2011; Lam et al., 2011; Wikimedia Meta-Wiki, 2013). The Wikipedia “gender gap”, as it has come to be known in Wikimedia circles, has increasingly become a large concern for the Wikimedia community, and a fair body of scholarly and non-scholarly work investigated and addressed the gender gap has materialized over the last few years. However, as much of this research has been on the “general” Wikipedia editing community, the vast majority of the outputs and dialogue that have been generated by these endeavours revolves predominately around the experiences of Western women on the English-language Wikipedia, and there has been little to no discourse on the significantly larger gender gaps in editing communities in the developing world. 

According to the same Editor Survey of 2011, India's editing community is only 3% female, but there has been little discussion on mainstream Wikipedian forums on why the participation of women in India is markedly lower than that of the Wikipedian population on average. Further, no directed research of any kind has attempted investigate this phenomenon.

This research thesis will attempt to investigate the gender gap in the Wikipedia community in India through an exploration of the contextual nature of the real and perceived barriers that both editors and non-editors face to contributing to Wikipedia. It is my hope that this research helps to generate a deeper understanding of those obstacles that prevent Indian women from becoming editors as well as demonstrate that context-specific research is needed to better understand those barriers and challenges that are faced by women from different regional, linguistic and socio-cultural backgrounds.

This research study has three main objectives: to generate a better understanding of the demographic composition of the current editing community in India; to investigate the barriers and challenges that Indian women face to their participation in the editing of Wikipedia through the exploration of the experiences of women who are currently editors and the perceptions of female non-editors; and to determine whether the barriers and challenges identified by the research subjects are unique to both the lived experiences and realities to there woman as well as to the Indian context. The research has a fourth, more long-term research objective, to produce research outputs that can be used to increase the effectivity of initiatives aimed at developing the Wikipedia editing community in India, but this objective will be given less focus during the data-gathering process.

This study hopes to work with the following three populations of research subjects in order to gather the necessary data required to meet the needs of the research objectives: the Indian Wikipedia editors community, the female members of the editing community and a group of Indian women who do not currently edit Wikipedia (the specific group will be decided at a later date). Data will be gathered from the Indian Wikipedia editing community through an online demographic study and one-on-one or group interviews with female editors, and an online qualitative and quantitive questionnaire will be circulated to the community of non-editors. This mixed methodology will hopefully lead to the generation of a large pool of data with significant potential for astute and insightful analysis.

While the research aims to generate a context-specific understanding of the barriers and challenges experienced by Indian women, considering the tremendous degree of culture, ethnic, linguistic and socio-economic diversity found within the Indian population and the time and resource limitations of the project itself, this study cannot realistically produce a complete account of the barriers that any one group of Indian woman face to their participation in India, nor can it hope to generate a highly nuanced and sophisticated exploration of the complexities and causal mechanisms that contextualize those barriers within the Indian society. Instead, the goal of this research is to perform an initial exploration of the themes that characterize the gender gap in the Indian Wikipedia editor community in hopes that it will lead to the discovery of avenues for more focused research in the future.

As very little regional, population or even context-specific research on the gender gap in Wikipedia has been carried out, and no such research has been done on the Indian editor population, any attempts to address the gender gap in India, or indeed the gender gap in any editing population whose cultural, socio-economic and societal contexts differ from those of the Western, caucasian, English-speaking world, are at risk of reproducing and even further entrenching the patterns of exclusion that already characterize the Wikipedia editing communities. The outputs of this research will not only be useful in designing more effective, context-appropriate development projects for the Indian Wikipedia editor community—particularly initiatives that aim to bridge the gender gap—but will also help the Wikipedia community at large to better understand the complex nature of the gender gap, which will hopefully stimulate similar investigations in India and elsewhere.



1 This is an oft-quoted phrase originally coined by Jimmy Wales, a co-founder of Wikipedia. While it is not Wikipedia's official catchphrase, it is often used in reference to the online encyclopedia.

Research Problem


Research Problem
A fair body of academic work has been carried out on subjects related to the gender gap and women editors on Wikipedia, including research on the content and style differences between female and male editors, but only a handful of studies have actually attempted to address the causal mechanisms of the gap. The majority of the research that has investigated the barriers and challenges that women face to their participation in Wikipedia are non-academic, and these works tend to suffer from the usual sampling complications of any electronic opt-in survey, namely that the self-selection aspect is liable to produce unrepresentative and possibly unreliable results and, in these cases, predominately represent the majority voice on Wikipedia: English-speaking, formally educated, from a developed nation in the North Hemisphere (UNU, 2011). No research investigating the barriers that women face to their participation in Wikipedia has been done on a specific region or sub-population within the larger Wikipedia contributor population, and therefore there is no research that would allow for a comparison between those barriers faced by Western women and those faced by women belonging to different societal contexts. This is not to say, however, that relevant, useful research has not been performed on this topic, but simply that much of it may not be applicable to women who do not belong to the Wikipedian status quo.

Academic Research
In a conference paper for the Conference on Computer Supported Cooperative Work 2012, Benjamin Collier and Julia Bear performed a statistical analysis of an international sample of 176,192 readers, contributors and former contributors to Wikipedia in order to investigate the factors that hinder women from transitioning from readers to contributors. In their sample, they found that the amount of women who reported conflict or fear of conflict on Wikipedia, lack of expertise or discomfort with editing other people's work as being factors that discouraged them from editing Wikipedia was statistically significant. Their fourth hypothesis, however, that women have less free time to edit, was not supported by their findings. This is curious, as it is contrary to both the findings of similar research projects as well as research that examines the amount of free time that men have in comparison to women1. While the authors themselves present various limitations of the study associated with over-representation of contributors in the responses and the inability of their empirical methodology to allow for the inclusion of other potentially significant survey questions, the study itself was limited by its set hypotheses, which allowed the researchers to test the significance of those factors that they felt were most hindering to women's participation within their sample size without allowing for any exploration of other factors influencing women's participation in Wikipedia.
Stine and Steiner argue in their conference paper for the Annual Meeting of the International Communication Association that women are more put-off by Wikipedia's editing culture and editor community than men, discouraged by a lack of positive feedback, intimidated by the Wikipedia interface, and/or lacked the free time or expertise to contribute (or both). They base these arguments on their findings of 53 e-mail interviews with Wikipedia editors and contributors, which is, they admit, a small and unrepresentative sample size. Indeed, their editor sample size consisted of eight women and twelve men, which underscores the limited legitimacy of these research findings as accurate portrayals of the experiences of women editors as a whole. Furthermore, they approached their potential editor respondents through three mailing lists for technology-related researchers, which has lead to a substantial skew towards respondents employed and/or heavily involved in educational or academic environments, meaning that the majority of respondents were most likely highly educated, technologically-literate and had consistent access to a computer and the Internet—the typical demographic make-up of the majority of Wikipedia editors (UNU, 2011). It is therefore difficult to determine whether these research findings would be consistent with those experiences of women belonging to the minority sub-populations of Wikipedia.
Using an empirical study of 113,848 Wikipedia Users, Lam et al. found that there is a distinct male-skewed gender imbalance on English Wikipedia. Statistically speaking, women editors edit less, are more liable to leave Wikipedia, more likely to have their first seven edits reverted/deleted, more likely to have their edits reversed for vandalization and are significantly more likely to be blocked indefinitely. As the authors themselves point out, the study is limited to those Wikipedians who explicitly identify as either male or female, which requires the assumption that these Users are honest in reporting their gender. Further, in order for these findings to be representative, one must assume that editors who choose to display their gender behave in similar ways to those that choose not to display their gender. Again, it is difficult to comment on the representativeness of these findings, particularly for women in the Indian context, so it will be interesting to see whether or not these findings are qualitatively reproduced in my research.

Non-academic Research


The purpose of this research is to explore the real and perceived barriers that both female editors and non-editors face to contributing to Wikipedia in order to determine whether these obstacles and challenges are contextually situated in the experiences and lived realities of women in India.


1 For example, see this study by Mark Aguiar and Erik Hurst available here: http://qje.oxfordjournals.org/content/122/3/969.short, as well as this article on The Economist which presents the results of an OECD study on the topic of leisure time in 18 countries: http://www.economist.com/node/13717514?story_id=13717514

Objectives and Research Questions


1. Generate a better understanding of the demographic composition of the editing community in India
   Research questions: What are the gender, linguistic/ethic, age, frequency of editing, educational variation, employment status, marital and familial distributions in the editing community in India? Is this population characterized by predominately one group of people, or is there an equally diverse range of groups involved in editing? Whose contributions/voices are being included in the editing community? Whose contributions/voices are missing? Why?

2. Investigate the barriers and challenges that Indian women face to their participation in the editing of Wikipedia
   i) Explore how women that are currently editors experience obstacles and challenges to their participation in contributing to Wikipedia
      Research questions: What does this population perceive as barriers and challenges to their participation? What are the causes of these barriers and challenges? How do these barriers and challenges change over time? Which challenges and barriers do they perceive to be the most hindering to women's participation in editing? What do they feel would be the obstacles that an average Indian women would face to her participation in contributing to Wikipedia, and how do these barriers differ from the ones currently faced by women already participating in the editing of Wikipedia?
   ii) Explore how women that are currently non-editors perceive and experience barriers to their participation in editing Wikipedia
      Research questions: Do editors and non-editors experience or perceive different barriers and challenges? How has any prior experience with Wikipedia shaped these women's perception of the barriers and challenges that they face to their participation in Wikipedia? How do perceived but unexperienced barriers to their participation affect women's choice or ability to contribute to Wikipedia? What do they feel would be the obstacles that an average Indian women would face to her participation in contributing to Wikipedia?

3. Investigate the contextual nature of the barriers and challenges that Indian women face to their participation
   Research questions: Do women from different geographic, linguistic, ethnic and class groups experience or perceive different barriers and challenges to their participation, or do they experience or perceive similar barriers and challenges? If so, do they experience these same barriers and challenges differently? How are these barriers and challenges defined/created by their contexts? How do these barriers and challenges compare to those faced by women in other societies (for example, the Western World)?

4. Generate research outputs that can be used to render current and future projects aimed at developing the Indian Wikipedia Editor community more effective, particularly those initiatives designed to address the gender gap and attract and retain female editors

Context

[Not complete yet]

Study Areas


 This research will take place throughout India, though I will be carrying out much of my research while being based out of Bangalore, India. I have chosen to be based here during the research because the current India Chapter of the Wikimedia Foundation is based here, and their physical presence allows me better access to their resources as well as their activities and events. My interactions and co-operation with the Chapter will help to provide legitimacy to my attempts to engage the editing community, as well as strengthen my ability to gain access to the various Indic-language editing communities.

However, due to the nature of Wikipedia and the Wikipedian community as well as the various other online communities I may choose to engage, I predict that the majority of the fieldwork for this research project will take place in online spaces via online means. Because Wikipedia itself is an online platform that does not possess one single specific physical location, its editing communities also exist predominately in online spaces and tend to be comprised of members from varying geographical locations. This would make it very difficult to interact with multiple editing communities in the physical world; therefore, I plan to use the mailing lists, Village Pumps1 and community pages of each Indic-language Wikipedia project as a means to communicate with and engage with each language-specific editing community. I have chosen these three resources as my “locations” of research as they are the main sites of interaction between the members of each community, and are therefore the principle online spaces where these communities exist.
Further, as the other online communities that I hope to use as research subjects are similarly characterized by their online existence, I will be using each communities' respective mailing list as the main “locations” of my research.

As for those data-gathering activities that must take place in the physical world (such as face-to-face interviews), due to the wide geographical distribution of the members of the editing communities in India, I cannot predict the exact locations where the research will take place at this point in time. However, the locations will most likely be dictated by the physical location of each editor that I would like to use as a research subject.


1 A “Village Pump” is a Wikipedia page that is used to discuss the technical issues, policies and operations of a specific Wikipedia project. Most Indic-language Wikipedia projects have their own Village Pump pages, though Wikipedia India, as it is community of Indian Wikipedians and not a specific Wikipedia project, does not have a Village Pump.

Study Sample


 The first group of individuals I plan to study will be the Wikipedia editing community in India. This editing community consists of the various members of the English language and the Indic language editing communities. This is a relatively large population, though there is significant overlap between the English language and Indic language community members as many editors edit in both English and Indic languages. As the first objective in my research is to generate a better understanding of the composition of the Indian editing community, the data that I gather from this population will contribute significantly to my analysis of whose voices are heard and whose voices are missing in the editing communities in India.

I have selected twenty communities to work with: the English language editing community and 19 Indic language1 communities. I selected these communities based on their accessibility (whether or not they could be contacted via a mailing list, Village pump and/or community page) as well as their inclusion in the official list of Indic language editing communities2 created by the Centre for Internet and Society (whose Access to Knowledge team is currently working as the India Chapter of the Wikimedia Foundation). I have chosen not to include the Sanskrit Wikipedia community in this population as Sanskrit is not a spoken language nor is it the language of any particular ethnic identity in India. The participants will not be selected; instead, they will be invited to participate in a survey via their mailing lists, Village Pumps and Community page, and the respondents will be included in the research. The requests for participation and the surveys will be translated in to the various Indic languages.

The second group of individuals I plan to study are current female Indian editors of Wikipedia from either the English editing community or the Indic community (or, as is most probable, both). These women will be my main research subjects, as their experiences and stories will help me answer a significant number of my research questions listed under my second research objective. These participants will be invited to be interviewed on the survey as well as through the mailing lists, Village Pumps and community pages.

The third population I plan to gather data from is a non-contributor population (individuals who do not currently edit Wikipedia); specifically, I've chosen to survey EITHER individuals that have edited Wikipedia in the past, but stopped doing so more than six months ago OR a community of potential past editors as well as those who have never edited. I am not certain which community I will choose to study at this point in time, as I am not certain if I will be able to access my potential community of individuals who were once editors but stopped. No matter which group I choose, however, their significance to my research will be the same. It is very important that I gather data from a non-contributor population, as their responses will help to highlight those barriers that are the most hindering to and/or difficult to overcome for women's participation in the editing of Wikipedia.

If I am able to gain access to this population, I plan to do research on the students who were involved in the India Education Program Pune Pilot Project3 that was carried out by the Wikimedia India Chapter in 2011. These students were required to edit Wikipedia as a part of a course at the university or college. I plan to gain access to them through the individuals who helped organize and run the Pune Pilot Project. They will be asked to participate in a questionnaire via their email addresses, and the respondents will be included in the research.
However, if I am not able to access this population, I plan to gather data from various Indian communities related to openness (open access, open data, open educational resources, etc.), as these individuals will most likely be quite knowledgeable about Wikipedia but may or may not be involved. Again, they will be asked to participate in a questionnaire through their community mailing lists, and the respondents will be included in the research.


1 Assamese, Bengali, Bishnupriya Manipuri, Bhojpuri, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Nepali, Newar (Nepal Bhasa), Odia, Pali, Punjabi, Sindhi, Tamil, Telugu and Urdu
2 This list can be found on one of the Centre for Internet and Society's Access to Knowledge Meta-Wiki pages at this link: http://meta.wikimedia.org/wiki/India_Access_To_Knowledge/Indic_Languages
3 This project was an attempt to engage students and professors in the Wikipedia editing process and increase the number of editors in India by using Wikipedia as a teaching tool and assigning Wikipedia editing as an assignment for various courses. A report on this project can be found here: http://en.wikipedia.org/wiki/Wikipedia:India_Education_Program/Analysis/Independent_Report_from_Tory_Read

Information to be Collected


 I plan to gather primary empirical demographic data and primary qualitative data. Specifically, the qualitative information will be in the form anecdotal, opinion and experiential data produced through one-on-one interviews, questionnaires and a possible group interview.

The quantitative demographic data is highly relevant for two reasons. Firstly, as no specific research has been done on the demographic composition of the editing community in India, this data is required in order for the first research objective of this study to be achieved. Secondly, as this study aims not only to discover but to analyze the barriers and challenges that women face to their participation in contributing to Wikipedia in the Indian context, this demographic information will generate insight into what types (age, gender, level of educational attainment, etc.) of individuals are currently involved in editing, and who is being excluded. This information will aid in the fulfillment of my second research objective.

As for the qualitative data, the questions pertaining to the barriers and challenges that women face to their participation in the editing of Wikipedia listed under my second objective will be answered through the anecdotal, opinion and experiential data that is gathered in this study. 

Methods for Gathering Information


Population Survey
The purpose of this survey is to gather demographic data on the editing communities in India. I have chosen to use an electronic survey to gather this information for two reasons: one, I would like to gather empirical data from as large a sample size as possible; and two, the majority of the community members are very accessible via online means, so the dissemination of an electronic survey will effective and timely. The majority of the questions in the survey will have set answers, but a handful of questions will allow participants to enter their own answers. Participants will be asked to provide their Wikipedia User name for purposes of verification; all those surveys with false or non User names will be discarded.

The survey will be created using a survey software called SurveyMonkey, and the survey will be hosted on SurveyMonkey. I have chosen to use SurveyMonkey as it allows for easy distribution of the survey and has sophisticated data analysis tools. This survey will be disseminated on the Wikimedia India community mailing list as well as the various mailing lists and Village Pumps of 19 Indic-Language Wikipedia projects1. The survey as well as the request for participation in the survey will be translated into each required language before it is released.

Aside from the fact that I cannot assume that all Indian editors can read and write in English (though the vast majority can), the survey must appear in 19 different languages for two reasons: one, India is extremely ethnically and lingually diverse, therefore excluding those editors that are not English-literate will not provide a complete picture of demographic composition of the Indian editing community at large; two, the survey itself will be used as the main method of recruiting participants for interviews that will be carried out later on in the research, and an important aspect of this research is to gather qualitative data from women from different linguistic groups as they may have differing or distinct editing experiences.

The data collected through this survey will help me to answer the research questions associated with my first research objective, as well as help guide parts of my analysis of the themes in subsections (i) under my second research objective and my third research objective. Further, much of the information gathered may be very helpful for the various entities involved in carrying out developmental projects for the editing community in India, which is in agreement with my fourth research objective.

Drawbacks: The results of my demographic survey are restricted by the size of my response. If I get a very small number of responses, my research findings will not be particularly representative of the population at large. Further, the responses from the editors may not represent the editing communities at large but instead just those editors that are actively involved in the community at this point in time. There may be editors that are not subscribed to the mailing list, who do not check the Village Pumps, and who edit anonymously. However, the aim of this research is not to generate a significantly profound understanding of the demographic composition of the editing community in India, but instead to provide the initial groundwork for more directed future research.

Semi-structured One-on-One Interviews
The purpose of these interviews is to gather profound experiential, anecdotal, emotive and opinion data. Interviews will be carried out either in person (with an interpreter, when needed), over Skype or similar software or, if absolutely necessary, over email. The interviewees will be asked to choose a location/method which is most comfortable for them. I plan to use semi-structured interviews because while I have a clear list of questions that I would like to explore, I would like the interviews to have more of a conversational structure so that the women feel that they are able to freely expand on their thoughts and discuss related topics and I am able to ask unplanned questions.

I have chosen to gather qualitative information through this method because I would like to explore the barriers and challenges that are experienced by women who currently edit Wikipedia. Much documentary research can be performed that will speculate on the barriers, but in order to gain true insight into those obstacles and struggles that are relevant to the current population, experiential and anecdotal data must be gathered from the lived experiences and realities of current female Wikipedians. Furthermore, their emotive responses and experiences will help me to understand which barriers are the most hindering and discouraging for current editors. All of this data will be instrumental in answering the research questions subsection (i) of my second research objective as well as in the provision of astute research findings that will help me to meet my third and fourth research objective.

Drawbacks: The amount of data that I gather from my interviews will depend on how many women agree to be interviewed, so I may face issues of a very small, unrepresentative sample size. However, I will attempt to ensure that each interview is as profound and explorative as possible, which is why I have chosen to use a semi-structured strategy instead of structured. Similar to the possible complications with the online survey that I have discussed above, the responses that I receive may not be representative of the actual population of women that edit Wikipedia but instead of the population that is currently active in the community and subscribed to the mailing lists. Again, as this is initial research into this topic, I am not aiming to produce research findings that are generalizable for the female Indian editor population as a whole, but instead to generate basic insights that will hopefully stimulate future research in this area.

Group Interviews
Depending on the quantity of responses that I get for my invitation to be interviewed, I may carry out a group interview with female members of the English-language editing community as I feel this may allow me to gather a larger array of data in the limited time I have for my research. In using group interviews, I will be able to collect a large array of different opinions and experiences in one session, as well as allow for a more profound exploration of various challenges and barriers as participants build on their peers’ ideas and opinions. Further, having the participants discuss the barriers and challenges that they have faced and continue to face as editors will enable me to view trends in responses during the interview itself instead of during the comparison of individual interviews after the fact. My reasons for choosing to gather this type of qualitative data are identical to those for the one-on-one interviews: they will effectively contribute to information demands of subsection (i) of my second research objective, my third research objective and my fourth research objective.

Considering that India is geographically very large, it is likely that the responding editors will be located in various different locations throughout India. It would very expensive and timely for each editor to travel to one location for the focus group, and therefore many respondents may not be able to participate. In order to maximize participation should this situation arise, I may elect to hold the focus group over the internet either via IRC chat (or a similar IM client) or a similar platform (like Google Hangout). In order to avoid issues of false representation, the chatroom or Hangout will require a passcode or keyword to enter, which will be provided to the participants beforehand.

Drawbacks: Aside from those drawbacks listed for the one-on-one interviews, I may also face issues related to participants' inability to participate due to geographical or technical limitations. Furthermore, holding the group interview over an online space may encourage more participants than originally expected, and not all participants may be able to voice their opinion and/or share their experiences. In a group interview setting, participants may become pre-occupied discusses topics amongst themselves during the interviews, and may not respond to my questions. However, this may not be a drawback; participants may bring up new discussions and topics that I had not thought of.


Questionnaire
The questionnaire will gather qualitative and quantitive data. It will be in electronic form and will be created and hosted on SurveyMonkey. I have chosen to use SurveyMonkey as it allows for easy distribution of the survey and has sophisticated data analysis tools. The questionnaire will be in English, and will request demographic data as well as experiential, anecdotal, feeling and opinion data. The majority of the quantitative questions will have set answers, whereas all of the qualitative questions will require participants to enter answers. Questionnaires will be answered anonymously, and each participant will be assigned a participant number that will be recorded with their individual responses.

I have chosen to use a mixed qualitative/quantitative electronic questionnaire because I do not have the same direct access to a non-editing community as I do an editing community, so the performance of a profound qualitative study on this population would be logistically complicated and timely; an electronic questionnaire will hopefully collect a large enough data pool to identify themes and perform analysis that will contribute to my research while leaving ample space for further research. Further, the demographic information that will be collected will help to inform the analysis of the qualitative data. The findings of this questionnaire will help to answer the questions pertaining to subsection (ii) of my second research objective, as well as help to meet the information demands of my third and fourth research objectives.

Drawbacks: As respondents will be allowed to answer anonymously, there is always a risk of “repeat-offenders”, or individual participants that take the test multiple times, as well as participants supplying and/or misleading false data. In addition, as is a drawback with any questionnaire, my research findings will be restricted by the number of respondents. However, I am not attempting to generate data that is representative of the population; instead, I hope that some of the themes that I am able to identify from this data-gathering exercise will lead to the creation of future research on this topic.



1 Not all Indic-language Wikipedia projects have their own mailing list, though most have both a mailing list and Village Pump. Two languages, Pali and Newari (Nepal Bhasa), do not possess either a Village Pump nor a mailing list, but both have a community page that could possibly be used to disseminate the survey.

Significance of the Study and Expected Outcomes


Through the identification and exploration of the various obstacles faced by both female editors and non-editors in India, this proposed study will show that many of these challenges are specific to the societal, social and economic realities of individual and/or groups of Indian women. As was pointed out in the research problem, much of the prior research on the editing gender gap has looked at the experiences of the female editing population in general, which puts the research findings of those studies in jeopardy of predominately representing the experiences of women from the English-speaking Western world. Any efforts to resolve the gender gap in the Indian contributor communities that do not take into account the contextual nature of the challenges faced by Indian women are likely to recreate the same exclusionary trends that already exist within these groups. The outputs of this study will therefore not only be highly conducive to the production of more effective efforts and projects to attract and retain female editors in India, but will help the Wikimedia community at large to better understand the complexities of the gender gap in Wikipedia. Accordingly, it is my hope that this study will stimulate, and behave as a basic structure for, similar regional and population-specific explorations of the Wikipedia gender gap.

Furthermore, I do not expect this study to generate a highly nuanced, representative account of the barriers and challenges that Indian women face to their participation in the editing of Wikipedia; instead, as it is the first region and population-specific research on the gender gap in Wikipedia, the findings of this study will provide an initial exploration of the topic that will hopefully stimulate more extensive research in this area. 

Timetable of Activities/Data Gathering Schedule

Week
Activity
May 20th-27th
Preparation to begin data-gathering activities:
-Waiting for Ethics approval
-Collecting quotes on translation services and, if budget permits, getting the surveys and invitations to participate in the surveys translated; if budget does not permit, I will solicit the aid of the Wikimedia Chapter in India
-Inputting survey on SurveyMonkey
-Inputting questionnaire on SurveyMonkey
-Disseminating invitation to participate in interviews
May 27th-June 3rd
(If Ethics approval has been awarded)
-Disseminate invitations to participate in survey and questionnaire
-Begin data-gathering for surveys questionnaire
-Begin interviews/co-ordinate interviews
June 3rd-June 10th
-Carry out interviews
-Begin organization for group interview, if needed and possible at this stage
-Transcribe any interviews that have been completed
June 10th-June 17th
-Send out reminders for survey and questionnaire responses
-Carry out interviews
-Begin organization for group interview, if needed and possible at this stage
-Transcribe any interviews that have been completed
June 17th-June 24th
-Carry out interviews
-Begin organization for group interview, if needed and possible at this stage
-Transcribe any interviews that have been completed
June 24th-July 1st
-Send out reminders for survey and questionnaire responses
-Carry out interviews
-Transcribe any interviews that have been completed
July 1st-July 8th
-Carry out interviews
-Transcribe any interviews that have been completed
July 8th-July 15th
-Send out reminders for survey and questionnaire responses (last week to respond)
-Carry out interviews
-Transcribe any interviews that have been completed
July 15th-22nd
-Organize, categorize, code, encrypt and store questionnaire data
-Translate (if needed), organize, categorize, code, encrypt and store survey data
-Encrypt and store interview data

Friday, May 10, 2013

Current Problems: My Methodology and Research Subjects...


I have reached a sticking point in my thesis work. Yesterday, I sat down to write a rough draft of my research proposal, and found that I was still struggling my research design. In particular, I am having much internal conflict over which population I want to work with, how I want to collect my data (my methodology), and how I will design those methods of data collection to provide answers to my research questions. The third issue is really the main issue, as it is the cause of my struggles with the first and second.

Without resolving these issues, I cannot move forwards in my research design.

Here are the three issues that I'm currently struggling with, along with some of the major questions associated with each issue:

Populations: 
Currently, I'm worried about finding the right populations to work with. The editing communities themselves are quite accessible via my contacts and their mailing lists. However, I am worried I will get very limited results if I just work with this community of already-established editors. I would therefore like to data-gather from a non-editing population. But which population? I thought of trying to work with the students who were part of the Pune Pilot Project of 2011 that was run by the India Wikimedia Chapter (the report on which can be found here: http://en.wikipedia.org/wiki/Wikipedia:India_Education_Program/Analysis/Independent_Report_from_Tory_Read) and women who have been participating in the current outreach programs put on by the A2K team, and to focus on women that had started editing but had stopped. My other idea was to reach out to various FOSS mailing lists in India and use them as my non-editing population (as a similar study, "Wikipedia's Gender Gap" by Linda Steiner and Stine Eckert, used to survey wikipedia readers). 

In order to collect this data, I wasn't sure if I should identify participants and do interviews (particularly with the students and women that have been involved in outreach projects) or design a survey similar to the one that Sarah Stierch designed, the Women and Wikimedia Survey (https://meta.wikimedia.org/wiki/Women_and_Wikimedia_Survey_2011#Demographics), which collected both quantitative and qualitative data.

Further, I would really like to do some research on both the English language editing community and the 20 other Indic language editing communities. Most of the communities have a very small population of editors, with, from what I gather, only a few women. However, I'm not sure how do-able this is. Do I do a basic demographic survey of all the populations, just to see how many women editors there are, what background they come from, etc.? Or do I try with the quantitative and qualitative survey? Or, do I abandon all of that and just work with the English Language editing community for qualitative data?

Questions: Should I try to work with both editors and non-editor communities? Should I attempt to work with both the English Language and Indic-language communities? Should I use a survey to try to gather both qualitative and quantitative data for these populations?

Methodology: 
I'm worried about the scope of my research. I don't want to overreach myself (by proposing to do too much data-gathering), but I also want to be able to generate useful data. Right now, I'd like to perform a basic demographic survey on all of the language communities just to get a better idea of how many women are editing in the various communities, what their backgrounds are, etc. Then I was thinking that I would perform either: 5 interviews with women editors from the community and five interviews with women that are non-editors (hopefully, depending on response) OR a focus group with each of the two populations. Then I began thinking about how large these populations are, and if this would really be representative, and came back to my joint qualitative-quantitative survey idea just so I could possibly get a larger sample size (depending, of course, on survey response). Also, because India is a big country, I'm a bit worried about how much I'll have to move around, and began wondering if focus groups could be done online (maybe through Google hang-out?), and how legitimate that would be.

Questions: Should I used a mixed methodology? How do I choose a sample size that is large enough without overreaching myself? How legitimate is qualitative data that is gathered via online in the academic world?

Abstract questions: 
I'm struggling to figure out how I investigate my more abstract questions through my methodology. 

Questions: How do I find out what kind of socio-economic barriers that Indian women face I'm working with a mostly mostly English-speaking, middle class Indian population? How do I investigate the barriers that prevent local knowledge from becoming part of Wikipedia in this population?


Hoping a Skype meeting with Professor Chan will help me overcome some of these conflicts, and enable me to complete my research proposal and ethics review!

Emerging Ideas about Content of Final Report...

Unfortunately, most of the research I find myself collecting (after partially reading it) is probably only going to make an appearance in the final report of the research, which I will begin writing in September, 2013.

As I've been collecting research (and my thoughts!), I've been writing notes on various themes that I will write about in the final report. These are my very rough thought-notes on these themes:


-How many women edit
-Survey, issues with that, active editors, editors that identify as male/female, how many women edit in Indian community
-Mention decrease from 13% to 9%. Look at survey sizes. Decrease or disproportionate increase in male editors?
-Mention the average WIkipedia user (systemic bias page)
-”WP:Clubhouse? An Exploration of Wikipedia's Gender Gap” + articles for research proposal
-Why is this happening on Wikipedia and not on other platforms?

-How does this compare to their participation in other online communities/activities?
-About the same amount of women and men read Wikipedia...
-Women and men tend to use the internet about the same amount (PEW research study)
-EMPHASIS: Other FOSS communities/activities...Lots of research done on this.
-Social media, journalism, other forms of online activities.

-Barriers that have been identified
-See resources: 9 reasons why women don't edit, Wikipedia page on systemic bias, outcomes from the Wikiwomen Camp 2012 (http://www.ludost.org/content/wikipedia-why-few-women-edit)
-Systemic bias, culture that is unfriendly to women and to women's knowledge, notability and verifiability, traditional knowledges
-”WP:Clubhouse?” An Exploration of Wikipedia's Gender Gap”
-Multple examples of stereotypical male knowledge articlers versus stereotypical female knowledge articles: http://www.nytimes.com/2011/01/31/business/media/31link.html

-Lack of dialogue about women editors and unique barriers in the global south
-Heather Ford's “The Missing Wikipedians”
-"Wikipedia is not the sum of all human knowledge: do we need a wiki for open data?"
-Some more discussion of traditional knowledges, etc.
-What are some of the unique barriers faced by women in the south (Mentioned in Wikipedia page on systemic bias)
-What is missing? As pointed out byWarewitz in her brief articles “Who Speaks for the Women of Wikipedia?” Little to no work has been done with current female editors. Many outsiders are speaking about it, but little is being asked of the already established/involved editors. This is where I come in. Even Sue Gardner takes sources from outside of Wikipedia in her article “9 reasons why women don't edit”
-See CIS' work and reports on this

-Research Justification (Why should we identify these barriers)
-Look how Wikipedia is being used (who is citing, how often it's being cited, etc.)
-what happens when we have a bias in knowledge repositories? What is the societal effect of content bias on Wikipedia?
-Halavais & Lackaff argue: “If an encyclopedia is only as good as its weakest areas, it is important to identify these weaknesses(431).
-What happens when women's voices aren't heard?
-Wikipedia shaping knowledge of the offline world (pew study: http://www.pewinternet.org/Reports/2011/Wikipedia.aspx) and also p.8 of “WP:Clubhouse? An Exploration of Wikipedia's Gender Gap”
-Wikipedia is not expanding at the rate it is expected to—See Heather Ford's “The Missing Wikipedians”
-Women difference in editing (“Gender Differences in Wikipedia Editing” article)
-Lack of dialogue about women editors in the developing world, particularly in India
-Example of women behaving as the keepers of “local” or “traditional” knowledge in India—this is gendered knowledge, but there is no place for it on Wikipedia, even though many argue that preserving local culture and knowledge is very important.




Important resources (as of May 10th):
-S. C. Herring. Gender and power in on-line communication. In J. Holmes and M. Meyerhoff, editors, The Handbook of Language and Gender, pages 202–228. Blackwell, 2003.
-Unlocking the Clubhouse: Women in Computing. Margolis, J. And Fisher, A. 2001.
-Hargittai, E. & Shafer, S. (2006). Differences in actual and perceived online skills: The Role of Gender. Social Science Quarterly. 87(2), 432-448.
-Krieger, B., & Leach, J. N., Dawn. (2006). FLOSSPOLS gender: Integrated report of findings. Retrieved July 26, 2011, 2011, from http://flosspols.org/deliverables/FLOSSPOLS-D16- Gender_Integrated_Report_of_Findings.pdf
-Rafaeli, S., & Ariel, Y. (2008). Online motivational factors: Incentives for participation and contribution in Wikipedia. In A. Barak (Ed.), Psychological aspects of cyberspace : Theory, research, applications (pp. 243-267). Cambridge, NY: Cambridge University Press.
-Henderson, J. J. (forthcoming). Toward an ethical framework for online participatory cultures. In A. Delwiche, & J. Henderson (Eds.), The Routledge handbook of participatory cultures. New York: Routledge.
-Fallis, D. (2008). Toward an epistemology of Wikipedia. Journal of the American Society for Information Science & Technology, 59(10), 1662-1674. Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1263781#
-IAMAI for stats about computer and internet penetration (http://www.iamai.in/rsh_pay.aspx?rid=avDLOK1zAI8=)
-"Wikipedia is not the sum of all human knowledge: do we need a wiki for open data?" Finn Arup Nielson
-Comparison to other online encyclopedias and Free and Open Source online societies.  

Friday, May 3, 2013

Data-gathering plan and Definitions


Hello Prof. Chan,

Here is a rough outline of the data-gathering activities I'd like to complete for this thesis. I don't know how realistic gathering this large of an amount of data is, so I would really appreciate any advice you can give in this respect.

It looks like I'm heavily leaning towards working mostly with the editing community, mostly because I think they will be the easiest to access, and there is much to be learnt from the experiences of current women editors from this group. I think that most of the insight about what barriers are faced by women when they are attempting to become editors will be coming from the A2K team, who is working to bring more female editors in, though I'm sure current editors will have faced interesting barriers, as well.

Please do look over my plan and tell me if it's realistic and, more importantly, useful!



Data-gathering activities:
Quantitative:
1. Perform survey of English Language and (maybe only the largest) Indic Language editing communities to see how many women editors are part of the editing communities
Methods:
-interaction with A2K team and communities themselves to do a head count (possible language/access complications, but hopefully A2K team will be able to help with this)
-counting and analyzing names on mailing lists ( (a)less reliable, as some editors may not be currently active in the community or even editing, etc. May be useful for getting an idea of increase/decrease of female editors over time? (b) Language barrier as I will not be able to recognize female versus male names in some of the Indic languages)

2. Perform survey of English Language and Indic Language community mailing lists and IRC chats to see how many female editors have been actively posting on the mailing lists and participating in IRC chats in the last 5 years
Method:
-Make a list of all editors who have posted to the mailing list in the last 5 years, with post counts
-Identify female editors through interaction with communities (they may know of past editors who
-Compare how many posts have been made by female editors versus male editors


3. [tentative idea] Create a data map of all the edits performed by all active female editors over the last calendar year, along with any revisions/deletions associated with their edits
Method:
-Using editors' names, look up their edit counts, etc. If this turns out to be less labour-intensive than I thought, maybe I can go back another year or more.
-OR put together survey for members of the community that would require them to identify this information themselves[1]

[1] A survey may be able to encompass all or a large part of the information I'm looking for in activities 1-3.

Qualitative:
4.
a) Interviews (and possibly in surveys for editors):
i) With A2K/Wikipedia team, and others working for/on Wikipedia in India:
-How would you describe the condition of female editors in the English-language/Indic language community? Do they tend to be active in the community? Are their voices heard? Why or why not?
-What are the barriers faced by women editors? What is stopping more women from becoming editors?
-What are the challenges in attracting more women editors?
-What kind of work have you been doing to encourage more women to take up editing?
-What kind of women are you targeting? How are you going about this?
-How has the community been receiving your attempts to bring in more women editors?
-What kind of work have you been doing with current women editors? Have they been receptive to your aid? Are they supportive or unsupportive of your efforts?
-What else can be done to bring in more women editors, and encourage the already-existing editors to become more active?
ii) With female editors active in the community
-Can you tell me a bit about yourself? What do you do? How old are you? What kind of certifications do you hold (educational attainment)?
-What prompted you to become involved in editing Wikipedia? What prompted you to continue to stay involved? If you are not involved, why not?
-Tell me a bit about your editing experiences. What kind of articles do you tend to edit? How often do you edit? Do you feel that you make more additions or revisions? How are those edits received? Do you find that many of your posts are revised and/or deleted? How do you feel that your knowledge additions/revisions are received?
-Do you feel that you've faced barriers to you participation in the editing of Wikipedia? If so, what were/are they? Have these changed?
-What has been your experience interacting with the editing community a) in India b) internationally?
-How do you feel that you are received by the editing community? Do you feel like your voice is heard? Do you feel like your inputs are considered?
-Do you face barriers or adversity to your involvement with the editing community? If so, what were/are they?
-Have you ever felt like you've been treated differently because of your gender, either in your editing, during your participation in the community, from other editors, from non-editors, from the general public, etc.?
-What do you feel could be done to improve the experiences of women editors, and bring in more editors?
iii) [possibly] From women editors not involved in the community
-Same questions as above for women editors involved in the community, but with less questions about their involvement (I'll simply ask: “Why are you not involved in the editing community?”)
b) Documentary research
-What barriers are women facing to their involvement in Wikipedia? In the Indian context, how much of this lack of women participation is due to physical/infrastructural barriers? How much is it due to educational barriers? How much is it due to language barriers? How much is it due to cultural barriers?
-Verification requirements of Wikipedia, the issue of authoritative or verifiable sources, and epistemological debates within Wikipedia about what knowledge is.
-What roles do cultural narratives surrounding the authority of knowledge play in women's participation in editing/adding to the "sum of all knowledge"?
-What role does the inclusionist versus revisionist debate play into cases where things that are typically viewed as "Women's knowledge" (knowledge that is traditionally/culturally feminine or associated with the female gender) are revised/deleted for being insignificant or having "no indication of importance"?


Notes on data-gathering activities:
Quantitative:
1. Required to determine exactly who I will be working with as my research subjects, and to understand how large of a sample size I am working with. Also gives me a better idea of the male-to-female editors ratio, which is a required data-point for the research, particularly when it comes to the justification of this research project. This is basically the first step of the research
2. While this doesn't give us much of an idea of how many women are editing Wikipedia actively, it does give us an idea of how active women are in the editing community, upon which inferences of how often their voices are heard, if they are having input into the on-going development of Wikipedia and the Wikipedia platform, etc. Analyses of the repercussions of “missing” female voices can be performed.
3. This is where I get into the hardcore quantitative editing data that will hopefully lead to some indices on what women are editing, how they are editing (are they making additions, or revisions?), how often their posts are being revised/deleted, etc. This could make a really cool and useful chart/graph. I should petition the Wikimedia Foundation to commission the creation of a data map of all edits/additions of all declared female editors, as well as the deletions/revisions afterwards. I'm sure the outputs of that could be compared to statistics about how often pages are edited/revised, etc. If those kind of stats are available.

Worries:
-I'll need statistics on male editors to make a comparison for analysis
-This, of course, limits my sample size to active editors, and can not reach editors that are not active in the communities. How do I reach non-active editors? How do I reach infrequent editors? How does this diminish the legitimacy of my research and its intended outputs?
-Mailing lists: Not just editors posting (I myself have posted on the mailing list), so requires me to look up each poster's name, see if they have an editing account, label them as infrequent/average/frequent editors, etc. I believe that these would still useful statistics, however, so I am more than willing to put in the leg work for this.
-These seem quite labour-intensive (though I'm not actually too worried about this—I have the time and the determination!), and I wonder if a survey could be put together that could be circulated on the mailing lists that could gather a large amount of the data I'm looking for (like gender, account names, edit counts, data from both male and female editors...As well as some possible qualitative data points which will be discussed below)
-However, the issues with surveys is response. I would just have to hope I got a good response. Further, for the Indic language communities, I would have to get the surveys translated, and the responses translated. I'm sure this could be done.

Qualitative:
4. a) iii) Only issue I foresee is getting access/in touch with women that are editors but are not active in the community. May not even include them as research subjects, as I seem to be leaning heavily towards working only with the community...Problematic?
   b) The majority of which will probably be done once the writing of thesis begins in September...



- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -


Research Definitions:

“Editor”: Someone who edits or who has edited wikipedia, and possesses an editing account
“Active Editor”: Someone who currently edits Wikipedia
“Infrequent Editor”: Someone who currently edits Wikipedia research to find out number
“Average Editor”: An editor who currently edits Wikipedia on this measurement shall be determined with more literature review)
“Frequent Editor”: An editor who currently edits Wikipedia
“Involved Editor”: An editor who is active and involved in their respective editing community
“Community”/ “Editing Community”: Come up with definition for this, not just mailing list but...etc.