Ethics and open source software development

During the last months, there has been a lot of discussion around ethics and open source software development. Since open source software is becoming more and more a key component of any industry and the society digital transformation, I think it is worthy to have the discussion. This post is my first set of humble ideas about the topic.

First of all, I think that any suggestion about how to improve the ethical use of the software is going to be well-received, or at least, took into account. Who is gonna be publicly against that?

But, we need to think carefully about the implications of any measure put in place to achieve an ethical use of the software.

And last, but not least, we, as open source software development contributors (and any sort of technology in general), have some responsibility about the things we co-create.

What does ethical use mean?

Words like “Don’t be evil” or “Do no harm” are the ones that first came to my mind when I started to hear about this ethical open source software development discussion months ago.

But it was Tobie Langel the first one that pointed me to a good approach: “Anything against the human rights would be considered not ethical”. Agree!

Let’s create an ethical license for copyleft

So, what is the easiest way to avoid using software against human rights? Let’s have a software license that avoids you to harm anyone.

But this approach seems to be in conflict with the copyleft concept, and in particular with Freedom 0 of Free Software Definition:

The freedom to run the program as you wish, for any purpose (freedom 0).

https://en.wikipedia.org/wiki/The_Free_Software_Definition

And with some points of Open Source Definition. The ones most referenced are:

5. No Discrimination Against Persons or Groups. The license must not discriminate against any person or group of persons.
6. No Discrimination Against Fields of Endeavor. The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

https://en.wikipedia.org/wiki/The_Open_Source_Definition

Given this scenario, Coraline Ada Ehmke is fostering the idea of a new and very interesting license: The Hippocratic License. And the core clause says:

No Harm: The software may not be used by anyone for systems or activities that actively and knowingly endanger, harm, or otherwise threaten the physical, mental, economic, or general well-being of other individuals or groups, in violation of the United Nations Universal Declaration of Human Rights (https://www.un.org/en/universal-declaration-human-rights/).

https://firstdonoharm.dev/version/1/2/license.html

Do we really need a new license?

The Open Source Definition has been around since 1999, based in Debian Free Software Guidelines, from 1997, and with roots in the 4 freedoms of Free Software Definition, from 1986. It has been here for a while and, I would say, working well for the software development industry. Of course, we could probably blame the Open Source Initiative for not being more active in making the definition as clear as possible. And we might be playing Chinese Whispers with the definitions during all this time.

And, what do we want to achieve with this ethical debate? Because there might be already mechanisms to report and try to avoid people harming each other, like the International Human Rights Law. Yes, the Law. There are local, regional, national, and even international laws in place. And even with that, people keep harming each other. And if the Law doesn’t fit our expectations, what can we do? Go politics!

So, do we really think that a software license is gonna change this? Of course, I support the idea of the ethical use of technology. And especially ways to be more proactive instead of reacting to bad uses of the technology we co-create. So, could there be another way to do it instead of breaking a working set of definitions?

The Hippocratic Oath

A couple of months ago, I discovered an interesting post from Mariesa Dale proposing the idea of an optional oath for building ethically considered experiences. It was from 2018 and titled The Technologist’s Hippocratic Oath. And it was completely logical for me.

We don’t need another license. We need to swear that we will do our best to comply with certain values, and if we fail, we should be accountable for it. And how can it be done in my open source projects?

Well, when we write the README file of a project, we are saying how it works. And if it doesn’t work as described, we are accountable for it. People will submit issues to help to fix the bugs. And usually, we try to do our best to solve it (note: this could be a topic for another post).

Why not doing something similar to express our swear to no harm anyone. Let’s put a ETHICS file containing the Technologist Hippocratic Oath (and probably we should have a better template, but this one works for me), and get ready for issues if we don’t comply with it.

Additionally, if it’s part of an open source project, anyone cloning it or forking it should be making the oath theirs. And of course, if they don’t feel comfortable, they can remove the file. It’s open source software. But, would you do that?

And for me, this is the key aspect. It’s not about stopping others to use my source code, but trying to ensure I pledge to be a good citizen, and let others decide for themselves if they want to or not. And be accountable for it.

And this is not a really new way to manage ethically development. Last week I had the chance to participate in Sustain OSS 2020, and one of the design sessions was about how corporations could be good open source citizens, and be accountable for it. One of the proposals was to make public their values related to open source, like, for example, Salesforce is already doing.

Update Feb. 11th: Thank you Mathew S. Wilson for sharing with me in Twitter the ACM Code of Ethics and Professional Conduct. Really worth reading page.

Final thoughts

I don’t consider myself licenses and law expert, so I would like to know what the open source community thinks about this approach. And perhaps the Technologist Hippocratic Oath text is not the best one, but I have started to use it and to add it to my open source projects as part of the license section.

What do you think?

Analyzing Open Source development (part 3)

In last post about analyzing open source development I mentioned that this one would be about massaging people information to have unique identities for all the project contributors.

But before that, I would like to explore something different. How to get data from multiple repositories? What happens when I want data from a whole GitHub organization’s or user’s repositories?

The obvious answer would be:
1. Let’s get the list of repositories:


import requests

def github_git_repositories(orgName):
    query = "org:{}".format(orgName)
    page = 1
    repos = []
    
    r = requests.get('https://api.github.com/search/repositories?q={}&page={}'.format(query, page))
    items = r.json()['items']
    
    while len(items) > 0:
        for item in items:
            repos.append(item['clone_url'])
        page += 1
        r = requests.get('https://api.github.com/search/repositories?q={}&page={}'.format(query, page))
        items = r.json()['items']
    
    return repos

2. And now, for each repository, run the code seen in previous post to get a dataframe for each one in list and concat them with:


df = pd.concat(dataframes)

For organizations or users with a few repositories, it would work. But for those with hundreds of repositories, how long would it take to go one by one fetching and extracting info?

Would there be a fastest approach? Let’s play with threads and queues…
Continue reading “Analyzing Open Source development (part 3)”