Big data analytics

What Is Data Ethics?

Data can be used to drive decisions and make an impact at scale. Yet, this powerful resource comes with challenges. How can organizations ethically collect, store, and use data? What rights must be upheld? The field of data ethics explores these questions and offers five guiding principles for business professionals who handle data.

Data Ethics describe a code of behavior, specifically what is right and wrong, encompassing the following:

Data Handling: generation, recording, curation, processing, dissemination, sharing, and use.

Algorithms: AI, artificial agents, machine learning, and robots.

Corresponding Practices: responsible innovation, programming, hacking, and professional codes.

Data Ethics build on the foundation provided by computer and information ethics, but at the same time, they refine the approach endorsed so far in this research field by shifting the level of abstraction of ethical enquiries from being information-centric to being data-centric. For example, Data Ethics focus on third-party practices with individuals’ data, while information ethics span more broadly to media, journalism, and library and information science.

Data ethical misdeeds such as the leaks from Edward Snowden and the manipulation of Facebook data to influence the U.S. presidential election are encouraging legal actions. National and international governments draft, publish, and enforce Data Ethics rules. Some examples include the European Union’s General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA), and the Family Educational Rights and Privacy Act (FERPA). In addition, individual states, regions, and provinces are also developing more expansive regulations around Data Ethics (e.g., the California Consumer Privacy Act).

Although organizations do not have the same jurisdiction as a government, various groups are coming together to provide sets of guidelines. For example, Bloomberg, BrightHive, and Data for Democracy are developing a code of Data Ethics called Community Principles on Ethical Data Sharing (CPEDS) to codify Data Ethics for Data Scientists.

At a local or corporate level, both Data Governance and legal counsel are responsible for overseeing compliance of and remediation for breaches in Data Ethics rules.

Businesses are Interested in Using Data Ethics to:

  • Comply with regulations.
  • Prove themselves trustworthy.
  • Ensure fair and reasonable data usage.
  • Minimize biases and social inequities.
  • Develop a positive public perception.

Data collection and analysis is inseparable from modern business. You’d be hard-pressed to find a medium or large-sized company that didn’t gather data these days. As customer information plays a more prominent role in industry, though, data ethics becomes a more pressing concern.

You’ll hear more and more people talk about data ethics today. It’s certainly getting a lot of press, but does it affect your company that much? How concerned should you be about how your business handles data ethics?

If you don’t already, you need to practice ethical data management at all levels. It’s essential for any company today, and this trend will only grow. Here’s why.

Consumer Relations

Practicing good data ethics is a smart business decision. When you’re careful with and respectful of people’s personal information, they’ll appreciate it, so you build loyalty. On the flip side, unethical data governance can harm your company’s public image.

Just think about how the Cambridge Analytica scandal affected Facebook. After the scandal became public knowledge in 2018, 32% of Facebook users stopped using the site as often. As the issue continued into 2019, that number jumped to 38% of users.

As the public becomes more aware of how much of their data is available, they’ll take it more seriously. Consumers will expect more from brands, so if you don’t showcase respectable data ethics, it could mean less profit. If you want to retain loyal customers, especially in the future, you need to use their data ethically.


Data governance is starting to become a legal matter in some areas as well. Europe led the way with the GDPR, and California followed suit with the California Consumer Privacy Act (CCPA). This precedent of legally mandating how companies handle data will likely lead to more laws.

If you do any business in these areas, you already have to comply with these regulations. Even if you don’t, you should consider reviewing your data ethics in case you ever want to expand. It’s not out of the question that national data privacy laws could come to America soon, either.

You should note here that these current regulations don’t necessarily cover data ethics. Instead, they’re about data privacy, which is part of ethics, but not all of it. Practicing more ethical behavior, though, will help you stay in the clear as far as privacy laws go.

Implementing Data Ethics

So if you hope to stay in business in the future, you’ll need to handle data ethically. How do you do that, though?

How you can protect people’s data depends on what you need from it. Whatever you collect and whatever you do with it, you should be transparent, though. If you’re honest about what you record and use from the beginning, customers will think highly of you.

Privacy by Design (PbD) is another good idea. If you have set privacy protocols from the start, it’ll be easier to expand or modify your data governance. Adding new regulations on after the fact could be challenging.

If you use any algorithms or AI to analyze your data, audit it now and then. Unchecked, AI can exaggerate innate human biases, which can lead to a PR disaster. Establish an ethics committee to check on things like that and keep you in the clear.

Ethical Data Governance Helps Everyone

When you practice respectable data ethics, all parties benefit from it. Your customers will be happy that you’re keeping their private information safe. You’ll benefit from a loyal consumer base and the selling point of ethical data use.

Data ethics is already a prevalent concern, and that’ll be doubly true in a few years. If you take steps to clean up your data governance today, though, you have nothing to worry about. Data will drive the future, so data ethics will save it.

5 Principles of Data Ethics For Business Professionals:

1. Ownership

The first principle of data ethics is that an individual has ownership over their personal information. Just as it’s considered stealing to take an item that doesn’t belong to you, it’s unlawful and unethical to collect someone’s personal data without their consent.

Some common ways you can obtain consent are through signed written agreements, digital privacy policies that ask users to agree to a company’s terms and conditions, and pop-ups with checkboxes that permit websites to track users’ online behavior with cookies. Never assume a customer is OK with you collecting their data; always ask for permission to avoid ethical and legal dilemmas.

2. Transparency

In addition to owning their personal information, data subjects have a right to know how you plan to collect, store, and use it. When gathering data, exercise transparency.

For instance, imagine your company has decided to implement an algorithm to personalize the website experience based on individuals’ buying habits and site behavior. You should write a policy explaining that cookies are used to track users’ behavior and that the data collected will be stored in a secure database and train an algorithm that provides a personalized website experience. It’s a user’s right to have access to this information so they can decide to accept your site’s cookies or decline them.

Withholding or lying about your company’s methods or intentions is deception and both unlawful and unfair to your data subjects.

3. Privacy

Another ethical responsibility that comes with handling data is ensuring data subjects’ privacy. Even if a customer gives your company consent to collect, store, and analyze their personally identifiable information (PII), that doesn’t mean they want it publicly available.

PII is any information linked to an individual’s identity. Some examples of PII include:

  • Full name
  • Birthdate
  • Street address
  • Phone number
  • Social Security card
  • Credit card information
  • Bank account number
  • Passport number

To protect individuals’ privacy, ensure you’re storing data in a secure database so it doesn’t end up in the wrong hands. Data security methods that help protect privacy include dual-authentication password protection and file encryption.

For professionals who regularly handle and analyze sensitive data, mistakes can still be made. One way to prevent slip-ups is by de-identifying a dataset. A dataset is de-identified when all pieces of PII are removed, leaving only anonymous data. This enables analysts to find relationships between variables of interest without attaching specific data points to individual identities.

4. Intention

When discussing any branch of ethics, intentions matter. Before collecting data, ask yourself why you need it, what you’ll gain from it, and what changes you’ll be able to make after analysis. If your intention is to hurt others, profit from your subjects’ weaknesses, or any other malicious goal, it’s not ethical to collect their data.

When your intentions are good—for instance, collecting data to gain an understanding of women’s healthcare experiences so you can create an app to address a pressing need—you should still assess your intention behind the collection of each piece of data.

Are there certain data points that don’t apply to the problem at hand? For instance, is it necessary to ask if the participants struggle with their mental health? This data could be sensitive, so collecting it when it’s unnecessary isn’t ethical. Strive to collect the minimum viable amount of data, so you’re taking as little as possible from your subjects while making a difference.

5. Outcomes

Even when intentions are good, the outcome of data analysis can cause inadvertent harm to individuals or groups of people. This is called a disparate impact, which is outlined in the Civil Rights Act as unlawful.

In Data Science Principles, Harvard Professor Latanya Sweeney provides an example of disparate impact. When Sweeney searched for her name online, an advertisement came up that read, “Latanya Sweeney, Arrested?” She had not been arrested, so this was strange.

“What names, if you search them, come up with arrest ads?” Sweeney asks in the course. “What I found was that if your name was given more often to a Black baby than to a white baby, your name was 80 percent more likely get an ad saying you had been arrested.”

It’s not clear from this example whether the disparate impact was intentional or a result of unintentional bias in an algorithm. Either way, it has the potential to do real damage that disproportionately impacts a specific group of people.

Unfortunately, you can’t know for certain the impact your data analysis will have until it’s complete. By considering this question beforehand, you can catch any potential occurrences of disparate impact.


Datavarsity httpps://

Originally published June 18, 2020 6:16 am, updated January 31 2022 for relevance and comprehensiveness.


Subscribe to our Newsletter

Get The Free Collection of 60+ Big Data & Data Science Cheat Sheets. Stay up-to-date with the latest Big Data news.