Organizations of all sizes benefit by equipping people in many roles—IT, finance, marketing, HR, etc.—to use data to improve quality and efficiency. Almost all of them need guidance, support, and coordination to help them pursue the full value of data while avoiding pitfalls that could undermine public trust or cause legal complications.
I offer a scalable consulting solution that allows your organization to capture the long term value of your data while building in safeguards to protect your relationships with customers, employees, shareholders, and regulators. I identify key issues you should be concerned about, such as where data is sourced, how it can be best used, and how to avoid using it in ways that lead to trouble. I coordinate communication across functional teams, including data owners, analysts, software developers, end users in sales and marketing, HR, finance, and leadership. This avoids dead ends and nasty surprises—like unintended algorithmic bias—by eliminating silos. Without it, each team in your organization might reasonably assume that someone else is responsible for identifying and taking responsibility for critical data and AI issues that are not being addressed.
Step 1: Identify Scope
Because your organization most likely has more than one data realm, rather than boiling the ocean by examining all of it we’ll start by defining a narrower focus. Some examples of this might be:
- One or more data-driven projects that you are working on, such as a recommendation engine (recommending purchases to customers) or an HR recruiting system.
- One or more data sets, such as customer data or employee data.
- All of the projects and/or data sets for one or more departments, for example, IT and data science, marketing and sales, or HR.
Key factors that go into determining the scope of a consulting engagement include the data maturity of your organization (Do you have data scientists? Do you have an active program for data governance, or a CDO?), the regulatory environment (What stage of GDPR compliance are you pursuing?), and the number and type of pending or ongoing projects you want to focus on (Are you currently building or revising any data-driven systems? Are any of your high-visibility systems likely candidates for algorithmic bias?).
Step 2: Identify Stakeholders
We will include those who touch the data before and after it’s processed, as well as those who promote and protect your organization, including:
- The “owners” of the data, internal and external (vendors), who can speak to how it is/was/will be acquired.
- Those who maintain the data, internally and externally (such as SaaS vendors), who can speak to security and access.
- Those who define the purposes for which the data will be used.
- Those who transform your raw data into actionable information (like recommendations or analytics) such as data scientists, software engineers, analysts or vendors.
- The recipients of the output (such as recommendations or analytics) when data is processed, who can speak to how the output is stored, who has access, what if any limits exist on its use.
- Those with an interest in understanding, explaining, and providing guidance around the ways data is being used here, which may include legal and compliance, sales and marketing, PR, or leadership.
Differences in data literacy and functional silos are the two biggest obstacles to productive, safe use of data. Functional users from areas like sales and HR may be willing to take data quality and outputs on faith, rather than asking important questions about relevance, privacy and safety. On the other end of the spectrum, data scientists and analysts may not be well equipped to decide how to spot issues that could potentially impact your organization’s reputation or raise red flags with regulators.
Step 3: Gather facts, assumptions, goals, and constraints
With data projects it’s common for teams to assume they already have all the facts they need to complete their specific piece. The people who separately create a data model, deploy that model, and use the output from that model may be siloed from one another, and each may assume that someone else has the expertise and responsibility to ask and answer all necessary questions. These assumptions often break down in reality when, with the best of intentions, data projects get thrown over the wall from one team to another.
Different functional teams usually have different levels of data maturity. For instance, HR teams may be less data centric than marketing, and less frequently include people with data expertise.
Hard core data experts—even data scientists—may not have been informed about the baked-in limitations of the data they have been given. Their expertise might not extend to a nuanced understanding of the potential brand or regulatory impact that the data, or their AI model, might have. But once the system is built, end users generally will assume that the experts have spoken, and all is said and done, without inquiring any further.
Step 4: Connect stakeholder agendas
To make the joint participation of every stakeholder productive rather than chaotic, a high level view of the inputs and objectives of all stakeholders is collected and shared, with dependencies between stakeholders highlighted. Stakeholders will be explicitly asked to coordinate with other stakeholders concerning these dependencies, and as-yet-undiscovered dependencies, to avoid collisions or missteps.
Step 5: Keep connections open
As these teams discover new factors—for example, when new opportunities to acquire and process data are uncovered, or new regulatory and brand communication challenges are recognized—these new factors are defined and shared with all stakeholders.
Example: To illustrate this framework, let’s imagine a consulting engagement centered around a product recommendation engine in an early stage of development.
Recommendation engines are often used to suggest products to customers. A feature that announces “people who bought X also bought Y” is a type of recommendation engine. Another is auto-completing search results. Machine learning can be trained to find similarities between customers, including actions that they’ve taken like purchasing specific products, then used to predict what individual customers want to do next.
For this example, we’ll assume that a new recommendation engine is being built from scratch to add search auto-complete and “you might also like” functionality to an e-commerce site, email and direct mail marketing.
Successful implementation of this engine can lead to delighted customers who spend more time engaging with company marketing, purchase more frequently, purchase a wider selection of products, remain loyal longer, and recommend the company to others.
Potential risks of this feature include
- Privacy and security issues, if permission to obtain and use data isn’t properly obtained or data is exposed. (For example, by seeing Target’s marketing offers to his teenage daughter, a father discovered she was pregnant),
- Regulatory issues. For example, businesses must comply with the US COPPA statute if children may be customers, the European Union’s GDPR statute if EU citizens may be customers, and should look forward towards compliance with California’s impending CCPA statute.
- Discrimination issues due to algorithmic bias. For example, Google’s search auto-completion is sometimes sexist and racist, and Amazon’s recommendation engine sometimes promotes hate speech.
The product recommendation engine was chosen because it is a high priority project for the marketing and IT teams. There are high expectations for increasing customer satisfaction, loyalty, and revenue, but at the same time the recommendation engine potentially exposes the company to negative customer experience, brand reputation and compliance impacts. In scope, it’s both big enough to help push the company forward while small enough to avoid sucking-down resources excessively.
Stakeholders (include but may not be limited to):
- the marketing team that owns the recommendation engine project, plus those who “own” part of the data (acquired via tracking customer engagement with the company website, email marketing, and social media), and those who “own” the company’s brand message, as well as those who will decide how the recommendations generated by the engine will be shared with customers;
- any external vendors used to acquire customer data for marketing;
- the IT team that manages any data that is stored in-house, as well as (if different) the IT team that will develop the recommendation engine’s software;
- a data scientist, responsible for creating and training the machine learning model that will be given to the software development team to power the recommendation engine;
- the finance team which “owns” customer identity and sales data;
- the legal team responsible for ensuring compliance with applicable privacy regulations; and
- people from PR and leadership who need to be kept abreast of the organization’s new push into a public-facing AI experience.
Facts, assumptions, goals, and constraints (there are too many to be comprehensive here, but some examples include):
- Marketing team(s): brand reputation versus revenue—how many customers will be “creeped out” by how well the recommendations reflect their interests? Has a response been planned for customer complaints or negative publicity should something go sideways?
- External vendors: is data from an online advertising platform being used which generates data about customer engagement, but also collects and shares customer information with 3rd parties?
- Data Science: where does the training data come from, and what kinds of algorithmic bias can we anticipate based on these sources?
- Finance team: did customers give the sort of explicit permission required by GDPR to use the data managed by the Finance team?
- Legal: are any EU citizens customers or likely to become customers?
Stakeholder agendas (examples include):
- The marketing team that owns the recommendation engine can consult with data scientists in advance about data sourced from 3rd parties, focusing on quality and potential hangups (such as permission and algorithmic bias).
- IT and software developers need to consult with marketing about security, how access is granted, and to whom it is granted.
- Legal needs to consult with marketing and data science about regulatory and privacy requirements that might prohibit certain uses of sources or types of data,
- PR and leader-spokespeople need to be advised about how value to customers is being increased as well as protective measures taken and risks addressed so that they are prepared to promote (and potentially deflect criticism from) the new feature without major missteps.
Keep connections open
While working on this project it may come to light that customer service teams already have access to all of this customer data and could use the recommendation engine while talking 1:1 with customers in order to increase satisfaction, loyalty, and sales. In this event, guidelines for when and how customer service can use processed data should be discussed from a delivery standpoint (software development), as well as brand reputation standpoint (will it creep customers out?) and compliance standpoint (e.g. is it a permissible use of data?).
For more information, contact Bruce Wilson at email@example.com.