June 6, 2024

Data Governance and Security in the AI Era

People.ai
Data Governance and Security in the AI Era

Table of Contents

Written by Stephen Myers, VP and Head of Legal at People.ai and Aman Sirohi, Chief Information Security Officer at People.ai

When ChatGPT burst onto the scene in 2022, it brought generative AI to the forefront of our collective consciousness and launched a new era in technology that is disrupting the way employees interact with technology to do work. Suddenly, every organization wants “AI''. 

In a 2024 survey among 3,000+ business and IT leaders, 92% of respondents agreed their organization needs to shift to an AI-first operating model within the next 12 months to stay competitive. In another survey by Salesforce, sales professionals said AI will help them increase productivity, increase sales, and better serve customers. 

Despite the understanding that AI will help organizations be competitive, there is still a lack of education in the industry about how the technology can be implemented safely and at scale. And since AI relies on massive amounts of sensitive data to work effectively, a lot of work needs to be done around AI data security and legal governance to protect organizations. Much of the burden of creating these new policies will fall on the shoulders of cyber security and legal teams. 

When AI education doesn’t keep up with demand 

What we’re learning as we work with customers at People.ai is that the interest in AI products is far greater than most people’s understanding of the technology. There’s a good reason for that - and it’s not the customer’s fault. 

AI presents a completely new way of working and solving problems - it’s not just another tool to adopt. There are very few processes that can’t be improved by AI. This tectonic shift in the way we work is similar to the transition from on-prem to cloud. It took individuals and organizations years to understand what the cloud was and how it worked. We’re in a similar early stage with AI. And the evolution of this technology is moving so fast that it’s hard for education to keep up with demand. 

Here are some examples of where AI confusion comes into play during contract discussions with our customers: 

  • Unclear on risks and biases: Most customers have heard many AI-related data risk buzzwords and security warnings, but have no idea if any of it applies to their own company or any AI company they’re evaluating. 
  • Misunderstanding: Customers often don't realize we (and many other AI vendors) rely on subprocessors, like OpenAI, for specific tasks, rather than building these AI services from scratch. This misunderstanding leads to questions that would be most appropriate for base genAI providers like OpenAI. The questions are not specific to a company that has built custom, industry-focused AI services on top of base AI models.
  • Unclear on roles and responsibilities: Some of our customers include guidelines on how their own employees would be expected to safely interact with our AI platform, something we have no control over. 

We spend a lot of time up front with our customers talking about what AI does and what we do as a company. We often end up taking the role of advisor and guiding customers and prospects on what they should be asking of an AI vendor when it comes to legal protections and data security. 

Here are the top things companies SHOULD be asking from their AI vendor:

  1. Does the product include generative AI or some other type of AI (eg NLP or ML)?
  2. Does the Agreement allow for customer data to be used to improve generative AI models?
  3. What type of customer data is used/needed for the training?
  4. Does the platform use shared or tenant specific generative AI models?
  5. Who owns the input and output to the generative AI models?

If you’re in the procurement phase, here is a comprehensive guide and RFP for evaluating AI sales vendors. 

The growth of AI ushers in new threats to sensitive data

As AI continues to grow with no signs of slowing down, the threat to the sensitive data that feeds AI models also grows. Organizations need to get even more serious about data protection - both within their own organizations and the security standards of their AI vendors. 

Multi-factor authentication, firewalls, and malware protection have all been the data protection gold standard for many years. But even with these protections, breaches still happen as determined hackers learn to circumvent even the strongest MFA methods. A threat actor can infiltrate an enterprise system in less than 10 minutes, making short-term access codes ineffective. 

If an employee or vendor’s identity is compromised, a large part of any organization is instantly exposed, giving hackers access to highly sensitive customer and company data — including financial records, PII, and trade secrets. Data exposure of any kind damages a company’s brand and erodes trust with customers, partners, and in some cases, employees. Sometimes irreversibly.

What does good data security look like at an AI company?

As tech stacks grow, employees and vendors inevitably gain full administrative privileges to different critical environments across the organization. And those standing privileges means that when there’s a breach, the hacker instantly gains entry to everything that employee or vendor has access to. Excessive access and standing privilege are problems that need to be addressed internally and among your vendors. Here’s how we’re approaching it at People.ai. 

When it comes to data protection and security, two things are top of mind with every employee and customer.

  1. Who has access to the data? 
  2. How long do they have access to that data? 

It’s our goal to limit both of those areas as much as possible and we do that in a couple different ways. 

#1: Data access due diligence and auditability. If anyone requests to make changes to a customer dataset or create a new dataset on behalf of a customer, we automatically send three things to our point of contact at that company:

  • What is being requested. 
  • How long they will have access to the data.
  • The name and title of the requestor.

By recording and sharing these things with our customer for every single data-related request, we are creating an auditable record of access. No one interacts with customer data without both our team and our customer proactively knowing about it. 

#2: Zero Standing Access. No one at People.ai has persistent access to any customer data. And when we say no one, that means absolutely no one at our company, including leadership. When someone needs to access a customer’s data for any reason, they must go through a documented data-access approvals process. Then, they are granted access for a specific period of time for a specific task. Once they’re done with the task, they automatically lose access. 

We do this at scale by leveraging a tool that specializes in zero standing access. The system evaluates hundreds of attributes for each request from our systems of record like Salesforce, GitHub, AWS, and Databricks to understand why access is needed. It then determines the most secure authorization based on detailed policies approved by the data security team.  

For example, if one of our customer success managers is granted access to a specific customer’s data based on an open JIRA ticket, PagerDuty working hours, and known IP address, that access is granted until the JIRA ticket is closed, at which point it is immediately terminated. No manual interventions are required to change, grant, or remove access, ensuring data is only accessed when required. 

Learn more about data security and governance at People.ai. 

Written by Stephen Myers, VP and Head of Legal at People.ai and Aman Sirohi, Chief Information Security Officer at People.ai

When ChatGPT burst onto the scene in 2022, it brought generative AI to the forefront of our collective consciousness and launched a new era in technology that is disrupting the way employees interact with technology to do work. Suddenly, every organization wants “AI''. 

In a 2024 survey among 3,000+ business and IT leaders, 92% of respondents agreed their organization needs to shift to an AI-first operating model within the next 12 months to stay competitive. In another survey by Salesforce, sales professionals said AI will help them increase productivity, increase sales, and better serve customers. 

Despite the understanding that AI will help organizations be competitive, there is still a lack of education in the industry about how the technology can be implemented safely and at scale. And since AI relies on massive amounts of sensitive data to work effectively, a lot of work needs to be done around AI data security and legal governance to protect organizations. Much of the burden of creating these new policies will fall on the shoulders of cyber security and legal teams. 

When AI education doesn’t keep up with demand 

What we’re learning as we work with customers at People.ai is that the interest in AI products is far greater than most people’s understanding of the technology. There’s a good reason for that - and it’s not the customer’s fault. 

AI presents a completely new way of working and solving problems - it’s not just another tool to adopt. There are very few processes that can’t be improved by AI. This tectonic shift in the way we work is similar to the transition from on-prem to cloud. It took individuals and organizations years to understand what the cloud was and how it worked. We’re in a similar early stage with AI. And the evolution of this technology is moving so fast that it’s hard for education to keep up with demand. 

Here are some examples of where AI confusion comes into play during contract discussions with our customers: 

  • Unclear on risks and biases: Most customers have heard many AI-related data risk buzzwords and security warnings, but have no idea if any of it applies to their own company or any AI company they’re evaluating. 
  • Misunderstanding: Customers often don't realize we (and many other AI vendors) rely on subprocessors, like OpenAI, for specific tasks, rather than building these AI services from scratch. This misunderstanding leads to questions that would be most appropriate for base genAI providers like OpenAI. The questions are not specific to a company that has built custom, industry-focused AI services on top of base AI models.
  • Unclear on roles and responsibilities: Some of our customers include guidelines on how their own employees would be expected to safely interact with our AI platform, something we have no control over. 

We spend a lot of time up front with our customers talking about what AI does and what we do as a company. We often end up taking the role of advisor and guiding customers and prospects on what they should be asking of an AI vendor when it comes to legal protections and data security. 

Here are the top things companies SHOULD be asking from their AI vendor:

  1. Does the product include generative AI or some other type of AI (eg NLP or ML)?
  2. Does the Agreement allow for customer data to be used to improve generative AI models?
  3. What type of customer data is used/needed for the training?
  4. Does the platform use shared or tenant specific generative AI models?
  5. Who owns the input and output to the generative AI models?

If you’re in the procurement phase, here is a comprehensive guide and RFP for evaluating AI sales vendors. 

The growth of AI ushers in new threats to sensitive data

As AI continues to grow with no signs of slowing down, the threat to the sensitive data that feeds AI models also grows. Organizations need to get even more serious about data protection - both within their own organizations and the security standards of their AI vendors. 

Multi-factor authentication, firewalls, and malware protection have all been the data protection gold standard for many years. But even with these protections, breaches still happen as determined hackers learn to circumvent even the strongest MFA methods. A threat actor can infiltrate an enterprise system in less than 10 minutes, making short-term access codes ineffective. 

If an employee or vendor’s identity is compromised, a large part of any organization is instantly exposed, giving hackers access to highly sensitive customer and company data — including financial records, PII, and trade secrets. Data exposure of any kind damages a company’s brand and erodes trust with customers, partners, and in some cases, employees. Sometimes irreversibly.

What does good data security look like at an AI company?

As tech stacks grow, employees and vendors inevitably gain full administrative privileges to different critical environments across the organization. And those standing privileges means that when there’s a breach, the hacker instantly gains entry to everything that employee or vendor has access to. Excessive access and standing privilege are problems that need to be addressed internally and among your vendors. Here’s how we’re approaching it at People.ai. 

When it comes to data protection and security, two things are top of mind with every employee and customer.

  1. Who has access to the data? 
  2. How long do they have access to that data? 

It’s our goal to limit both of those areas as much as possible and we do that in a couple different ways. 

#1: Data access due diligence and auditability. If anyone requests to make changes to a customer dataset or create a new dataset on behalf of a customer, we automatically send three things to our point of contact at that company:

  • What is being requested. 
  • How long they will have access to the data.
  • The name and title of the requestor.

By recording and sharing these things with our customer for every single data-related request, we are creating an auditable record of access. No one interacts with customer data without both our team and our customer proactively knowing about it. 

#2: Zero Standing Access. No one at People.ai has persistent access to any customer data. And when we say no one, that means absolutely no one at our company, including leadership. When someone needs to access a customer’s data for any reason, they must go through a documented data-access approvals process. Then, they are granted access for a specific period of time for a specific task. Once they’re done with the task, they automatically lose access. 

We do this at scale by leveraging a tool that specializes in zero standing access. The system evaluates hundreds of attributes for each request from our systems of record like Salesforce, GitHub, AWS, and Databricks to understand why access is needed. It then determines the most secure authorization based on detailed policies approved by the data security team.  

For example, if one of our customer success managers is granted access to a specific customer’s data based on an open JIRA ticket, PagerDuty working hours, and known IP address, that access is granted until the JIRA ticket is closed, at which point it is immediately terminated. No manual interventions are required to change, grant, or remove access, ensuring data is only accessed when required. 

Learn more about data security and governance at People.ai. 

Data Governance and Security in the AI Era

Learn all of the ways People.ai can  drive revenue growth for your business