• Skip to main content
  • Skip to primary sidebar

utterworks

  • Home
  • Solutions
    • Natural Language Assessment
    • Fluent One – Natural Language Platform
    • Fluent Find – Natural Language Search
    • Fluent Converse – Conversational AI
  • About Us
    • About Utterworks
    • Meet the Directors
    • Contact Us
  • Blogs

06/01/2023

PII Anonymisation

Natural language processing (NLP) models can be used to very effectively anonymize personally identifiable information (PII) in order to protect the privacy of individuals and comply with privacy regulations. Anonymization is the process of removing or obscuring PII from a text or other data source in such a way that the individual can no longer be identified.

There are several ways in which NLP models can be used for anonymization, including:

  1. Named entity recognition (NER): NER models can be used to identify and extract named entities (such as names, addresses, and phone numbers) from a text, and then replace these entities with placeholders or pseudonyms to anonymize the text.
  2. Redaction: NLP models can be used to identify sensitive information in a text and redact it (i.e., obscure it or remove it entirely) in order to protect the privacy of individuals.
  3. De-identification: NLP models can be used to de-identify texts by replacing or removing specific pieces of information (such as names, addresses, and phone numbers) that could be used to identify an individual.
  4. Pseudonymization: NLP models can be used to create pseudonyms for individuals by replacing their names with unique identifiers that cannot be traced back to the individual.

Overall, the use of NLP models for anonymization can help organizations protect the privacy of their customers or clients and comply with privacy regulations.

Fluent One uses pre-trained deep learning models to offer the ability to anonymize personally identifiable information (PII) in a piece of text. This feature uses the same pre-trained deep learning model used for entity recognition, but is trained to specifically identify and anonymize PII in the input text. The model is able to identify PII such as names, addresses, phone numbers, email addresses, and customer reference numbers and configure a number of different anonymisation techniques to protect the individual’s privacy, whilst keeping meaningfulness in the anonymised text. The techniques include:

  • Replacing with a fixed value
  • Replacing with a fake value
  • Masking
  • Removing 
  • Hashing

The Fluent One PII feature also provides a servive to reverse the anonymisation process – ideal for utilising a less secure 3rd party cloud service from a secure internal service (anonymise to pass to the cloud service and then reverse anonymisation to further process in secure environment)

Primary Sidebar

Recent Posts

  • Conversational AI and Customer Service
  • Customer self-service is hard too
  • Customer Service is hard
  • We need to talk about search
  • Re-think your metrics
  • Covid-19 and NLP
  • Can NLP enhance RPA?
  • We love messaging
  • Multi-label Text Classification using BERT – The Mighty Transformer
  • Train and Deploy the Mighty BERT based NLP models using FastBert and Amazon SageMaker

About This Site

Jump onboard with us as Natural Language Processing takes off

Copyright © 2023 UTTERWORKS LTD Company no: 12186421 Registered in England and Wales · Privacy