DataScava and RPA

We commissioned a series of articles from Scott Spangler, former IBM Watson Health Researcher, Chief Data Scientist, and author of the book Mining the Talk: Unlocking the Business Value in Unstructured Information,”  in which he discusses how and why DataScava’s patented precise approach to mining unstructured text data perfectly complements real-world big data applications in AI, LLMs, ML, RPA, BI, Research, Talent, and BAU applications. He also contrasts our Tailored Topics Taxonomies, Domain-Specific Language Processing, and Weighted Topic Scoring methodologies with standard approaches such as NLP and NLU.

In “Consistent High-Quality RPA Requires Deep Customer Understanding, Scott discusses:

  • His views on the difference between knowing and understanding when it comes to implementing RPA.
  • The drawbacks of using a pure Machine Learning/NLP approach to RPA.
  • The need for customer understanding through three fundamental capabilities:  classification of content, characterization of the customer, and customization of features.
  • How DataScava can be employed to fill in these critical gaps and provide a better customer experience by readily capturing existing in-house expertise.

Here’s an excerpt:

“Customers love being understood. It’s just human nature to want to be seen as a unique individual by those we interact with. Therefore, good RPA systems have to work by first understanding the customer’s needs (all of them!), being aware of what the customer doesn’t need, what the customer prioritizes, and only then suggest a course of action (or maybe several, or none).

The DataScava approach enables this level of deep understanding. Multiple customer intents within text can be determined based on a detailed analysis of the unstructured text. Business rules that encode the Boolean logic of the solution space combined with Weighted Topic Scoring can be designed to apply the right solutions to the right situation. This includes the ability to encode rules of form X AND Y BUT NOT Z, as well as to assign different levels of importance to each topic. This precise level of characterization is what’s required to make each customer feel heard and understood.”


Our Patented Approach


Domain-Specific Language Processing (DSLP)



Weighted Topic Scoring (WTS)


Tailored Topics Taxonomies (TTT)





Taxonomies for Financial and I.T. Domains



 Taxonomies for Talent Matching and Skills Analytics