DataScava helps companies leverage the digital world’s explosive growth of data. By 2022, more than 90 percent of the world’s electronic information will be unstructured, delivered in business reports, research papers, emails, user comments – all of them written in different styles and using terms whose meaning differs from sector to sector.

You don’t need to be a Data Scientist or Software Engineer to use it, but you can be.

Domain-Specific Language Processing (DSLP) and Weighted Topic Scoring (WTS), a methodology we’ve patenteduse your business language and jargon to index and curate the most relevant documents from large datasets for use in AI/Machine Learning, prediction engines, business analytics and production systems, delivering verifiable results fast.

DataScava provides a true competitive advantage to companies in finance, technology, healthcare, insurance, manufacturing, the public sector or any other industry seeking to maximize the transformative power of data.


How We Do It

INDEX your selected unstructured textual content using your customized DSLP.

SCORE and generate Metadata including Topic Scores, Percentile Rankings, Data Tags and more.

MATCH your weighted search topics using your WTS and Company-Specific Search Templates.

CURATE a subset of highly precise files for your use and FILTER new data around the clock.

Download the DataScava Datasheet.

Domain-Specific Language Processing

Our Domain-Specific Language Processing converts unstructured textual content into precisely structured derived data and searchable document indices.

Built upon our two U.S. patents in “Profile Matching of Unstructured Data,” DataScava uses your business language, jargon and acronyms to index, search, score and match unstructured textual data points.

We don’t use any form of generalized Natural Language Processing (NLP), Natural Language Understanding (NLU), Semantic Search, linguistics or fuzzy logic which ensures DataScava is accessible to both technical and nontechnical users.


Weighted Topic Scoring (WTS)

DataScava is the first and only product to offer patented Weighted Topic Scoring which finds and filters the most relevant documents from large unstructured data sets, providing highly precise results you can see, control and measure.

WTS scores topic keywords and phrases found using DLSP, creating a shortlist of highly precise results that can be seen, controlled and measured. Users select search topics of interest, set required scores files must meet in each topic to match, add “nice-to-have’s, and rank the resulting output.

WTS only matches files that meet or exceed ALL weighted topic score requirements, and highlights color-coded found topics in the file textual content. Users can adjust WTS and add or edit new topics on the fly, and hone in further, using multi-level sort to drill down and surface key results.

Users can create, select and adjust all topics of interest on-the-fly, weight their significance, and rank the output. They set minimum “required” and “nice-to-have” score thresholds to be met in each topic, and home in further using multi-level sort to drill down on specific topics of interest, bringing strong matches quickly to the top.

Customized Company-Specific Search Templates and Topics Libraries, percentile rankings and “not” capability help you get to the precise data you need. Real-time scoring, visualization and highlighting of data points empower you to draw better insights and make more accurate business decisions.

White-Box Approach, Human In Command

DataScava keeps the Human in Command through a white-box approach that uses human intelligence (not artificial intelligence) and  machine training (not machine learning).

It empowers business users, data scientists, data analysts, and software engineers by automating time-consuming unstructured textual data preparation and mining tasks, positively impacting the bottom line.

Through use, as you continue to train DataScava using your subject matter expertise and automatic data filters that work for you around the clock, its capabilities grow, providing measurable benefits you can count on.

Control Input, Measure Output in AI/ML

In AI/ML and other data-driven systems, the quality of the output depends on the quality of the input. With bad data, applications produce results that are inaccurate, incomplete or incoherent.

By extracting precise user-defined domain-specific information from unstructured textual data, DataScava can help ensure input data is high quality, relevant and useful to your business, avoiding the problem of garbage in, garbage out.

Since DataScava operates on an alternate principle to AI, it provides a unique adjunct to AI environments to filter input and measure output to help you do a reality check on your assumptions.

It also uses a top-down analysis approach that informs the data filter based on corpus-level statistics, unlike AI systems which begin at the word level.

DataScava Powers TalentBrowser

DataScava’s story is backed by real-world results.

It powers TalentBrowser, industry’s only Automated Job Matching, Skills Assessment and Domain-Specific Search Engine.

For more than a decade, management consultants, investment banks, Fortune 500 firms, staffing companies and others have benefited from TalentBrowser’s ability to find exceptional talent for jobs.