By Cynthia Hetherington

As cyber intelligence investigators, we need to assess what is a real issue and what is simply white noise. To respond quickly and efficiently, we need a platform and a format that help organize everything coming in over the wires. Consider a Twitter-popular mega celebrity tweeting her “hot new eyelashes are on fire, OMG…so excited!” The phrase “on fire” will flag every social media monitor programmed to watch for public safety incidents.

With social media posts, monitoring systems, message boards, and local television news programs, it can quickly become a herculean effort to whittle thousands of messages down to the two truly important ones—within an hour’s deadline. The good news is that vendors are crafting platforms based on artificial intelligence to help identify and refine copious amounts of content into useful content. Each new resource helps with the process.

In this 2-part blog series, Hg discusses how to capture and record your findings and explores the ever-growing field of tools and resources to make your searches thorough and efficient. Last week we explored tools and resources. This week, we dive into using taxonomy and word clouds to assist in your investigations and reports.

Drinking from a Firehose

Today’s cyber intelligence investigators are tasked with drinking from a firehose of information.

Investigators must filter through the noise of oversharing on social media accounts, online and print media, 24/7 news channels, as well as trade and journal publications. The tsunami of information overwhelms even the simplest of searches for valuable case content.

With the overabundance of information come many new technology resources that help the searcher reduce the flood to a trickle, and from it extract drops of data that are relevant. To mix metaphors, think of the fishing phrase “catch and release,” let’s call this “capture and awareness.”

For example: If you are investigating Carlos Gonzalez, a man known to be a resident of Norwalk, California, you can start with the following Google search terms to see, rather quickly, which might work best:

  1. carlos gonzalez
  2. “carlos gonzalez”
  3. “carlos gonzalez” norwalk
  4. “carlos gonzalez” * norwalk

The * (asterisk) operates as a 15-word proximity Boolean operator between the two search sets of Carlos Gonzalez and Norwalk. That is, the phrase carlos gonzalez needs to be 15 words from the word Norwalk. This technique should reduce the search results from millions to maybe 10. This is an example of using available technology to its best ability to capture and learn from the Internet. Need to do this regularly or in larger volume? The following instruction might be helpful in your own investigations.

Capture and Awareness Information

When starting your investigation, create an initial word list of terms and develop additional, helpful words to add that you might not have thought of before. For example, suppose your case involves maintenance men who had been in escalator accidents. The client doesn’t want information about accidents involving patrons (of which there are hundreds), but, specifically, the client wants to know about the workers who repair and maintain movable floors and stairs.

Using a Taxonomy

Taxonomy is a classification system. For our escalator example, use Google or another search engine to search for the key terms “escalator, accident, maintenance.” Your search will reveal that there are other terms that can also work for this search. You are now developing a word taxonomy—a list of all possible terms and expressions that might get you closer to relevant research that might produce answers.

Organized into three columns, your taxonomy list for the escalator example would look like this:

During your search, you may find a brand-new lead and want to pursue it. Don’t go down that rabbit hole. You will be better served if you finish the original inquiry, and then return to the new lead. This disciplined, “to-the-end” approach keeps you focused and prevents you from wandering, scatterbrained, around the Internet.Record this list in a computer file; also keep descriptive notes for yourself in that file.

Documenting your research in an electronic file will pay off when you later (often much later) have to refer back to your notes to follow up on the investigation. You can electronically search one document rather than spend time digging through piles of handwritten paper notes (which, over time, might get lost). By electronically managing a word and subject directory, you can easily incorporate the items you searched for and where you looked into your final report.

Recording Your Findings

It’s important in research to establish a methodology that keeps you attuned with your research results. Try the combination approach of finding and recording information in the initial stages of gathering research.

Record the findings in a consistent way so that you can return to the search and repeat the steps. This consistent manner will help you create a more professional looking report that will benefit the client.

Use a writing style manual to help your writing be clear, clean, and precise. There are many to choose from (e.g., The Associated Press Stylebook, The Chicago Manual of Style, and the MLA Handbook, to name but a few). The key is choosing one and being consistent.

Not sure your writing is of high quality? Show a colleague a redacted copy of your report and ask for feedback on flow, content, and readability. Doing so will help you refine your reports, which will pay off later when a potential client wishes to see a sample of your work.

Also be consistent in your method, keeping in mind how you arrived at your findings. This way your client can follow your findings—how you would explain your findings to your client, the depth and necessity you need to explain why you chose those key items.

Write in Word Clouds

Draw your client’s attention to all the words that can be used to describe a single item. For example, the Scots have over 400 words to describe snow. If you were researching a Scottish-based firm that had losses due to weather, you would likely need to use 400+ words to research in your Capture and Awareness method. Demonstrate the volume of available research words in your client report by using a word cloud. You can generate a word cloud for free here.






Are you an analyst or investigator looking for advanced training on unique social search tools? If so, check out Hg’s recorded webinar, Utilizing Social Media and Other Search Resources. This class expands your general knowledge of social media platforms and search engines, allowing you to create thorough reports for your clients.


With over twenty-five years of global experience risk monitoring, Hetherington Group offers companies of all sizes Current Awareness Risk Monitoring Service—our proprietary Internet, chat room, and news monitoring system designed for online vigilance. Our alerts aggregate data from over 20 million foreign and domestic resources. Learn how Hg’s analysts can keep you, your company, and your loved ones safe from online predators, protesters, and hackers.


Cynthia Hetherington, MLS, MSM, CFE, CII is the founder and president of Hetherington Group, a consulting, publishing, and training firm that leads in due diligence, corporate intelligence, and cyber investigations by keeping pace with the latest security threats and assessments. She has authored three books on how to conduct investigations, is the publisher of the newsletter, Data2know: Internet and Online Intelligence, and annually trains thousands of investigators, security professionals, attorneys, accountants, auditors, military intelligence professionals, and federal, state, and local agencies on best practices in the public and private sectors.