Bitsight GIA: AI-Powered Asset Mapping and Attribution

Bitsight GIA: AI-Powered Asset Mapping and Attribution

Last month, my colleague Arzu Ozbek Akay shared some insights about the impact that Bitsight Groma, our next-generation scanner, is already having on our products. Today, I’m going to follow that up with an update on the momentum we’re seeing with the second core component of our data engine: Bitsight Graph of Internet Assets (GIA).

As a quick refresher, GIA uses advanced graph technology and AI models to map assets to specific organizations and build Ratings Trees at a global scale. It’s critically important to Bitsight’s products since it allows our customers to understand their assets and those of their third parties and fourth parties. It minimizes visibility gaps and supports our human curation efforts.

Let’s take a closer look at how GIA works and the early impact it is having on the Bitsight product experience.

Key takeaways

  • Bitsight GIA combines AI with human curation to map the world’s internet-connected assets
  • The curated training data that powers GIA includes millions of data points spanning over 540,000 organizations, and that grows by 10,000 organizations per month
  • GIA has already started to accelerate the asset assignment refresh time for Bitsight’s inventory by 2X for some attribute types, expanded asset discovery, and improved initial human-led mapping accuracy and coverage

Combining AI with human curation

One of the most transformational aspects of GIA is its innovative use of AI models. Many existing mapping and attribution approaches rely on static rules to automate various steps in the process. Next-generation approaches like GIA instead use AI models to:

  • Discover relationships between assets with a high degree of confidence
  • Create multi-step associations as it navigates the graph to find more complex linkages and extend what we provide our customers
  • Continually improve mapping speed and accuracy over time

GIA has already proven itself to be highly effective at this for two primary reasons.

1. Depth of training data

As with many AI applications, output quality is driven by the quantity and quality of training data. To support GIA’s training needs, we’ve amassed a human-tagged training data set that spans over 540,000 organizations and millions of assets and evidentiary data points. To the best of knowledge, this is the largest set of training data of this kind in existence, and it expands by about 10,000 organizations monthly. Because of the human curation process, GIA models are continually learning and evolving based on the most up-to-date and accurate data.

2. A commitment to human data curation

While we view AI as a transformational technology, we don’t see it as a complete replacement for human analysis. In addition to being large in size, GIA’s training data set was expertly curated by our research team. This unlocked the full potential of GIA’s models by giving it a pristine foundation to build on. In addition, we’ve avoided the temptation to trust automated mapping outputs implicitly, applying human curation on both the input and output sides of our AI process. This is furthering our lead as the industry’s data quality leader.

GIA deep dive


Data signals GIA uses

Another important way that GIA is evolving our mapping and attribution approach is by correlating disparate data points to create more precise and high-confidence mappings. The primary sources of asset information we are targeting with GIA include scanning results, WHOIS information, DNS queries, BGP lookups, and TLS/SSL certificate inspection. We are also building out GIA’s capability to use other types of corroborating information to increase mapping and attribution confidence levels, which we’ll continuously add to the product.

GIA’s early impact on Bitsight’s products

While it has only been a relatively short time since we introduced GIA, it is already impacting the data that powers Bitsight’s products in numerous ways.

Here are a few examples:

  • GIA has accelerated by 2X the attribution refresh time for IPs assigned based on certificates
  • It’s supporting initial asset mapping and learning from the human response to GIA’s suggestion, which improves the model and helps the curated maps to be more complete and available faster
  • A new probable assets feed will bring additional exposure data to customers and help our human curators increase the number of high-confidence asset mappings

Most importantly, GIA is always evolving its models as its AI models are exposed to new training data, inputs from our researchers, and direct feedback from customers.

Learn More About Bitsight’s Data Discovery, Attribution, and Observation Capabilities

For a more comprehensive overview of Bitsight’s internet scanning and AI-based graphing capabilities, download our white paper, “A Data-Driven Approach to Asset Discovery and Risk Measurement.”

You’ll learn how Bitsight GIA and Bitsight Groma, our scanning engine, work in concert to build a living map of the world’s digital ecosystem and capture deep insights about organizational risk posture.

Download your copy today.

Data-Driven Approach Asset Discovery Risk Measurement cover

Download this white paper to learn about Bitsight’s approach and how it sets apart from alternatives in the market.