Vendor Alias Resolution at Scale: How Normalisation Engines Handle 400+ Manufacturer Variants
A technical deep-dive into the three-layer alias resolution architecture — exact match, suffix stripping, and fuzzy matching — that handles acquisition history, casing, and typos at production scale.
Resolving 'Liebert' to 'Vertiv', 'LENOVA' to 'Lenovo', and 'APC by Schneider Electric' to 'Schneider Electric' sounds simple. At scale, with 400+ variants and acquisition history to track, it requires a structured three-layer approach.
Key Takeaways
- Effective vendor alias resolution requires three layers: exact match (fast, handles the majority), suffix stripping (handles corporate suffixes), and fuzzy match (handles typos and abbreviations).
- Acquisition history is the hardest part of vendor normalisation. 'Liebert' is now 'Vertiv', but 'Liebert by Vertiv' and 'Liebert Corp' are still in the wild. Each variant needs an explicit alias.
- ALL_CAPS brands (QNAP, STULZ, IBM, APC) and preserved-case brands (CoolIT, VMware, NetBox) must be handled separately from standard title-casing to avoid incorrect output.
- Confidence scoring on alias matches allows downstream systems to treat high-confidence matches differently from low-confidence ones that need manual review.
- A compiled-in alias library (embedded in code at build time) is faster and more reliable than a file-based or database-backed library for per-record normalisation at production throughput.
The Scale of the Problem
A single enterprise data centre inventory with 1,000 to 2,000 assets will typically contain 40 to 80 unique manufacturer name variants. A multi-site inventory aggregated from a dozen source systems can contain 200 or more. Across the full range of DC hardware vendors — servers, networking, storage, power, cooling, security — the total universe of variants in the wild exceeds 400.
This is not a problem you can solve manually at scale. Processing 2,000 records with 80 manufacturer variants by hand takes hours and produces inconsistent results. The same engineer will resolve "Dell EMC" to "Dell Technologies" in the morning and "Dell" in the afternoon. A normalisation engine that applies a consistent, deterministic algorithm to every record is the only reliable approach.
This article explains how that algorithm works.
Layer 1: Exact Match
The first layer of alias resolution is exact match. The input string is normalised — converted to lowercase, leading and trailing whitespace stripped — and compared directly against a lookup table of known aliases.
The lookup table maps every known variant to its canonical form:
| Input (normalised) | Canonical Output |
|---|---|
| dell emc | Dell Technologies |
| dell inc | Dell Technologies |
| dell inc. | Dell Technologies |
| dell | Dell Technologies |
| hpe | Hewlett Packard Enterprise |
| hp | Hewlett Packard Enterprise |
| hewlett-packard | Hewlett Packard Enterprise |
| apc | Schneider Electric |
| apc by schneider electric | Schneider Electric |
| american power conversion | Schneider Electric |
| liebert | Vertiv |
| emerson network power | Vertiv |
| emerson | Vertiv |
Exact match handles the majority of cases — typically 70 to 80% of all records in a real-world inventory. It is fast (O(1) lookup against a hash map) and deterministic (the same input always produces the same output). For a broader overview of why vendor normalisation matters for DCIM imports, see What Is DCIM Data Normalisation?.
The limitation of exact match is that it requires every variant to be explicitly listed in the lookup table. A variant that is not in the table will not be resolved. This is why the lookup table must be comprehensive — and why it must be actively maintained as new variants appear.
Layer 2: Suffix Stripping
The second layer handles corporate suffixes — the ", Inc.", ", Ltd.", " Group", " Corporation", " Holdings" suffixes that appear on legal entity names but are not part of the brand name used in practice.
Suffix stripping works by applying a set of regex patterns to the input string after exact match has failed:
, Inc. → (remove)
, Inc → (remove)
, Ltd. → (remove)
, Ltd → (remove)
Corp. → (remove)
Corp → (remove)
Corporation → (remove)
Group → (remove)
Holdings → (remove)
Technologies → (remove, then re-match)
After stripping the suffix, the result is re-submitted to the exact match layer. "Cisco Systems, Inc." strips to "Cisco Systems", which matches the alias for "Cisco Systems" → "Cisco Systems". "Lenovo Group" strips to "Lenovo", which matches "Lenovo" → "Lenovo".
Suffix stripping handles a significant portion of the variants that exact match misses — particularly variants from legal entity names in procurement systems and financial asset registers.
Layer 3: Fuzzy Match
The third layer handles typos, abbreviations, and variants that cannot be resolved by exact match or suffix stripping. Fuzzy matching uses edit distance (Levenshtein distance) or token overlap (Jaccard similarity) to find the closest match in the alias table.
Fuzzy match is slower than exact match — O(n) against the alias table rather than O(1) — and it requires a confidence threshold to avoid false positives. A fuzzy match with a similarity score below the threshold is not applied; instead, the record is flagged for manual review.
Common cases that fuzzy match resolves:
| Input | Resolved To | Match Type |
|---|---|---|
| LENOVA | Lenovo | Typo (edit distance 1) |
| Cisco Sys | Cisco Systems | Abbreviation (token overlap) |
| Juniper Net | Juniper Networks | Abbreviation |
| Schneider Elec | Schneider Electric | Abbreviation |
| Palo Alto Net | Palo Alto Networks | Abbreviation |
Fuzzy match is the safety net for the long tail of variants that are too unusual to be in the exact match table but too close to a known vendor to be genuinely unknown.
Try Struktive on your own data
Upload a raw asset CSV and get back a normalised, DCIM-ready file in minutes. No account required.
Handling Acquisition History
Acquisition history is the hardest part of vendor alias resolution. When a company is acquired, its products continue to appear in asset inventories under the old brand name for years — sometimes decades — after the acquisition. The alias library must track not just current brand names but the full chain of acquisitions.
The most complex acquisition chains in the DC hardware space:
Vertiv: Liebert → Emerson Network Power → Emerson → Vertiv (acquired by Platinum Equity in 2016, rebranded as Vertiv in 2018). Variants include: Liebert, Liebert Corp, Liebert Corporation, Liebert by Vertiv, Emerson Network Power, Emerson, Emerson Electric (the parent, not the DC brand), Geist (acquired by Vertiv in 2021).
Hewlett Packard Enterprise: HP → Hewlett-Packard → HPE (split from HP Inc. in 2015). Variants include: HP, Hewlett-Packard, Hewlett Packard, H.P., HPE, H.P.E., HP Enterprise. Note: HP Inc. (printers and PCs) is a separate entity from HPE (servers and networking) — the alias library must distinguish between them.
Schneider Electric: APC → APC by Schneider Electric → Schneider Electric. Also: American Power Conversion (APC's original name), MGE UPS Systems (acquired by APC in 2007), Clipsal (acquired by Schneider in 2003).
Cisco: Cisco Systems → Cisco. Also: Cisco Meraki (acquired 2012, kept as distinct brand), Cisco Webex (acquired 2007), Cisco AppDynamics (acquired 2017). Note: Cisco Meraki should resolve to "Cisco Meraki", not "Cisco Systems" — it is a distinct product line with its own device type library entries.
Special Casing: ALL_CAPS and Preserved-Case Brands
Standard title-casing (capitalising the first letter of each word) is the default output format for normalised vendor names. But some brands use ALL_CAPS or non-standard capitalisation as part of their official identity.
ALL_CAPS brands that must not be title-cased:
- QNAP (not "Qnap")
- STULZ (not "Stulz")
- IBM (not "Ibm")
- APC (not "Apc")
- NVIDIA (not "Nvidia")
- ABB (not "Abb")
- F5 (not "F5" — this one is already correct, but must not become "F 5")
Preserved-case brands with non-standard capitalisation:
- CoolIT (not "Coolit" or "COOLIT")
- VMware (not "Vmware" or "VMWARE")
- NetBox (not "Netbox" or "NETBOX")
- MinIO (not "Minio")
- WekaIO (not "Wekaio")
The normalisation engine must check a preserved-case lookup table before applying title-casing, and must check an ALL_CAPS set before applying any casing transformation.
Confidence Scoring on Alias Matches
Not all alias matches are equally reliable. An exact match against a hand-authored alias entry is highly reliable. A fuzzy match with a similarity score of 0.72 is less reliable. A match inferred from the hostname when the manufacturer field is empty is least reliable.
Assigning a confidence score to each match allows downstream systems to treat high-confidence matches differently from low-confidence ones. Records with high-confidence matches can be imported directly. Records with low-confidence matches should be flagged for manual review before import.
A practical confidence scoring scheme:
| Match Type | Confidence |
|---|---|
| Exact match, hand-authored alias | 1.00 |
| Exact match, library alias | 0.95 |
| Suffix-stripped exact match | 0.90 |
| Fuzzy match, similarity > 0.85 | 0.80 |
| Fuzzy match, similarity 0.70–0.85 | 0.65 |
| Inferred from hostname | 0.50 |
| No match found | 0.00 |
The Compiled-In Library Advantage
A vendor alias library can be stored in three ways: as a file (CSV or JSON) loaded at startup, as a database table queried at runtime, or compiled into the application code at build time.
For per-record normalisation at production throughput, the compiled-in approach is fastest. The alias map is a JavaScript/TypeScript Map or object literal that is initialised once when the module loads and is available in memory for every subsequent lookup. There is no file I/O, no database query, and no network round-trip.
The trade-off is that updating the library requires a code change and a deployment. For a library that changes infrequently (vendor acquisitions happen a few times a year; new variants are added as they appear in customer data), this is an acceptable trade-off. For a library that changes frequently (for example, a customer-editable alias table), a database-backed approach is more appropriate.
Struktive uses a compiled-in library for the core alias map (400+ entries, zero I/O overhead) and a database-backed layer for customer-specific overrides (editable via the Ops Centre Reference Libraries UI). The compiled-in library handles the vast majority of cases; the database layer handles the edge cases that are specific to a customer's environment. For the full DCIM migration context in which alias resolution sits, see The DCIM Migration Project Playbook.