From Symbol Count to Bid-Ready Estimate: Inside Aginera's Electrical Extraction Pipeline
Most "AI takeoff" tools stop at symbol counting. They'll scan a drawing, identify that there are 89 duplex receptacles and 247 light fixtures, and hand you a list. That's useful — but it's maybe 25% of what an estimator actually needs to produce a bid.
The other 75% is the hard part: figuring out the conduit, the wire, the panel connections, the labor, and the assemblies that turn a device count into a priced estimate. That's what Aginera's electrical extraction pipeline does, and this post explains how it works.
The Five Layers of Electrical Extraction
Think of electrical estimation as five stacked layers, each building on the one below:
Layer 5: Priced Estimate ← material costs + labor rates + margin
Layer 4: Assembly Expansion ← each device → full material + labor assembly
Layer 3: Conduit & Wire ← conductor sizing, raceway type, run lengths
Layer 2: Panel & Circuit Logic ← breaker distribution, feeder sizing
Layer 1: Device Extraction ← symbol recognition, counting, classification
Most tools only do Layer 1. A few attempt Layer 2. Aginera's pipeline covers all five.
Layer 1: Sheet Classification and Device Extraction
A construction drawing set is not a flat list of pages — it's a structured document with cover sheets, architectural plans, structural details, mechanical layouts, and electrical drawings all interleaved. Feeding every page through the same extraction model wastes compute and produces noise.
Aginera's sheet classifier reads each page's title block, sheet number prefix, and content characteristics to route it:
| Sheet Prefix | Classification | Processing |
|---|---|---|
| E-1xx | Electrical Site/Power Plans | Symbol extraction + conduit tracing |
| E-2xx | Lighting Plans | Fixture extraction + circuit mapping |
| E-3xx | Fire Alarm Plans | Device extraction + loop mapping |
| E-4xx | Low Voltage / Telecom | Device extraction |
| E-5xx | Panel Schedules / Details | Table parsing + circuit extraction |
| E-6xx | One-Line / Riser Diagrams | Distribution topology extraction |
Each classified sheet goes through a specialized vision model trained on that drawing type. A lighting plan extractor knows to look for fixture symbols, switching groups, and circuiting designations. A panel schedule extractor knows to parse tabular data from schedule blocks.
Garbage Filtering
Construction PDFs are messy. OCR picks up border text, revision clouds, general notes, and titleblock boilerplate. Without filtering, these produce phantom items that inflate counts.
The garbage filter applies domain-specific rules:
- Reject descriptions shorter than 3 characters or longer than 200
- Reject items with unrealistic electrical values (a "1796A branch breaker" is OCR noise, not a real component)
- Reject text that contains no recognizable electrical nouns (conduit, wire, panel, fixture, receptacle, switch, etc.)
- Reject pure alphanumeric strings that are clearly label fragments
On a typical 150-page project, this filter removes 10–15% of raw OCR output that would otherwise pollute the takeoff.
Layer 2: Panel Schedule Parsing
Panel schedules are the Rosetta Stone of electrical drawings. A single panel schedule table encodes:
- Every branch circuit's breaker size and pole count
- The connected load description for each circuit
- Phase assignments (A, B, C for three-phase panels)
- Main breaker rating and bus ampacity
- Voltage configuration (120/208V, 277/480V, etc.)
Aginera reads panel schedule tables using a combination of OCR and table-structure recognition. For a panel labeled LP-2A, the parser might extract:
| Circuit | Breaker | Poles | Description | Phase |
|---|---|---|---|---|
| 1 | 20A | 1 | Lighting - Corridor 2nd FL | A |
| 3 | 20A | 1 | Receptacles - Office 201-204 | B |
| 5 | 30A | 2 | HVAC Unit - RTU-2 | A,B |
| 7 | 20A | 1 | Emergency Lighting - 2nd FL | C |
This structured data feeds directly into the conduit and wire inference engine. Without it, the system would have to guess circuit sizes — and guessing is how estimates go wrong.
Panel Deduplication
In many drawing sets, the same panel appears on multiple sheets — once on the power plan, once on the panel schedule detail sheet, and sometimes again on the one-line diagram. Naive extraction counts it three times.
The pipeline deduplicates panels by matching on the panel reference name. If LP-2A appears on sheets E-101, E-501, and E-601, it's recognized as one panel with data merged from all three appearances.
Layer 3: Conduit and Wire Inference
This is the layer most AI takeoff tools skip entirely, and it's the layer that accounts for the largest share of material cost.
The Problem
Construction drawings show device locations but rarely show every conduit run. An electrical plan might indicate that a group of receptacles is on circuit 3 of panel LP-2A, but it won't draw the exact conduit path from each receptacle back to the panel. The estimator is expected to infer:
- What size wire serves each circuit (based on breaker size, voltage, and NEC ampacity tables)
- What size conduit houses that wire (based on NEC conduit fill calculations)
- How many linear feet of conduit and wire are needed (based on floor plan scale and routing)
How Aginera Handles It
The conduit/wire inference engine works in three passes:
Pass 1 — Conductor Selection: For each circuit extracted from the panel schedule, the engine looks up the appropriate conductor size using NEC Table 310.16 ampacity ratings. A 20A/120V single-phase circuit gets 12 AWG THHN. A 100A/480V three-phase feeder gets 3 AWG THHN (or larger for voltage drop on long runs).
Pass 2 — Raceway Sizing: Given the conductor count and size, the engine calculates the minimum conduit size per NEC Chapter 9, Table 1 (40% fill for three or more conductors). Four 12 AWG THHN conductors fit in 1/2" EMT. A set of 3 AWG feeders with a ground needs 1-1/4" EMT.
Pass 3 — Quantity Estimation: Run lengths are estimated from the floor plan scale and typical routing patterns. Branch circuits average 25–40 LF depending on the building footprint. Feeders between panels and the MDP are estimated from riser diagram distances or floor-to-floor heights.
The output is a line-by-line material list:
| Item | Size | Quantity | Unit | Source |
|---|---|---|---|---|
| EMT Conduit | 3/4" | 1,250 | LF | Branch circuit inference |
| EMT Conduit | 1" | 450 | LF | Branch circuit inference |
| EMT Conduit | 2" | 180 | LF | Feeder inference |
| THHN Wire | 12 AWG | 5,600 | LF | 20A branch circuits |
| THHN Wire | 10 AWG | 1,800 | LF | 30A branch circuits |
| XHHW-2 Wire | 3/0 AWG | 600 | LF | 200A feeders |
Why This Matters for Pricing
On a typical commercial electrical project, the cost breakdown looks roughly like:
| Category | % of Total Cost |
|---|---|
| Labor | 40–55% |
| Wire and Cable | 15–25% |
| Conduit and Fittings | 10–20% |
| Devices and Fixtures | 5–15% |
| Distribution Equipment | 5–10% |
If your takeoff tool only counts devices (5–15% of cost), you're missing 85–95% of the estimate. Conduit and wire inference is what closes that gap.
Layer 4: Assembly Expansion
A device on a drawing is never just the device. A GFCI receptacle on an electrical plan represents an assembly of materials and labor:
| Component | Material Cost | Labor (hrs) |
|---|---|---|
| GFCI Receptacle, 20A/125V | $18.50 | — |
| Single-gang weather-resistant box | $3.20 | — |
| Single-gang SS cover plate | $2.40 | — |
| 3/4" EMT conduit, 20 LF avg | $1.85/LF | — |
| 12 AWG THHN, 60 LF (3 conductors) | $0.12/LF | — |
| EMT compression connectors (2) | $1.60 | — |
| Installation labor | — | 0.65 |
Total installed cost for one GFCI: roughly $75–90 depending on your labor rate and material pricing.
If your takeoff says "89 GFCI receptacles" and you price them at $18.50 each, you get $1,646. The actual installed cost is closer to $7,000. That's a 4x gap, and it's the gap between a symbol count and an estimate.
Aginera's assembly engine expands every extracted device into its full material and labor assembly. The expansion rules are configurable per project type and specification, but the defaults follow RS Means and NECA labor unit standards.
Layer 5: Priced Estimate
The final layer applies pricing — material costs from your cost library or vendor quotes, labor rates from your company's rate sheet, and markup/margin.
Aginera's pricing engine works in order of preference:
- Your pricing library — if you've uploaded vendor quotes or maintained material prices, those are used first
- Cost library lookup — matched by item type code, conductor size, conduit type, and other attributes
- Description-based matching — regex patterns match items like "3/4 EMT" or "12 AWG THHN" to known rates
- Flagged for review — if no confident match exists, the item is flagged rather than priced with a bad default
The output is a complete estimate with sections, line items, material/labor split, and subtotals — ready for final review and markup adjustment.
What This Means in Practice
For a 150-page commercial electrical drawing set, the full pipeline produces a bid-ready estimate in under 5 minutes. A manual estimator doing the same work needs 2–3 days.
The quality is different too. Manual estimators are good at judgment calls — is this fixture custom or standard? Does this spec require rigid conduit or EMT? — but they make arithmetic mistakes. They lose conduit runs when tracing circuits across multiple sheets. They forget to count the home-run from the last receptacle to the panel.
The AI doesn't make arithmetic mistakes. It doesn't lose track of circuits. It doesn't forget home-runs. What it does need is human review for the judgment calls — and that review takes 30 minutes, not 3 days.
Try It on Your Next Bid
If you're an electrical, low-voltage, or communications contractor and you want to see what this looks like on your actual drawings, upload a PDF to Aginera DesignOps and run an extraction. The takeoff results are available immediately, and the estimate can be generated with one click.
No demo call required. Just drawings and results.